Commit Graph

31225 Commits

Author SHA1 Message Date
Adam Nemet
80b9e006aa [AVX512] Rename arg from Opcode32/64 to Opcode128/256 in vinsert_for_size
It's the W bit that selects between 32 or 64 elt type and not the opcode.  The
opcode selects between the width of the insert (128 or 256).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219870 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:42:04 +00:00
Matt Arsenault
37e484b5ef R600: Remove unnecessary part of computeKnownBitsForTargetNode
Zero-width BFEs are combined away already, so there's no point in
handling them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219868 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:37:49 +00:00
Matt Arsenault
bb402de0c9 Move variable down to use
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219867 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:37:42 +00:00
Tom Stellard
d3fc10a525 R600/SI: Fix bug where immediates were being used in DS addr operands
The SelectDS1Addr1Offset complex pattern always tries to store constant
lds pointers in the offset operand and store a zero value in the addr operand.
Since the addr operand does not accept immediates, the zero value
needs to first be copied to a register.

This newly created zero value will not go through normal instruction
selection, so we need to manually insert a V_MOV_B32_e32 in the complex
pattern.

This bug was hidden by the fact that if there was another zero value
in the DAG that had not been selected yet, then the CSE done by the DAG
would use the unselected node for the addr operand rather than the one
that was just created.  This would lead to the zero value being selected
and the DAG automatically inserting a V_MOV_B32_e32 instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219848 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 21:08:59 +00:00
Sid Manning
0a72d8bb20 Wrong attribute. LLVM_ATTRIBUTE_UNUSED not LLVM_ATTRIBUTE_USED
This original fix for the build break was correct.  LLVM_ATTRIBUTE_USED
removes the warning message because it keeps the function in the object
file.  LLVM_ATTRIBUTE_UNUSED indicates that it may or may not be used
depending on build settings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219846 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 20:41:17 +00:00
Sid Manning
a169d59437 Wrong attribute. LLVM_ATTRIBUTE_USED not LLVM_ATTRIBUTE_UNUSED
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219837 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 19:32:52 +00:00
Sid Manning
f0f7ec31d4 Add LLVM_ATTRIBUTE_UNUSED to function currently just used in an assert
Fixes break when -Wunused-function is used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219833 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 19:24:14 +00:00
Juergen Ributzka
7440a83e60 Reapply "[FastISel][AArch64] Add custom lowering for GEPs."
This is mostly a copy of the existing FastISel GEP code, but we have to
duplicate it for AArch64, because otherwise we would bail out even for simple
cases. This is because the standard fastEmit functions don't cover MUL at all
and ADD is lowered very inefficientily.

The original commit had a bug in the add emit logic, which has been fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219831 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 18:58:07 +00:00
Juergen Ributzka
32c5fde3f1 [FastISel][AArch64] Factor out add with immediate emission into a helper function. NFC.
Simplify add with immediate emission by factoring it out into a helper function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219830 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 18:58:02 +00:00
Sid Manning
1338612c55 Enable the instruction printer in HexagonMCTargetDesc
This adds the MCInstPrinter to the LLVMHexagonDesc library and removes
the dependency LLVMHexagonAsmPrinter had on LLVMHexagonDesc. This is
a prerequisite needed by the disassembler.

Phabricator Revision: http://reviews.llvm.org/D5734

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219826 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 18:27:40 +00:00
Matt Arsenault
8b3a9205b7 R600/SI: Also try to use 0 base for misaligned 8-byte DS loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219823 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 18:06:43 +00:00
Matt Arsenault
7fdd553b66 R600: Fix miscompiles when BFE has multiple uses
SimplifyDemandedBits would break the other uses of the operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219819 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 17:58:34 +00:00
Rafael Espindola
90ce9f70e2 Simplify handling of --noexecstack by using getNonexecutableStackSection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219799 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 16:12:52 +00:00
Rafael Espindola
b510f8d08c Move getNonexecutableStackSection up to the base ELF class.
The .note.GNU-stack section is not SystemZ/X86 specific.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219796 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 15:44:16 +00:00
Matt Arsenault
18ed4acf21 R600: Use existing variable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219778 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 05:07:00 +00:00
Matt Arsenault
8a55ca3c41 R600: Remove outdated comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219777 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 05:06:57 +00:00
Juergen Ributzka
0081070cfd Revert "[FastISel][AArch64] Add custom lowering for GEPs."
This breaks our internal build bots. Reverting it to get the bots green again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 04:55:48 +00:00
Tim Northover
d3458577a9 ARM: drop check for triple that's no longer used.
Early attempts to support AAPCS bare metal MachO targets based the decision on
the CPU being compiled for. This was not a particularly great idea and we've
got a better option now, but this check remained.

No functional change for any target we care about.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219767 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 01:05:01 +00:00
Eric Christopher
c6d2db4db1 Remove unused variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219750 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 00:09:07 +00:00
Gerolf Hoflehner
cd27f3fb33 [AArch64] Wrong CC access in CSINC-conditional branch sequence
This is a follow up to commit r219742. It removes the CCInMI variable
and accesses the CC in CSCINC directly. In the case of a conditional
branch accessing the CC with CCInMI was wrong.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219748 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 23:55:00 +00:00
Gerolf Hoflehner
2bddd7cf65 [AAarch64] Optimize CSINC-branch sequence
Peephole optimization that generates a single conditional branch
for csinc-branch sequences like in the examples below. This is
possible when the csinc sets or clears a register based on a condition
code and the branch checks that register. Also the condition
code may not be modified between the csinc and the original branch.

Examples:

1. Convert csinc w9, wzr, wzr, <CC>;tbnz w9, #0, 0x44
   to b.<invCC>

2. Convert csinc w9, wzr, wzr, <CC>; tbz w9, #0, 0x44
   to b.<CC>


rdar://problem/18506500



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219742 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 23:07:53 +00:00
Simon Pilgrim
84a3feea38 [X86][SSE] pslldq/psrldq shuffle mask decodes
Patch to provide shuffle decodes and asm comments for the sse pslldq/psrldq SSE2/AVX2 byte shift instructions.

Differential Revision: http://reviews.llvm.org/D5598


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219738 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 22:31:34 +00:00
Tim Northover
7419c9c0c0 ARM: remove ARM/Thumb distinction for preferred alignment.
Thumb1 has legitimate reasons for preferring 32-bit alignment of types
i1/i8/i16, since the 16-bit encoding of "add rD, sp, #imm" requires #imm to be
a multiple of 4. However, this is a trade-off betweem code size and RAM usage;
the DataLayout string is not the best place to represent it even if desired.

So this patch removes the extra Thumb requirements, hopefully making ARM and
Thumb completely compatible in this respect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219734 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 22:12:17 +00:00
Tim Northover
32d728fbb9 ARM: allow misaligned local variables in Thumb1 mode.
There's no hard requirement on LLVM to align local variable to 32-bits, so the
Thumb1 frame handling needs to be able to deal with variables that are only
naturally aligned without falling over.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219733 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 22:12:14 +00:00
Juergen Ributzka
569c5b62af [FastISel][AArch64] Add custom lowering for GEPs.
This is mostly a copy of the existing FastISel GEP code, but on AArch64 we bail
out even for simple cases, because the standard fastEmit functions don't cover
MUL and ADD is lowered inefficientily.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219726 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 21:41:23 +00:00
Hans Wennborg
76806748d4 [x86 asm] allow fwait alias in both At&t and Intel modes (PR21208)
Differential Revision: http://reviews.llvm.org/D5741

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219725 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 21:41:17 +00:00
Tim Northover
eddeac0b8c ARM: set preferred aggregate alignment to 32 universally.
Before, ARM and Thumb mode code had different preferred alignments, which could
lead to some rather unexpected results. There's justification for reducing it
from the default 64-bits (wasted space), but I don't think there is for going
below 32-bits.

There's no actual ABI change here, just to reassure people.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219719 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:57:26 +00:00
Juergen Ributzka
40017084f7 [FastISel][AArch64] Fix sign-/zero-extend folding when SelectionDAG is involved.
Sign-/zero-extend folding depended on the load and the integer extend to be
both selected by FastISel. This cannot always be garantueed and SelectionDAG
might interfer. This commit adds additonal checks to load and integer extend
lowering to catch this.

Related to rdar://problem/18495928.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219716 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:36:02 +00:00
Jan Vesely
d6315ea5a5 Reapply "R600: Add new intrinsic to read work dimensions"
This effectively reverts revert 219707. After fixing the test to work with
new function name format and renamed intrinsic.

Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219710 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:05:26 +00:00
Rafael Espindola
e8e8db7ff6 Revert "R600: Add new intrinsic to read work dimensions"
This reverts commit r219705.

CodeGen/R600/work-item-intrinsics.ll was failing on linux.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219707 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 18:58:04 +00:00
Jan Vesely
6a529850f3 R600: Add new intrinsic to read work dimensions
v2: Add SI lowering
    Add test

v3: Place work dimensions after the kernel arguments.
v4: Calculate offset while lowering arguments
v5: rebase
v6: change prefix to AMDGPU

Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219705 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 18:52:07 +00:00
Jan Vesely
787e3ca6a4 R600: FMA is VecALU only instruction
Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219704 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 18:52:04 +00:00
Reed Kotler
c180402b48 Finish getting Mips fast-isel to match up with AArch64 fast-isel
Summary:
In order to facilitate use of common code, checking by reviewers of other fast-isel ports, and hopefully to eventually move most of Mips and other fast-isel ports into target independent code, I've tried to get the two implementations to line up.

There is no functional code change. Just methods moved in the file to be in the same order as in AArch64.

Test Plan: No functional change.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, aemerson, rfuhler

Differential Revision: http://reviews.llvm.org/D5692

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219703 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 18:27:58 +00:00
Matt Arsenault
704b06ce61 R600/SI: Use DS offsets for constant addresses
Use 0 as the base address for a constant address, so if
we have a constant address we can save moves and form
read2/write2s.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219698 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 17:21:19 +00:00
Robert Khasanov
ad5d223cb5 [AVX512] Extended avx512_binop_rm to DQ/VL subsets.
Added encoding tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219686 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 15:13:56 +00:00
Robert Khasanov
33a95f24bb [AVX512] Extended avx512_binop_rm to BW/VL subsets.
Added encoding tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219685 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 14:36:19 +00:00
Bradley Smith
5051f6033d [AArch64] Fix crash with empty/pseudo-only blocks in A53 erratum (835769) workaround
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219684 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 14:02:41 +00:00
Eric Christopher
0ba4483d01 Grab the subtarget info off of the MachineFunction rather than
indirecting through the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219674 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 08:44:19 +00:00
Eric Christopher
ff9182749e Use the triple to figure out if this is a darwin target, not
the subtarget.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219673 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 08:25:26 +00:00
Hao Liu
75ad488c41 [AArch64]Select wide immediate offset into [Base+XReg] addressing mode
e.g Currently we'll generate following instructions if the immediate is too wide:
    MOV  X0, WideImmediate
    ADD  X1, BaseReg, X0
    LDR  X2, [X1, 0]

    Using [Base+XReg] addressing mode can save one ADD as following:
    MOV  X0, WideImmediate
    LDR  X2, [BaseReg, X0]

    Differential Revision: http://reviews.llvm.org/D5477


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219665 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 06:50:36 +00:00
Eric Christopher
9accb10855 Include map into the A15SDOptimizer rather than pick it up
transitively from the DFAPacketizer via TargetInstrInfo.h.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219652 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 01:13:51 +00:00
Eric Christopher
8ff8c16f58 Remove the TargetMachine from DFAPacketizer since it was only
being used to grab subtarget specific things that we can grab
from the MachineFunction anyhow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219650 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 01:03:16 +00:00
Reed Kotler
2061a56b8a Make first of several changes to bring up to AArch64 fast-isel style
Summary:
Make Mips fast-isel track the form of AArch64 where practical.
This makes it easier for people to review the code, to borrow similar code, and to see how to eventually move a lot of this
 target code for fast-isels into target independent code.

These are just cosmetic changes. Should be no functional difference.

Test Plan:
make check
test-suite for 4 flavors mips32 r1/r2 , -O0/-O2

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: aemerson, llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D5595

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219633 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 21:46:41 +00:00
Filipe Cabecinhas
40251eb0b0 Fix a broadcast related regression on the vector shuffle lowering.
Summary: Test by Robert Lougher!

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5745

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219617 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 16:16:16 +00:00
Matt Arsenault
162415e8db R600/SI: Minor cleanup of function
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219616 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 15:47:59 +00:00
Yuri Gorshenin
ec8aeb0bc1 [asan-asm-instrumentation] Follow-up fixes to r219602: asserts are moved into
function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219610 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 11:44:06 +00:00
Renato Golin
d0c745a9f0 Adds support for the Cortex-A17 to the ARM backend
Patch by Matthew Wahab.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219606 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 10:22:19 +00:00
Bradley Smith
7e67a4b0cb [AArch64] Add workaround for Cortex-A53 erratum (835769)
Some early revisions of the Cortex-A53 have an erratum (835769) whereby it is
possible for a 64-bit multiply-accumulate instruction in AArch64 state to
generate an incorrect result.  The details are quite complex and hard to
determine statically, since branches in the code may exist in some
 circumstances, but all cases end with a memory (load, store, or prefetch)
instruction followed immediately by the multiply-accumulate operation.

The safest work-around for this issue is to make the compiler avoid emitting
multiply-accumulate instructions immediately after memory instructions and the
simplest way to do this is to insert a NOP.

This patch implements such work-around in the backend, enabled via the option
-aarch64-fix-cortex-a53-835769.

The work-around code generation is not enabled by default.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219603 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 10:12:35 +00:00
Yuri Gorshenin
eba0a96f8e [asan-asm-instrumentation] Fixed memory references which includes %rsp as a base or an index register.
Summary: [asan-asm-instrumentation] Fixed memory references which includes %rsp as a base or an index register.

Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5599

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219602 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 09:37:47 +00:00
NAKAMURA Takumi
58c0f65bf2 Revert r219584, "[X86] Memory folding for commutative instructions."
It broke i686 selfhosting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219595 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 04:17:34 +00:00
Simon Pilgrim
c00cd8e3c8 [X86] Memory folding for commutative instructions.
This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning.

This mainly helps the stack inliner better fold reloads of 3 (or more) operand instructions (VEX encoded SSE etc.) but by performing this in the lowest foldMemoryOperandImpl implementation it also replaces the X86InstrInfo::optimizeLoadInstr version and is now used by FastISel too.

Differential Revision: http://reviews.llvm.org/D5701


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219584 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-12 10:52:55 +00:00
Simon Pilgrim
b6e1b30957 Test commit access (email fix)
Indentation tidyup.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219577 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-11 20:28:56 +00:00
Benjamin Kramer
fedd0e2a21 MC: Bit pack MCSymbolData.
On x86_64 this brings it from 80 bytes to 64 bytes. Also make any member
variables private and clean up uses to go through the existing accessors.

NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219573 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-11 15:07:21 +00:00
Simon Pilgrim
195be17c96 Test commit access
Fix comment typo + spelling.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219572 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-11 14:23:36 +00:00
Reed Kotler
dd190243ee Add basic conditional branches in mips fast-isel
Summary: Implement the most basic form of conditional branches in Mips fast-isel.

Test Plan:
br1.ll
run 4 flavors of test-suite. mips32 r1/r2 and at -O0/O2

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D5583

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219556 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-11 00:55:18 +00:00
Matt Arsenault
fc9fda5443 R600/SI: Change how DS offsets are printed
Match SC by using offset/offset0/offset1 and printing
in decimal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219537 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 22:16:07 +00:00
Matt Arsenault
9bd1daf4b9 R600/SI: Match read2/write2 stride 64 versions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219536 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 22:12:32 +00:00
Matt Arsenault
968e1f2f5b R600/SI: Add load / store machine optimizer pass.
Currently this only functions to match simple cases
where ds_read2_* / ds_write2_* instructions can be used.

In the future it might match some of the other weird
load patterns, such as direct to LDS loads.

Currently enabled only with a subtarget feature to enable
easier testing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219533 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 22:01:59 +00:00
Chandler Carruth
9a0be5bfd9 [mips] Actually mark that the default case is unreachable as this switch
is over a subset of condition codes.

This fixes the -Werror build which warns about use of uninitialized
variables in the default case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219531 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 21:07:03 +00:00
Reed Kotler
704d4277aa Implement floating point compare for mips fast-isel
Summary: Expand SelectCmp to handle floating point compare

Test Plan:
fpcmpa.ll
run 4 flavors of test-suite, mips32 r1/r2 O0/O2

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D5567

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219530 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 20:46:28 +00:00
Matt Arsenault
b144f27c8f R600/SI: Disable copying of SCC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219519 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 17:44:47 +00:00
Reed Kotler
5ae4b93565 implement integer compare in mips fast-isel
Summary: implement SelectCmp (integer compare ) in mips fast-isel

Test Plan:
icmpa.ll
also ran 4 test-suite flavors mips32 r1/r2 O0/O2

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, rfuhler, mcrosier

Differential Revision: http://reviews.llvm.org/D5566

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219518 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 17:39:51 +00:00
Bill Schmidt
38a202955c [PowerPC] Reduce names from Power8Vector to P8Vector
Per Hal Finkel's review, improving typability of some variable names.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219514 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 17:21:15 +00:00
Reed Kotler
f6e11eacdd Implement floating point to integer conversion in mips fast-isel
Summary: Add the ability to convert 64 or 32 bit floating point values to integer in mips fast-isel

Test Plan:
fpintconv.ll
ran 4 flavors of test-suite with no errors, misp32 r1/r2 O0/O2

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, rfuhler, mcrosier

Differential Revision: http://reviews.llvm.org/D5562

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219511 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 17:00:46 +00:00
Benjamin Kramer
d62c4bac66 Reduce double set lookups. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219505 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 15:32:50 +00:00
Bill Schmidt
836ca75dd7 [PowerPC] Add feature for Power8 vector extensions
The current VSX feature for PowerPC specifies availability of the VSX
instructions added with the 2.06 architecture version.  With 2.07, the
architecture adds new instructions to both the Category:Vector and
Category:VSX instruction sets.  Additionally, unaligned vector storage
operations have improved performance.

This patch adds a feature to provide access to the new instructions
and performance capabilities of Power8.  For compatibility with GCC,
the feature is controlled via a new -mpower8-vector switch, and the
feature causes the __POWER8_VECTOR__ builtin define to be generated by
the preprocessor.

There is a companion patch for cfe being committed at the same time.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219501 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 15:09:28 +00:00
Zoran Jovanovic
0bf4807a90 [mips][microMIPS] Implement ADDIUSP instruction
Differential Revision: http://reviews.llvm.org/D5084


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219500 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 14:37:30 +00:00
Zoran Jovanovic
24335e60c7 [mips][microMIPS] Implement JR16 instruction
Differential Revision: http://reviews.llvm.org/D5062


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219498 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 14:02:44 +00:00
Zoran Jovanovic
e2db3024be [mips][microMIPS] Implement ADDIUS5 instruction
Differential Revision: http://reviews.llvm.org/D5049


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219495 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 13:45:34 +00:00
Zoran Jovanovic
28b2826538 ps][microMIPS] Implement JRC instruction
Differential Revision: http://reviews.llvm.org/D5045


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219494 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 13:31:18 +00:00
Zoran Jovanovic
b581230077 [mips][microMIPS] Implement JALRS16 instruction
Differential Revision: http://reviews.llvm.org/D5027


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219493 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 13:22:28 +00:00
Chandler Carruth
f2f5070d79 Don't use an unqualified 'abs' function call with a builtin type.
This is dangerous for numerous reasons. The primary risk here is with
floating point or double types where if the wrong header files are
included in a strange order this can implicitly convert to integers and
then call the C abs function on the integers. There is a secondary risk
that even impacts integers where if the namespace the code is written in
ever defines an abs overload for types within that namespace the global
abs will be hidden. The correct form is to call std::abs or write 'using
std::abs' for builtin types (and only the latter is correct in any
generic context).

I've also added the requisite header to be a bit more explicit here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219484 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 08:27:19 +00:00
Samuel Antao
f75bfbea17 Fix bug in GPR to FPR moves in PPC64LE.
The current implementation of GPR->FPR register moves uses a stack slot. This mechanism writes a double word and reads a word. In big-endian the load address must be displaced by 4-bytes in order to get the right value. In little endian this is no longer required. This patch fixes the issue and adds LE regression tests to fast-isel-conversion which currently expose this problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219441 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 20:42:56 +00:00
Benjamin Kramer
07200c4b6c Remove a compiler bug workaround from 2007. The affected versions of gcc are long gone.
NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219433 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 19:50:39 +00:00
Matt Arsenault
2e3ee4ceb5 Fix typo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219429 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 19:15:15 +00:00
Tom Stellard
a8b2e6f4af R600/SI: Legalize CopyToReg during instruction selection
The instruction emitter will crash if it encounters a CopyToReg
node with a non-register operand like FrameIndex.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219428 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 19:06:00 +00:00
Lang Hames
80bfef5ce1 [PBQP] Add missing headers from r219421.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219425 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 18:36:59 +00:00
Lang Hames
54d63b4fd5 [PBQP] Replace PBQPBuilder with composable constraints (PBQPRAConstraint).
This patch removes the PBQPBuilder class and its subclasses and replaces them
with a composable constraints class: PBQPRAConstraint. This allows constraints
that are only required for optimisation (e.g. coalescing, soft pairing) to be
mixed and matched.

This patch also introduces support for target writers to supply custom
constraints for their targets by overriding a TargetSubtargetInfo method:

std::unique_ptr<PBQPRAConstraints> getCustomPBQPConstraints() const;

This patch should have no effect on allocations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219421 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 18:20:51 +00:00
Tom Stellard
d0fb5b1c11 R600/SI: Legalize INSERT_SUBREG instructions during PostISelFolding
LLVM assumes INSERT_SUBREG will always have register operands, so
we need to legalize non-register operands, like FrameIndexes, to
avoid random assertion failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219420 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 18:09:15 +00:00
Bill Schmidt
c307b3034a [PPC64] VSX indexed-form loads use wrong instruction format
The VSX instruction definitions for lxsdx, lxvd2x, lxvdsx, and lxvw4x
incorrectly use the XForm_1 instruction format, rather than the
XX1Form instruction format.  This is likely a pasto when creating
these instructions, which were based on lvx and so forth.  This patch
uses the correct format.

The existing reformatting test (test/MC/PowerPC/vsx.s) missed this
because the two formats differ only in that XX1Form has an extension
to the target register field in bit 31.  The tests for these
instructions used a target register of 7, so the default of 0 in bit
31 for XForm_1 didn't expose a problem.  For register numbers 32-63
this would be noticeable.  I've changed the test to use higher
register numbers to verify my change is effective.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219416 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 17:51:35 +00:00
Kevin Qin
f73400a864 [AArch64] Enable partial & runtime unrolling on cortex-a57.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219401 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 10:13:27 +00:00
Robert Khasanov
340b5b9ad7 [AVX512] Extended avx512_binop_rm for AVX512VL subsets.
Added avx512_binop_rm_vl multiclass for VL subset
Added encoding tests



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219390 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 08:38:48 +00:00
Bob Wilson
8e673570b9 Use triple's isiOS() and isOSDarwin() methods.
These methods are already used in lots of places. This makes things more
consistent. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219386 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 05:43:30 +00:00
Eric Christopher
e6d97094b7 Remove unused argument to CreateTargetScheduleState and change
the TargetMachine to a TargetSubtargetInfo since everything
we wanted is off of that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219382 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 01:59:35 +00:00
Adam Nemet
8088af2547 [AVX512] Rename AVX512_masking* to AVX512_maskable*
No functional change.

This is the current AVX512_maskable multiclass hierarchy:

                 maskable_custom
                    /       \
                   /         \
          maskable_common   maskable_in_asm
            /         \
           /           \
      maskable        maskable_3src

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219363 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 23:25:39 +00:00
Adam Nemet
fbd0e464dd [AVX512] Intrinsics for vextract*x4
This adds the Pat<>'s for the intrinsics.  These are necessary because we
don't lower these intrinsics to SDNodes but match them directly.  See the
rational in the previous commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219362 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 23:25:37 +00:00
Adam Nemet
e868005a27 [AVX512] Add asm-only support for vextract*x4 masking variants
These derive from the new asm-only masking definitions.

Unfortunately I wasn't able to find a ISel pattern that we could legally
generate for the masking variants.  The problem is that since the destination
is v4* we would need VK4 register classes and v4i1 value types to express the
masking.  These are however not legal types/classes in AVX512f but only in VL,
so things get complicated pretty quickly.  We can revisit this question later
if we have a more pressing need to express something like this.

So the ISel patterns are empty for the masking instructions and the next patch
will add Pat<>s instead to match the intrinsics calls with instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219361 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 23:25:33 +00:00
Adam Nemet
9d0ec9212b [AVX512] Move DAG for all-zero node to X86VectorVTInfo
No functional change.

No change in X86.td.expanded except for the appearance of the new attributes.

The new attributes will be used in the subsequent patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219360 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 23:25:31 +00:00
Adam Nemet
6feb834941 [AVX512] Peel off an asm-only class from AVX512_masking_common.
No functional change.

This enables the generation of masking instructions that don't provide a
ISel pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219358 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 23:25:23 +00:00
Robin Morisset
b79d91ca1c [X86] Don't transform atomic-load-add into an inc/dec when inc/dec is slow
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219357 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 23:16:23 +00:00
Robin Morisset
48dfa127d7 [X86] Avoid generating inc/dec when slow for x.atomic_store(1 + x.atomic_load())
Summary:
I had forgotten to check for NotSlowIncDec in the patterns that can generate
inc/dec for the above pattern (added in D4796).
This currently applies to Atom Silvermont, KNL and SKX.

Test Plan: New checks on atomic_mi.ll

Reviewers: jfb, nadav

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5677

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219336 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 19:38:18 +00:00
Robert Khasanov
0e3754615e [AVX512] Added intrinsics for 128-, 256- and 512-bit versions of VPCMP/VPCMPU{BWDQ}
Added CMP_MASK_CC intrinsic type.
Added tests for intrinsics.

Patch by Sergey Lisitsyn <sergey.lisitsyn@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219316 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 15:49:26 +00:00
Robert Khasanov
e659ba92c8 [AVX512] Refactoring of avx512_binop_rm multiclass through AVX512_masking.
Added new argrument for AVX512_masking: InstrItinClass and bit isCommutable.
No functional change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219310 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 14:37:45 +00:00
Renato Golin
1e059a88f8 Emit unaligned access build attribute for ARM
Patch by Charlie Turner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219301 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 12:26:22 +00:00
Renato Golin
ef2ed3f465 Refactor isThumb1Only() && isMClass() into a predicate called isV6M()
This must be enforced for all v6M cores, not just the cortex-m0,
irregardless of the user-specified alignment.

Patch by Charlie Turner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219300 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 12:26:16 +00:00
Renato Golin
3828392790 Simplify switch statement in ARM subtarget align access
This switch can be reduced to a simpler if/else statement.

Patch by Charlie Turner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219299 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 12:26:13 +00:00
Eric Christopher
b7cd35b171 Cache TargetLowering on SelectionDAGISel and update previous
calls to getTargetLowering() with the cached variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219284 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 07:32:17 +00:00
Chad Rosier
2929da99a9 [AArch64] Generate vector signed/unsigned mul and mla/mls long.
Phabricator Revision: http://reviews.llvm.org/D5589
Patch by Balaram Makam <bmakam@codeaurora.org>!!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219276 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 02:31:24 +00:00
Robin Morisset
28127ffb51 [X86] Fix a bug with fetch_add(INT32_MIN)
Summary:
Fix pr21099

The pseudocode of what we were doing (spread through two functions) was:
if (operand.doesNotFitIn32Bits())
  Opc.initializeWithFoo();
if (operand < 0)
  operand = -operand;
if (operand.doesFitIn8Bits())
  Opc.initializeWithBar();
else if (operand.doesFitIn32Bits())
  Opc.initializeWithBlah();
doStuff(Opc);

So for operand == INT32_MIN, Opc was never initialized because the operand changes
from fitting in 32 bits to not fitting, causing the various bugs/error messages
noted by pr21099.

This patch adds an extra test at the beginning for this case, and an
llvm_unreachable to have better error message if the operand ends up
not fitting in 32-bits at the end.

Test Plan: new test + make check

Reviewers: jfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5655

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219257 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 23:53:57 +00:00
Tom Stellard
806e08c5a0 R600/SI: Refactor VOP3 instruction defs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219256 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 23:51:41 +00:00
Tom Stellard
96c9ebbd04 R600/SI: Refactor VOPC instruction defs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219255 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 23:51:39 +00:00
Tom Stellard
40949737c4 R600/SI: Refactor VOP2 instruction defs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219254 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 23:51:38 +00:00
Tom Stellard
6bf4c41242 R600/SI: Refactor VOP1 instruction defs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219253 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 23:51:34 +00:00
Matt Arsenault
1ba00eab26 R600: Remove dead code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219242 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 21:29:56 +00:00
Tom Stellard
ae06e97b3b R600: Remove some redundant initializations from AMDGPUMCAsmInfo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219238 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 21:09:25 +00:00
Tom Stellard
9eb63e54db R600: Use MCAsmInfoELF as AMDGPUMCAsmInfo base class
The main reason for this is that the MCAsmInfo class,
which we were previously using as the base class, sets
PrivateGlobalPrefix to "L", which causes all global
functions that start with L to be treated as local symbols.

MCAsmInfoELF sets PrivateGlobalPrefix to ".L", which is what
we want, and it is probably a good idea to use this as the
base class anyway, since we are emitting ELF binaries.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219237 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 21:09:23 +00:00
Tom Stellard
b8fee4f1d9 R600/SI: Remove assertion in SIInstrInfo::areLoadsFromSameBasePtr()
Added a FIXME coment instead, we need to handle the case where the
two DS instructions being compared have different numbers of operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219236 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 21:09:20 +00:00
Yuri Gorshenin
86e0844d1c [asan-asm-instrumentation] CFI directives are generated for .S files.
Summary: CFI directives are generated for .S files.

Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5520

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219199 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 11:03:09 +00:00
Daniel Sanders
75046b4891 [mips] Return {f128} correctly for N32/N64.
Summary:
According to the ABI documentation, f128 and {f128} should both be returned
in $f0 and $f2. However, this doesn't match GCC's behaviour which is to
return f128 in $f0 and $f2, but {f128} in $f0 and $f1.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5578

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219196 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 09:29:59 +00:00
Craig Topper
95717dbb11 [X86] Fix a bug where the disassembler was ignoring the VEX.W bit in 32-bit mode for certain instructions it shouldn't.
Unfortunately, this isn't easy to fix since there's no simple way to figure out from the disassembler tables whether the W-bit is being used to select a 64-bit GPR or if its a required part of the opcode. The fix implemented here just looks for "64" in the instruction name and ignores the W-bit in 32-bit mode if its present.

Fixes PR21169.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219194 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 07:29:50 +00:00
Craig Topper
ada4703cba Formatting fixes. Most putting 'else' on the same line as the preceding curly brace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219193 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 07:29:48 +00:00
Craig Topper
038a3b8d65 Fix filename in header and use C++ version of the C header files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219192 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 07:29:46 +00:00
Juergen Ributzka
301d3d04f0 [FastISel][AArch64] Teach the address computation code to also fold sign-/zero-extends.
The code already folds sign-/zero-extends, but only if they are arguments to
mul and shift instructions. This extends the code to also fold them when they
are direct inputs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219187 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 03:40:06 +00:00
Juergen Ributzka
3692081566 [FastISel][AArch64] Teach the address computation to also fold sub instructions.
Tiny enhancement to the address computation code to also fold sub instructions
if the rhs is constant and can be folded into the offset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219186 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 03:40:03 +00:00
Juergen Ributzka
ca07e256f6 [FastISel][AArch64] Fix "Fold sign-/zero-extends into the load instruction."
This commit fixes an issue with sign-/zero-extending loads that was discovered
by Richard Barton.

We use now the correct load instructions for sign-extending loads to 64bit. Also
updated and added more unit tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219185 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 03:39:59 +00:00
NAKAMURA Takumi
844eeb3741 ARMInstPrinter.cpp: Suppress a warning for -Asserts. [-Wunused-variable]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219172 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-06 23:48:04 +00:00
Tim Northover
ca6c95df36 ARM: silence unused variable warning
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219128 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-06 17:26:36 +00:00
Tim Northover
969587d62d ARM: remove dead InstPrinting code
This instruction form is handled by different AsmOperands now, so the code is
completely dead (and wrong anyway).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219127 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-06 17:10:13 +00:00
Benjamin Kramer
043994e266 X86: Drop the isConvertibleTo3Addr bit from shufps/shufpd now that we don't convert them anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219112 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-06 09:56:40 +00:00
Eric Christopher
e6f32be8df Add subtarget caches to aarch64, arm, ppc, and x86.
These will make it easier to test further changes to the
code generation and optimization pipelines as those are
moved to subtargets initialized with target feature and
target cpu.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219106 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-06 06:45:36 +00:00
Chandler Carruth
1d02acb7a0 [x86] Remove the 2-addr-to-3-addr "optimization" from shufps to pshufd.
This trades a (register-renamer-friendly) movaps for a floating point
/ integer domain cross. That is a very bad trade, even on architectures
where domain crossing is relatively fast. On any chip where there is
even a cycle stall, this is a Very Bad Idea. It doesn't even seem likely
to cause a spill to be introduced because the reason for the copy is to
destructively shuffle in place.

Thanks to Ben Kramer for fixing a bug in this code that my new shuffle
lowering exposed and highlighting that perhaps it should just go away.
=]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219090 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 22:57:31 +00:00
Benjamin Kramer
88b3a52eec X86: Don't drop half of the mask when converting 2-address shufps into 3-address pshufd.
It's debatable whether this transform is useful at all, but for now make sure
we don't generate invalid asm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219084 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 16:14:29 +00:00
Elena Demikhovsky
a0cb2c75b0 AVX-512-SKX: Added instruction VPMOVM2B/W/D/Q.
This instruction allows to broadacst mask vector to data vector.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219083 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 14:11:08 +00:00
Chandler Carruth
30ae72f02c [x86] Fix PR21139, one of the last remaining regressions found in the
new vector shuffle lowering.

This is loosely based on a patch by Marius Wachtler to the PR (thanks!).
I refactored it a bi to use std::count_if and a mutable array ref but
the core idea was exactly right. I also added some direct testing of
this case.

I believe PR21137 is now the only remaining regression.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219081 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 12:07:34 +00:00
Chandler Carruth
a644b090de [x86] Teach the new vector shuffle lowering how to lower 128-bit
shuffles using AVX and AVX2 instructions. This fixes PR21138, one of the
few remaining regressions impacting benchmarks from the new vector
shuffle lowering.

You may note that it "regresses" many of the vperm2x128 test cases --
these were actually "improved" by the naive lowering that the new
shuffle lowering previously did. This regression gave me fits. I had
this patch ready-to-go about an hour after flipping the switch but
wasn't sure how to have the best of both worlds here and thought the
correct solution might be a completely different approach to lowering
these vector shuffles.

I'm now convinced this is the correct lowering and the missed
optimizations shown in vperm2x128 are actually due to missing
target-independent DAG combines. I've even written most of the needed
DAG combine and will submit it shortly, but this part is ready and
should help some real-world benchmarks out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219079 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 11:41:36 +00:00
NAKAMURA Takumi
529fcbed32 HexagonMCCodeEmitter.cpp: Prune 2nd redundant \brief. [-Wdocumentation]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219073 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 04:54:54 +00:00
NAKAMURA Takumi
935e4326a5 HexagonDesc: Update LLVMBuild.txt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219071 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 04:54:29 +00:00
Benjamin Kramer
3cdf75c5cf [SystemZ] Make operator bool explicit. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219069 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 22:44:35 +00:00
Benjamin Kramer
814a2ffc7c Make AAMDNodes ctor and operator bool (!!!) explicit, mop up bugs and weirdness exposed by it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219068 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 22:44:29 +00:00
Benjamin Kramer
dbc6d9b9d7 Remove unnecessary copying or replace it with moves in a bunch of places.
NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219061 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 16:55:56 +00:00
Chandler Carruth
03a77831cc [x86] Enable the new vector shuffle lowering by default.
Update the entire regression test suite for the new shuffles. Remove
most of the old testing which was devoted to the old shuffle lowering
path and is no longer relevant really. Also remove a few other random
tests that only really exercised shuffles and only incidently or without
any interesting aspects to them.

Benchmarking that I have done shows a few small regressions with this on
LNT, zero measurable regressions on real, large applications, and for
several benchmarks where the loop vectorizer fires in the hot path it
shows 5% to 40% improvements for SSE2 and SSE3 code running on Sandy
Bridge machines. Running on AMD machines shows even more dramatic
improvements.

When using newer ISA vector extensions the gains are much more modest,
but the code is still better on the whole. There are a few regressions
being tracked (PR21137, PR21138, PR21139) but by and large this is
expected to be a win for x86 generated code performance.

It is also more correct than the code it replaces. I have fuzz tested
this extensively with ISA extensions up through AVX2 and found no
crashes or miscompiles (yet...). The old lowering had a few miscompiles
and crashers after a somewhat smaller amount of fuzz testing.

There is one significant area where the new code path lags behind and
that is in AVX-512 support. However, there was *extremely little*
support for that already and so this isn't a significant step backwards
and the new framework will probably make it easier to implement lowering
that uses the full power of AVX-512's table-based shuffle+blend (IMO).

Many thanks to Quentin, Andrea, Robert, and others for benchmarking
assistance. Thanks to Adam and others for help with AVX-512. Thanks to
Hal, Eric, and *many* others for answering my incessant questions about
how the backend actually works. =]

I will leave the old code path in the tree until the 3 PRs above are at
least resolved to folks' satisfaction. Then I will rip it (and 1000s of
lines of code) out. =] I don't expect this flag to stay around for very
long. It may not survive next week.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219046 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 03:52:55 +00:00
Jingyue Wu
ea1cccd0b6 Add fake use to suppress defined-but-unused warnings
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219045 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 03:50:10 +00:00
Chandler Carruth
1e663cf69c [x86] Fix a bug in the VZEXT DAG combine that I just made more powerful.
It turns out this combine was always somewhat flawed -- there are cases
where nested VZEXT nodes *can't* be combined: if their types have
a mismatch that can be observed in the result. While none of these show
up in currently, once I switch to the new vector shuffle lowering a few
test cases actually form such nested VZEXT nodes. I've not come up with
any IR pattern that I can sensible write to exercise this, but it will
be covered by tests once I flip the switch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219044 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 02:51:03 +00:00
Chandler Carruth
cd2c1d8db1 [x86] Sink a generic combine of VZEXT nodes from the lowering to VZEXT
nodes to the DAG combining of them.

This will allow the combine to fire on both old vector shuffle lowering
and the new vector shuffle lowering and generally seems like a cleaner
design. I've trimmed down the code a bit and tried to make it and the
surrounding combine fairly clean while moving it around.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219042 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 01:05:48 +00:00
Matt Arsenault
9747d27b59 R600/SI: Custom lower f64 -> i64 conversions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219038 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 23:54:56 +00:00
Matt Arsenault
ad2f641c33 R600: Custom lower [s|u]int_to_fp for i64 -> f64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219037 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 23:54:41 +00:00
Matt Arsenault
24cee77dd4 R600/SI: Fix ftrunc f64 conformance failures.
Re-add the tests since they were deleted at some point

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219036 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 23:54:27 +00:00
Chandler Carruth
f159de96bd [x86] Add a really preposterous number of patterns for matching all of
the various ways in which blends can be used to do vector element
insertion for lowering with the scalar math instruction forms that
effectively re-blend with the high elements after performing the
operation.

This then allows me to bail on the element insertion lowering path when
we have SSE4.1 and are going to be doing a normal blend, which in turn
restores the last of the blends lost from the new vector shuffle
lowering when I got it to prioritize insertion in other cases (for
example when we don't *have* a blend instruction).

Without the patterns, using blends here would have regressed
sse-scalar-fp-arith.ll *completely* with the new vector shuffle
lowering. For completeness, I've added RUN-lines with the new lowering
here. This is somewhat superfluous as I'm about to flip the default, but
hey, it shows that this actually significantly changed behavior.

The patterns I've added are just ridiculously repetative. Suggestions on
making them better very much welcome. In particular, handling the
commuted form of the v2f64 patterns is somewhat obnoxious.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219033 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 22:43:17 +00:00
Chandler Carruth
91ea3e41ae [x86] Adjust the patterns for lowering X86vzmovl nodes which don't
perform a load to use blendps rather than movss when it is available.

For non-loads, blendps is *much* faster. It can execute on two ports in
Sandy Bridge and Ivy Bridge, and *three* ports on Haswell. This fixes
one of the "regressions" from aggressively taking the "insertion" path
in the new vector shuffle lowering.

This does highlight one problem with blendps -- it isn't commuted as
heavily as it should be. That's future work though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219022 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 21:38:49 +00:00
Richard Smith
2451e5b581 PR21145: Teach LLVM about C++14 sized deallocation functions.
C++14 adds new builtin signatures for 'operator delete'. This change allows
new/delete pairs to be removed in C++14 onwards, as they were in C++11 and
before.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219014 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 20:17:06 +00:00
Adam Nemet
726942c8bb [ISel] Keep matching state consistent when folding during X86 address match
In the X86 backend, matching an address is initiated by the 'addr' complex
pattern and its friends.  During this process we may reassociate and-of-shift
into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
shift into the scale of the address.

However as demonstrated by the testcase, this can trigger CSE of not only the
shift and the AND which the code is prepared for but also the underlying load
node.  In the testcase this node is sitting in the RecordedNode and MatchScope
data structures of the matcher and becomes a deleted node upon CSE.  Returning
from the complex pattern function, we try to access it again hitting an assert
because the node is no longer a load even though this was checked before.

Now obviously changing the DAG this late is bending the rules but I think it
makes sense somewhat.  Outside of addresses we prefer and-of-shift because it
may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
example because it create a non-canonical node).  We currently don't recognize
addresses during DAGCombiner where arguably this canonicalization should be
performed.  On the other hand, having this in the matcher allows us to cover
all the cases where an address can be used in an instruction.

I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
the new shift node in FoldMaskedShiftToScaledMask.  This RAUW is responsible
for initiating the recursive CSE on users
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
is not strictly necessary since the shift is hooked into the visited user.  Of
course it's safer to keep the DAG consistent at all times (e.g. for accurate
number of uses, etc.).

So rather than changing the fundamentals, I've decided to continue along the
previous patches and detect the CSE.  This patch installs a very targeted
DAGUpdateListener for the duration of a complex-pattern match and updates the
matching state accordingly.  (Previous patches used HandleSDNode to detect the
CSE but that's not practical here).  The listener is only installed on X86.

I tested that there is no measurable overhead due to this while running
through the spec2k BC files with llc.  The only thing we pay for is the
creation of the listener.  The callback never ever triggers in spec2k since
this is a corner case.

Fixes rdar://problem/18206171

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219009 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 20:00:34 +00:00
Tom Stellard
77859c9e9c R600: Align functions to 256 bytes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219002 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 19:02:02 +00:00
Benjamin Kramer
63688e622c Eliminate some deep std::vector copies. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218999 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 18:33:16 +00:00
Robin Morisset
8e2b2ae80e [Power] Use lwsync for non-seq_cst fences
Summary:
hwsync is only required for seq_cst fences, acquire and release one can use
the cheaper lwsync.

Test Plan: Added some cases to atomics.ll + make check-all

Reviewers: jfb, wschmidt

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5317

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218995 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 18:04:36 +00:00
Hans Wennborg
d8af735ca7 MipsAsmParser.cpp: fix VS2012 build
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218991 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 17:16:24 +00:00
Hans Wennborg
8651aece06 HexagonMCCodeEmitter.h: deleted member functions are not supported in VS2012
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218990 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 17:02:28 +00:00
Daniel Sanders
61bc405795 [mips] Print warning when using register names not available in N32/64
Summary:
The register names t4-t7 are not available in the N32 and N64 ABIs.
This patch prints a warning, when those names are used in N32/64,
along with a fix-it with the correct register names.

Patch by Vasileios Kalintiris

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5272


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218989 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 15:37:37 +00:00
Sid Manning
c4d113192d Fix build break on Hexagon
Differential Revision: http://reviews.llvm.org/D5600

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218987 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 13:59:01 +00:00
Sid Manning
30d67d99cf Adding skeleton for unit testing Hexagon Code Emission
Adding and modifying CMakeLists.txt files to run unit tests under
unittests/Target/* if the directory exists.  Adding basic unit test to check
that code emitter object can be retrieved.

Differential Revision: http://reviews.llvm.org/D5523
Change by: Colin LeMahieu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218986 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 13:18:11 +00:00
Chandler Carruth
dce98e6739 [x86] Teach the new vector shuffle lowering to aggressively form MOVSS
and MOVSD nodes for single element vector inserts.

This is particularly important because a number of patterns in the
backend detect these patterns and leverage them to simplify things. It
also fixes quite a few of the insertion bad code examples. However, it
regresses a specific area: when available, blendps and blendpd are
*dramatically* faster than movss and movsd respectively. But it doesn't
really work to form the blend logic first because the blends *aren't* as
crazy efficient when the data is coming from memory anyways, and thus
will have a movss or movsd regardless. Also, doing that would block
a bunch of the patterns that this is designed to hit.

So my plan is to go into the patterns for lowering MOVSS and MOVSD and
lower them via blends when available. However that's a pretty invasive
restructuring so it will need to be a follow-up patch.

I have already gone into the patterns to lower MOVSS and MOVSD from
memory using MOVLPD, etc. Without that, several of the test cases
I already have regress.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218985 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 13:11:13 +00:00
Renato Golin
b157cb7afd Revert 202433 - Provide a target override for the latest regalloc heuristic
That commit was introduced in order to help investigate a problem in ARM
codegen breaking from commit 202304 (Add a limit to the heuristic that register
allocates instructions in local order). Recent analisys indicated that the
problem no longer exists, so I'm reverting this change.

See PR18996.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218981 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 12:20:53 +00:00
Chandler Carruth
7ae6f2abf6 [x86] Refactor the element insertion logic in the new vector shuffle
lowering to handle the potential mirroring of 2-element vectors (because
we can't reliably sort them one way) in the caller rather than in the
insertion logic.

This will simplify things considerably as more ways to fail to match the
insertion are added because now we have a nice try and retry point.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218980 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 12:01:55 +00:00
Chandler Carruth
01b3858e66 [x86] Significantly improve the ability of the new vector shuffle
lowering to match VZEXT_MOVL patterns.

I hadn't realized that these had sufficient pattern smarts in the
backend to lower zext-ing from the low element of a vector without it
being a scalar_to_vector node. They do, and this is how to match a bunch
of patterns for movq, movss, etc.

There is a weird propensity to end up using pshufd to place the element
afterward even though it means domain crossing (or rather, to use
xorps+movss to zext the element rather than movq) but that's an
orthogonal problem with VZEXT_MOVL that someone should probably look at.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218977 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 11:25:58 +00:00
Chandler Carruth
53bf81ae59 [x86] Unbreak SSE1 with the new vector shuffle lowering. We can't widen
element types to form illegal vector types.

I've added a special SSE1 test case here that makes sure we don't break
this going forward.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218974 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 10:11:39 +00:00
Eric Christopher
8f09464bc9 constify TargetMachine parameter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218934 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 00:42:41 +00:00
Eric Christopher
59cacc9dec constify TargetMachine argument.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218930 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 00:17:59 +00:00
Eric Christopher
1340986490 We can grab the options struct from the TargetMachine, no need to
pass it down in the constructor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218929 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 00:10:03 +00:00
Adam Nemet
6955c9d1ac [AVX512] Pull pattern for subvector insert into the instruction definition
No functional change intended.

Very similar to the change I made for subvector extract in r218480.

test/CodeGen/X86/avx512-insert-extract.ll covers this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218928 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 23:18:30 +00:00
Adam Nemet
d9e2cc7fa0 [AVX512] Refactor subvector inserts
No functional change.

Very similar to the extract refactoring I did in r218478.

Compared X86.td.expanded before and after.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218927 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 23:18:28 +00:00
Adam Nemet
a9014e5530 [AVX512] Fix i256mem->f256mem typo in VINSERTF64x4rm
Just like in the case of extracts, the refactoring is uncovering some typos in
the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218926 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 23:18:26 +00:00
Hal Finkel
626236d9bc [PowerPC] Modern Book-E cores support sync
Older Book-E cores, such as the PPC 440, support only msync (which has the same
encoding as sync 0), but not any of the other sync forms. Newer Book-E cores,
however, do support sync, and for performance reasons we should allow the use
of the more-general form.

This refactors msync use into its own feature group so that it applies by
default only to older Book-E cores (of the relevant cores, we only have
definitions for the PPC440/450 currently).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218923 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 22:34:22 +00:00
Robin Morisset
2b1874cbd4 [Power] Improve the expansion of atomic loads/stores
Summary:
Atomic loads and store of up to the native size (32 bits, or 64 for PPC64)
can be lowered to a simple load or store instruction (as the synchronization
is already handled by AtomicExpand, and the atomicity is guaranteed thanks to
the alignment requirements of atomic accesses). This is exactly what this patch
does. Previously, these were implemented by complex
load-linked/store-conditional loops.. an obvious performance problem.

For example, this patch turns
```
define void @store_i8_unordered(i8* %mem) {
  store atomic i8 42, i8* %mem unordered, align 1
  ret void
}
```
from
```
_store_i8_unordered:                    ; @store_i8_unordered
; BB#0:
    rlwinm r2, r3, 3, 27, 28
    li r4, 42
    xori r5, r2, 24
    rlwinm r2, r3, 0, 0, 29
    li r3, 255
    slw r4, r4, r5
    slw r3, r3, r5
    and r4, r4, r3
LBB4_1:                                 ; =>This Inner Loop Header: Depth=1
    lwarx r5, 0, r2
    andc r5, r5, r3
    or r5, r4, r5
    stwcx. r5, 0, r2
    bne cr0, LBB4_1
; BB#2:
    blr
```
into
```
_store_i8_unordered:                    ; @store_i8_unordered
; BB#0:
    li r2, 42
    stb r2, 0(r3)
    blr

```
which looks like a pretty clear win to me.

Test Plan:
fixed the tests + new test for indexed accesses + make check-all

Reviewers: jfb, wschmidt, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5587

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218922 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 22:27:07 +00:00
Juergen Ributzka
b3f91b0af7 [Stackmaps] Make ithe frame-pointer required for stackmaps.
Do not eliminate the frame pointer if there is a stackmap or patchpoint in the
function. All stackmap references should be FP relative.

This fixes PR21107.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218920 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 22:21:49 +00:00
Chandler Carruth
bf21d40070 [x86] Teach the new vector shuffle lowering to widen floating point
elements as well as integer elements in order to form simpler shuffle
patterns.

This is the primary reason why we were failing to match some of the
2-and-2 floating point shuffles such as PR21140. Even after fixing this
we need to support some extra patterns in the backend in order to match
the resulting X86ISD::UNPCKL nodes into the correct instructions. This
commit should fix PR21140 and includes more comprehensive testing of
insertion patterns in v4 shuffles.

Not all of the added tests are beautiful. For example, we don't have
clever instructions to insert-via-load in the integer domain. There are
also some places where we aren't sufficiently cunning with our use of
movq and movd, but that's future work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218911 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 21:37:14 +00:00
Tilmann Scheller
ad7783df73 [NVPTX] Remove dead code.
Found by the Clang static analyzer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218874 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 15:12:48 +00:00
Joerg Sonnenberger
92583e0712 Support padding unaligned data in .text.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218870 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 13:41:42 +00:00
Chandler Carruth
4bbf21e71e [x86] Improve and correct how the new vector shuffle lowering was
matching and lowering 64-bit insertions.

The first problem was that we weren't looking through bitcasts to
discover that we *could* lower as insertions. Once fixed, we in turn
weren't looking through bitcasts to discover that we could fold a load
into the lowering. Once fixed, we weren't forming a SCALAR_TO_VECTOR
node around the inserted element and instead were passing a scalar to
a DAG node that expected a vector. It turns out there are some patterns
that will "lower" this into the correct asm, but the rest of the X86
backend is very unhappy with such antics.

This should fix a few more edge case regressions I've spotted going
through the regression test suite to enable the new vector shuffle
lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218839 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 23:14:28 +00:00
Eric Christopher
300743f74a constify the TargetMachine argument used in the subtarget and
lowering constructors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218832 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 21:36:28 +00:00
Sanjay Patel
2b918388ab Lower FNEG ( FABS (x) ) -> FNABS (x) [X86 codegen] PR20578
Negative FABS of either a scalar or vector should be handled the same way
on x86 with SSE/AVX: a single OR instruction of the FP operand with a
constant to light up the sign bit(s).

http://llvm.org/bugs/show_bug.cgi?id=20578

Differential Revision: http://reviews.llvm.org/D5201



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218822 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 21:20:06 +00:00
Eric Christopher
406dccea99 Now that the optimization level is adjusting the feature string
before we hit the subtarget, remove the constructor parameter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218817 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 21:05:35 +00:00
Eric Christopher
c9038d9c1b Rework the PPC TargetMachine so that the non-function specific
overrides happen at TargetMachine creation and not on every
subtarget creation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218805 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 20:38:26 +00:00
Eric Christopher
2e07dedce3 constify TargetMachine parameter for X86TargetLowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218804 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 20:38:22 +00:00
Sanjay Patel
72447214a6 Don't repeat function/variable name in comment. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218791 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 19:39:32 +00:00
Tim Northover
472f2a056d ARM: allow copying of CPSR when all else fails.
As with x86 and AArch64, certain situations can arise where we need to spill
CPSR in the middle of a calculation. These should be avoided where possible
(MRS/MSR is rather expensive), which ARM is actually better at than the other
two since it tries to Glue defs to uses, but as a last ditch effort, copying is
better than crashing.

rdar://problem/18011155

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218789 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 19:21:03 +00:00
Adrian Prantl
02474a32eb Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

Note: I accidentally committed a bogus older version of this patch previously.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218787 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 18:55:02 +00:00
Reed Kotler
befab0a552 Add fptrunc to mips fast-sel
Summary: Implement conversion of 64 to 32 bit floating point numbers (fptrunc) in mips fast-isel

Test Plan:
fptrunc.ll
checked also with 4 internal mips build bot flavors mip32r1/miprs32r2 and at -O0 and -O2

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: rfuhler

Differential Revision: http://reviews.llvm.org/D5553

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218785 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 18:47:02 +00:00
Adrian Prantl
10c4265675 Revert r218778 while investigating buldbot breakage.
"Move the complex address expression out of DIVariable and into an extra"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218782 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 18:10:54 +00:00
Adrian Prantl
076fd5dfc1 Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218778 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 17:55:39 +00:00
Tom Stellard
56077f5796 R600: Call EmitFunctionHeader() in the AsmPrinter to populate the ELF symbol table
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 17:15:17 +00:00
Toma Tabacu
a2878ec715 [mips] Rename emit and parse functions for the .cpload assembler directive. NFC.
Summary: It's better if we have a consistent name for .cpload-related functions.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5437

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218768 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 14:53:19 +00:00
Tom Stellard
f7082f9bd7 R600/SI: Add a generic pseudo EXP instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218767 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 14:44:45 +00:00
Tom Stellard
cbb63311cd R600/SI: Add generic pseudo MTBUF instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218766 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 14:44:43 +00:00
Tom Stellard
f69ae4815a R600/SI: Add generic pseudo SMRD instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218765 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 14:44:42 +00:00
Oliver Stannard
9d7038c437 [ARM] Allow selecting VRINT[APMXZR] and VCVT[BT] instructions for FPv5
Currently, we only codegen the VRINT[APMXZR] and VCVT[BT] instructions
when targeting ARMv8, but they are actually present on any target with
FP-ARMv8. Note that FP-ARMv8 is called FPv5 when is is part of an
M-profile core, but they have the same instructions so we model them
both as FPARMv8 in the ARM backend.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218763 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 13:13:18 +00:00
Chandler Carruth
7d64681274 [x86] Fix a few more tiny patterns with the new vector shuffle lowering
that keep cropping up in the regression test suite.

This also addresses one of the issues raised on the mailing list with
failing to form 'movsd' in as many cases as we realistically should.
There will be corresponding patches forthcoming for v4f32 at least. This
was a lot of fuss for a relatively small gain, but all the fuss was on
my end trying different ways of holding the pieces of the x86 fragment
patterns *just right*. Now that it works, the code is reasonably simple.

In the new test cases I'm adding here, v2i64 sticks out as just plain
horrible. I've not come up with any great ideas here other than that it
would be nice to recognize when we're *going* to take a domain crossing
hit and cross earlier to get the decent instructions. At least with AVX
it is slightly less silly....

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218756 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 11:14:02 +00:00
Chandler Carruth
a1b88ab2c1 [x86] Delete some extraneous logic from the new vector shuffle lowering.
Nothing was relying on this and there are potentially some edge cases
that it would not be correct under. Removing it seems better than trying
to "fix" it as nothing was relying on it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218755 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 11:13:57 +00:00
Tom Coxon
01649dea92 [AArch64] Allow access to all system registers with MRS/MSR instructions.
The A64 instruction set includes a generic register syntax for accessing
implementation-defined system registers. The syntax for these registers is:
    S<op0>_<op1>_<CRn>_<CRm>_<op2>

The encoding space permitted for implementation-defined system registers
is:
    op0 op1  CRn   CRm   op2
    11  xxx  1x11  xxxx  xxx

The full encoding space can now be accessed:
    op0 op1  CRn   CRm   op2
    xx  xxx  xxxx  xxxx  xxx

This is useful to anyone needing to write assembly code supporting new
system registers before the assembler has learned the official names for
them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218753 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 10:13:59 +00:00
Asiri Rathnayake
e9bbacd0a8 Add missing natual vector cast.
Summary: The natual vector cast node (similar to bitcast) AArch64ISD::NVCAST
was introduced in r217159 and r217138. This patch adds a missing cast from
v2f32 to v1i64 which is causing some compilation failures. Also added test
cases to cover various modimm types and BUILD_VECTORs with i64 elements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218751 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 09:59:45 +00:00
Oliver Stannard
ff18b9ff38 [ARM] Add support for Cortex-M7, FPv5-SP and FPv5-DP (LLVM)
The Cortex-M7 has 3 options for its FPU: none, FPv5-SP-D16 and
FPv5-DP-D16. FPv5 has the same instructions as FP-ARMv8, so it can be
modelled using the same target feature, and all double-precision
operations are already disabled by the fp-only-sp target features.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218747 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 09:02:17 +00:00
Daniel Sanders
9a11fba79f [mips] Fix disassembly of [ls][wd]c[23], cache, and pref
Fixes PR21015, and PR20993.                                                       
                                                                                  
Patch by Jun Koi



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218745 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 08:26:55 +00:00
Sasa Stankovic
05a13f0bd0 [mips] For indirect calls we don't need $gp to point to .got. Mips linker
doesn't generate lazy binding stub for a function whose address is taken in
the program.

Differential Revision: http://reviews.llvm.org/D5067


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218744 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 08:22:21 +00:00
Nick Lewycky
b69f873ee1 Fix typo in comment from r218733
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218739 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 03:37:34 +00:00
Chandler Carruth
9e2fe46484 [x86] Teach the new vector shuffle lowering to be even more aggressive
in exposing the scalar value to the broadcast DAG fragment so that we
can catch even reloads and fold them into the broadcast.

This is somewhat magical I'm afraid but seems to work. It is also what
the old lowering did, and I've switched an old test to run both
lowerings demonstrating that we get the same result.

Unlike the old code, I'm not lowering f32 or f64 scalars through this
path when we only have AVX1. The target patterns include pretty heinous
code to re-cast those as shuffles when the scalar happens to not be
spilled because AVX1 provides no broadcast mechanism from registers
what-so-ever. This is terribly brittle. I'd much rather go through our
generic lowering code to get this. If needed, we can add a peephole to
get even more opportunities to broadcast-from-spill-slots that are
exposed post-RA, but my suspicion is this just doesn't matter that much.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218734 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 03:19:43 +00:00
Chandler Carruth
429670f0e8 [x86] Hoist the zext-lowering up in the v4i32 lowering routine -- it is
the same speed as pshufd but we can fold loads into the pmovzx
instructions.

This fixes some regressions that came up in the regression test suite
for the new vector shuffle lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218733 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 02:25:54 +00:00
Adam Nemet
d0d5b08fbd [AVX512] Remove space before \t in AsmStrings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218725 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 00:41:32 +00:00
Chandler Carruth
afe75172b1 [x86] Teach the new vector shuffle lowering about VBROADCAST and
VPBROADCAST.

This has the somewhat expected pervasive impact. I don't know why
I forgot about this. Everything seems good with lots of significant
improvements in the tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218724 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 00:41:21 +00:00
Sanjay Patel
cafc85bf1e Split the estimate() interface into separate functions for each type. NFC.
It was hacky to use an opcode as a switch because it won't always match
(rsqrte != sqrte), and it looks like we'll need to add more special casing
per arch than I had hoped for. Eg, x86 will prefer a different NR estimate
implementation. ARM will want to use it's 'step' instructions. There also
don't appear to be any new estimate instructions in any arch in a long,
long time. Altivec vloge and vexpte may have been the first and last in
that field...



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218698 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 20:28:48 +00:00
Juergen Ributzka
9952c922c2 Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.
Note: This version fixed an issue with the TBZ/TBNZ instructions that were
generated in FastISel. The issue was that the 64bit version of TBZ (TBZX)
automagically sets the upper bit of the immediate field that is used to specify
the bit we want to test. To test for any of the lower 32bits we have to first
extract the subregister and use the 32bit version of the TBZ instruction (TBZW).

Original commit message:
Teach selectBranch to fold bit test and branch into a single instruction (TBZ or
TBNZ).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218693 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 19:59:35 +00:00
Matt Arsenault
28233d3a63 R600/SI: Fix printing of clamp and omod
No tests for omod since nothing uses it yet, but
this should get rid of the remaining annoying trailing
zeros after some instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218692 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 19:49:48 +00:00
Matt Arsenault
532a5c7dc6 R600/SI: Update VOP3b to not include obsolete operands
abs / neg are now part of the srcN_modifiers operands

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218691 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 19:49:43 +00:00
Reed Kotler
8a6f79e58d Add numeric extend, trunctate to mips fast-isel
Summary:
 Add numeric extend, trunctate to mips fast-isel

 Reactivates D4827



Test Plan:
fpext.ll
loadstoreconv.ll

Reviewers: dsanders

Subscribers: mcrosier

Differential Revision: http://reviews.llvm.org/D5251

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218681 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 16:30:13 +00:00
Tom Coxon
8a23890385 [AArch64] Remove unnecessary whitespace. (Test commit)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218680 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 16:23:16 +00:00
Robert Khasanov
8acdc5232d [AVX512] Added intrinsics for 128-, 256- and 512-bit versions of VCMPGT{BWDQ}.
Patch by Sergey Lisitsyn <sergey.lisitsyn@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218670 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 12:15:52 +00:00
Robert Khasanov
175ff01f0f [AVX512] Added intrinsics for 128- and 256-bit versions of VCMPEQ{BWDQ}
Fixed lowering of this intrinsics in case when mask is v2i1 and v4i1.
Now cmp intrinsics lower in the following way:
 (i8 (int_x86_avx512_mask_pcmpeq_q_128
             (v2i64 %a), (v2i64 %b), (i8 %mask))) ->
 (i8 (bitcast
   (v8i1 (insert_subvector undef,
           (v2i1 (and (PCMPEQM %a, %b),
                      (extract_subvector
                         (v8i1 (bitcast %mask)), 0))), 0))))


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218669 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 11:41:54 +00:00
Robert Khasanov
cfa5724d50 [AVX512] Added intrinsics for VPCMPEQB and VPCMPEQW.
Added new operand type for intrinsics (IIT_V64)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218668 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 11:32:22 +00:00
Robert Khasanov
58da66b2bf [AVX512] Enabled intrinsics for VPCMPEQD and VPCMPEQQ.
Added CMP_MASK intrinsic type


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218667 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 11:19:50 +00:00
Job Noorman
deb16c9eac Make sure aggregates are properly alligned on MSP430.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218665 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 11:15:44 +00:00
Chandler Carruth
4abb04a65c [x86] Revert r218588, r218589, and r218600. These patches were pursuing
a flawed direction and causing miscompiles. Read on for details.

Fundamentally, the premise of this patch series was to map
VECTOR_SHUFFLE DAG nodes into VSELECT DAG nodes for all blends because
we are going to *have* to lower to VSELECT nodes for some blends to
trigger the instruction selection patterns of variable blend
instructions. This doesn't actually work out so well.

In order to match performance with the existing VECTOR_SHUFFLE
lowering code, we would need to re-slice the blend in order to fit it
into either the integer or floating point blends available on the ISA.
When coming from VECTOR_SHUFFLE (or other vNi1 style VSELECT sources)
this works well because the X86 backend ensures that these types of
operands to VSELECT get sign extended into '-1' and '0' for true and
false, allowing us to re-slice the bits in whatever granularity without
changing semantics.

However, if the VSELECT condition comes from some other source, for
example code lowering vector comparisons, it will likely only have the
required bit set -- the high bit. We can't blindly slice up this style
of VSELECT. Reid found some code using Halide that triggers this and I'm
hopeful to eventually get a test case, but I don't need it to understand
why this is A Bad Idea.

There is another aspect that makes this approach flawed. When in
VECTOR_SHUFFLE form, we have very distilled information that represents
the *constant* blend mask. Converting back to a VSELECT form actually
can lose this information, and so I think now that it is better to treat
this as VECTOR_SHUFFLE until the very last moment and only use VSELECT
nodes for instruction selection purposes.

My plan is to:
1) Clean up and formalize the target pre-legalization DAG combine that
   converts a VSELECT with a constant condition operand into
   a VECTOR_SHUFFLE.
2) Remove any fancy lowering from VSELECT during *legalization* relying
   entirely on the DAG combine to catch cases where we can match to an
   immediate-controlled blend instruction.

One additional step that I'm not planning on but would be interested in
others' opinions on: we could add an X86ISD::VSELECT or X86ISD::BLENDV
which encodes a fully legalized VSELECT node. Then it would be easy to
write isel patterns only in terms of this to ensure VECTOR_SHUFFLE
legalization only ever forms the fully legalized construct and we can't
cycle between it and VSELECT combining.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218658 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 02:52:28 +00:00
Matt Arsenault
108eb07f61 Fix missing C++ mode comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218654 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 01:05:27 +00:00
Juergen Ributzka
a0af4b0271 [FastISel][AArch64] Fold sign-/zero-extends into the load instruction.
The sign-/zero-extension of the loaded value can be performed by the memory
instruction for free. If the result of the load has only one use and the use is
a sign-/zero-extend, then we emit the proper load instruction. The extend is
only a register copy and will be optimized away later on.

Other instructions that consume the sign-/zero-extended value are also made
aware of this fact, so they don't fold the extend too.

This fixes rdar://problem/18495928.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218653 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 00:49:58 +00:00
Juergen Ributzka
e8a9706eda [FastISel][AArch64] Factor out scale factor calculation. NFC.
Factor out the code that determines the implicit scale factor of memory
operations for a given value type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218652 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 00:49:54 +00:00
Eric Christopher
11bea1bff3 Simplify conditional.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218643 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 23:31:13 +00:00
Adam Nemet
e3d2fcce41 [AVX512] Use X86VectorVTInfo in the masking helper classes and the FMAs
No functionality change.

Makes the code more compact (see the FMA part).

This needs a new type attribute MemOpFrag in X86VectorVTInfo.  For now I only
defined this in the simple cases.  See the commment before the attribute.

Diff of X86.td.expanded before and after is empty except for the appearance of
the new attribute.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218637 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 22:54:41 +00:00
Eric Christopher
6a2169eb6f Add soft-float to the key for the subtarget lookup in the TargetMachine
map, this makes sure that we can compile the same code for two different
ABIs (hard and soft float) in the same module.

Update one testcase accordingly (and fix some confusing naming) and
add a new testcase as well with the ordering swapped which would
highlight the problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218632 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 21:57:54 +00:00
Eric Christopher
200f3764bf Fix spelling and reflow comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218631 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 21:57:52 +00:00
Dave Estes
f657899174 [AArch64] Refines the Cortex-A57 Machine Model
Primarily refines all of the instructions with accurate latency
and micro-op information. Refinements largely focus on the NEON
instructions.

Additionally, a few advanced features are modeled, including
forwarding for MAC instructions and hazards for floating point SQRT
and DIV.

Lastly, the issue-width is reduced to three so that the scheduler
will better accommodate the narrower decode and dispatch width.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218627 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 21:27:36 +00:00
Matt Arsenault
0db97bb9b4 Fix include order
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218611 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 15:53:15 +00:00
Matt Arsenault
1bcadc9b5c R600/SI: Fix hardcoded values for modifiers.
Move enums to SIDefines.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218610 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 15:50:26 +00:00
Matt Arsenault
49cbc1891b R600/SI: Also fix fsub + fadd a, a to mad combines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218609 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 14:59:38 +00:00
Matt Arsenault
a5f45d5444 R600/SI: Fix using mad with multiplies by 2
These turn into fadds, so combine them into the target
mad node.

fadd (fadd (a, a), b) -> mad 2.0, a, b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218608 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 14:59:34 +00:00
Chad Rosier
ea64dce261 [AArch64] Improve cost model to handle sdiv by a pow-of-two.
This patch improves the target-specific cost model to better handle signed
division by a power of two. The immediate result is that this enables the SLP
vectorizer to do a better job.

http://reviews.llvm.org/D5469
PR20714

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218607 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 13:59:31 +00:00
Oliver Stannard
017c6111a8 [Thumb2] ldrexd and strexd are not defined on v7M
The Thumb2 ldrexd and strexd instructions are not defined for
M-class architectures.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218603 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 10:57:29 +00:00
Chandler Carruth
8ac2f142a8 [x86] Make the new vector shuffle lowering lower blends as VSELECT
nodes, and rely exclusively on its logic. This removes a ton of
duplication from the blend lowering and centralizes it in one place.

One downside is that it requires a bunch of hacks to make this work with
the current legalization framework. We have to manually speculate one
aspect of legalizing VSELECT nodes to get everything to work nicely
because the existing legalization framework isn't *actually* bottom-up.

The other grossness is that we somewhat duplicate the analysis of
constant blends. I'm on the fence here. If reviewers thing this would
look better with VSELECT when it has constant operands dumping over tho
VECTOR_SHUFFLE, we could go that way. But it would be a substantial
change because currently all of the actual blend instructions are
matched via patterns in the TD files based around VSELECT nodes (despite
them not being perfect fits for that). Suggestions welcome, but at least
this removes the rampant duplication in the backend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218600 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 09:57:07 +00:00
Chandler Carruth
d23f1883d3 [x86] Delete a bunch of really bad and totally unnecessary code in the
X86 target-specific DAG combining that tried to convert VSELECT nodes
into VECTOR_SHUFFLE nodes that it "knew" would lower into
immediate-controlled blend nodes.

Turns out, we have perfectly good lowering of all these VSELECT nodes,
and indeed that lowering already knows how to handle lowering through
BLENDI to immediate-controlled blend nodes. The code just wasn't getting
used much because this thing forced the world to go through the vector
shuffle lowering. Yuck.

This also exposes that I was too aggressive in avoiding domain crossing
in v218588 with that lowering -- when the other option is to expand into
two 128-bit vectors, it is worth domain crossing. Restore that behavior
now that we have nice tests covering it.

The test updates here fall into two camps. One is where previously we
ended up with an unsigned encoding of the blend operand and now we get
a signed encoding. In most of those places there were elaborate comments
explaining exactly what these operands really mean. Rather than that,
just switch these tests to use the nicely decoded comments that make it
obvious that the final shuffle matches.

The other updates are just removing pointless domain crossing by
blending integers with PBLENDW rather than BLENDPS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218589 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 02:01:20 +00:00
Chandler Carruth
3589550b3e [x86] Refactor all of the VSELECT-as-blend lowering code to avoid domain
crossing and generally work more like the blend emission code in the new
vector shuffle lowering.

My goal is to have the new vector shuffle lowering just produce VSELECT
nodes that are either matched here to BLENDI or are legal and matched in
the .td files to specific blend instructions. That seems much cleaner as
there are other ways to produce a VSELECT anyways. =]

No *observable* functionality changed yet, mostly because this code
appears to be near-dead. The behavior of this lowering routine did
change though. This code being mostly dead and untestable will change
with my next commit which will also point some new tests at it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218588 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 01:32:54 +00:00
Chandler Carruth
b3cf6a65d6 [x86] Improve naming and comments for VSELECT lowering.
No functionality changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218586 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 00:51:58 +00:00
Chandler Carruth
8e93ce1780 [x86] Add the dispatch skeleton to the new vector shuffle lowering for
AVX-512.

There is no interesting logic yet. Everything ends up eventually
delegating to the generic code to split the vector and shuffle the
halves. Interestingly, that logic does a significantly better job of
lowering all of these types than the generic vector expansion code does.
Mostly, it lets most of the cases fall back to nice AVX2 code rather
than all the way back to SSE code paths.

Step 2 of basic AVX-512 support in the new vector shuffle lowering. Next
up will be to incrementally add direct support for the basic instruction
set to each type (adding tests first).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218585 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 00:37:27 +00:00
Chandler Carruth
3bc1ba672c [x86] Make the split-and-lower routine fully generic by relaxing the
assertion, making the name generic, and improving the documentation.

Step 1 in adding very primitive support for AVX-512. No functionality
changed yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218584 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 00:21:49 +00:00
Chandler Carruth
b61dfec824 [x86] Teach the new vector shuffle lowering to fall back on AVX-512
vectors.

Someone will need to build the AVX512 lowering, which should follow
AVX1 and AVX2 *very* closely for AVX512F and AVX512BW resp. I've added
a dummy test which is a port of the v8f32 and v8i32 tests from AVX and
AVX2 to v8f64 and v8i64 tests for AVX512F and AVX512BW. Hopefully this
is enough information for someone to implement proper lowering here. If
not, I'll be happy to help, but right now the AVX-512 support isn't
a priority for me.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218583 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-28 23:53:10 +00:00
Chandler Carruth
4f4280469c [x86] Fix the new vector shuffle lowering's use of VSELECT for AVX2
lowerings.

This was hopelessly broken. First, the x86 backend wants '-1' to be the
element value representing true in a boolean vector, and second the
operand order for VSELECT is backwards from the actual x86 instructions.
To make matters worse, the backend is just using '-1' as the true value
to get the high bit to be set. It doesn't actually symbolically map the
'-1' to anything. But on x86 this isn't quite how it works: there *only*
the high bit is relevant. As a consequence weird non-'-1' values like
0x80 actually "work" once you flip the operands to be backwards.

Anyways, thanks to Hal for helping me sort out what these *should* be.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218582 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-28 23:23:55 +00:00
Chandler Carruth
3f40848670 [x86] Fix a really silly bug that I introduced fixing another bug in the
new vector shuffle target DAG combines -- it helps to actually test for
the value you want rather than just using an integer in a boolean
context.

Have I mentioned that I loathe implicit conversions recently? :: sigh ::

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218576 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-28 06:11:04 +00:00
Chandler Carruth
21b69296fb [x86] Fix yet another bug in the new vector shuffle lowering's handling
of widening masks.

We can't widen a zeroing mask unless both elements that would be merged
are either zeroed or undef. This is the only way to widen a mask if it
has a zeroed element.

Also clean up the code here by ordering the checks in a more logical way
and by using the symoblic values for undef and zero. I'm actually torn
on using the symbolic values because the existing code is littered with
the assumption that -1 is undef, and moreover that entries '< 0' are the
special entries. While that works with the values given to these
constants, using the symbolic constants actually makes it a bit more
opaque why this is the case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218575 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-28 03:30:25 +00:00
Chandler Carruth
b66b0cf2eb [x86] Fix yet another issue with widening vector shuffle elements.
I spotted this by inspection when debugging something else, so I have no
test case what-so-ever, and am not even sure it is possible to
realistically trigger the bug. But this is what was intended here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218565 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-27 08:40:33 +00:00
Chandler Carruth
72c3b07dfd [x86] Fix terrible bugs everywhere in the new vector shuffle lowering
and in the target shuffle combining when trying to widen vector
elements.

Previously only one of these was correct, and we didn't correctly
propagate zeroing target shuffle masks (which have a different sentinel
value from undef in non- target shuffle masks now). This isn't just
a missed optimization, this caused us to drop zeroing shuffles on the
floor and miscompile code. The added test case is one example of that.

There are other fixes to the test suite as a consequence of this as well
as restoring the undef elements in some of the masks that were lost when
I brought sanity to the actual *value* of the undef and zero sentinels.

I've also just cleaned up some of the PSHUFD and PSHUFLW and PSHUFHW
combining code, but that code really needs to go. It was a nice initial
attempt, but it isn't very principled and the recursive shuffle combiner
is much more powerful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218562 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-27 04:42:44 +00:00
Chandler Carruth
8470b5b812 [x86] Flip the sentinel values used in the target shuffle mask decoding
to significantly more sane sentinels. Notably, everywhere else in the
backend's representation of shuffles uses '-1' to represent undef. The
target shuffle masks really shouldn't diverge from that, especially as
in a few places they are manipulated by shared code.

This causes us to lose some undef lanes in various test masks. I want to
get these back, but technically it isn't invalid and there are a *lot*
of bugs here so I want to try to establish a saner baseline for fixing
some of the bugs by aligning the specific senitnel values used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218561 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-27 04:42:39 +00:00
Sanjay Patel
676af35b38 Refactor reciprocal and reciprocal square root estimate into target-independent functions (part 2).
This is purely refactoring. No functional changes intended. PowerPC is the only target
that is currently using this interface.

The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this:

z = y / sqrt(x)

into:

z = y * rsqrte(x)

And:

z = y / x

into:

z = y * rcpe(x)

using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 .

There is one hook in TargetLowering to get the target-specific opcode for an estimate instruction
along with the number of refinement steps needed to make the estimate usable.

Differential Revision: http://reviews.llvm.org/D5484



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218553 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 23:01:47 +00:00
Chandler Carruth
0a31a52b91 [x86] Fix a moderately terrifying bug in the new 128-bit shuffle logic
that managed to elude all of my fuzz testing historically. =/

Something changed to allow this code path to actually be exercised and
it was doing bad things. It is especially heavily exercised by the
patterns that emerge when doing AVX shuffles that end up lowered through
the 128-bit code path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218540 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 20:41:45 +00:00
Matt Arsenault
07b7c98d61 R600/SI: Use break instead of continue
If an instruction doesn't have src1, it doesn't have src2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218536 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:55:14 +00:00
Matt Arsenault
88416c337b R600/SI: Add a note about the order of the operands to div_scale
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218534 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:55:09 +00:00
Matt Arsenault
508b8db287 R600/SI: Move finding SGPR operand to move to separate function
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218533 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:55:06 +00:00
Matt Arsenault
d991d2217b R600/SI Allow same SGPR to be used for multiple operands
Instead of moving the first SGPR that is different than the first,
legalize the operand that requires the fewest moves if one
SGPR is used for multiple operands.

This saves extra moves and is also required for some instructions
which require that the same operand be used for multiple operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218532 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:55:03 +00:00
Matt Arsenault
aed12d4bad R600/SI: Partially move operand legalization to post-isel hook.
Disable the SGPR usage restriction parts of the DAG legalizeOperands.
It now should only be doing immediate folding until it can be replaced
later. The real legalization work is now done by the other
SIInstrInfo::legalizeOperands

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218531 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:59 +00:00
Matt Arsenault
29202835d8 R600/SI: Implement findCommutedOpIndices
The base implementation of commuteInstruction is used
in some cases, but it turns out this has been broken for a
long time since modifiers were inserted between the real operands.

The base implementation of commuteInstruction also fails on immediates,
which also needs to be fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218530 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:54 +00:00
Matt Arsenault
8a70e28114 R600/SI: Don't move operands that are required to be SGPRs
e.g. v_cndmask_b32 requires the condition operand be an SGPR.
If one of the source operands were an SGPR, that would be considered
the one SGPR use and the condition operand would be illegally moved.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218529 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:52 +00:00
Matt Arsenault
5b199b585c R600/SI: Don't assert on exotic operand types
This needs a test, but I'm not sure if it is currently possible and
I originally hit it due to a bug. Right now the only global address
operands have no reason to be VALU instructions, although it
theoretically could be a problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218528 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:46 +00:00
Matt Arsenault
26b2a7834e R600/SI: Fix using wrong operand indices when commuting
No test since the current SIISelLowering::legalizeOperands
effectively hides this, and the general uses seem to only fire
on SALU instructions which don't have modifiers between
the operands.

When trying to use legalizeOperands immediately after
instruction selection, it now sees a lot more patterns
it did not see before which break on this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218527 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:43 +00:00
Matt Arsenault
ea849e9adc R600/SI: Remove apparently dead code in legalizeOperands
No tests hit this, and I don't see any way a GlobalAddress
node would survive beyond lowering on SI. It it would, the
move should probably be inserted by selection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218526 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:54:38 +00:00
Chandler Carruth
a7579ed23f [x86] The mnemonic is SHUFPS not SHUPFS. =[ I'm very bad at spelling
sadly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218524 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:27:40 +00:00
Chandler Carruth
7929a210d5 [x86] In the new vector shuffle lowering, when trying to do another
layer of tie-breaking sorting, it really helps to check that you're in
a tie first. =] Otherwise the whole thing cycles infinitely. Test case
added, another one found through fuzz testing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218523 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:24:26 +00:00
Chandler Carruth
7164a4ae0a [x86] Fix a large collection of bugs that crept in as I fleshed out the
AVX support.

New test cases included. Note that none of the existing test cases
covered these buggy code paths. =/ Also, it is clear from this that
SHUFPS and SHUFPD are the most bug prone shuffle instructions in x86. =[

These were all detected by fuzz-testing. (I <3 fuzz testing.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218522 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-26 17:11:02 +00:00