Commit Graph

2052 Commits

Author SHA1 Message Date
Craig Topper
880ef45a14 Move setOperationAction for CONCAT_VECTORS for 256-bit vectors into loop since all 256-bit types are supported.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161730 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-11 22:34:26 +00:00
Michael Liao
2a33cec66a add X86-specific DAG optimization to simplify boolean test
- if a boolean test (X86ISD::CMP or X86ISD:SUB) checks a boolean value
  generated from X86ISD::SETCC, try to simplify the boolean value
  generation and checking by reusing the original EFLAGS with proper
  condition code
- add hooks to X86 specific SETCC/BRCOND/CMOV, the major 3 places
  consuming EFLAGS

part of patches fixing PR12312



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161687 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-10 19:58:13 +00:00
Michael Liao
f6c24eef62 remove tailing whitespaces and test commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161664 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-10 14:39:24 +00:00
Joerg Sonnenberger
78cab947cf Add some missing includes for the build against stdcxx.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161657 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-10 10:53:56 +00:00
Manman Ren
39ad568c62 X86: enable CSE between CMP and SUB
We perform the following:
1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering.
2> Modify MachineCSE to correctly handle implicit defs.
3> Convert SUB back to CMP if possible at peephole.

Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled
by peephole now.

rdar://11873276


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161462 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-08 00:51:41 +00:00
Evan Cheng
b64dd5f2b5 X86 cmp lowering is looking past truncate on the condition node. It should only
do so when the high bits are known zero. This caused a subtle miscompilation.

rdar://12027825 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161451 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-07 22:21:00 +00:00
Craig Topper
4feb647283 Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161318 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-06 06:22:36 +00:00
Craig Topper
cc915951eb Remove custom inserter for MWAIT. It doesn't do anything that couldn't be represented in a pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161306 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-05 00:36:57 +00:00
Craig Topper
638aa687d4 Use a COPY node instead of an explicit MOVA opcode in the custom insterter for pcmpestrm/pcmpistrm. Allows the register allocator to handle it better and prevent wasted identity moves.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161305 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-05 00:17:48 +00:00
Bob Wilson
d49edb7ab0 Fall back to selection DAG isel for calls to builtin functions.
Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161232 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-03 04:06:28 +00:00
Chad Rosier
a20e1e7ef5 Whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161122 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-01 18:39:17 +00:00
Elena Demikhovsky
1503aba4a0 Added FMA functionality to X86 target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161110 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-01 12:06:00 +00:00
Rafael Espindola
1cee71099c When a return struct pointer is passed in registers, the called has nothing
to pop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160725 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-25 13:41:10 +00:00
Sylvestre Ledru
c8e41c5917 Fix a typo (the the => the)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160621 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-23 08:51:15 +00:00
Evan Cheng
a9e13ba3c8 Back out r160101 and instead implement a dag combine to recover from instcombine transformation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160387 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 18:54:11 +00:00
Evan Cheng
f5c0539092 Implement r160312 as target indepedenet dag combine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160354 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 08:31:11 +00:00
Evan Cheng
70e10d3fe4 This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160346 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 06:53:39 +00:00
Evan Cheng
98819c9d1e For something like
uint32_t hi(uint64_t res)
{
        uint_32t hi = res >> 32;
        return !hi;
}

llvm IR looks like this:
define i32 @hi(i64 %res) nounwind uwtable ssp {
entry:
  %lnot = icmp ult i64 %res, 4294967296
  %lnot.ext = zext i1 %lnot to i32
  ret i32 %lnot.ext
}

The optimizer has optimize away the right shift and truncate but the resulting
constant is too large to fit in the 32-bit immediate field. The resulting x86
code is worse as a result:
        movabsq $4294967296, %rax       ## imm = 0x100000000
        cmpq    %rax, %rdi
        sbbl    %eax, %eax
        andl    $1, %eax

This patch teaches the x86 lowering code to handle ult against a large immediate
with trailing zeros. It will issue a right shift and a truncate followed by
a comparison against a shifted immediate.
        shrq    $32, %rdi
        testl   %edi, %edi
        sete    %al
        movzbl  %al, %eax

It also handles a ugt comparison against a large immediate with trailing bits
set. i.e. X >  0x0ffffffff -> (X >> 32) >= 1

rdar://11866926


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 19:35:43 +00:00
Nadav Rotem
d896e24299 Teach getTargetVShiftNode about TargetConstant nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160234 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-15 20:27:43 +00:00
Nadav Rotem
65f489fd7d AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector.
This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160222 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-14 22:26:05 +00:00
Benjamin Kramer
feae00a68e Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it.
I already had the necessary things in place for IR-level passes but missed the machine passes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160137 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 18:14:57 +00:00
Benjamin Kramer
b9bee04995 Add intrinsics for Ivy Bridge's rdrand instruction.
The rdrand/cmov sequence is the same that is emitted by both
GCC and ICC.

Fixes PR13284.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160117 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 09:31:43 +00:00
Nadav Rotem
5cd95e1478 When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160044 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 13:27:05 +00:00
Nadav Rotem
2dd83eb1ab Improve the loading of load-anyext vectors by allowing the codegen to load
multiple scalars and insert them into a vector. Next, we shuffle the elements
into the correct places, as before.
Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
migration of bitcasts happened too late in the SelectionDAG process.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159991 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 13:25:08 +00:00
Jakob Stoklund Olesen
85dccf18ea Make X86 call and return instructions non-variadic.
Function argument and return value registers aren't part of the
encoding, so they should be implicit operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159728 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-04 23:53:27 +00:00
Jakob Stoklund Olesen
b872078701 Ensure CopyToReg nodes are always glued to the call instruction.
The CopyToReg nodes that set up the argument registers before a call
must be glued to the call instruction. Otherwise, the scheduler may emit
the physreg copies long before the call, causing long live ranges for
the fixed registers.

Besides disabling good register allocation, that can also expose
problems when EmitInstrWithCustomInserter() splits a basic block during
the live range of a physreg.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159721 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-04 19:28:31 +00:00
Elena Demikhovsky
8f40f7b867 Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159504 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-01 06:12:26 +00:00
Rafael Espindola
94e3b388e5 In the initial exec mode we always do a load to find the address of a variable.
Before this patch in pic 32 bit code we would add the global base register
and not load from that address. This is a really old bug, but before the
introduction of the tls attributes we would never select initial exec for
pic code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159409 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-29 04:22:35 +00:00
Elena Demikhovsky
fcb0946833 Removed unused variable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159197 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 10:50:07 +00:00
Bill Wendling
a44489d5b5 Rename to match other X86_64* names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159196 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 10:05:06 +00:00
Elena Demikhovsky
1596373671 Shuffle optimization for AVX/AVX2.
The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction.
Before:
      vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
       vextractf128    $1, %ymm1, %xmm1
       vextractf128    $1, %ymm0, %xmm0
       vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
       vinsertf128     $1, %xmm0, %ymm2, %ymm0
After:
      vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
      vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
      vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159188 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 08:04:10 +00:00
Eli Friedman
52d418df5d Make some ugly hacks for inline asm operands which name a specific register a bit more thorough. PR13196.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159176 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-25 23:42:33 +00:00
Jakob Stoklund Olesen
82d58b147f %RCX is not a function live-out in eh.return functions.
The function live-out registers must be live at all function returns,
and %RCX is only used by eh.return. When a function also has a normal
return, only %RAX holds a return value.

This fixes PR13188.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159116 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-24 15:53:01 +00:00
Pete Cooper
6e2db65266 Remove code i'd been testing with but didn't mean to commit. Oops
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159094 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-24 00:08:36 +00:00
Pete Cooper
b49998d76c DAG legalisation can now handle illegal fma vector types by scalarisation
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159092 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-24 00:05:44 +00:00
Rafael Espindola
ce0a5cda8a Handle aliases to tls variables in all architectures, not just x86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159058 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-23 00:30:03 +00:00
Craig Topper
703c38bf58 Don't insert 128-bit UNDEF into 256-bit vectors. Just keep the 256-bit vector. Original patch by Elena Demikhovsky. Tweaked by me to allow possibility of covering more cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158792 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 05:39:26 +00:00
Rafael Espindola
d6b43a317e Move the support for using .init_array from ARM to the generic
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.

Add a command line option to llc that enables it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158692 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 00:48:28 +00:00
Craig Topper
2a5dc43bd9 Use XOP vpcom intrinsics in patterns instead of a target specific SDNode type. Remove the custom lowering code that selected the SDNode type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158279 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 17:02:24 +00:00
Craig Topper
c29106b36f Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158278 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 16:46:13 +00:00
Manman Ren
45d53b866e Enable optimization for integer ABS on X86 if Subtarget has CMOV.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158220 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 18:58:26 +00:00
Manman Ren
9236362a64 X86: optimize generated code for integer ABS
This patch will generate the following for integer ABS:
      movl    %edi, %eax
      negl    %eax
      cmovll  %edi, %eax
INSTEAD OF
      movl    %edi, %ecx
      sarl    $31, %ecx
      leal    (%rdi,%rcx), %eax
      xorl    %ecx, %eax

There exists a target-independent DAG combine for integer ABS, which converts
integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. 
This is implemented in PerformXorCombine.

rdar://10695237


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158175 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07 22:39:10 +00:00
Nadav Rotem
bdcae38256 Do not optimize the used bits of the x86 vselect condition operand, when the condition operand is a vector of 1-bit predicates.
This may happen on MIC devices.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158168 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07 20:53:48 +00:00
Manman Ren
e6fc9d40b3 PR13046: we can't replace usage of SUB with CMP in the lowering phase.
It will cause assertion failure later on.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158160 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07 19:27:33 +00:00
Manman Ren
87253c2ebd X86: replace SUB with CMP if possible
This patch will optimize the following
    movq    %rdi, %rax
    subq    %rsi, %rax
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax
to
    cmpq    %rsi, %rdi
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158126 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07 00:42:47 +00:00
Benjamin Kramer
d9b0b02561 Fix typos found by http://github.com/lyda/misspell-check
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157885 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-02 10:20:22 +00:00
Hans Wennborg
f0234fcbc9 Implement the local-dynamic TLS model for x86 (PR3985)
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157818 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 16:27:21 +00:00
Jakob Stoklund Olesen
ee66b417ef Add support for return value promotion in X86 calling conventions.
Patch by Yiannis Tsiouris!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157757 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 17:28:20 +00:00
Justin Holewinski
d2ea0e10cb Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall
to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.

NV_CONTRIB

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157479 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-25 16:35:28 +00:00
Craig Topper
85b9e56bac Fix constant used for pshufb mask when lowering v16i8 shuffles. Bug introduced in r157043. Fixes PR12908.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157236 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-22 06:09:38 +00:00