Commit Graph

659 Commits

Author SHA1 Message Date
Nadav Rotem
7ae3bcca45 CostModel: Add tables for the common x86 compares.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167421 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-05 23:48:20 +00:00
Nadav Rotem
b4b04c3fa0 X86 CostModel: Add support for a some of the common arithmetic instructions for SSE4, AVX and AVX2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167347 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-03 00:39:56 +00:00
Nadav Rotem
0c31e43ff3 Add a stub for the x86 cost model impl. Implement a basic cost rule for inserting/extracting from XMM registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167333 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-02 23:27:16 +00:00
Michael Liao
c5c970ee85 Clean up redundant SP register maintained in X86 TLI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167104 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-31 04:14:09 +00:00
Manman Ren
4c74a956b2 X86 MMX: optimize transfer from mmx to i32
We used to generate a store (movq) + a load.
Now we use movd.

rdar://9946746


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167056 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-30 22:15:38 +00:00
Michael Liao
a7554630e9 Add custom UINT_TO_FP from v4i8/v4i16/v8i8/v8i16 to v4f32/v8f32
- Replace v4i8/v8i8 -> v8f32 DAG combine with custom lowering to reduce
  DAG combine overhead.
- Extend the support to v4i16/v8i16 as well.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166487 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 17:36:08 +00:00
Michael Liao
d9d09600ee Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166486 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 17:34:00 +00:00
Michael Liao
facace808c Lower BUILD_VECTOR to SHUFFLE + INSERT_VECTOR_ELT for X86
- If INSERT_VECTOR_ELT is supported (above SSE2, either by custom
  sequence of legal insn), transform BUILD_VECTOR into SHUFFLE +
  INSERT_VECTOR_ELT if most of elements could be built from SHUFFLE with few
  (so far 1) elements being inserted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166288 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-19 17:15:18 +00:00
Michael Liao
bedcbd433d Support v8f32 to v8i8/vi816 conversion through custom lowering
- Add custom FP_TO_SINT on v8i16 (and v8i8 which is legalized as v8i16 due to
  vector element-wise widening) to reduce DAG combiner and its overhead added
  in X86 backend.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166036 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 18:14:11 +00:00
Michael Liao
6c0e04c823 Add __builtin_setjmp/_longjmp supprt in X86 backend
- Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also
  used as a light-weight replacement of setjmp/longjmp which are used to
  implementation continuation, user-level threading, and etc. The support added
  in this patch ONLY addresses this usage and is NOT intended to support SjLj
  exception handling as zero-cost DWARF exception handling is used by default
  in X86.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165989 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 22:39:43 +00:00
Michael Liao
44c2d61b67 Add support for FP_ROUND from v2f64 to v2f32
- Due to the current matching vector elements constraints in
  ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
  v2f32) is scalarized. Add a customized v2f32 widening to convert it
  into a target-specific X86ISD::VFPROUND to work around this
  constraints.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165631 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 16:53:28 +00:00
Michael Liao
9d796db3e7 Add alternative support for FP_ROUND from v2f32 to v2f64
- Due to the current matching vector elements constraints in ISD::FP_EXTEND,
  rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening
  to convert it into a target-specific X86ISD::VFPEXT to work around this
  constraints. This patch also reverts a previous attempt to fix this issue by
  recovering the scalarized ISD::FP_EXTEND pattern and thus significantly
  reduces the overhead of supporting non-power-2 vector FP extend.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165625 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 16:32:15 +00:00
Micah Villmow
3574eca1b0 Move TargetData to DataLayout.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165402 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-08 16:38:25 +00:00
Michael Liao
e5e8f7656a Add missing i64 max/min/umax/umin on 32-bit target
- Turn on atomic6432.ll and add specific test case as well



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164616 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-25 18:08:13 +00:00
Evan Cheng
b1cacc7423 Fix an illegal tailcall opt where the callee returns a double via xmm while caller returns x86_fp80 via st0. rdar://12229511
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164588 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-25 05:32:34 +00:00
Michael Liao
b118a073d7 Re-work X86 code generation of atomic ops with spin-loop
- Rewrite/merge pseudo-atomic instruction emitters to address the
  following issue:
  * Reduce one unnecessary load in spin-loop

    previously the spin-loop looks like

        thisMBB:
        newMBB:
          ld  t1 = [bitinstr.addr]
          op  t2 = t1, [bitinstr.val]
          not t3 = t2  (if Invert)
          mov EAX = t1
          lcs dest = [bitinstr.addr], t3  [EAX is implicit]
          bz  newMBB
          fallthrough -->nextMBB

    the 'ld' at the beginning of newMBB should be lift out of the loop
    as lcs (or CMPXCHG on x86) will load the current memory value into
    EAX. This loop is refined as:

        thisMBB:
          EAX = LOAD [MI.addr]
        mainMBB:
          t1 = OP [MI.val], EAX
          LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined]
          JNE mainMBB
        sinkMBB:

  * Remove immopc as, so far, all pseudo-atomic instructions has
    all-register form only, there is no immedidate operand.

  * Remove unnecessary attributes/modifiers in pseudo-atomic instruction
    td

  * Fix issues in PR13458

- Add comprehensive tests on atomic ops on various data types.
  NOTE: Some of them are turned off due to missing functionality.

- Revise tests due to the new spin-loop generated.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164281 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-20 03:06:15 +00:00
Michael Liao
f966e4e5b3 Add wider vector/integer support for PR12312
- Enhance the fix to PR12312 to support wider integer, such as 256-bit
  integer. If more than 1 fully evaluated vectors are found, POR them
  first followed by the final PTEST.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163832 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-13 20:24:54 +00:00
Craig Topper
55b2405484 Make a bunch of lowering helper functions static instead of member functions. No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163596 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-11 06:15:32 +00:00
Nadav Rotem
d60cb11afd When unsafe math is used, we can use commutative FMAX and FMIN. In some cases
this allows for better code generation.

Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.

For example:

  movaps  %xmm0, %xmm1
  movsd LC(%rip), %xmm0
  minsd %xmm1, %xmm0

becomes:

  minsd LC(%rip), %xmm0




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162187 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-19 13:06:16 +00:00
Craig Topper
c087870c47 Make ReplaceATOMIC_BINARY_64 a static function. Use a nested switch to reduce to only a single call to it thus allowing it to be inlined by the compiler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162088 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-17 06:55:11 +00:00
Michael Liao
7091b2451d fix PR11334
- FP_EXTEND only support extending from vectors with matching elements.
  This results in the scalarization of extending to v2f64 from v2f32,
  which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
  extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161894 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-14 21:24:47 +00:00
Craig Topper
bccc8ce9b8 Remove the LowerMMXCONCAT_VECTORS function. It could never execute because there are no legal 64-bit vector types that could be used as inputs to a 128-bit concat_vectors. Remove a target specific SDNode and its patterns that become unused as a result.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161742 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-13 01:23:55 +00:00
Craig Topper
4feb647283 Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161318 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-06 06:22:36 +00:00
Bob Wilson
d49edb7ab0 Fall back to selection DAG isel for calls to builtin functions.
Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161232 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-03 04:06:28 +00:00
Elena Demikhovsky
1503aba4a0 Added FMA functionality to X86 target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161110 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-01 12:06:00 +00:00
Bill Wendling
2127c9b93a Remove tabs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160479 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-19 00:15:11 +00:00
Evan Cheng
70e10d3fe4 This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160346 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 06:53:39 +00:00
Benjamin Kramer
b9bee04995 Add intrinsics for Ivy Bridge's rdrand instruction.
The rdrand/cmov sequence is the same that is emitted by both
GCC and ICC.

Fixes PR13284.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160117 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 09:31:43 +00:00
Craig Topper
2a5dc43bd9 Use XOP vpcom intrinsics in patterns instead of a target specific SDNode type. Remove the custom lowering code that selected the SDNode type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158279 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 17:02:24 +00:00
Hans Wennborg
f0234fcbc9 Implement the local-dynamic TLS model for x86 (PR3985)
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157818 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 16:27:21 +00:00
Justin Holewinski
d2ea0e10cb Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall
to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.

NV_CONTRIB

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157479 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-25 16:35:28 +00:00
Benjamin Kramer
17c836c4b5 X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures.
* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.

Fixes PR6679. Patch by Christoph Erhardt!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155704 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 12:07:43 +00:00
Craig Topper
8325c11d47 Merge vpermps/vpermd and vpermpd/vpermq SD nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154782 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 00:41:45 +00:00
Elena Demikhovsky
73c504af9d Added VPERM optimization for AVX2 shuffles
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154761 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-15 11:18:59 +00:00
Richard Smith
42fc29e717 Fix X86 codegen for 'atomicrmw nand' to generate *x = ~(*x & y), not *x = ~*x & y.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154705 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-13 22:47:00 +00:00
Nadav Rotem
e611378a6e Reapply 154396 after fixing a test.
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154483 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 06:40:27 +00:00
Eric Christopher
a139051654 Temporarily revert this patch to see if it brings the buildbots back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154425 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 19:33:16 +00:00
Nadav Rotem
50e64cfe6e Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154396 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 14:33:13 +00:00
Evan Cheng
bf010eb911 Fix a long standing tail call optimization bug. When a libcall is emitted
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.

PR12419
rdar://9770785
rdar://11195178


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154370 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 01:51:00 +00:00
Nadav Rotem
154819dd6f Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type.
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154310 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09 07:45:58 +00:00
Rafael Espindola
26c8dcc692 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154011 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 12:51:34 +00:00
Evan Cheng
4bfcd4acbc Re-commit r151623 with fix. Only issue special no-return calls if it's a direct call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151645 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-28 18:51:51 +00:00
Daniel Dunbar
20bd5296ce Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151630 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-28 15:36:07 +00:00
Evan Cheng
ec52aaa12f Some ARM implementaions, e.g. A-series, does return stack prediction. That is,
the processor keeps a return addresses stack (RAS) which stores the address
and the instruction execution state of the instruction after a function-call
type branch instruction.

Calling a "noreturn" function with normal call instructions (e.g. bl) can
corrupt RAS and causes 100% return misprediction so LLVM should use a
unconditional branch instead. i.e.
mov lr, pc
b _foo
The "mov lr, pc" is issued in order to get proper backtrace.

rdar://8979299


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151623 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-28 06:42:03 +00:00
NAKAMURA Takumi
9a68fdc7f8 Target/X86: Fix assertion failures and warnings caused by r151382 _ftol2 lowering for i386-*-win32 targets. Patch by Joe Groff.
[Joe Groff] Hi everyone. My previous patch applied as r151382 had a few problems:
Clang raised a warning, and X86 LowerOperation would assert out for
fptoui f64 to i32 because it improperly lowered to an illegal
BUILD_PAIR. Here's a patch that addresses these issues. Let me know if
any other changes are necessary. Thanks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151432 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-25 03:37:25 +00:00
Michael J. Spencer
1a2d061ec0 Add WIN_FTOL_* psudo-instructions to model the unique calling convention
used by the Win32 _ftol2 runtime function. Patch by Joe Groff!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151382 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-24 19:01:22 +00:00
Craig Topper
44d23825d6 Make all pointers to TargetRegisterClass const since they are all pointers to static data that should not be modified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151134 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-22 05:59:10 +00:00
Craig Topper
5aaffa8470 Make a bunch of X86ISelLowering shuffle functions static now that they are no longer needed by isel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150908 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-19 02:53:47 +00:00
Craig Topper
5b209e84f4 Add target specific node for PMULUDQ. Change patterns to use it and custom lower intrinsics to it. Use it instead of intrinsic to handle 64-bit vector multiplies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149807 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-05 03:14:49 +00:00
Elena Demikhovsky
dcabc7bca9 Optimization for SIGN_EXTEND operation on AVX.
Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32
extensions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149600 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-02 09:10:43 +00:00
Elena Demikhovsky
3ae98150e3 Optimization for "truncate" operation on AVX.
Truncating v4i64 -> v4i32 and v8i32 -> v8i16 may be done with set of shuffles.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149485 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-01 07:56:44 +00:00
Craig Topper
86c7c583a3 Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic patterns with custom lowering to a target specific nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149216 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-30 01:10:15 +00:00
Craig Topper
1906d32e55 Combine X86 CMPPD and CMPPS node types. Simplifies selection code and pattern matching.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148670 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-22 23:36:02 +00:00
Craig Topper
67609fd0eb Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86 ISD node types into only two node types. Simplifying opcode selection and pattern matching.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148667 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-22 22:42:16 +00:00
Craig Topper
ed2e13d667 Add target specific ISD node types for SSE/AVX vector shuffle instructions and change all the code that used to create intrinsic nodes to create the new nodes instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148664 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-22 19:15:14 +00:00
Craig Topper
6a32b6f0c0 Remove unused X86 ISD node type defines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148644 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-22 01:15:56 +00:00
Craig Topper
1a7700a3fa Merge 128-bit and 256-bit SHUFPS/SHUFPD handling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148466 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-19 08:19:12 +00:00
Victor Umansky
435d0bd09d Reverted commit #147601 upon Evan's request.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147748 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-08 17:20:33 +00:00
Victor Umansky
19d8559019 Peephole optimization of ptest-conditioned branch in X86 arch. Performs instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX.
Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX)

Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147601 91177308-0d34-0410-b5e6-96231b3b80d8
2012-01-05 08:46:19 +00:00
Craig Topper
b3982da7d2 Merge X86 SHUFPS and SHUFPD node types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147394 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-31 23:50:21 +00:00
Chandler Carruth
acc068e873 Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the
X86ISelLowering C++ code. Because this is lowered via an xor wrapped
around a bsr, we want the dagcombine which runs after isel lowering to
have a chance to clean things up. In particular, it is very common to
see code which looks like:

  (sizeof(x)*8 - 1) ^ __builtin_clz(x)

Which is trying to compute the most significant bit of 'x'. That's
actually the value computed directly by the 'bsr' instruction, but if we
match it too late, we'll get completely redundant xor instructions.

The more naive code for the above (subtracting rather than using an xor)
still isn't handled correctly due to the dagcombine getting confused.

Also, while here fix an issue spotted by inspection: we should have been
expanding the zero-undef variants to the normal variants when there is
an 'lzcnt' instruction. Do so, and test for this. We don't want to
generate unnecessary 'bsr' instructions.

These two changes fix some regressions in encoding and decoding
benchmarks. However, there is still a *lot* to be improve on in this
type of code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147244 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-24 10:55:54 +00:00
Craig Topper
ab44d3cf49 Remove an unused X86ISD node type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146833 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-17 19:16:44 +00:00
Craig Topper
94438ba538 Don't try to match 'unpackl/h v, v' for 32xi8 and 16xi16 when only AVX1 is supported. Fix 'unpackh v, v' for 256-bit types to understand 128-bit lanes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146726 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-16 08:06:31 +00:00
Craig Topper
d93e4c3496 Remove some remants of the old palign pattern fragment that were still hanging around. Also remove a cast from inside getShuffleVPERM2X128Immediate and getShuffleVPERMILPImmediate since the only caller already had done the cast.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146344 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-11 19:12:35 +00:00
Craig Topper
34671b812a Merge floating point and integer UNPCK X86ISD node types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145926 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-06 08:21:25 +00:00
Craig Topper
ec24e61ab0 Merge VPERM2F128/VPERM2I128 ISD node types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145485 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-30 07:47:51 +00:00
Craig Topper
316cd2a2c5 Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node type for VPERMILPD/PS. Add instruction selection support for VINSERTI128/VEXTRACTI128.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145483 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-30 06:25:25 +00:00
Craig Topper
70b883b3a7 Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145238 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-28 10:14:51 +00:00
Craig Topper
38034c568c Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145153 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-26 22:55:48 +00:00
Craig Topper
06cb680779 Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145148 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-26 20:47:44 +00:00
Craig Topper
705f2431a0 Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145126 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-24 22:57:10 +00:00
Craig Topper
f475a55bd4 Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145125 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-24 22:20:08 +00:00
Craig Topper
6fa583d787 Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145028 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-21 08:26:50 +00:00
Craig Topper
6347e8662c Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145026 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-21 06:57:39 +00:00
Craig Topper
54f952afac Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144989 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 09:02:40 +00:00
Craig Topper
3113384a34 Collapse X86 PSIGNB/PSIGNW/PSIGND node types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144988 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 07:33:10 +00:00
Lang Hames
15701f8969 Rename NonScalarIntSafe to something more appropriate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143080 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-26 23:50:43 +00:00
Craig Topper
b4c945716f Remove intrinsics for X86 BLSI, BLSMSK, and BLSR intrinsics and replace with custom isel lowering code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142642 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-21 06:55:01 +00:00
Craig Topper
54a11176f6 Add X86 ANDN instruction. Including instruction selection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141947 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-14 07:06:56 +00:00
Duncan Sands
17470bee5f Synthesize SSE3/AVX 128 bit horizontal add/sub instructions from
floating point add/sub of appropriate shuffle vectors.  Does not
synthesize the 256 bit AVX versions because they work differently.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140332 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-22 20:15:48 +00:00
Nadav Rotem
fbad25e120 CR fixes per Bruno's request.
Undo the changes from r139285 which added custom lowering to vselect.
Add tablegen lowering for vselect.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139479 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-11 15:02:23 +00:00
Nadav Rotem
8ffad56f8e Implement vector-select support for avx256. Refactor the vblend implementation to have tablegen match the instruction by the node type
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139400 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-09 20:29:17 +00:00
Nadav Rotem
ffe3e7da84 Add X86-SSE4 codegen support for vector-select.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139285 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-08 08:11:19 +00:00
Rafael Espindola
5c984df26b Fix comment. Noticed by Duncan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139161 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-06 19:29:31 +00:00
Duncan Sands
28b77e968d Add codegen support for vector select (in the IR this means a select
with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons.  Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all").  Patch mostly by
Nadav Rotem.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139159 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-06 19:07:46 +00:00
Duncan Sands
4a544a79bd Split the init.trampoline intrinsic, which currently combines GCC's
init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC.  While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function.  To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function.  Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!).  Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC.  Patch mostly by Sanjoy Das.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139140 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-06 13:37:06 +00:00
Rafael Espindola
151ab3e2f7 Adds support for variable sized allocas. For a variable sized alloca,
code is inserted to first check if the current stacklet has enough
space. If so, space is allocated by simply decrementing the stack
pointer. Otherwise a runtime routine (__morestack_allocate_stack_space
in libgcc) is called which allocates the required memory from the
heap.

Patch by Sanjoy Das.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138818 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-30 19:47:04 +00:00
Rafael Espindola
d07b7ec772 Adds a SelectionDAG node X86SegAlloca which will be custom lowered
from DYNAMIC_STACKALLOC.

Two new pseudo instructions (SEG_ALLOCA_32 and SEG_ALLOCA_64) which
will match X86SegAlloca (based on word size) are also added.  They
will be custom emitted to inject the actual stack handling code.

Patch by Sanjoy Das.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138814 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-30 19:43:21 +00:00
Eli Friedman
43f51aeca8 Add support for generating CMPXCHG16B on x86-64 for the cmpxchg IR instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138660 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-26 21:21:21 +00:00
Craig Topper
13894fa135 Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid costly scalarization. Fixes PR10711.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138427 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-24 06:14:18 +00:00
Bruno Cardoso Lopes
0e6d230abd Introduce matching patterns for vbroadcast AVX instruction. The idea is to
match splats in the form (splat (scalar_to_vector (load ...))) whenever
the load can be folded. All the logic and instruction emission is
working but because of PR8156, there are no ways to match loads, cause
they can never be folded for splats. Thus, the tests are XFAILed, but
I've tested and exercised all the logic using a relaxed version for
checking the foldable loads, as if the bug was already fixed. This
should work out of the box once PR8156 gets fixed since MayFoldLoad will
work as expected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137810 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-17 02:29:19 +00:00
Bruno Cardoso Lopes
53cae1362d The VPERM2F128 is a AVX instruction which permutes between two 256-bit
vectors. It operates on 128-bit elements instead of regular scalar
types. Recognize shuffles that are suitable for VPERM2F128 and teach
the x86 legalizer how to handle them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137519 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-12 21:48:26 +00:00
Bruno Cardoso Lopes
9065d4b65f Cleanup PALIGNR handling and remove the old palign pattern fragment.
Also make PALIGNR masks to don't match 256-bits, which isn't supported
It's also a step to solve PR10489

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136448 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-29 01:30:59 +00:00
Eli Friedman
1464846801 Code generation for 'fence' instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136283 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-27 22:21:52 +00:00
Bruno Cardoso Lopes
cea34e41fa The vpermilps and vpermilpd have different behaviour regarding the
usage of the shuffle bitmask. Both work in 128-bit lanes without
crossing, but in the former the mask of the high part is the same
used by the low part while in the later both lanes have independent
masks. Handle this properly and and add support for vpermilpd.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136200 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-27 00:56:34 +00:00
Bruno Cardoso Lopes
4ea496846a Recognize unpckh* masks and match 256-bit versions. The new versions are
different from the previous 128-bit because they work in lanes.
Update a few comments and add testcases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136157 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 22:03:40 +00:00
Bruno Cardoso Lopes
9123c6fea0 More movsldup/movshdup cleanup. Rewrite the mask matching function and add
support for 256-bit versions (but no instruction selection yet, coming next).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136050 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 02:39:28 +00:00
Bruno Cardoso Lopes
6683efb4cd -Inspected a AVX code block added by someone in early Feb. This was never used
and was actually very wrong, fix it and make it simpler. Also remove the
ConcatVectors function, which is unused now.

- Fix a introduction of useless nodes in r126664 and r126264. The
VUNPCKL* should never be introduced cause we don't want duplicate
nodes for 128 AVX and non-AVX modes, the actual instruction
difference only exists during isel, but not for target specific DAG
nodes. We only introduce V* target nodes when there is no 128-bit
version already there.

- Fix a fragile test and make it more useful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135729 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-22 00:15:07 +00:00
Bruno Cardoso Lopes
65b74e1d00 Add support for 256-bit versions of VPERMIL instruction. This is a new
instruction introduced in AVX, which can operate on 128 and 256-bit vectors.
It considers a 256-bit vector as two independent 128-bit lanes. It can permute
any 32 or 64 elements inside a lane, and restricts the second lane to
have the same permutation of the first one. With the improved splat support
introduced early today, adding codegen for this instruction enable more
efficient 256-bit code:

Instead of:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vextractf128  $1, %ymm0, %xmm1
  shufps  $1, %xmm1, %xmm1
  movss %xmm1, 28(%rsp)
  movss %xmm1, 24(%rsp)
  movss %xmm1, 20(%rsp)
  movss %xmm1, 16(%rsp)
  vextractf128  $0, %ymm0, %xmm0
  shufps  $1, %xmm0, %xmm0
  movss %xmm0, 12(%rsp)
  movss %xmm0, 8(%rsp)
  movss %xmm0, 4(%rsp)
  movss %xmm0, (%rsp)
  vmovaps (%rsp), %ymm0
We get:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vpermilps $85, %ymm0, %ymm0

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135662 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-21 01:55:47 +00:00
Chris Lattner
db125cfaf5 land David Blaikie's patch to de-constify Type, with a few tweaks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135375 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-18 04:54:35 +00:00
Nadav Rotem
d0f3ef807e [VECTOR-SELECT]
During type legalization we often use the SIGN_EXTEND_INREG SDNode.
When this SDNode is legalized during the LegalizeVector phase, it is
scalarized because non-simple types are automatically marked to be expanded.
In this patch we add support for lowering SIGN_EXTEND_INREG manually.
This fixes CodeGen/X86/vec_sext.ll when running with the '-promote-elements'
flag.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135144 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-14 11:11:14 +00:00
Bruno Cardoso Lopes
c1af4772f1 The target specific node PANDN name is misleading. That happens because
it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN
instruction. Rename it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135087 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-13 21:36:47 +00:00
Eric Christopher
d176af8cf3 Use getRegForInlineAsmConstraint instead of custom defining regclasses
via vectors.

Part of rdar://9643582


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134079 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-29 17:23:50 +00:00
Evan Cheng
ef41ff618f Remove TargetOptions.h dependency from X86Subtarget.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133726 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-23 17:54:54 +00:00
Eric Christopher
471e422480 Add a parameter to CCState so that it can access the MachineFunction.
No functional change.

Part of PR6965


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132763 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-08 23:55:35 +00:00
Stuart Hastings
f99a4b82a4 Followup to 132458, omit unnecessary stack copy when x87 input is a
load.  rdar://problem/6373334


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132696 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-06 23:15:58 +00:00
Stuart Hastings
865f09334f Reapply 132424 with fixes. This fixes PR10068.
rdar://problem/5993888


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132606 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-03 23:53:54 +00:00
Eric Christopher
100c833416 Have LowerOperandForConstraint handle multiple character constraints.
Part of rdar://9119939


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132510 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-02 23:16:42 +00:00
Rafael Espindola
251b4a0405 Revert 132424 to fix PR10068.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132479 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-02 19:57:47 +00:00
Stuart Hastings
ec880283b3 Recommit 132404 with fixes. rdar://problem/5993888
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132424 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 21:33:14 +00:00
Stuart Hastings
4abc5fea9c Revert 132404 to appease a buildbot. rdar://problem/5993888
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132419 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 19:52:20 +00:00
Stuart Hastings
10ff0bbdfb Add support for x86 CMPEQSS and friends. These instructions do a
floating-point comparison, generate a mask of 0s or 1s, and generally
DTRT with NaNs.  Only profitable when the user wants a materialized 0
or 1 at runtime.  rdar://problem/5993888


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132404 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 17:17:45 +00:00
Stuart Hastings
4fd0dee3bf FGETSIGN support for x86, using movmskps/pd. Will be enabled with a
patch to TargetLowering.cpp.  rdar://problem/5660695


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132388 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 04:39:42 +00:00
Stuart Hastings
2aa0f23e1c Reverting 132105: it broke some LLVM-GCC DejaGNU tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132108 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-26 04:09:49 +00:00
Stuart Hastings
aa4e6afc9b Correctly handle a one-word struct passed byval on x86_64.
rdar://problem/6920088


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132105 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-26 02:44:56 +00:00
Eli Friedman
b8e0d3412c Clean up the mess created by r131467+r131469.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131471 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-17 18:02:22 +00:00
Stuart Hastings
6db2c2fe21 Revert 131467 due to buildbot complaint.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131469 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-17 16:59:46 +00:00
Stuart Hastings
504421e327 Fix an obscure issue in X86_64 parameter passing: if a tiny byval is
passed as the fifth parameter, insure it's passed correctly (in R9).
rdar://problem/6920088


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131467 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-17 16:45:55 +00:00
Nadav Rotem
4301222525 Add custom lowering of X86 vector SRA/SRL/SHL when the shift amount is a splat vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131179 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-11 08:12:09 +00:00
Eli Friedman
fc5d305597 Make the logic for determining function alignment more explicit. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131012 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-06 20:34:06 +00:00
Evan Cheng
485fafc840 Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127981 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-21 01:19:09 +00:00
Daniel Dunbar
7a90e04fc7 Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors
to canonicalize IR", it broke a lot of things.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127954 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-19 21:47:14 +00:00
Evan Cheng
ae16d6b972 SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
  switch(x) {
  case 1: return f1();
  case 2: return f2();
  case 3: return f3();
  case 4: return f4();
  case 5: return f5();
  case 6: return f6();
  }
}

=>
LBB0_2:                                 ## %sw.bb
  callq   _f1
  popq    %rbp
  ret
LBB0_3:                                 ## %sw.bb1
  callq   _f2
  popq    %rbp
  ret
LBB0_4:                                 ## %sw.bb3
  callq   _f3
  popq    %rbp
  ret

This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:

sw.bb7:                                           ; preds = %entry
  %call8 = tail call i32 @f5() nounwind
  br label %return
sw.bb9:                                           ; preds = %entry
  %call10 = tail call i32 @f6() nounwind
  br label %return
return:
  %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
  ret i32 %retval.0

This allows codegen to generate better code like this:

LBB0_2:                                 ## %sw.bb
        jmp     _f1                     ## TAILCALL
LBB0_3:                                 ## %sw.bb1
        jmp     _f2                     ## TAILCALL
LBB0_4:                                 ## %sw.bb3
        jmp     _f3                     ## TAILCALL

rdar://9147433


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127953 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-19 17:17:39 +00:00
Cameron Zwarich
7bbf0ee97c Move more logic into getTypeForExtArgOrReturn.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127809 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-17 14:53:37 +00:00
Cameron Zwarich
4457968011 Rename getTypeForExtendedInteger() to getTypeForExtArgOrReturn().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127807 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-17 14:21:56 +00:00
Cameron Zwarich
ebe8173941 The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byte
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.

This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127766 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-16 22:20:18 +00:00
Cameron Zwarich
be2119e8e2 Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127175 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-07 21:56:36 +00:00
Owen Anderson
95771afbfd Allow targets to specify a the type of the RHS of a shift parameterized on the type of the LHS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126518 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-25 21:41:48 +00:00
David Greene
fbf05d32b4 [AVX] General VUNPCKL codegen support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126264 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-22 23:31:46 +00:00
David Greene
ccacdc1952 [AVX] Support VSINSERTF128 with more patterns and appropriate
infrastructure.  This makes lowering 256-bit vectors to 128-bit
vectors simple when 256-bit vector support is not available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124868 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-04 16:08:29 +00:00
David Greene
c38a03eeca [AVX] VEXTRACTF128 support. This commit includes patterns for
matching EXTRACT_SUBVECTOR to VEXTRACTF128 along with support routines
to examine and translate index values.  VINSERTF128 comes next.  With
these two in place we can begin supporting more AVX operations as
INSERT/EXTRACT can be used as a fallback when 256-bit support is not
available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124797 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-03 15:50:00 +00:00
David Greene
cfe33c46aa [AVX] Add INSERT_SUBVECTOR and support it on x86. This provides a
default implementation for x86, going through the stack in a similr
fashion to how the codegen implements BUILD_VECTOR.  Eventually this
will get matched to VINSERTF128 if AVX is available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124307 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-26 19:13:22 +00:00
David Greene
91585098ef [AVX] Support EXTRACT_SUBVECTOR on x86. This provides a default
implementation of EXTRACT_SUBVECTOR for x86, going through the stack
in a similr fashion to how the codegen implements BUILD_VECTOR.
Eventually this will get matched to VEXTRACTF128 if AVX is available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124292 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-26 15:38:49 +00:00
Nate Begeman
672fb6225b Implement feedback from Bruno on making pblendvb an x86-specific ISD node in addition to being an intrinsic, and convert
lowering to use it.  Hopefully the pattern fragment is doing the right thing with XMM0, looks correct in testing.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122277 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 22:04:24 +00:00
Chris Lattner
5b85654844 Change the X86 backend to stop using the evil ADDC/ADDE/SUBC/SUBE nodes (which
their carry depenedencies with MVT::Flag operands) and use clean and beautiful
EFLAGS dependences instead.

We do this by changing the modelling of SBB/ADC to have EFLAGS input and outputs
(which is what requires the previous scheduler change) and change X86 ISelLowering
to custom lower ADDC and friends down to X86ISD::ADD/ADC/SUB/SBB nodes.

With the previous series of changes, this causes no changes in the testsuite, woo.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122213 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 00:59:46 +00:00
Chris Lattner
c19d1c3ba2 improve the setcc -> setcc_carry optimization to happen more
consistently by moving it out of lowering into dag combine.

Add some missing patterns for matching away extended versions of setcc_c.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122201 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 22:08:31 +00:00
Nate Begeman
b65c175d32 Add support for matching psign & plendvb to the x86 target
Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122098 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17 22:55:37 +00:00
Chris Lattner
b20e0b1fdd it turns out that when ".with.overflow" intrinsics were added to the X86
backend that they were all implemented except umul.  This one fell back
to the default implementation that did a hi/lo multiply and compared the
top.  Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test.  Now we compile:

void *func(long count) {
      return new int[count];
}

into:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
	testb	%cl, %cl                ## encoding: [0x84,0xc9]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

instead of:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120935 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 07:30:36 +00:00
Evan Cheng
3d2125c9db Enable sibling call optimization of libcalls which are expanded during
legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120501 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 23:55:39 +00:00
Eric Christopher
82be220092 Fix some cleanups from my last patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120410 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 08:10:28 +00:00
Eric Christopher
228232b282 Rewrite mwait and monitor support and custom lower arguments.
Fixes PR8573.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120404 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 07:20:12 +00:00
Rafael Espindola
5bf7c534cf Lower TLS_addr32 and TLS_addr64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120225 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-27 20:43:02 +00:00
Wesley Peck
bf17cfa3f9 Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119990 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-23 03:31:01 +00:00
Duncan Sands
59d2dad59e On X86, MEMBARRIER, MFENCE, SFENCE, LFENCE are not target memory intrinsics,
so don't claim they are.  They are allocated using DAG.getNode, so attempts
to access MemSDNode fields results in reading off the end of the allocated
memory.  This fixes crashes with "llc -debug" due to debug code trying to
print MemSDNode fields for these barrier nodes (since the crashes are not
deterministic, use valgrind to see this).  Add some nasty checking to try
to catch this kind of thing in the future.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119901 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-20 11:25:00 +00:00
Chris Lattner
142b531e02 move the pic base symbol stuff up to MachineFunction
since it is trivial and will be shared between ppc and x86.
This substantially simplifies the X86 backend also.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119089 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-14 22:48:15 +00:00
Chris Lattner
4fd0ea0166 simplify getPICBaseSymbol a bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119088 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-14 22:37:11 +00:00
Duncan Sands
4590766580 Factorize the duplicated logic for choosing the right argument
calling convention out of the fast and normal ISel files, and
into the calling convention TD file.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117856 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-31 13:21:44 +00:00
John Thompson
44ab89eb37 Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117667 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-29 17:29:13 +00:00
Michael J. Spencer
e9c253e0bc X86: Add alloca probing to dynamic alloca on Windows. Fixes PR8424.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116984 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-21 01:41:01 +00:00
Michael J. Spencer
6e56b18e57 Fix Whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116972 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-20 23:40:27 +00:00
Dan Gohman
320afb8c81 Initial va_arg support for x86-64. Patch by David Meyer!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116319 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-12 18:00:49 +00:00
Dale Johannesen
0488fb649a Massive rewrite of MMX:
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.

Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics. 

MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces.  Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.

The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115243 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-30 23:57:10 +00:00
Chris Lattner
f93b90c5df reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114529 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 04:39:11 +00:00
Chris Lattner
492a43e6f6 convert the last 4 X86ISD nodes that should have memoperands to have them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114523 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 01:28:21 +00:00
Chris Lattner
2156b79c49 give X86ISD::FNSTCW16m a memoperand, since it touches memory. It only
can access the stack due to how it is generated though.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114522 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 01:11:26 +00:00
Chris Lattner
0729093cd7 give FP_TO_INT16_IN_MEM and friends a memoperand. They are only
used with stack slots, but hey, lets be safe.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114521 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 01:05:16 +00:00
Chris Lattner
8864155a35 give VZEXT_LOAD a memory operand, it now works with segment registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114515 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 00:34:38 +00:00
Chris Lattner
93c4a5bef7 give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114508 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-21 23:59:42 +00:00
Owen Anderson
bc146b0a4d Reimplement r114460 in target-independent DAGCombine rather than target-dependent, by using
the predicate to discover the number of sign bits.  Enhance X86's target lowering to provide
a useful response to this query.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114473 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-21 20:42:50 +00:00
John Thompson
eac6e1d0c7 Added skeleton for inline asm multiple alternative constraint support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113766 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-13 18:15:37 +00:00
Bruno Cardoso Lopes
56098f5d26 Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112694 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes
f2db5b48d0 Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112642 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes
bf8154a439 Prepare LowerVECTOR_SHUFFLEv8i16 to use x86 target specific nodes directly
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111704 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-21 01:32:18 +00:00
Bruno Cardoso Lopes
3157ef1c13 This is the first step towards refactoring the x86 vector shuffle code. The
general idea here is to have a group of x86 target specific nodes which are
going to be selected during lowering and then directly matched in isel.

The commit includes the addition of those specific nodes and a *bunch* of
patterns, and incrementally we're going to switch between them and what we
have right now. Both the patterns and target specific nodes can change as
we move forward with this work.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111691 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-20 22:55:05 +00:00
Bruno Cardoso Lopes
045573ce21 Add AVX matching patterns to Packed Bit Test intrinsics.
Apply the same approach of SSE4.1 ptest intrinsics but
create a new x86 node "testp" since AVX introduces
vtest{ps}{pd} instructions which set ZF and CF depending
on sign bit AND and ANDN of packed floating-point sources.

This is slightly different from what the "ptest" does.
Tests comming with the other 256 intrinsics tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110744 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-10 23:25:42 +00:00
Nate Begeman
bdcb5afb77 ~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches.
For:

define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
entry:
  %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
  %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
  ret <2 x i64> %tmp2
}

We get:

_shl:                                   ## @shl
	pslld	$23, %xmm1
	paddd	LCPI0_0, %xmm1
	cvttps2dq	%xmm1, %xmm1
	pmulld	%xmm1, %xmm0
	ret

Instead of:

_shl:                                   ## @shl
	pshufd	$3, %xmm0, %xmm2
	movd	%xmm2, %eax
	pshufd	$3, %xmm1, %xmm2
	movd	%xmm2, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	pshufd	$1, %xmm0, %xmm3
	movd	%xmm3, %eax
	pshufd	$1, %xmm1, %xmm3
	movd	%xmm3, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm3
	punpckldq	%xmm2, %xmm3
	movd	%xmm0, %eax
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	movhlps	%xmm0, %xmm0
	movd	%xmm0, %eax
	movhlps	%xmm1, %xmm1
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm0
	punpckldq	%xmm0, %xmm2
	movdqa	%xmm2, %xmm0
	punpckldq	%xmm3, %xmm0
	ret


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109549 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-27 22:37:06 +00:00
Evan Cheng
dee81010eb On x86, f32 / f64 nodes share the same registers as 128-bit vector values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109450 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-26 21:50:05 +00:00
Evan Cheng
70017e44cd Add an ILP scheduler. This is a register pressure aware scheduler that's
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.

On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109300 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-24 00:39:05 +00:00
Eric Christopher
9a9d275dc7 Custom lower the memory barrier instructions and add support
for lowering without sse2.  Add a couple of new testcases.

Fixes a few libgomp tests and latent bugs.  Remove a few todos.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109078 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-22 02:48:34 +00:00
Eric Christopher
dab4dac2a0 Pulling out previous patch, must've run the tests in
the wrong directory.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109005 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-21 09:23:56 +00:00
Eric Christopher
87f41370a8 Lower MEMBARRIER on x86 and support processors without SSE2.
Fixes a pile of libgomp failures in the llvm-gcc testsuite due
to the libcall not existing.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109004 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-21 09:05:23 +00:00
Jakob Stoklund Olesen
b5378ea12e Use TargetOpcode::COPY instead of X86-native register copy instructions when
lowering atomics. This will allow those copies to still be coalesced after
TII::isMoveInstr is removed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108385 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-14 23:50:27 +00:00
Dan Gohman
84023e0fbe Reapply bottom-up fast-isel, with several fixes for x86-32:
- Check getBytesToPopOnReturn().
 - Eschew ST0 and ST1 for return values.
 - Fix the PIC base register initialization so that it doesn't ever
   fail to end up the top of the entry block.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108039 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-10 09:00:22 +00:00
Bob Wilson
02266e29f9 --- Reverse-merging r107947 into '.':
U    utils/TableGen/FastISelEmitter.cpp
--- Reverse-merging r107943 into '.':
U    test/CodeGen/X86/fast-isel.ll
U    test/CodeGen/X86/fast-isel-loads.ll
U    include/llvm/Target/TargetLowering.h
U    include/llvm/Support/PassNameParser.h
U    include/llvm/CodeGen/FunctionLoweringInfo.h
U    include/llvm/CodeGen/CallingConvLower.h
U    include/llvm/CodeGen/FastISel.h
U    include/llvm/CodeGen/SelectionDAGISel.h
U    lib/CodeGen/LLVMTargetMachine.cpp
U    lib/CodeGen/CallingConvLower.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
U    lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
U    lib/CodeGen/SelectionDAG/FastISel.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
U    lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
U    lib/CodeGen/SelectionDAG/InstrEmitter.cpp
U    lib/CodeGen/SelectionDAG/TargetLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.h
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86FastISel.cpp
U    lib/Target/X86/X86ISelLowering.h


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107987 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-09 16:37:18 +00:00
Dan Gohman
bf87e24917 Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting
a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107943 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-09 00:39:23 +00:00
Dan Gohman
f595141525 Revert 107840 107839 107813 107804 107800 107797 107791.
Debug info intrinsics win for now.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107850 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-08 01:00:56 +00:00
Dan Gohman
f423a69839 Add X86FastISel support for return statements. This entails refactoring
a bunch of stuff, to allow the target-independent calling convention
logic to be employed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107800 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-07 18:32:53 +00:00
Dan Gohman
a4160c3434 Simplify FastISel's constructor by giving it a FunctionLoweringInfo
instance, rather than pointers to all of FunctionLoweringInfo's
members.

This eliminates an NDEBUG ABI sensitivity.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107789 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-07 16:29:44 +00:00
Dan Gohman
c9403659a9 Split the SDValue out of OutputArg so that SelectionDAG-independent
code can do calling-convention queries. This obviates OutputArgReg.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107786 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-07 15:54:55 +00:00
Dan Gohman
c9af33c685 CanLowerReturn doesn't need a SelectionDAG; it just needs an LLVMContext.
SelectBasicBlock doesn't needs its BasicBlock argument.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107712 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-06 22:19:37 +00:00
Eric Christopher
f7a0c7bf8b Fix up -fstack-protector on linux to use the segment
registers.  Split out testcases per architecture and os
now.

Patch from Nelson Elhage.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107640 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-06 05:18:56 +00:00
Dale Johannesen
1784d160e4 The hasMemory argument is irrelevant to how the argument
for an "i" constraint should get lowered; PR 6309.  While
this argument was passed around a lot, this is the only
place it was used, so it goes away from a lot of other
places.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106893 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-25 21:55:36 +00:00
Eric Christopher
30ef0e5658 Add first pass at darwin tls compiler support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105381 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-03 04:07:48 +00:00
Dale Johannesen
7d07b48b26 Fix i64->f64 conversion, x86-64, -no-sse. A bit
tricky since there's a 3rd 64-bit type, MMX vectors.
PR 7135.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104308 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-21 00:52:33 +00:00
Dan Gohman
ff7a562751 Implement a bunch more TargetSelectionDAGInfo infrastructure.
Move EmitTargetCodeForMemcpy, EmitTargetCodeForMemset, and
EmitTargetCodeForMemmove out of TargetLowering and into
SelectionDAGInfo to exercise this.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103481 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-11 17:31:57 +00:00
Dan Gohman
419e4f9263 Remove the TargetLowering::getSubtarget() virtual function, which
was unused. TargetMachine::getSubtarget() is used instead.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103474 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-11 16:21:03 +00:00
Dan Gohman
af1d8ca44a Get rid of the EdgeMapping map. Instead, just check for BasicBlock
changes before doing phi lowering for switches.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102809 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-01 00:01:06 +00:00
Evan Cheng
552f09a0d7 Promoting 16-bit cmp / test aren't free. Don't do it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102366 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-26 19:06:11 +00:00
Evan Cheng
962021bc7f - Move TargetLowering::EmitTargetCodeForFrameDebugValue to TargetInstrInfo and rename it to emitFrameIndexDebugValue.
- Teach spiller to modify DBG_VALUE instructions to reference spill slots.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102323 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-26 07:38:55 +00:00
Dale Johannesen
f822e733af Stop abusing EmitInstrWithCustomInserter for target-dependent
form of DEBUG_VALUE, as it doesn't have reasonable default
behavior for unsupported targets.  Add a new hook instead.
No functional change.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102320 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-25 21:33:54 +00:00
Evan Cheng
2808ccb775 Fix X86ISD::CMP i16 to i32 promotion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102192 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-23 18:21:16 +00:00
Dan Gohman
f81eca0ab9 Move HandlePHINodesInSuccessorBlocks functions out of SelectionDAGISel
and into SelectionDAGBuilder and FastISel.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102123 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-22 20:46:50 +00:00
Evan Cheng
5528e7bcb1 isel (i32 anyext i16) as insert_subreg when 16-bit ops are being promoted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101979 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-21 01:47:12 +00:00
Dan Gohman
d858e90f03 Use const qualifiers with TargetLowering. This eliminates several
const_casts, and it reinforces the design of the Target classes being
immutable.

SelectionDAGISel::IsLegalToFold is now a static member function, because
PIC16 uses it in an unconventional way. There is more room for API
cleanup here.

And PIC16's AsmPrinter no longer uses TargetLowering.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101635 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-17 15:26:15 +00:00
Dan Gohman
1e93df6f0b Move per-function state out of TargetLowering subclasses and into
MachineFunctionInfo subclasses.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101634 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-17 14:41:14 +00:00
Evan Cheng
e5b51ac770 More work to allow dag combiner to promote 16-bit ops to 32-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101621 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-17 06:13:15 +00:00
Dan Gohman
37f32ee7ff Eliminate an unnecessary SelectionDAG dependency in getOptimalMemOpType.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101531 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-16 20:11:05 +00:00
Evan Cheng
64b7bf71e8 Adding support for dag combiner to promote operations for profit. This requires target specific queries. For example, x86 should promote i16 to i32 when it does not impact load folding.
x86 support is off by default. It can be enabled with -promote-16bit.

Work in progress.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101448 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-16 06:14:10 +00:00
Dan Gohman
46510a73e9 Add const qualifiers to CodeGen's use of LLVM IR constructs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101334 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-15 01:51:59 +00:00
Dan Gohman
2520864773 Factor out EH landing pad code into a separate function, and constify
a bunch of stuff to support it.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101273 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-14 19:53:31 +00:00