Commit Graph

391 Commits

Author SHA1 Message Date
Nadav Rotem
4301222525 Add custom lowering of X86 vector SRA/SRL/SHL when the shift amount is a splat vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131179 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-11 08:12:09 +00:00
Eli Friedman
fc5d305597 Make the logic for determining function alignment more explicit. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131012 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-06 20:34:06 +00:00
Evan Cheng
485fafc840 Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127981 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-21 01:19:09 +00:00
Daniel Dunbar
7a90e04fc7 Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors
to canonicalize IR", it broke a lot of things.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127954 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-19 21:47:14 +00:00
Evan Cheng
ae16d6b972 SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR
to have single return block (at least getting there) for optimizations. This
is general goodness but it would prevent some tailcall optimizations.
One specific case is code like this:
int f1(void);
int f2(void);
int f3(void);
int f4(void);
int f5(void);
int f6(void);
int foo(int x) {
  switch(x) {
  case 1: return f1();
  case 2: return f2();
  case 3: return f3();
  case 4: return f4();
  case 5: return f5();
  case 6: return f6();
  }
}

=>
LBB0_2:                                 ## %sw.bb
  callq   _f1
  popq    %rbp
  ret
LBB0_3:                                 ## %sw.bb1
  callq   _f2
  popq    %rbp
  ret
LBB0_4:                                 ## %sw.bb3
  callq   _f3
  popq    %rbp
  ret

This patch teaches codegenprep to duplicate returns when the return value
is a phi and where the phi operands are produced by tail calls followed by
an unconditional branch:

sw.bb7:                                           ; preds = %entry
  %call8 = tail call i32 @f5() nounwind
  br label %return
sw.bb9:                                           ; preds = %entry
  %call10 = tail call i32 @f6() nounwind
  br label %return
return:
  %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
  ret i32 %retval.0

This allows codegen to generate better code like this:

LBB0_2:                                 ## %sw.bb
        jmp     _f1                     ## TAILCALL
LBB0_3:                                 ## %sw.bb1
        jmp     _f2                     ## TAILCALL
LBB0_4:                                 ## %sw.bb3
        jmp     _f3                     ## TAILCALL

rdar://9147433


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127953 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-19 17:17:39 +00:00
Cameron Zwarich
7bbf0ee97c Move more logic into getTypeForExtArgOrReturn.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127809 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-17 14:53:37 +00:00
Cameron Zwarich
4457968011 Rename getTypeForExtendedInteger() to getTypeForExtArgOrReturn().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127807 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-17 14:21:56 +00:00
Cameron Zwarich
ebe8173941 The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byte
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.

This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127766 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-16 22:20:18 +00:00
Cameron Zwarich
be2119e8e2 Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127175 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-07 21:56:36 +00:00
Owen Anderson
95771afbfd Allow targets to specify a the type of the RHS of a shift parameterized on the type of the LHS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126518 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-25 21:41:48 +00:00
David Greene
fbf05d32b4 [AVX] General VUNPCKL codegen support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126264 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-22 23:31:46 +00:00
David Greene
ccacdc1952 [AVX] Support VSINSERTF128 with more patterns and appropriate
infrastructure.  This makes lowering 256-bit vectors to 128-bit
vectors simple when 256-bit vector support is not available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124868 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-04 16:08:29 +00:00
David Greene
c38a03eeca [AVX] VEXTRACTF128 support. This commit includes patterns for
matching EXTRACT_SUBVECTOR to VEXTRACTF128 along with support routines
to examine and translate index values.  VINSERTF128 comes next.  With
these two in place we can begin supporting more AVX operations as
INSERT/EXTRACT can be used as a fallback when 256-bit support is not
available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124797 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-03 15:50:00 +00:00
David Greene
cfe33c46aa [AVX] Add INSERT_SUBVECTOR and support it on x86. This provides a
default implementation for x86, going through the stack in a similr
fashion to how the codegen implements BUILD_VECTOR.  Eventually this
will get matched to VINSERTF128 if AVX is available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124307 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-26 19:13:22 +00:00
David Greene
91585098ef [AVX] Support EXTRACT_SUBVECTOR on x86. This provides a default
implementation of EXTRACT_SUBVECTOR for x86, going through the stack
in a similr fashion to how the codegen implements BUILD_VECTOR.
Eventually this will get matched to VEXTRACTF128 if AVX is available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124292 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-26 15:38:49 +00:00
Nate Begeman
672fb6225b Implement feedback from Bruno on making pblendvb an x86-specific ISD node in addition to being an intrinsic, and convert
lowering to use it.  Hopefully the pattern fragment is doing the right thing with XMM0, looks correct in testing.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122277 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 22:04:24 +00:00
Chris Lattner
5b85654844 Change the X86 backend to stop using the evil ADDC/ADDE/SUBC/SUBE nodes (which
their carry depenedencies with MVT::Flag operands) and use clean and beautiful
EFLAGS dependences instead.

We do this by changing the modelling of SBB/ADC to have EFLAGS input and outputs
(which is what requires the previous scheduler change) and change X86 ISelLowering
to custom lower ADDC and friends down to X86ISD::ADD/ADC/SUB/SBB nodes.

With the previous series of changes, this causes no changes in the testsuite, woo.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122213 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 00:59:46 +00:00
Chris Lattner
c19d1c3ba2 improve the setcc -> setcc_carry optimization to happen more
consistently by moving it out of lowering into dag combine.

Add some missing patterns for matching away extended versions of setcc_c.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122201 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 22:08:31 +00:00
Nate Begeman
b65c175d32 Add support for matching psign & plendvb to the x86 target
Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122098 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17 22:55:37 +00:00
Chris Lattner
b20e0b1fdd it turns out that when ".with.overflow" intrinsics were added to the X86
backend that they were all implemented except umul.  This one fell back
to the default implementation that did a hi/lo multiply and compared the
top.  Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test.  Now we compile:

void *func(long count) {
      return new int[count];
}

into:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
	testb	%cl, %cl                ## encoding: [0x84,0xc9]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

instead of:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120935 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 07:30:36 +00:00
Evan Cheng
3d2125c9db Enable sibling call optimization of libcalls which are expanded during
legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120501 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 23:55:39 +00:00
Eric Christopher
82be220092 Fix some cleanups from my last patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120410 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 08:10:28 +00:00
Eric Christopher
228232b282 Rewrite mwait and monitor support and custom lower arguments.
Fixes PR8573.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120404 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 07:20:12 +00:00
Rafael Espindola
5bf7c534cf Lower TLS_addr32 and TLS_addr64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120225 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-27 20:43:02 +00:00
Wesley Peck
bf17cfa3f9 Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119990 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-23 03:31:01 +00:00
Duncan Sands
59d2dad59e On X86, MEMBARRIER, MFENCE, SFENCE, LFENCE are not target memory intrinsics,
so don't claim they are.  They are allocated using DAG.getNode, so attempts
to access MemSDNode fields results in reading off the end of the allocated
memory.  This fixes crashes with "llc -debug" due to debug code trying to
print MemSDNode fields for these barrier nodes (since the crashes are not
deterministic, use valgrind to see this).  Add some nasty checking to try
to catch this kind of thing in the future.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119901 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-20 11:25:00 +00:00
Chris Lattner
142b531e02 move the pic base symbol stuff up to MachineFunction
since it is trivial and will be shared between ppc and x86.
This substantially simplifies the X86 backend also.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119089 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-14 22:48:15 +00:00
Chris Lattner
4fd0ea0166 simplify getPICBaseSymbol a bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119088 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-14 22:37:11 +00:00
Duncan Sands
4590766580 Factorize the duplicated logic for choosing the right argument
calling convention out of the fast and normal ISel files, and
into the calling convention TD file.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117856 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-31 13:21:44 +00:00
John Thompson
44ab89eb37 Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117667 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-29 17:29:13 +00:00
Michael J. Spencer
e9c253e0bc X86: Add alloca probing to dynamic alloca on Windows. Fixes PR8424.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116984 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-21 01:41:01 +00:00
Michael J. Spencer
6e56b18e57 Fix Whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116972 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-20 23:40:27 +00:00
Dan Gohman
320afb8c81 Initial va_arg support for x86-64. Patch by David Meyer!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116319 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-12 18:00:49 +00:00
Dale Johannesen
0488fb649a Massive rewrite of MMX:
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.

Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics. 

MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces.  Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.

The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115243 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-30 23:57:10 +00:00
Chris Lattner
f93b90c5df reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114529 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 04:39:11 +00:00
Chris Lattner
492a43e6f6 convert the last 4 X86ISD nodes that should have memoperands to have them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114523 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 01:28:21 +00:00
Chris Lattner
2156b79c49 give X86ISD::FNSTCW16m a memoperand, since it touches memory. It only
can access the stack due to how it is generated though.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114522 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 01:11:26 +00:00
Chris Lattner
0729093cd7 give FP_TO_INT16_IN_MEM and friends a memoperand. They are only
used with stack slots, but hey, lets be safe.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114521 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 01:05:16 +00:00
Chris Lattner
8864155a35 give VZEXT_LOAD a memory operand, it now works with segment registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114515 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 00:34:38 +00:00
Chris Lattner
93c4a5bef7 give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114508 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-21 23:59:42 +00:00
Owen Anderson
bc146b0a4d Reimplement r114460 in target-independent DAGCombine rather than target-dependent, by using
the predicate to discover the number of sign bits.  Enhance X86's target lowering to provide
a useful response to this query.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114473 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-21 20:42:50 +00:00
John Thompson
eac6e1d0c7 Added skeleton for inline asm multiple alternative constraint support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113766 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-13 18:15:37 +00:00
Bruno Cardoso Lopes
56098f5d26 Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112694 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes
f2db5b48d0 Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112642 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes
bf8154a439 Prepare LowerVECTOR_SHUFFLEv8i16 to use x86 target specific nodes directly
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111704 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-21 01:32:18 +00:00
Bruno Cardoso Lopes
3157ef1c13 This is the first step towards refactoring the x86 vector shuffle code. The
general idea here is to have a group of x86 target specific nodes which are
going to be selected during lowering and then directly matched in isel.

The commit includes the addition of those specific nodes and a *bunch* of
patterns, and incrementally we're going to switch between them and what we
have right now. Both the patterns and target specific nodes can change as
we move forward with this work.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111691 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-20 22:55:05 +00:00
Bruno Cardoso Lopes
045573ce21 Add AVX matching patterns to Packed Bit Test intrinsics.
Apply the same approach of SSE4.1 ptest intrinsics but
create a new x86 node "testp" since AVX introduces
vtest{ps}{pd} instructions which set ZF and CF depending
on sign bit AND and ANDN of packed floating-point sources.

This is slightly different from what the "ptest" does.
Tests comming with the other 256 intrinsics tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110744 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-10 23:25:42 +00:00
Nate Begeman
bdcb5afb77 ~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches.
For:

define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
entry:
  %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
  %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
  ret <2 x i64> %tmp2
}

We get:

_shl:                                   ## @shl
	pslld	$23, %xmm1
	paddd	LCPI0_0, %xmm1
	cvttps2dq	%xmm1, %xmm1
	pmulld	%xmm1, %xmm0
	ret

Instead of:

_shl:                                   ## @shl
	pshufd	$3, %xmm0, %xmm2
	movd	%xmm2, %eax
	pshufd	$3, %xmm1, %xmm2
	movd	%xmm2, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	pshufd	$1, %xmm0, %xmm3
	movd	%xmm3, %eax
	pshufd	$1, %xmm1, %xmm3
	movd	%xmm3, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm3
	punpckldq	%xmm2, %xmm3
	movd	%xmm0, %eax
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	movhlps	%xmm0, %xmm0
	movd	%xmm0, %eax
	movhlps	%xmm1, %xmm1
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm0
	punpckldq	%xmm0, %xmm2
	movdqa	%xmm2, %xmm0
	punpckldq	%xmm3, %xmm0
	ret


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109549 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-27 22:37:06 +00:00
Evan Cheng
dee81010eb On x86, f32 / f64 nodes share the same registers as 128-bit vector values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109450 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-26 21:50:05 +00:00
Evan Cheng
70017e44cd Add an ILP scheduler. This is a register pressure aware scheduler that's
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.

On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109300 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-24 00:39:05 +00:00