Commit Graph

644 Commits

Author SHA1 Message Date
Arnold Schwaighofer
30e62c098b Tail call optimization improvements:
Move platform independent code (lowering of possibly overwritten
arguments, check for tail call optimization eligibility) from
target X86ISelectionLowering.cpp to TargetLowering.h and
SelectionDAGISel.cpp.

Initial PowerPC tail call implementation:

Support ppc32 implemented and tested (passes my tests and
test-suite llvm-test).  
Support ppc64 implemented and half tested (passes my tests).
On ppc tail call optimization is performed if 
  caller and callee are fastcc
  call is a tail call (in tail call position, call followed by ret)
  no variable argument lists or byval arguments
  option -tailcallopt is enabled
Supported:
 * non pic tail calls on linux/darwin
 * module-local tail calls on linux(PIC/GOT)/darwin(PIC)
 * inter-module tail calls on darwin(PIC)
If constraints are not met a normal call will be emitted.

A test checking the argument lowering behaviour on x86-64 was added.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50477 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-30 09:16:33 +00:00
Dan Gohman
1f13c686df Fix the SVOffset values for loads and stores produced by
memcpy/memset expansion. It was a bug for the SVOffset value
to be used in the actual address calculations.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50359 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-28 17:15:20 +00:00
Anton Korobeynikov
998a5bcc80 Properly lower vararg's FORMAL_ARGUMENTS node on win64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50325 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-27 23:15:03 +00:00
Chris Lattner
5e764233f3 A few inline asm cleanups:
- Make targetlowering.h fit in 80 cols.
  - Make LowerAsmOperandForConstraint const.
  - Make lowerXConstraint -> LowerXConstraint
  - Make LowerXConstraint return a const char* instead of taking a string byref.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50312 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-26 23:02:14 +00:00
Evan Cheng
44c0fd17e1 Extract the lower 64-bit if a MMX value is passed in a XMM register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50292 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-25 20:13:28 +00:00
Evan Cheng
10e864276b Special handling for MMX values being passed in either GPR64 or lower 64-bits of XMM registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50289 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-25 19:11:04 +00:00
Evan Cheng
ee472b1081 MMX argument passing fixes:
On Darwin / Linux x86-32, v8i8, v4i16, v2i32 values are passed in MM[0-2].                                                                                                                                      
On Darwin / Linux x86-32, v1i64 values are passed in memory.                                                                                                                                                    
On Darwin x86-64, v8i8, v4i16, v2i32 values are passed in XMM[0-7].                                                                                                                                     
On Darwin x86-64, v1i64 values are passed in 64-bit GPRs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50257 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-25 07:56:45 +00:00
Evan Cheng
2749c72f30 Fix bug in x86 memcpy / memset lowering. If there are trailing bytes not handled by rep instructions, a new memcpy / memset is introduced for them. However, since source / destination addresses are already adjusted, their offsets should be zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50239 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-25 00:26:43 +00:00
Dan Gohman
61a9213440 Implement an x86-64 ABI detail of passing structs by hidden first
argument. The x86-64 ABI requires the incoming value of %rdi to
be copied to %rax on exit from a function that is returning a
large C struct.

Also, add a README-X86-64 entry detailing the missed optimization
opportunity and proposing an alternative approach.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50075 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-21 23:59:07 +00:00
Chris Lattner
02a260aa11 Switch to using Simplified ConstantFP::get API.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49977 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-20 00:41:09 +00:00
Dan Gohman
28269139ee Fix the handling of va_copy on x86-64. As of llvm-gcc r49920
llvm-gcc is now lowering va_copy on x86-64, so this completes
the fix for PR2230.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49922 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-18 20:55:41 +00:00
Roman Levenstein
9cac5259fe Ongoing work on improving the instruction selection infrastructure:
Rename SDOperandImpl back to SDOperand.
Introduce the SDUse class that represents a use of the SDNode referred by
an SDOperand. Now it is more similar to Use/Value classes.

Patch is approved by Dan Gohman.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49795 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-16 16:15:27 +00:00
Dan Gohman
171c11ec93 Add support for the form of the SSE41 extractps instruction that
puts its result in a 32-bit GPR.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49762 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-16 02:32:24 +00:00
Dan Gohman
bcda285fcc Recreate the size SDNode instead of reusing the old one in the x86
memcpy lowering code; this ensures that the size node has the desired
result type. This fixes a regression from r49572 with @llvm.memcpy.i64
on x86-32.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49761 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-16 01:32:32 +00:00
Dan Gohman
29e4bdbf27 Fix const-correctness issues with the SrcValue handling in the
memory intrinsic expansion code.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49666 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-14 17:55:48 +00:00
Arnold Schwaighofer
4b5324ad2c This patch corrects the handling of byval arguments for tailcall
optimized x86-64 (and x86) calls so that they work (... at least for
my test cases).

Should fix the following problems:

Problem 1: When i introduced the optimized handling of arguments for
tail called functions (using a sequence of copyto/copyfrom virtual
registers instead of always lowering to top of the stack) i did not
handle byval arguments correctly e.g they did not work at all :).

Problem 2: On x86-64 after the arguments of the tail called function
are moved to their registers (which include ESI/RSI etc), tail call
optimization performs byval lowering which causes xSI,xDI, xCX
registers to be overwritten. This is handled in this patch by moving
the arguments to virtual registers first and after the byval lowering
the arguments are moved from those virtual registers back to
RSI/RDI/RCX.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49584 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-12 18:11:06 +00:00
Dan Gohman
707e018423 Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal
on any current target and aren't optimized in DAGCombiner. Instead
of using intermediate nodes, expand the operations, choosing between
simple loads/stores, target-specific code, and library calls,
immediately.

Previously, the code to emit optimized code for these operations
was only used at initial SelectionDAG construction time; now it is
used at all times. This fixes some cases where rep;movs was being
used for small copies where simple loads/stores would be better.

This also cleans up code that checks for alignments less than 4;
let the targets make that decision instead of doing it in
target-independent code. This allows x86 to use rep;movs in
low-alignment cases.

Also, this fixes a bug that resulted in the use of rep;stos for
memsets of 0 with non-constant memory size when the alignment was
at least 4. It's better to use the library in this case, which
can be significantly faster when the size is large.

This also preserves more SourceValue information when memory
intrinsics are lowered into simple loads/stores.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49572 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-12 04:36:06 +00:00
Dan Gohman
6f836adafe Fix a bug that prevented x86-64 from using rep.movsq for
8-byte-aligned data.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49571 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-12 02:35:39 +00:00
Dan Gohman
7d8143f0ef Make isVectorClearMaskLegal's operand list const.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49446 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-09 20:09:42 +00:00
Roman Levenstein
dc1adac582 Re-commit of the r48822, where the infinite looping problem discovered
by Dan Gohman is fixed.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49330 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-07 10:06:32 +00:00
Evan Cheng
0c0f83ff5d Favors pshufd over shufps when shuffling elements from one vector. pshufd is faster than shufps.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49244 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-05 00:30:36 +00:00
Evan Cheng
6397c64441 Backing out 48222 temporarily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49124 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-03 03:13:16 +00:00
Dan Gohman
38c92263eb Don't use __bzero for memset if the second argument isn't zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49050 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-01 20:56:18 +00:00
Dan Gohman
68d599df37 Speculatively micro-optimize memory-zeroing calls on Darwin 10.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49048 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-01 20:38:36 +00:00
Dale Johannesen
2ffbcaccd1 Accept 'y' constraint (MMX) in inline asm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49011 91177308-0d34-0410-b5e6-96231b3b80d8
2008-04-01 00:57:48 +00:00
Dan Gohman
d4a2ad35e3 Fix a tokenfactor node to use the load chain rather than the
load value. This fixes PR2177.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48932 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-28 23:45:16 +00:00
Roman Levenstein
e326332acd Use a linked data structure for the uses lists of an SDNode, just like
LLVM Value/Use does and MachineRegisterInfo/MachineOperand does.
This allows constant time for all uses list maintenance operations.

The idea was suggested by Chris. Reviewed by Evan and Dan.
Patch is tested and approved by Dan.

On normal use-cases compilation speed is not affected. On very big basic
blocks there are compilation speedups in the range of 15-20% or even better. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48822 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-26 12:39:26 +00:00
Evan Cheng
62a3f1538c - SSE4.1 extractfps extracts a f32 into a gr32 register. Very useful! Not. Fix the instruction specification and teaches lowering code to use it only when the only use is a store instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48746 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-24 21:52:23 +00:00
Anton Korobeynikov
1a979d9eab Add convenient helper for win64 check. Simplify things slightly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48691 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-22 20:57:27 +00:00
Anton Korobeynikov
8f88cb0899 Initial support for Win64 calling conventions. Still in early state.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48690 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-22 20:37:30 +00:00
Duncan Sands
276dcbdc8d Introduce a new node for holding call argument
flags.  This is needed by the new legalize types
infrastructure which wants to expand the 64 bit
constants previously used to hold the flags on
32 bit machines.  There are two functional changes:
(1) in LowerArguments, if a parameter has the zext
attribute set then that is marked in the flags;
before it was being ignored; (2) PPC had some bogus
code for handling two word arguments when using the
ELF 32 ABI, which was hard to convert because of
the bogusness.  As suggested by the original author
(Nicolas Geoffray), I've disabled it for the moment.
Tested with "make check" and the Ada ACATS testsuite.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48640 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-21 09:14:45 +00:00
Chris Lattner
920c37afc5 remove Evan's "ugly hack" that sorta attempted to get
x86-64 return conventions correct, but was never enabled.
We can now do the "right thing" with multiple return values.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48635 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-21 06:50:21 +00:00
Evan Cheng
260e07ec8c Fix this xform: (sra (shl X, m), result_size) -> (sign_extend (trunc (shl X, result_size - n - m)))
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48578 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-20 02:18:41 +00:00
Christopher Lamb
15cbde3cf6 Fix X86's isTruncateFree to not claim that truncate to i1 is free. This fixes Bill's testcase that failed for r48491.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48542 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-19 08:30:06 +00:00
Evan Cheng
586ccac4ec Fix a x86-64 isel lowering bug that's been around forever. A x86-64 varargs function implicitly reads X86::AL, don't clobber it!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48515 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-18 23:36:35 +00:00
Chris Lattner
58d74910c6 Reimplement the parameter attributes support, phase #1. hilights:
1. There is now a "PAListPtr" class, which is a smart pointer around
   the underlying uniqued parameter attribute list object, and manages
   its refcount.  It is now impossible to mess up the refcount.
2. PAListPtr is now the main interface to the underlying object, and
   the underlying object is now completely opaque.
3. Implementation details like SmallVector and FoldingSet are now no
   longer part of the interface.
4. You can create a PAListPtr with an arbitrary sequence of
   ParamAttrsWithIndex's, no need to make a SmallVector of a specific 
   size (you can just use an array or scalar or vector if you wish).
5. All the client code that had to check for a null pointer before
   dereferencing the pointer is simplified to just access the 
   PAListPtr directly.
6. The interfaces for adding attrs to a list and removing them is a
   bit simpler.

Phase #2 will rename some stuff (e.g. PAListPtr) and do other less 
invasive changes.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48289 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-12 17:45:29 +00:00
Chris Lattner
fce84acbbc start handling the 'f' x87 constraint.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48239 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-11 19:06:29 +00:00
Chris Lattner
447ff68c08 Change the model for FP Stack return to use fp operands on the
RET instruction instead of using FpSET_ST0_32.  This also generalizes
the code to handling returning of multiple FP results.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48209 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-11 03:23:40 +00:00
Chris Lattner
8e6da15e54 Eliminate the FP_GET_ST0/FP_SET_ST0 target-specific dag nodes, just lower to
copyfromreg/copytoreg instead.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48174 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-10 21:08:41 +00:00
Evan Cheng
d2cde68855 Default ISD::PREFETCH to expand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48169 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-10 19:38:10 +00:00
Scott Michel
5b8f82e35b Give TargetLowering::getSetCCResultType() a parameter so that ISD::SETCC's
return ValueType can depend its operands' ValueType.

This is a cosmetic change, no functionality impacted.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48145 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-10 15:42:14 +00:00
Dale Johannesen
b8cafe3427 Increase ISD::ParamFlags to 64 bits. Increase the ByValSize
field to 32 bits, thus enabling correct handling of ByVal
structs bigger than 0x1ffff.  Abstract interface a bit.
Fixes gcc.c-torture/execute/pr23135.c and 
gcc.c-torture/execute/pr28982b.c in gcc testsuite (were ICE'ing
on ppc32, quietly producing wrong code on x86-32.)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48122 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-10 02:17:22 +00:00
Chris Lattner
afb23f48a4 rename FP_SETRESULT -> FP_SET_ST0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48094 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-09 07:08:44 +00:00
Chris Lattner
6fa2f9c636 rename FpGETRESULT32 -> FpGET_ST0_32 etc. Add support for
isel'ing value preserving FP roundings from one fp stack reg to another
into a noop, instead of stack traffic.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48093 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-09 07:05:32 +00:00
Chris Lattner
67f453aae7 Finish implementing a readme entry: when inserting an i64 variable
into a vector of zeros or undef, and when the top part is obviously
zero, we can just use movd + shuffle.  This allows us to compile
vec_set-B.ll into:

_test3:
	movl	$1234567, %eax
	andl	4(%esp), %eax
	movd	%eax, %xmm0
	ret

instead of:

_test3:
	subl	$28, %esp
	movl	$1234567, %eax
	andl	32(%esp), %eax
	movl	%eax, (%esp)
	movl	$0, 4(%esp)
	movq	(%esp), %xmm0
	addl	$28, %esp
	ret



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48090 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-09 05:42:06 +00:00
Chris Lattner
62098040a1 Implement a readme entry, compiling
#include <xmmintrin.h>
__m128i doload64(short x) {return _mm_set_epi16(0,0,0,0,0,0,0,1);}

into:
	movl	$1, %eax
	movd	%eax, %xmm0
	ret

instead of a constant pool load.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48063 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-09 01:05:04 +00:00
Chris Lattner
19f7969f81 1) Improve comments.
2) Don't try to insert an i64 value into the low part of a 
   vector with movq on an x86-32 target.  This allows us to 
   compile:

__m128i doload64(short x) {return _mm_set_epi16(0,0,0,0,0,0,0,1);}

into:

_doload64:
	movaps	LCPI1_0, %xmm0
	ret

instead of:

_doload64:
	subl	$28, %esp
	movl	$0, 4(%esp)
	movl	$1, (%esp)
	movq	(%esp), %xmm0
	addl	$28, %esp
	ret


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48057 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-08 22:59:52 +00:00
Chris Lattner
c9517fb6eb minor simplifications to this code, don't create a dead
SCALAR_TO_VECTOR on paths that end up not using it.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48056 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-08 22:48:29 +00:00
Evan Cheng
27b7db549e Implement x86 support for @llvm.prefetch. It corresponds to prefetcht{0|1|2} and prefetchnta instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48042 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-08 00:58:38 +00:00
Chris Lattner
d1108222fd mark frem as expand for all legal fp types on x86, regardless of whether
we're using SSE or not.  This fixes PR2122.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48006 91177308-0d34-0410-b5e6-96231b3b80d8
2008-03-07 06:36:32 +00:00