Commit Graph

235 Commits

Author SHA1 Message Date
Nate Begeman
204e84e138 The rest of the SSE4.1 intrinsic patterns that are obvious to me. Getting
Evan's help with the rest.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46697 91177308-0d34-0410-b5e6-96231b3b80d8
2008-02-04 06:00:24 +00:00
Nate Begeman
2f6f1c02ca Some more SSE 4.1 intrinsic patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46696 91177308-0d34-0410-b5e6-96231b3b80d8
2008-02-04 05:34:34 +00:00
Nate Begeman
63ec90a6a8 SSE 4.1 Intrinsics and detection
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46681 91177308-0d34-0410-b5e6-96231b3b80d8
2008-02-03 07:18:54 +00:00
Chris Lattner
d43d00cf3a Significantly simplify and improve handling of FP function results on x86-32.
This case returns the value in ST(0) and then has to convert it to an SSE
register.  This causes significant codegen ugliness in some cases.  For 
example in the trivial fp-stack-direct-ret.ll testcase we used to generate:

_bar:
	subl	$28, %esp
	call	L_foo$stub
	fstpl	16(%esp)
	movsd	16(%esp), %xmm0
	movsd	%xmm0, 8(%esp)
	fldl	8(%esp)
	addl	$28, %esp
	ret

because we move the result of foo() into an XMM register, then have to
move it back for the return of bar.

Instead of hacking ever-more special cases into the call result lowering code
we take a much simpler approach: on x86-32, fp return is modeled as always 
returning into an f80 register which is then truncated to f32 or f64 as needed.
Similarly for a result, we model it as an extension to f80 + return.

This exposes the truncate and extensions to the dag combiner, allowing target
independent code to hack on them, eliminating them in this case.  This gives 
us this code for the example above:

_bar:
	subl	$12, %esp
	call	L_foo$stub
	addl	$12, %esp
	ret

The nasty aspect of this is that these conversions are not legal, but we want
the second pass of dag combiner (post-legalize) to be able to hack on them.
To handle this, we lie to legalize and say they are legal, then custom expand
them on entry to the isel pass (PreprocessForFPConvert).  This is gross, but
less gross than the code it is replacing :)

This also allows us to generate better code in several other cases.  For 
example on fp-stack-ret-conv.ll, we now generate:

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstps	8(%esp)
	movl	16(%esp), %eax
	cvtss2sd	8(%esp), %xmm0
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

where before we produced (incidentally, the old bad code is identical to what
gcc produces):

_test:
	subl	$12, %esp
	call	L_foo$stub
	fstpl	(%esp)
	cvtsd2ss	(%esp), %xmm0
	cvtss2sd	%xmm0, %xmm0
	movl	16(%esp), %eax
	movsd	%xmm0, (%eax)
	addl	$12, %esp
	ret

Note that we generate slightly worse code on pr1505b.ll due to a scheduling 
deficiency that is unrelated to this patch.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46307 91177308-0d34-0410-b5e6-96231b3b80d8
2008-01-24 08:07:48 +00:00
Chris Lattner
f77e037309 add some missing flags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45859 91177308-0d34-0410-b5e6-96231b3b80d8
2008-01-11 06:59:07 +00:00
Chris Lattner
ba7e756c22 Start inferring side effect information more aggressively, and fix many bugs in the
x86 backend where instructions were not marked maystore/mayload, and perf issues where
instructions were not marked neverHasSideEffects.  It would be really nice if we could
write patterns for copy instructions.

I have audited all the x86 instructions down to MOVDQAmr.  The flags on others and on
other targets are probably not right in all cases, but no clients currently use this
info that are enabled by default.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45829 91177308-0d34-0410-b5e6-96231b3b80d8
2008-01-10 07:59:24 +00:00
Chris Lattner
dd41527a7d remove explicit sets of 'neverHasSideEffects' that can now be
inferred from the instr patterns.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45824 91177308-0d34-0410-b5e6-96231b3b80d8
2008-01-10 05:45:39 +00:00
Chris Lattner
834f1ce031 rename isLoad -> isSimpleLoad due to evan's desire to have such a predicate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45667 91177308-0d34-0410-b5e6-96231b3b80d8
2008-01-06 23:38:27 +00:00
Chris Lattner
4ee451de36 Remove attribution from file headers, per discussion on llvmdev.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45418 91177308-0d34-0410-b5e6-96231b3b80d8
2007-12-29 20:36:04 +00:00
Evan Cheng
700a0fba97 Fix JIT encoding for CMPSD as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45268 91177308-0d34-0410-b5e6-96231b3b80d8
2007-12-20 19:57:09 +00:00
Bill Wendling
627c00b663 Add "mayHaveSideEffects" and "neverHasSideEffects" flags to some instructions. I
based what flag to set on whether it was already marked as
"isRematerializable". If there was a further check to determine if it's "really"
rematerializable, then I marked it as "mayHaveSideEffects" and created a check
in the X86 back-end similar to the remat one.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45132 91177308-0d34-0410-b5e6-96231b3b80d8
2007-12-17 23:07:56 +00:00
Chris Lattner
d7610e16fc Fix the JIT encoding of cmp*ss, which aborts with this assertion currently:
X86CodeEmitter.cpp:378: failed assertion `0 && "Immediate size not set!"'

I *think* this is right, but Evan, please verify.  It also looks like
CMPSDrr and maybe others are missing this info.  Evan, plz investigate.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45074 91177308-0d34-0410-b5e6-96231b3b80d8
2007-12-16 20:12:41 +00:00
Evan Cheng
7a831ce85f Make better use of instructions that clear high bits; fix various 2-wide shuffle bugs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45058 91177308-0d34-0410-b5e6-96231b3b80d8
2007-12-15 03:00:47 +00:00
Evan Cheng
6e141fd048 Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44960 91177308-0d34-0410-b5e6-96231b3b80d8
2007-12-12 23:12:09 +00:00
Evan Cheng
1076210ee3 Remove a bogus optimization. It's not possible to do a move to low element to a <8 x i16> or <16 x i8> vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44669 91177308-0d34-0410-b5e6-96231b3b80d8
2007-12-06 22:14:22 +00:00
Chris Lattner
8a594489bf Fix a long standing deficiency in the X86 backend: we would
sometimes emit "zero" and "all one" vectors multiple times,
for example:

_test2:
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M1
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M2
	ret

instead of:

_test2:
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M1
	movq	%mm0, _M2
	ret

This patch fixes this by always arranging for zero/one vectors
to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be
any random type.  This ensures they get trivially CSE'd on the dag.
This fix is also important for LegalizeDAGTypes, as it gets unhappy
when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when
'i64' isn't legal.

This patch makes the following changes:

1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into
   their canonical types.
2) The now-dead patterns are removed from the SSE/MMX .td files.
3) All the patterns in the .td file that referred to immAllOnesV or
   immAllZerosV in the wrong form now use *_bc to match them with a
   bitcast wrapped around them.
4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle 
   bitcast'd zero vectors, which simplifies the code actually.
5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that
   is legal, instead of generating one that is illegal and expecting
   a later legalize pass to clean it up.
6) isZeroShuffle is generalized to handle bitcast of zeros.
7) several other minor tweaks.

This patch is definite goodness, but has the potential to cause random
code quality regressions.  Please be on the lookout for these and let 
me know if they happen.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44310 91177308-0d34-0410-b5e6-96231b3b80d8
2007-11-25 00:24:49 +00:00
Nate Begeman
b348d18caf Add support for vectors to int <-> float casts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44204 91177308-0d34-0410-b5e6-96231b3b80d8
2007-11-17 03:58:34 +00:00
Dale Johannesen
c784208a73 Add missing SSE builtins: CVTPD2PI, CVTPS2PI,
CVTTPD2PI, CVTTPS2PI, CVTPI2PD, CVTPI2PS.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@43523 91177308-0d34-0410-b5e6-96231b3b80d8
2007-10-30 22:15:38 +00:00
Arnold Schwaighofer
48abc5cf6b Corrected many typing errors. And removed 'nest' parameter handling
for fastcc from X86CallingConv.td.  This means that nested functions
are not supported for calling convention 'fastcc'.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42934 91177308-0d34-0410-b5e6-96231b3b80d8
2007-10-12 21:30:57 +00:00
Dale Johannesen
83e105c600 Add missing argument to PALIGNR
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42874 91177308-0d34-0410-b5e6-96231b3b80d8
2007-10-11 20:58:37 +00:00
Evan Cheng
c231e8c8a5 Added DAG xforms. e.g.
(vextract (v4f32 s2v (f32 load $addr)), 0) -> (f32 load $addr) 
(vextract (v4i32 bc (v4f32 s2v (f32 load $addr))), 0) -> (i32 load $addr)
Remove x86 specific patterns.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42677 91177308-0d34-0410-b5e6-96231b3b80d8
2007-10-06 02:46:29 +00:00
Evan Cheng
fef922a4d5 Typo. X86comi doesn't read / write chain's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42492 91177308-0d34-0410-b5e6-96231b3b80d8
2007-10-01 18:12:48 +00:00
Evan Cheng
e5f6204cd5 Enabling new condition code modeling scheme.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42459 91177308-0d34-0410-b5e6-96231b3b80d8
2007-09-29 00:00:36 +00:00
Evan Cheng
0488db9b99 Added support for new condition code modeling scheme (i.e. physical register dependency). These are a bunch of instructions that are duplicated so the x86 backend can support both the old and new schemes at the same time. They will be deleted after
all the kinks are worked out.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42285 91177308-0d34-0410-b5e6-96231b3b80d8
2007-09-25 01:57:46 +00:00
Dale Johannesen
f1fc3a8fa6 Fix PR 1681. When X86 target uses +sse -sse2,
keep f32 in SSE registers and f64 in x87.  This
is effectively a new codegen mode.
Change addLegalFPImmediate to permit float and
double variants to do different things.
Adjust callers.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42246 91177308-0d34-0410-b5e6-96231b3b80d8
2007-09-23 14:52:20 +00:00
Evan Cheng
24f2ea3971 Add implicit def of EFLAGS on those instructions that may modify flags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@41962 91177308-0d34-0410-b5e6-96231b3b80d8
2007-09-14 21:48:26 +00:00
Evan Cheng
071a279e94 Remove (somewhat confusing) Imp<> helper, use let Defs = [], Uses = [] instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@41863 91177308-0d34-0410-b5e6-96231b3b80d8
2007-09-11 19:55:27 +00:00
Dan Gohman
1ab79897e2 Avoid storing and reloading zeros and other constants from stack slots
by flagging the associated instructions as being trivially rematerializable.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@41775 91177308-0d34-0410-b5e6-96231b3b80d8
2007-09-07 21:32:51 +00:00
Evan Cheng
2f39426ec9 Mark load instructions with isLoad = 1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@41595 91177308-0d34-0410-b5e6-96231b3b80d8
2007-08-30 05:49:43 +00:00
Bill Wendling
01284b4d55 64-bit SSSE3 ops that use MMX registers don't require 16-byte alignment.
Make a 'memop' pattern just for them.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@41017 91177308-0d34-0410-b5e6-96231b3b80d8
2007-08-11 09:52:53 +00:00
Bill Wendling
ae9671b838 For kicks, I though it would be fun to use the correct opcode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40985 91177308-0d34-0410-b5e6-96231b3b80d8
2007-08-10 09:00:17 +00:00
Bill Wendling
76d708b76f Adding SSSE3 intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40982 91177308-0d34-0410-b5e6-96231b3b80d8
2007-08-10 06:22:27 +00:00
Dan Gohman
7f55fcbc6b Fix the alignment requirements of several unpck and shuf instructions.
Generalize isPSHUFDMask and add a unary SHUFPD pattern so that SHUFPD's
memory operand alignment can be tested as well, with a fix to avoid
breaking MMX's use of isPSHUFDMask.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40756 91177308-0d34-0410-b5e6-96231b3b80d8
2007-08-02 21:17:01 +00:00
Dan Gohman
f3372d1d64 Fix pastos in vector arithmetic intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40754 91177308-0d34-0410-b5e6-96231b3b80d8
2007-08-02 21:06:40 +00:00
Dan Gohman
73a902b228 Mark the SSE and MMX load instructions that
X86InstrInfo::isReallyTriviallyReMaterializable knows how to handle
with the isReMaterializable flag so that it is given a chance to handle
them. Without hoisting constant-pool loads from loops this isn't very
visible, though it does keep CodeGen/X86/constant-pool-remat-0.ll from
making a copy of the constant pool on the stack.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40736 91177308-0d34-0410-b5e6-96231b3b80d8
2007-08-02 14:27:55 +00:00
Evan Cheng
c5dd54154a Missing Requires.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40691 91177308-0d34-0410-b5e6-96231b3b80d8
2007-08-01 21:42:24 +00:00
Dan Gohman
b1576f56c8 Change the x86 assembly output to use tab characters to separate the
mnemonics from their operands instead of single spaces. This makes the
assembly output a little more consistent with various other compilers
(f.e. GCC), and slightly easier to read. Also, update the regression
tests accordingly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40648 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-31 20:11:57 +00:00
Evan Cheng
c64a1a921c Redo and generalize previously removed opt for pinsrw: (vextract (v4i32 bc (v4f32 s2v (f32 load ))), 0) -> (i32 load )
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40628 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-31 08:04:03 +00:00
Dan Gohman
d300622eba Re-apply 40504, but with a fix for the segfault it caused in oggenc:
Make the alignedload and alignedstore patterns always require 16-byte
alignment. This way when they are used in the "Fs" instructions, in which
a vector instruction is used for a scalar purpose, they can still require
the full vector alignment. And add a regression test for this.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40555 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-27 17:16:43 +00:00
Evan Cheng
3e22947d9a Reverting 40504 for now. It's breaking oggenc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40547 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-27 01:37:47 +00:00
Dan Gohman
1704c2f9b9 Fix a whitespace difference between CMPSSrr and CMPSDrr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40528 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-26 15:11:50 +00:00
Dan Gohman
d3283832aa Remove X86ISD::LOAD_PACK and X86ISD::LOAD_UA and associated code from the
x86 target, replacing them with the new alignment attributes on memory
references.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40504 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-26 00:31:09 +00:00
Evan Cheng
b4162fd393 Because we promote SSE logical ops and loads to v2i64, we often end up generate
code that cross integer / floating point domains (e.g. generate pxor / pand for
logical ops on floating point value, movdqa to load / store floating point SSE
values). Given that, it's better to use movaps instead of movdqa and movups
instead of movdqu. They have the same latency but the "aps" variants are one
byte shorter.
If the domain crossing problem is a real performance issue, then we will have to
fix it with dynamic programming based isel.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40076 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-20 00:27:43 +00:00
Evan Cheng
31d3a65052 Fix patterns so we isel the xorps, etc. for floating pt logical SSE ops. DAG combiner may fold away the (bit_convert (load)).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40070 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-19 23:34:10 +00:00
Evan Cheng
64d80e3387 Change instruction description to split OperandList into OutOperandList and
InOperandList. This gives one piece of important information: # of results
produced by an instruction.
An example of the change:
def ADD32rr  : I<0x01, MRMDestReg, (ops GR32:$dst, GR32:$src1, GR32:$src2),
                 "add{l} {$src2, $dst|$dst, $src2}",
                 [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;
=>
def ADD32rr  : I<0x01, MRMDestReg, (outs GR32:$dst), (ins GR32:$src1, GR32:$src2),
                 "add{l} {$src2, $dst|$dst, $src2}",
                 [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40033 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-19 01:14:50 +00:00
Dan Gohman
4106f3714e Implement initial memory alignment awareness for SSE instructions. Vector loads
and stores that have a specified alignment of less than 16 bytes now use
instructions that support misaligned memory references.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@40015 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-18 20:23:34 +00:00
Dan Gohman
2038252c6a Define non-intrinsic instructions for vector min, max, sqrt, rsqrt, and rcp,
in addition to the intrinsic forms. Add spill-folding entries for these new
instructions, and for the scalar min and max instrinsic instructions which
were missing. And add some preliminary ISelLowering code for using the new
non-intrinsic vector sqrt instruction, and fneg and fabs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@38478 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-10 00:05:58 +00:00
Dale Johannesen
849f214a4e Fix for PR 1505 (and 1489). Rewrite X87 register
model to include f32 variants.  Some factoring
improvments forthcoming.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@37847 91177308-0d34-0410-b5e6-96231b3b80d8
2007-07-03 00:53:03 +00:00
Dan Gohman
d45eddd214 Revert the earlier change that removed the M_REMATERIALIZABLE machine
instruction flag, and use the flag along with a virtual member function
hook for targets to override if there are instructions that are only
trivially rematerializable with specific operands (i.e. constant pool
loads).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@37728 91177308-0d34-0410-b5e6-96231b3b80d8
2007-06-26 00:48:07 +00:00
Dan Gohman
32791e06d8 Make minor adjustments to whitespace and comments to reduce differences
between SSE1 instructions and their respective SSE2 analogues.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@37718 91177308-0d34-0410-b5e6-96231b3b80d8
2007-06-25 15:44:19 +00:00