Commit Graph

2451 Commits

Author SHA1 Message Date
Evan Cheng
fb11288109 Model inline asm constraint which ties an input to an output register as machine operand TIED_TO constraint. This eliminated the need to pre-allocate registers for these. This also allows register allocator can eliminate the unneeded copies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67512 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-23 08:01:15 +00:00
Evan Cheng
7d6d4b360f Do not fold away subreg_to_reg if the source register has a sub-register index. That means the source register is taking a sub-register of a larger register. e.g. On x86
%RAX<def> = ...
%RAX<def> = SUBREG_TO_REG 0, %EAX:3<kill>, 3
The first def is defining RAX, not EAX so the top bits were not zero-extended.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67511 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-23 07:19:58 +00:00
Rafael Espindola
6b2c7ae2c9 Add -relocation-model=pic so that the test works
both in Linux and Darwin.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67191 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-18 09:38:28 +00:00
Mon P Wang
aa9df0b0c3 Added missing support for widening when splitting an unary op (PR3683)
and expanding a bit convert (PR3711).  In both cases, we extract the
valid part of the widen vector and then do the conversion.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67175 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-18 06:24:04 +00:00
Evan Cheng
7367319991 Add another test case for r64440.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67156 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-18 02:43:01 +00:00
Chris Lattner
ff81ebf758 Disable the "call to immediate" optimization on x86-64. It is
not safe in general because the immediate could be an arbitrary
value that does not fit in a 32-bit pcrel displacement.  
Conservatively fall back to loading the value into a register
and calling through it.

We still do the optzn on X86-32.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67142 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-18 00:43:52 +00:00
Bill Wendling
64ec298e68 A more proper -mtriple.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67138 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-18 00:19:44 +00:00
Bill Wendling
652c3c36c5 Temporary fix. I think Rafael wanted this to be Linux-only.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67137 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-18 00:16:36 +00:00
Chris Lattner
b7e64ac3ac LSR shouldn't ever try to hack on integer IV's larger than 64-bits. Right now
it is not APInt clean, but even when it is it needs to be evaluated carefully
to determine whether it is actually profitable.

This fixes a crash on PR3806


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67134 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 23:58:30 +00:00
Rafael Espindola
152932b71c Don't force promotion of return arguments on the callee.
Some architectures (like x86) don't require it.
This fixes bug 3779.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67132 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 23:43:59 +00:00
Chris Lattner
3985728bde this is apparently passing now. Evan/Dan, please check
to see if this is producing the expected code or not, I'm
not sure what the test was intended to check.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67099 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 20:23:43 +00:00
Chris Lattner
0b18e59336 Fix codegen to compute the size of an allocation by multiplying the
size by the array amount as an i32 value instead of promoting from
i32 to i64 then doing the multiply.  Not doing this broke wrap-around
assumptions that the optimizers (validly) made.  The ultimate real
fix for this is to introduce i64 version of alloca and remove mallocinst.

This fixes PR3829


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67093 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 19:36:00 +00:00
Evan Cheng
58e2287d00 Add newline at end of file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67085 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 17:08:25 +00:00
Scott Michel
a82d3f7c57 CellSPU:
Revert inadvertent mis-fix of fneg.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67084 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 16:45:16 +00:00
Duncan Sands
a5fec0dba3 Reapply r67049, with the test adjusted for darwin
(which produces "call L_f$stub" rather than "call f").


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67079 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 09:46:22 +00:00
Mon P Wang
93b7415f4c Fix a problem with DAGCombine where we were building an illegal build
vector shuffle mask. Forced the mask to be built using i32.  Note: this will
be irrelevant once vector_shuffle no longer takes a build vector for the
shuffle mask.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67076 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 06:33:10 +00:00
Dan Gohman
9626447e70 Recognize bswapl as bswap too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67072 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 02:45:40 +00:00
Dan Gohman
d73566609e Recognize "bswapq" as an alternate spelling for the bswap instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67071 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 02:17:27 +00:00
Evan Cheng
e47b0089d9 Spiller may unfold load / mod / store instructions as an optimization when the would be loaded value is available in a register. It needs to check if it's legal to clobber the register. Also, the register can contain values of multiple spill slots, make sure to check all instead of just the one being unfolded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67068 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 01:23:09 +00:00
Scott Michel
7ea02ffe91 CellSPU:
- Fix fabs, fneg for f32 and f64.
- Use BuildVectorSDNode.isConstantSplat, now that the functionality exists
- Continue to improve i64 constant lowering. Lower certain special constants
  to the constant pool when they correspond to SPU's shufb instruction's
  special mask values. This avoids the overhead of performing a shuffle on a
  zero-filled vector just to get the special constant when the memory load
  suffices.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67067 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17 01:15:45 +00:00
Bill Wendling
db14d63cea --- Reverse-merging (from foreign repository) r67049 into '.':
U    test/CodeGen/X86/2009-03-13-PHIElimBug.ll
D    test/CodeGen/X86/2009-03-16-PHIElimInLPad.ll
U    lib/CodeGen/PHIElimination.cpp

r67049 was causing this failure:

Running /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/test/CodeGen/X86/dg.exp ...
FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/test/CodeGen/X86/2009-03-13-PHIElimBug.ll for PR3784
Failed with exit(1) at line 1
while running:  llvm-as < /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/test/CodeGen/X86/2009-03-13-PHIElimBug.ll |  llc -march=x86 | /usr/bin/grep -A 2 {call f} | /usr/bin/grep movl
child process exited abnormally


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67051 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-16 20:27:20 +00:00
Duncan Sands
dfec24c877 Tweak the fix for PR3784: be less sensitive about just
how invokes are set up.  The fix could be disturbed by
register copies coming after the EH_LABEL, and also didn't
behave quite right when it was the invoke result that
was used in a phi node.  Also (see new testcase) fix
another phi elimination bug while there: register copies
in the landing pad need to come after the EH_LABEL, because
that's where execution branches to when unwinding.  If they
come before the EH_LABEL then they will never be executed...
Also tweak the original testcase so it doesn't use a no-longer
existing counter.
The accumulated phi elimination changes fix two of seven Ada
testsuite failures that turned up after landing pad critical
edge splitting was turned off.  So there's probably more to come.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67049 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-16 19:58:38 +00:00
Scott Michel
6e1d1470c2 CellSPU:
Incorporate Tilmann's 128-bit operation patch. Evidently, it gets the
llvm-gcc bootstrap a bit further along.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67048 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-16 18:47:25 +00:00
Dan Gohman
318f5055a4 Add a testcase that covers a wide variety of ABI isel cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67003 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-14 02:35:10 +00:00
Dan Gohman
72bb0a64af Use %rip-relative addressing on x86-64 whenever practical, as
it has a smaller encoding than absolute addressing.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67002 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-14 02:33:41 +00:00
Dan Gohman
d00d2feab2 Add a few more ptrtoint/inttoptr cast tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66989 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13 23:54:51 +00:00
Dan Gohman
474d3b3f40 Improve FastISel's handling of truncates to i1, and implement
ptrtoint and inttoptr in X86FastISel. These casts aren't always
handled in the generic FastISel code because X86 sometimes needs
custom code to do truncation and zero-extension.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66988 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13 23:53:06 +00:00
Evan Cheng
fc0b80d974 Fix PR3784: If the source of a phi comes from a bb ended with an invoke, make sure the copy is inserted before the try range (unless it's used as an input to the invoke, then insert it after the last use), not at the end of the bb.
Also re-apply r66140 which was disabled as a workaround.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66976 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13 22:59:14 +00:00
Dan Gohman
14ea1ec232 Fix FastISel's assumption that i1 values are always zero-extended
by inserting explicit zero extensions where necessary. Included
is a testcase where SelectionDAG produces a virtual register
holding an i1 value which FastISel previously mistakenly assumed
to be zero-extended.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66941 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13 20:42:20 +00:00
Rafael Espindola
9b922aa3b8 Improve sext and zext of TLS variables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66922 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13 18:37:06 +00:00
Evan Cheng
1606e8e4cd Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues.
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.


Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66875 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13 07:51:59 +00:00
Chris Lattner
cee56e7d33 generalize the previous code to use the full generality of LEA
for i32/i64 expressions (we could also do i16 on cpus where
i16 lea is fast, but I didn't add this).  On the example, we now
generate:

_test:
	movl	4(%esp), %eax
	cmpl	$42, (%eax)
	setl	%al
	movzbl	%al, %eax
	leal	4(%eax,%eax,8), %eax
	ret

instead of:

_test:
	movl	4(%esp), %eax
	cmpl	$41, (%eax)
	movl	$4, %ecx
	movl	$13, %eax
	cmovg	%ecx, %eax
	ret



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66869 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13 05:53:31 +00:00
Chris Lattner
97a29a5fee optimize the case of cond ? 42 : 41 and friends. This compiles the
example to:

_test:
	movl	4(%esp), %eax
	cmpl	$41, (%eax)
	setg	%al
	movzbl	%al, %eax
	orl	$4294967294, %eax
	ret

instead of:

        movl    4(%esp), %eax
        cmpl    $41, (%eax)
	movl	$4294967294, %ecx
	movl	$4294967295, %eax
	cmova	%ecx, %eax
	ret

which is smaller in code size and faster. rdar://6668608



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66868 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13 05:22:11 +00:00
Dan Gohman
77502c9344 Enhance address-mode folding of ISD::ADD to handle cases where the
operands can't both be fully folded at the same time. For example,
in the included testcase, a global variable is being added with
an add of two values. The global variable wants RIP-relative
addressing, so it can't share the address with another base
register, but it's still possible to fold the initial add.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66865 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13 02:25:09 +00:00
Evan Cheng
379e15e988 Add this test back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66838 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-12 23:01:35 +00:00
Duncan Sands
58256f83c8 Revert commit 66140 since it caused several failures
in the Ada testcase.  Reverting this only covers up
the real problem, which is a nasty conceptual difficulty
in the phi elimination pass: when eliminating phi nodes
in landing pads, the register copies need to come before
the invoke, not at the end of the basic block which is
too late...  See PR3784.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66826 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-12 21:13:42 +00:00
Evan Cheng
826af20ed5 Typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66797 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-12 17:07:39 +00:00
Evan Cheng
0b220d0e6c Fix test after Chris' select changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66795 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-12 16:10:08 +00:00
Chris Lattner
d1980a5acd Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))"
related transformations out of target-specific dag combine into the
ARM backend.  These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).

Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0  -> (zext(cond) << 3).  This happens frequently
with the recently added cp constant select optimization, but is a
very general xform.  For example, we now compile the second example
in const-select.ll to:

_test:
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        seta    %al
        movzbl  %al, %eax
        movl    4(%esp), %ecx
        movsbl  (%ecx,%eax,4), %eax
        ret

instead of:

_test:
        movl    4(%esp), %eax
        leal    4(%eax), %ecx
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        cmovbe  %eax, %ecx
        movsbl  (%ecx), %eax
        ret

This passes multisource and dejagnu.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66779 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-12 06:52:53 +00:00
Evan Cheng
536e66764b On x86, if the only use of a i64 load is a i64 store, generate a pair of double load and store instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66776 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-12 05:59:15 +00:00
Chris Lattner
1285295e61 add no-unwind, remove duplicate run line.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66775 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-12 05:56:37 +00:00
Chris Lattner
7a15c8fe22 add nounwinds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66773 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-12 05:35:33 +00:00
Dan Gohman
30143763b9 Revert r66024. The JIT encoding for CALLpcrel32 is wrong -- see PR3773, and the
assembly text output uses an indirect call ("call *") instead of a direct call.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66735 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-11 23:01:47 +00:00
Rafael Espindola
b316f90e57 optimize i8 and i16 tls values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66725 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-11 22:40:04 +00:00
Evan Cheng
a597a97618 My last coalescer fix introduced a subtler one. It's aborting a commuting optimization too late and left the live intervals to be out of sync with instructions. This fixes 8b10b.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66715 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-11 22:18:44 +00:00
Mon P Wang
6b3ef693d7 For yonah, fix a vector shuffle case for v16i8 where we didn't properly clear some bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66684 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-11 18:47:57 +00:00
Mon P Wang
37b9a19653 Fixed a v8i16 shuffle case that should generate a pshufb instead of a pshuflw/hw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66645 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-11 06:35:11 +00:00
Chris Lattner
600fec3cea reapply my previous patch (r66358) with a tweak to set the
alignment of the generated constant pool entry to the
desired alignment of a type.  If we don't do this, we end up
trying to do movsd from 4-byte alignment memory.  This fixes
450.soplex and 456.hmmer.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66641 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-11 05:08:08 +00:00
Evan Cheng
a2e6435e48 Two coalescer fixes in one.
1. Use the same value# to represent unknown values being merged into sub-registers.
2. When coalescer commute an instruction and the destination is a physical register, update its sub-registers by merging in the extended ranges.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66610 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-11 00:03:21 +00:00
Bill Wendling
aad49feb6a Readd test, but XFAIL it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66581 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-10 21:31:00 +00:00
Evan Cheng
41d88d2ac0 Revert 66358 for now. It's breaking povray, 450.soplex, and 456.hmmer on x86 / Darwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66574 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-10 20:47:18 +00:00
Bill Wendling
3528e38fdf Add radar number.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66534 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-10 06:53:54 +00:00
Chris Lattner
3e0cc2634e wire up support for emitting "special" values from inline asm
format strings with the standard ${:foo} syntax.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66527 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-10 05:37:13 +00:00
Chris Lattner
66b8bc3289 Fix PR3763 by using proper APInt methods instead of uint64_t's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66434 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-09 20:22:18 +00:00
Evan Cheng
6501153fc0 ARM isLegalAddressImmediate should check if type is a simple type now that optimizer can create values of funky scalar types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66429 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-09 19:15:00 +00:00
Evan Cheng
0d8fc52ed3 Yet another case where the spiller marked two uses of the same register on the same instruction as kill. This fixes PR3706.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66428 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-09 19:00:05 +00:00
Evan Cheng
4b1747430a Recognize triplets starting with armv5-, armv6- etc. And set the ARM arch version accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66365 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-08 04:02:49 +00:00
Evan Cheng
821b8560e7 If a MI uses the same register more than once, only mark one of them as 'kill'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66363 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-08 03:58:35 +00:00
Chris Lattner
476769498e implement an optimization to codegen c ? 1.0 : 2.0 as load { 2.0, 1.0 } + c*4.
For 2009-03-07-FPConstSelect.ll we now produce:

_f:
	xorl	%eax, %eax
	testl	%edi, %edi
	movl	$4, %ecx
	cmovne	%rax, %rcx
	leaq	LCPI1_0(%rip), %rax
	movss	(%rcx,%rax), %xmm0
	ret

previously we produced:

_f:
	subl	$4, %esp
	cmpl	$0, 8(%esp)
	movss	LCPI1_0, %xmm0
	je	LBB1_2	## entry
LBB1_1:	## entry
	movss	LCPI1_1, %xmm0
LBB1_2:	## entry
	movss	%xmm0, (%esp)
	flds	(%esp)
	addl	$4, %esp
	ret

on PPC the code also improves to:

_f:
	cntlzw r2, r3
	srwi r2, r2, 5
	li r3, lo16(LCPI1_0)
	slwi r2, r2, 2
	addis r3, r3, ha16(LCPI1_0)
	lfsx f1, r3, r2
	blr 

from:

_f:
	li r2, lo16(LCPI1_1)
	cmplwi cr0, r3, 0
	addis r2, r2, ha16(LCPI1_1)
	beq cr0, LBB1_2	; entry
LBB1_1:	; entry
	li r2, lo16(LCPI1_0)
	addis r2, r2, ha16(LCPI1_0)
LBB1_2:	; entry
	lfs f1, 0(r2)
	blr 

This also improves the existing pic-cpool case from:

foo:
	subl	$12, %esp
	call	.Lllvm$1.$piclabel
.Lllvm$1.$piclabel:
	popl	%eax
	addl	$_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax
	cmpl	$0, 16(%esp)
	movsd	.LCPI1_0@GOTOFF(%eax), %xmm0
	je	.LBB1_2	# entry
.LBB1_1:	# entry
	movsd	.LCPI1_1@GOTOFF(%eax), %xmm0
.LBB1_2:	# entry
	movsd	%xmm0, (%esp)
	fldl	(%esp)
	addl	$12, %esp
	ret

to:

foo:
	call	.Lllvm$1.$piclabel
.Lllvm$1.$piclabel:
	popl	%eax
	addl	$_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax
	xorl	%ecx, %ecx
	cmpl	$0, 4(%esp)
	movl	$8, %edx
	cmovne	%ecx, %edx
	fldl	.LCPI1_0@GOTOFF(%eax,%edx)
	ret

This triggers a few dozen times in spec FP 2000.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66358 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-08 01:51:30 +00:00
Dan Gohman
3112581441 Arithmetic instructions don't set EFLAGS bits OF and CF bits
the same say the "test" instruction does in overflow cases,
so eliminating the test is only safe when those bits aren't
needed, as is the case for COND_E and COND_NE, or if it
can be proven that no overflow will occur. For now, just
restrict the optimization to COND_E and COND_NE and don't
do any overflow analysis.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66318 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-07 01:58:32 +00:00
Dan Gohman
16e8eda4b8 Fix ScheduleDAGRRList::CopyAndMoveSuccessors' handling of nodes
with multiple chain operands. This can occur when the scheduler
has added chain operands to a node that already has a chain
operand, in order to handle physical register dependencies.

This fixes an llvm-gcc bootstrap failure on x86-64 introduced
in r66058.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66240 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-06 02:23:01 +00:00
Dan Gohman
4bfcf2a2a6 Fix the "test" optimization to recognize "dec" as an add of
negative one, as subtracts of immediates are canonicalized
to adds.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66180 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-05 19:32:48 +00:00
Dan Gohman
8733db351a Make this test more thorough. Not only should there be no %esi,
there should be no spilling of anything.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66179 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-05 19:31:32 +00:00
Evan Cheng
6fb8f421ee Do not split edges to EH landing pads. It will cause code size explosion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66140 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-05 06:31:26 +00:00
Dan Gohman
076aee32e8 Re-apply 66008, now that the unfoldMemoryOperand bug is fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66058 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04 19:44:21 +00:00
Owen Anderson
c93023a89e Add a restore folder, which shaves a dozen or so machineinstrs off oggenc. Update a testcase to check this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66029 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04 08:52:31 +00:00
Evan Cheng
ae3f2b6c77 Fix PR3666: isel calls to constant addresses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66024 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04 06:48:53 +00:00
Eli Friedman
27759f41ca PR3686: make the legalizer handle bitcast from i80 to x86 long double.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66021 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04 06:23:34 +00:00
Dan Gohman
29582d1223 Revert r66004 for now; it's causing a variety of test failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66008 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04 03:54:19 +00:00
Evan Cheng
10029df79a Rename test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66006 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04 02:47:25 +00:00
Dan Gohman
12bbc52aa7 Teach the x86 backend to eliminate "test" instructions by using the EFLAGS
result from add, sub, inc, and dec instructions in simple cases.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66004 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04 02:33:24 +00:00
Evan Cheng
599a6a88ce Fix PR3701. 1. X86 target renamed eflags register to flags. This matches what llvm-gcc generates so codegen knows flags register is being clobbered by inline asm. 2. BURR scheduler should also check if inline asm nodes can clobber "live" physical registers. Previously it was only checking target nodes with implicit defs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65996 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04 01:41:49 +00:00
Bill Wendling
36ae6c1827 The DAG combiner was performing a BT combine. The BT combine had a value of -1,
so it changed it into a 31 via the TLO.ShrinkDemandedConstant() call. Then it
would go through the DAG combiner again. This time it had a value of 31, which
was turned into a -1 by TLI.SimplifyDemandedBits(). This would ping pong
forever.

Teach the TLO.ShrinkDemandedConstant() call not to lower a value if the demanded
value is an XOR of all ones.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65985 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04 00:18:06 +00:00
Nate Begeman
cbd88adea6 Fix a problem with DAGCombine on 64b targets where folding
extracts + build_vector into a shuffle would fail, because the
type of the new build_vector would not be legal.  Try harder to
create a legal build_vector type.  Note: this will be totally 
irrelevant once vector_shuffle no longer takes a build_vector for
shuffle mask.

New:
_foo:
	xorps	%xmm0, %xmm0
	xorps	%xmm1, %xmm1
	subps	%xmm1, %xmm1
	mulps	%xmm0, %xmm1
	addps	%xmm0, %xmm1
	movaps	%xmm1, 0

Old:
_foo:
	xorps	%xmm0, %xmm0
	movss	%xmm0, %xmm1
	xorps	%xmm2, %xmm2
	unpcklps	%xmm1, %xmm2
	pshufd	$80, %xmm1, %xmm1
	unpcklps	%xmm1, %xmm2
	pslldq	$16, %xmm2
	pshufd	$57, %xmm2, %xmm1
	subps	%xmm0, %xmm1
	mulps	%xmm0, %xmm1
	addps	%xmm0, %xmm1
	movaps	%xmm1, 0



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65791 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-01 23:44:07 +00:00
Evan Cheng
870b80722f Minor optimization:
Look for situations like this:                                                                                                                                                              
%reg1024<def> = MOV r1                                                                                                                                                                      
%reg1025<def> = MOV r0                                                                                                                                                                      
%reg1026<def> = ADD %reg1024, %reg1025                                                                                                                                                      
r0            = MOV %reg1026                                                                                                                                                                
Commute the ADD to hopefully eliminate an otherwise unavoidable copy.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65752 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-01 02:03:43 +00:00
Evan Cheng
c8bb37a50a Last commit accidentially deleted this code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65679 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-28 06:02:14 +00:00
Rafael Espindola
9a58023c6c Refactor TLS code and add some tests. The tests and expected results are:
pic |  declaration | linkage  | visibility |

!pic |  declaration | external | default    | tls1.ll     tls2.ll     | local exec
 pic |  declaration | external | default    | tls1-pic.ll tls2-pic.ll | general dynamic
!pic | !declaration | external | default    | tls3.ll     tls4.ll     | initial exec
 pic | !declaration | external | default    | tls3-pic.ll tls4-pic.ll | general dynamic

!pic |  declaration | external | hidden     | tls7.ll     tls8.ll     | local exec
 pic |  declaration | external | hidden     | X                       | local dynamic
!pic | !declaration | external | hidden     | tls9.ll     tls10.ll    | local exec
 pic | !declaration | external | hidden     | X                       | local dynamic

!pic |  declaration | internal | default    | tls5.ll     tls6.ll     | local exec
 pic |  declaration | internal | default    | X                       | local dynamic

The ones marked with an X have not been implemented since local dynamic is not implemented.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65632 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-27 13:37:18 +00:00
Evan Cheng
98f122fc40 Make sure this test passes on linux-ppc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65600 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-27 00:51:50 +00:00
Evan Cheng
efc783951c MachineLICM CSE should match destination register classes; avoid hoisting implicit_def's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65592 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-27 00:02:22 +00:00
Evan Cheng
236aa8a503 ADDS{D|S}rr_Int and MULS{D|S}rr_Int are not commutable. The users of these intrinsics expect the high bits will not be modified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65499 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-26 03:12:02 +00:00
Evan Cheng
04cf3e39f3 The last commit was overly conservative. It's ok to reuse value that's already marked livein.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65498 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-26 03:02:21 +00:00
Evan Cheng
a87008d90b Revert BuildVectorSDNode related patches: 65426, 65427, and 65296.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65482 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-25 22:49:59 +00:00
Dan Gohman
e9865945ad Fast-isel can't do TLS yet, so it should fall back to SDISel
if it sees TLS addresses.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65341 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-23 22:03:08 +00:00
Dan Gohman
834db732ba Use the -stack-alignment option instead of using a target triple
for avoiding dynamic stack realignment.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65319 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-23 16:34:46 +00:00
Evan Cheng
242b38bae5 Only v1i16 (i.e. _m64) is returned via RAX / RDX.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65313 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-23 09:03:22 +00:00
Nate Begeman
932be21c08 Make this test use darwin targe triple, to avoid stack traffic on linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65312 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-23 09:03:06 +00:00
Nate Begeman
b9a47b824f Generate better code for v8i16 shuffles on SSE2
Generate better code for v16i8 shuffles on SSE2 (avoids stack)
Generate pshufb for v8i16 and v16i8 shuffles on SSSE3 where it is fewer uops.
Document the shuffle matching logic and add some FIXMEs for later further
  cleanups.
New tests that test the above.

Examples:

New:
_shuf2:
	pextrw	$7, %xmm0, %eax
	punpcklqdq	%xmm1, %xmm0
	pshuflw	$128, %xmm0, %xmm0
	pinsrw	$2, %eax, %xmm0

Old:
_shuf2:
	pextrw	$2, %xmm0, %eax
	pextrw	$7, %xmm0, %ecx
	pinsrw	$2, %ecx, %xmm0
	pinsrw	$3, %eax, %xmm0
	movd	%xmm1, %eax
	pinsrw	$4, %eax, %xmm0
	ret

=========

New:
_shuf4:
	punpcklqdq	%xmm1, %xmm0
	pshufb	LCPI1_0, %xmm0

Old:
_shuf4:
	pextrw	$3, %xmm0, %eax
	movsd	%xmm1, %xmm0
	pextrw	$3, %xmm1, %ecx
	pinsrw	$4, %ecx, %xmm0
	pinsrw	$5, %eax, %xmm0

========

New:
_shuf1:
	pushl	%ebx
	pushl	%edi
	pushl	%esi
	pextrw	$1, %xmm0, %eax
	rolw	$8, %ax
	movd	%xmm0, %ecx
	rolw	$8, %cx
	pextrw	$5, %xmm0, %edx
	pextrw	$4, %xmm0, %esi
	pextrw	$3, %xmm0, %edi
	pextrw	$2, %xmm0, %ebx
	movaps	%xmm0, %xmm1
	pinsrw	$0, %ecx, %xmm1
	pinsrw	$1, %eax, %xmm1
	rolw	$8, %bx
	pinsrw	$2, %ebx, %xmm1
	rolw	$8, %di
	pinsrw	$3, %edi, %xmm1
	rolw	$8, %si
	pinsrw	$4, %esi, %xmm1
	rolw	$8, %dx
	pinsrw	$5, %edx, %xmm1
	pextrw	$7, %xmm0, %eax
	rolw	$8, %ax
	movaps	%xmm1, %xmm0
	pinsrw	$7, %eax, %xmm0
	popl	%esi
	popl	%edi
	popl	%ebx
	ret

Old:
_shuf1:
	subl	$252, %esp
	movaps	%xmm0, (%esp)
	movaps	%xmm0, 16(%esp)
	movaps	%xmm0, 32(%esp)
	movaps	%xmm0, 48(%esp)
	movaps	%xmm0, 64(%esp)
	movaps	%xmm0, 80(%esp)
	movaps	%xmm0, 96(%esp)
	movaps	%xmm0, 224(%esp)
	movaps	%xmm0, 208(%esp)
	movaps	%xmm0, 192(%esp)
	movaps	%xmm0, 176(%esp)
	movaps	%xmm0, 160(%esp)
	movaps	%xmm0, 144(%esp)
	movaps	%xmm0, 128(%esp)
	movaps	%xmm0, 112(%esp)
	movzbl	14(%esp), %eax
	movd	%eax, %xmm1
	movzbl	22(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm1, %xmm2
	movzbl	42(%esp), %eax
	movd	%eax, %xmm1
	movzbl	50(%esp), %eax
	movd	%eax, %xmm3
	punpcklbw	%xmm1, %xmm3
	punpcklbw	%xmm2, %xmm3
	movzbl	77(%esp), %eax
	movd	%eax, %xmm1
	movzbl	84(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm1, %xmm2
	movzbl	104(%esp), %eax
	movd	%eax, %xmm1
	punpcklbw	%xmm1, %xmm0
	punpcklbw	%xmm2, %xmm0
	movaps	%xmm0, %xmm1
	punpcklbw	%xmm3, %xmm1
	movzbl	127(%esp), %eax
	movd	%eax, %xmm0
	movzbl	135(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm0, %xmm2
	movzbl	155(%esp), %eax
	movd	%eax, %xmm0
	movzbl	163(%esp), %eax
	movd	%eax, %xmm3
	punpcklbw	%xmm0, %xmm3
	punpcklbw	%xmm2, %xmm3
	movzbl	188(%esp), %eax
	movd	%eax, %xmm0
	movzbl	197(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm0, %xmm2
	movzbl	217(%esp), %eax
	movd	%eax, %xmm4
	movzbl	225(%esp), %eax
	movd	%eax, %xmm0
	punpcklbw	%xmm4, %xmm0
	punpcklbw	%xmm2, %xmm0
	punpcklbw	%xmm3, %xmm0
	punpcklbw	%xmm1, %xmm0
	addl	$252, %esp
	ret



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65311 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-23 08:49:38 +00:00
Scott Michel
4214a5531c Introduce the BuildVectorSDNode class that encapsulates the ISD::BUILD_VECTOR
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65296 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-22 23:36:09 +00:00
Richard Pennington
3d3c955b18 bug 3610: Test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65287 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-22 15:54:44 +00:00
Evan Cheng
58207f12ee If a use operand is marked isKill, don't forget to add kill to its live interval as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65279 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-22 08:35:56 +00:00
Evan Cheng
6140a8b057 Be bug compatible with gcc by returning MMX values in RAX.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65274 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-22 08:05:12 +00:00
Anton Korobeynikov
b5bd026a75 Drop bunch of half-working stuff in the ext_weak linkage support.
Now we're using one gross, but quite robust hack :) (previous ones
did not work, for example, when ext_weak symbol was used deep inside
constant expression in the initializer).

The proper fix of this problem will require some quite huge asmprinter
changes and that's why was postponed. This fixes PR3629 by the way :)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65230 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-21 11:53:32 +00:00
Evan Cheng
28c7ce3fd4 If two-address def is dead and the instruction does not define other registers, and it doesn't produce side effects, just delete the instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65218 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-21 03:14:25 +00:00
Evan Cheng
d9fb712403 Teach LSR sink to sink the immediate portion of the common expression back into uses if they fit in address modes of all the uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65215 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-21 02:06:47 +00:00
Evan Cheng
d33cec18a9 Fix strange logic in CollectIVUsers used to determine whether all uses are
addresses, part 1. This fixes an obvious logic bug. Previously if the only
in-loop use is a PHI, it would return AllUsesAreAddresses as true.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65178 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-20 22:16:49 +00:00
Evan Cheng
79fb3b434f Support return of MMX values in 64-bit mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65152 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-20 20:43:02 +00:00
Owen Anderson
4cafbb58e2 Fix a crash in the pre-alloc splitter exposed by recent codegen changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65121 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-20 10:02:23 +00:00
Chris Lattner
e772842505 make these tests pass when run on a G5.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65117 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-20 07:10:11 +00:00
Dan Gohman
c17e0cf6c0 Implement "superhero" strength reduction, or full strength
reduction of address calculations down to basic pointer arithmetic.
This is currently off by default, as it needs a few other features
before it becomes generally useful. And even when enabled, full
strength reduction is only performed when it doesn't increase
register pressure, and when several other conditions are true.

This also factors out a bunch of exisiting LSR code out of
StrengthReduceStridedIVUsers into separate functions, and tidies
up IV insertion. This actually decreases register pressure even
in non-superhero mode. The change in iv-users-in-other-loops.ll
is an example of this; there are two more adds because there are
two fewer leas, and there is less spilling.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65108 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-20 04:17:46 +00:00
Evan Cheng
caa0c2cadd GV with null value initializer shouldn't go to BSS if it's meant for a mergeable strings section. Currently it only checks for Darwin. Someone else please check if it should apply to other targets as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64877 91177308-0d34-0410-b5e6-96231b3b80d8
2009-02-18 02:19:52 +00:00