Commit Graph

3011 Commits

Author SHA1 Message Date
NAKAMURA Takumi
9cbb0d2b3c test/CodeGen/X86/opt-shuff-tstore.ll: Add explicit -mtriple=x86_64-linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137262 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-10 22:52:48 +00:00
Devang Patel
c722c3d5ff While extending definition range of a debug variable, consult lexical scopes also. There is no point extending debug variable out side its lexical block. This provides 6x compile time speedup in some cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137250 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-10 21:25:34 +00:00
Nadav Rotem
f429767765 Fix the test. Add cpu target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137241 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-10 19:49:19 +00:00
Nadav Rotem
614061bfb4 When performing a truncating store, it is sometimes possible to rearrange the
data in-register prior to saving to memory.  When we reorder the data in memory
we prevent the need to save multiple scalars to memory, making a single regular
store.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137238 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-10 19:30:14 +00:00
Bruno Cardoso Lopes
6ad251358e The following X86 pattern is incorrect:
def : Pat<(X86Movss VR128:$src1,
                   (bc_v4i32 (v2i64 (load addr:$src2)))),
          (MOVLPSrm VR128:$src1, addr:$src2)>;
This matches a MOVSS dag with a MOVLPS instruction. However, MOVSS will replace only the low 32 bits of the register, while the MOVLPS instruction will replace the low 64 bits. A testcase is added and illustrates the bug and also modified the one that was already present. Patch by Tanya Lattner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137227 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-10 17:45:17 +00:00
Bruno Cardoso Lopes
155a92a491 Fix a bug in vpermilps mask checking. Fix PR10560
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137194 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-10 01:54:17 +00:00
Bruno Cardoso Lopes
d40aa24ebf Add 256-bit support for v8i32, v4i64 and v4f64 ISD::SELECT. Fix PR10556
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137179 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-09 23:27:13 +00:00
Bruno Cardoso Lopes
18deb04e9c Add v16i16 and v32i8 store patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137166 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-09 22:39:53 +00:00
Bruno Cardoso Lopes
cde4a1abd5 Use fp unpack instructions to unpack int types. Until we have AVX2, this
is the best we can do for these patterns. This fix PR10554.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137161 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-09 22:18:37 +00:00
Eli Friedman
fc430a662f Fix a couple ridiculous copy-paste errors. rdar://9914773 .
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137160 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-09 22:17:39 +00:00
Bruno Cardoso Lopes
e2406dfd89 Reapply a more appropriate solution than in r137114. AVX supports
v4f64 = sitofp v4i32. This fix PR10559.
Also add support for v4i32 = fptosi v4f64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137128 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-09 17:39:13 +00:00
Bruno Cardoso Lopes
a511b8e519 Revert r137114
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137127 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-09 17:39:01 +00:00
Bruno Cardoso Lopes
e321d7ffc5 Handle sitofp between v4f64 <- v4i32. Fix PR10559
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137114 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-09 05:48:01 +00:00
Bruno Cardoso Lopes
2f613c5fff Add support for avx vector fextend
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137105 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-09 03:04:29 +00:00
Bruno Cardoso Lopes
b33ea56448 Rename and tidy up tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137103 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-09 03:04:23 +00:00
Bruno Cardoso Lopes
e5118ab7bb Add two patterns to match special vmovss and vmovsd cases. Also fix
the patterns already there to be more strict regarding the predicate.
This fixes PR10558

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137100 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-09 01:43:09 +00:00
Bruno Cardoso Lopes
0f0e0a0e58 Make LowerVSETCC aware of AVX types and add patterns to match them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137090 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-09 00:46:57 +00:00
Bruno Cardoso Lopes
328a9d4a0f Add support for several vector shifts operations while in AVX mode. Fix PR10581
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137067 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-08 21:31:08 +00:00
Jakob Stoklund Olesen
66b0f515d5 Don't clobber pending ST regs when FP regs are killed.
X86FloatingPoint keeps track of pending ST registers for an upcoming
inline asm instruction with fixed stack register constraints.  It does
this by remembering which FP register holds the value that should appear
at a fixed stack position for the inline asm.

When that FP register is killed before the inline asm, make sure to
duplicate it to a scratch register, so the ST register still has a live
FP reference.

This could happen when the same FP register was copied to two ST
registers, or when a spill instruction is inserted between the ST copy
and the inline asm.

This fixes PR10602.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137050 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-08 17:15:43 +00:00
Bill Wendling
456a925c61 Only access both operands of an INSERT_SUBVECTOR if it is an INSERT_SUBVECTOR.
Fixes PR10527.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136853 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-04 00:32:58 +00:00
Jakob Stoklund Olesen
56e3232d5a Handle IMPLICIT_DEF instructions in X86FloatingPoint.
This fixes PR10575.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136787 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-03 16:33:19 +00:00
Rafael Espindola
49cb9b8886 Assume .cfi_startproc is the first thing in a function. If the function is
externally visable, create a local symbol to use in the CFE. If not, use the
function label itself.

Fixes PR10420.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136716 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-02 20:24:22 +00:00
Bruno Cardoso Lopes
ac5f13fe3f Make this kind of lowering to be supported by 256-bit instructions:
shuffle (scalar_to_vector (load (ptr + 4))), undef, <0, 0, 0, 0>
To:
  shuffle (vload ptr)), undef, <1, 1, 1, 1>
Fix PR10494

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136691 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-02 16:06:18 +00:00
Bruno Cardoso Lopes
55244ceac4 Add v4f64 -> v2f32 fp_round support. Also add a testcase to exercise
the legalizer. This commit together with the two previous ones fixes
PR10495.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136654 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-01 21:54:09 +00:00
Bruno Cardoso Lopes
531f19f767 Since vectors with all ones can't be created with a 256-bit instruction,
avoid returning early for v8i32 types, which would only be valid for
vector with all zeros. Also split the handling of zeros and ones into separate
checking logic since they are handled differently. This fixes PR10547

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136642 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-01 19:51:53 +00:00
Jakob Stoklund Olesen
4af0f5fecb Revert "Don't check liveness of unallocatable registers."
The ARM target depends on CPSR liveness being tracked after register
allocation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136548 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-30 00:57:25 +00:00
Jakob Stoklund Olesen
eeb57c7701 Don't check liveness of unallocatable registers.
This includes registers like EFLAGS and ST0-ST7. We don't check for
liveness issues in the verifier and scavenger because registers will
never be allocated from these classes.

While in SSA form, we do care about the liveness of unallocatable
unreserved registers. Liveness of EFLAGS and ST0 neds to be correct for
MachineDCE and MachineSinking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136541 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-29 23:36:21 +00:00
Bruno Cardoso Lopes
6126005259 Fix two tests that I crashed in the previous commits. The mask elts
on the second half must be reindexed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136454 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-29 02:05:28 +00:00
Bruno Cardoso Lopes
dd6353073f Match VPERMIL masks more strictly and update the target specific mask
generation to always catch the weird cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136453 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-29 01:31:15 +00:00
Bruno Cardoso Lopes
e89c7d4ce3 Add v8i32 and v4i64 vpermil patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136451 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-29 01:31:07 +00:00
Bruno Cardoso Lopes
93fa4766c2 Add patterns to generate copies for extract_subvector instead of
using vextractf128. This will reduce the number of issued instruction
for several avx codes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136323 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes
a23236c360 Add a few patterns to match allzeros without having to use the fp unit.
Take advantage that the 128-bit vpxor zeros the higher part and use it.
This also fixes PR10491

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136321 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes
2e64ae4101 Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move
a convert pattern close to the instruction definition.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136320 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-28 01:26:39 +00:00
Bruno Cardoso Lopes
cea34e41fa The vpermilps and vpermilpd have different behaviour regarding the
usage of the shuffle bitmask. Both work in 128-bit lanes without
crossing, but in the former the mask of the high part is the same
used by the low part while in the later both lanes have independent
masks. Handle this properly and and add support for vpermilpd.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136200 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-27 00:56:34 +00:00
Devang Patel
26a92003cd It is quiet possible that inlined function body is split into multiple chunks of consequtive instructions. But, there is not any way to describe this in .debug_inline accelerator table used by gdb. However, describe non contiguous ranges of inlined function body appropriately using AT_range of DW_TAG_inlined_subroutine debug info entry.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136196 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-27 00:34:13 +00:00
Jakob Stoklund Olesen
e4709777e3 Eliminate copies of undefined values during coalescing.
These copies would coalesce easily, but the resulting value would be
defined by a deleted instruction. Now we also remove the undefined value
number from the destination register.

This fixes PR10503.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136174 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 23:00:24 +00:00
Benjamin Kramer
25ad783322 Update test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136170 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 22:45:39 +00:00
Benjamin Kramer
162ee5c725 Add a neat little two's complement hack for x86.
On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can
fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add,
which can be commuted and encoded efficiently.

This code is generated for __builtin_clz and friends.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136167 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 22:42:13 +00:00
Bruno Cardoso Lopes
4ea496846a Recognize unpckh* masks and match 256-bit versions. The new versions are
different from the previous 128-bit because they work in lanes.
Update a few comments and add testcases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136157 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 22:03:40 +00:00
Eli Friedman
61cc47e15d Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected). Fixes a minor isel issue that was breaking the testcase from r136130.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136148 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 21:02:58 +00:00
Eli Friedman
ce1986bd21 XFAIL this test while I investigate it; it's failing for an unexpected reason.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136131 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 20:41:03 +00:00
Eli Friedman
24f05334e6 Add obvious missing case to switch. PR10497.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136130 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 20:38:49 +00:00
Bruno Cardoso Lopes
5d348b4dc4 Add 256-bit isel for movsldup/movshdup
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136051 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 02:39:32 +00:00
Bruno Cardoso Lopes
863bd9d5cf Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128
This also fixes PR10452

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136004 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 23:05:32 +00:00
Bruno Cardoso Lopes
6a32adc4e5 - Handle special scalar_to_vector case: splats. Using a native 128-bit
shuffle before inserting on a 256-bit vector.
- Add AVX versions of movd/movq instructions
- Introduce a few COPY patterns to match insert_subvector instructions.
This turns a trivial insert_subvector instruction into a register copy,
coalescing the xmm into a ymm and avoid emiting on more instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136002 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 23:05:25 +00:00
Eli Friedman
9eff19896e Attempt to fix test failure reported on llvm-commits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135995 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 22:28:51 +00:00
Eli Friedman
ed4b4272ba Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135993 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 22:25:42 +00:00
Eli Friedman
63f8dde482 Get rid of an incorrect optimization for shuffles with PALIGNR and simplify isPALIGNRMask.
Addresses PR10466, although the crash from that PR only triggers in cases where DAGCombine misses optimizing a shuffle.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135980 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 21:36:45 +00:00
Jakob Stoklund Olesen
b09701db9e Correctly handle <undef> tied uses when rewriting after a split.
This fixes PR10463. A two-address instruction with an <undef> use
operand was incorrectly rewritten so the def and use no longer used the
same register, violating the tie constraint.

Fix this by always rewriting <undef> operands with the register a def
operand would use.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135885 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-24 20:23:50 +00:00
Bruno Cardoso Lopes
bb37dcd66f Fix test check!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135802 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-22 20:55:28 +00:00