Commit Graph

2990 Commits

Author SHA1 Message Date
Rafael Espindola
49cb9b8886 Assume .cfi_startproc is the first thing in a function. If the function is
externally visable, create a local symbol to use in the CFE. If not, use the
function label itself.

Fixes PR10420.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136716 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-02 20:24:22 +00:00
Bruno Cardoso Lopes
ac5f13fe3f Make this kind of lowering to be supported by 256-bit instructions:
shuffle (scalar_to_vector (load (ptr + 4))), undef, <0, 0, 0, 0>
To:
  shuffle (vload ptr)), undef, <1, 1, 1, 1>
Fix PR10494

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136691 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-02 16:06:18 +00:00
Bruno Cardoso Lopes
55244ceac4 Add v4f64 -> v2f32 fp_round support. Also add a testcase to exercise
the legalizer. This commit together with the two previous ones fixes
PR10495.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136654 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-01 21:54:09 +00:00
Bruno Cardoso Lopes
531f19f767 Since vectors with all ones can't be created with a 256-bit instruction,
avoid returning early for v8i32 types, which would only be valid for
vector with all zeros. Also split the handling of zeros and ones into separate
checking logic since they are handled differently. This fixes PR10547

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136642 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-01 19:51:53 +00:00
Jakob Stoklund Olesen
4af0f5fecb Revert "Don't check liveness of unallocatable registers."
The ARM target depends on CPSR liveness being tracked after register
allocation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136548 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-30 00:57:25 +00:00
Jakob Stoklund Olesen
eeb57c7701 Don't check liveness of unallocatable registers.
This includes registers like EFLAGS and ST0-ST7. We don't check for
liveness issues in the verifier and scavenger because registers will
never be allocated from these classes.

While in SSA form, we do care about the liveness of unallocatable
unreserved registers. Liveness of EFLAGS and ST0 neds to be correct for
MachineDCE and MachineSinking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136541 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-29 23:36:21 +00:00
Bruno Cardoso Lopes
6126005259 Fix two tests that I crashed in the previous commits. The mask elts
on the second half must be reindexed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136454 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-29 02:05:28 +00:00
Bruno Cardoso Lopes
dd6353073f Match VPERMIL masks more strictly and update the target specific mask
generation to always catch the weird cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136453 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-29 01:31:15 +00:00
Bruno Cardoso Lopes
e89c7d4ce3 Add v8i32 and v4i64 vpermil patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136451 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-29 01:31:07 +00:00
Bruno Cardoso Lopes
93fa4766c2 Add patterns to generate copies for extract_subvector instead of
using vextractf128. This will reduce the number of issued instruction
for several avx codes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136323 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes
a23236c360 Add a few patterns to match allzeros without having to use the fp unit.
Take advantage that the 128-bit vpxor zeros the higher part and use it.
This also fixes PR10491

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136321 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes
2e64ae4101 Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move
a convert pattern close to the instruction definition.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136320 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-28 01:26:39 +00:00
Bruno Cardoso Lopes
cea34e41fa The vpermilps and vpermilpd have different behaviour regarding the
usage of the shuffle bitmask. Both work in 128-bit lanes without
crossing, but in the former the mask of the high part is the same
used by the low part while in the later both lanes have independent
masks. Handle this properly and and add support for vpermilpd.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136200 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-27 00:56:34 +00:00
Devang Patel
26a92003cd It is quiet possible that inlined function body is split into multiple chunks of consequtive instructions. But, there is not any way to describe this in .debug_inline accelerator table used by gdb. However, describe non contiguous ranges of inlined function body appropriately using AT_range of DW_TAG_inlined_subroutine debug info entry.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136196 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-27 00:34:13 +00:00
Jakob Stoklund Olesen
e4709777e3 Eliminate copies of undefined values during coalescing.
These copies would coalesce easily, but the resulting value would be
defined by a deleted instruction. Now we also remove the undefined value
number from the destination register.

This fixes PR10503.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136174 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 23:00:24 +00:00
Benjamin Kramer
25ad783322 Update test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136170 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 22:45:39 +00:00
Benjamin Kramer
162ee5c725 Add a neat little two's complement hack for x86.
On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can
fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add,
which can be commuted and encoded efficiently.

This code is generated for __builtin_clz and friends.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136167 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 22:42:13 +00:00
Bruno Cardoso Lopes
4ea496846a Recognize unpckh* masks and match 256-bit versions. The new versions are
different from the previous 128-bit because they work in lanes.
Update a few comments and add testcases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136157 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 22:03:40 +00:00
Eli Friedman
61cc47e15d Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected). Fixes a minor isel issue that was breaking the testcase from r136130.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136148 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 21:02:58 +00:00
Eli Friedman
ce1986bd21 XFAIL this test while I investigate it; it's failing for an unexpected reason.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136131 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 20:41:03 +00:00
Eli Friedman
24f05334e6 Add obvious missing case to switch. PR10497.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136130 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 20:38:49 +00:00
Bruno Cardoso Lopes
5d348b4dc4 Add 256-bit isel for movsldup/movshdup
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136051 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 02:39:32 +00:00
Bruno Cardoso Lopes
863bd9d5cf Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128
This also fixes PR10452

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136004 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 23:05:32 +00:00
Bruno Cardoso Lopes
6a32adc4e5 - Handle special scalar_to_vector case: splats. Using a native 128-bit
shuffle before inserting on a 256-bit vector.
- Add AVX versions of movd/movq instructions
- Introduce a few COPY patterns to match insert_subvector instructions.
This turns a trivial insert_subvector instruction into a register copy,
coalescing the xmm into a ymm and avoid emiting on more instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136002 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 23:05:25 +00:00
Eli Friedman
9eff19896e Attempt to fix test failure reported on llvm-commits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135995 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 22:28:51 +00:00
Eli Friedman
ed4b4272ba Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135993 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 22:25:42 +00:00
Eli Friedman
63f8dde482 Get rid of an incorrect optimization for shuffles with PALIGNR and simplify isPALIGNRMask.
Addresses PR10466, although the crash from that PR only triggers in cases where DAGCombine misses optimizing a shuffle.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135980 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 21:36:45 +00:00
Jakob Stoklund Olesen
b09701db9e Correctly handle <undef> tied uses when rewriting after a split.
This fixes PR10463. A two-address instruction with an <undef> use
operand was incorrectly rewritten so the def and use no longer used the
same register, violating the tie constraint.

Fix this by always rewriting <undef> operands with the register a def
operand would use.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135885 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-24 20:23:50 +00:00
Bruno Cardoso Lopes
bb37dcd66f Fix test check!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135802 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-22 20:55:28 +00:00
Bruno Cardoso Lopes
dad38638e1 Fix PR10422 by adding the necessary AVX UCOMISD memory versions to
load folding logic

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135801 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-22 20:53:20 +00:00
Rafael Espindola
23e31011fb Turn shuffles into unpacks for VT == MVT::v2i64 and MVT::v2f64
too. Patch by Jeff Muizelaar.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135789 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-22 18:56:05 +00:00
Bruno Cardoso Lopes
6683efb4cd -Inspected a AVX code block added by someone in early Feb. This was never used
and was actually very wrong, fix it and make it simpler. Also remove the
ConcatVectors function, which is unused now.

- Fix a introduction of useless nodes in r126664 and r126264. The
VUNPCKL* should never be introduced cause we don't want duplicate
nodes for 128 AVX and non-AVX modes, the actual instruction
difference only exists during isel, but not for target specific DAG
nodes. We only introduce V* target nodes when there is no 128-bit
version already there.

- Fix a fragile test and make it more useful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135729 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-22 00:15:07 +00:00
Bruno Cardoso Lopes
08b076cc96 Although we already support this, add testcases for consistency
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135728 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-22 00:15:03 +00:00
Bruno Cardoso Lopes
74dad551d8 Add a DAGCombine for transforming 128->256 casts into a simple
vxorps + vinsertf128 pair of instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135727 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-22 00:15:00 +00:00
Bruno Cardoso Lopes
dbd4fe2b0a - Register v16i16 as valid VR256 register class
- Add more bitcasts for v16i16
- Since 135661 and 135662 already added the splat logic,
just add one more splat test for v16i16

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135663 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-21 02:24:08 +00:00
Bruno Cardoso Lopes
65b74e1d00 Add support for 256-bit versions of VPERMIL instruction. This is a new
instruction introduced in AVX, which can operate on 128 and 256-bit vectors.
It considers a 256-bit vector as two independent 128-bit lanes. It can permute
any 32 or 64 elements inside a lane, and restricts the second lane to
have the same permutation of the first one. With the improved splat support
introduced early today, adding codegen for this instruction enable more
efficient 256-bit code:

Instead of:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vextractf128  $1, %ymm0, %xmm1
  shufps  $1, %xmm1, %xmm1
  movss %xmm1, 28(%rsp)
  movss %xmm1, 24(%rsp)
  movss %xmm1, 20(%rsp)
  movss %xmm1, 16(%rsp)
  vextractf128  $0, %ymm0, %xmm0
  shufps  $1, %xmm0, %xmm0
  movss %xmm0, 12(%rsp)
  movss %xmm0, 8(%rsp)
  movss %xmm0, 4(%rsp)
  movss %xmm0, (%rsp)
  vmovaps (%rsp), %ymm0
We get:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vpermilps $85, %ymm0, %ymm0

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135662 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-21 01:55:47 +00:00
Devang Patel
4ec14b0dee While emitting constant value, look through derived type and use underlying basic type to determine size and signness of the constant value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135627 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-20 21:57:04 +00:00
Eli Friedman
0381c21d2d PR10421: Fix a straightforward bug in the widening logic for CONCAT_VECTORS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135595 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-20 18:14:33 +00:00
Eric Christopher
03c45f60f3 New pointer rotate test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135562 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-20 03:09:11 +00:00
Evan Cheng
70955c2d12 Fix an obvious typo that's preventing x86 (32-bit) from using .literal16.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135535 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-19 23:14:32 +00:00
Devang Patel
497a397f3e Revert r135423.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135454 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-19 00:28:24 +00:00
Devang Patel
1360bc8eb0 During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases.
[take 2]


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135423 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-18 20:55:23 +00:00
Bruno Cardoso Lopes
3aaa010ece Add AVX 128-bit sqrt versions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135404 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-18 17:51:40 +00:00
Nick Lewycky
b8c129ea66 Delete empty unused file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135379 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-18 05:54:06 +00:00
Bruno Cardoso Lopes
4201ecae92 Add AVX 128-bit patterns for sint_to_fp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135332 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-16 00:50:20 +00:00
Bruno Cardoso Lopes
5bc37dd131 Fix a couple of things:
1) Make non-legal 256-bit loads to be promoted to v4i64. This lets us
canonize the loads and handle things the same way we use to handle
for 128-bit registers. Despite of what one of the removed comments
explained, the load promotion would not mess with VPERM, it's only a
matter of doing the appropriate bitcasts when this instructions comes
to be introduced. Also make LOAD v8i32 legal.

2) Doing 1) exposed two bugs:
- v4i64 was being promoted to itself for several opcodes (introduced
in r124447 by David Greene) causing endless recursion and the stack to
explode.
- there was no support for allOnes BUILD_VECTORs and ANDNP would fail to
match because it was generating early target constant pools during
lowering.

3) The testcases are already checked-in, doing 1) exposed the
bugs in the current testcases.

4) Tidy up code to be more clear and explicit about AVX.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135313 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-15 22:24:33 +00:00
Eric Christopher
5427edeb68 Check register class matching instead of width of type matching
when determining validity of matching constraint. Allow i1
types access to the GR8 reg class for x86.

Fixes PR10352 and rdar://9777108

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135180 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-14 20:13:52 +00:00
Bruno Cardoso Lopes
62f67f86fe Add 256-bit load/store recognition and matching in several places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135171 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-14 18:50:58 +00:00
Benjamin Kramer
3ff25514ce Don't emit a bit test if there is only one case the test can yield false. A simple SETNE is sufficient.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135126 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-14 01:38:42 +00:00
Bruno Cardoso Lopes
53c95880f0 We already support 256-bit packed ADD, SUB, DIV, MUL. Add testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135099 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-13 22:28:55 +00:00