Commit Graph

3472 Commits

Author SHA1 Message Date
Chandler Carruth
2901243fda Add some comments to the latest test case I added here to document what
is actually being tested. Also add some FileCheck goodness to much more
carefully ensure that the result is the desired result. Before this test
would only have failed through an assert failure if the underlying fix
were reverted.

Also, add some weight metadata and a comment explaining exactly what is
going on to a trick section of the test case. Originally, we were
getting very unlucky and trying to form a block chain that isn't
actually profitable. I'm working on a fix to avoid forming these
unprofitable chains, and that would also have masked any failure from
this test case. The easy solution is to add some metadata that makes it
*really* profitable to form the bad chain here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145006 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-20 09:30:40 +00:00
Craig Topper
0d86d462f8 Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145005 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-20 00:12:05 +00:00
Craig Topper
745a86bac9 Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145004 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 22:34:59 +00:00
Chandler Carruth
03300ecaee Move the handling of unanalyzable branches out of the loop-driven chain
formation phase and into the initial walk of the basic blocks. We
essentially pre-merge all blocks where unanalyzable fallthrough exists,
as we won't be able to update the terminators effectively after any
reorderings. This is quite a bit more principled as there may be CFGs
where the second half of the unanalyzable pair has some analyzable
predecessor that gets placed first. Then it may get placed next,
implicitly breaking the unanalyzable branch even though we never even
looked at the part that isn't analyzable. I've included a test case that
triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize
some more general ones as I dig into related issues.

Also, to make this new scheme work we have to be able to handle branches
into the middle of a chain, so add this check. We always fallback on the
incoming ordering.

Finally, this starts to really underscore a known limitation of the
current implementation -- we don't consider broken predecessors when
merging successors. This can caused major missed opportunities, and is
something I'm planning on looking at next (modulo more bug reports).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144994 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 10:26:02 +00:00
Craig Topper
6bf57b0272 Test cases for SSSE3/AVX integer horizontal add/sub.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144990 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 09:03:33 +00:00
Craig Topper
1666cb6d63 Extend VPBLENDVB and VPSIGN lowering to work for AVX2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144987 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 07:07:26 +00:00
Nadav Rotem
cbbe33fde4 Add AVX2 vpbroadcast support
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144967 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-18 02:49:55 +00:00
Devang Patel
ce35d8b5a1 DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144937 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-17 23:43:15 +00:00
Eli Friedman
4db4addcd4 Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144863 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-16 23:50:22 +00:00
Evan Cheng
2b89498979 Another missing X86ISD::MOVLPD pattern. rdar://10450317
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144839 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-16 22:24:44 +00:00
Evan Cheng
c3aa7c5c5a Disable expensive two-address optimizations at -O0. rdar://10453055
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144806 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-16 18:44:48 +00:00
Eli Friedman
ee94dc212e Fix testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144769 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-16 03:03:52 +00:00
Eli Friedman
d577df8e5a CONCAT_VECTORS can have more than two operands. PR11389.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144768 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-16 02:52:39 +00:00
Nadav Rotem
f8c10e5cb1 AVX: Add support for vbroadcast from BUILD_VECTOR and refactor some of the vbroadcast code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144720 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 22:50:37 +00:00
NAKAMURA Takumi
ec0af2f4e1 test/CodeGen/X86/dec-eflags-lower.ll: Relax expression for win32 x64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144714 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 22:30:37 +00:00
Pete Cooper
2d49689793 Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144705 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 21:57:53 +00:00
Rafael Espindola
6c5b2dcd83 We currently use a callback to handle an IL pass deleting a BB that still
has a reference to it. Unfortunately, that doesn't work for codegen passes
since we don't get notified of MBB's being deleted (the original BB stays).

Use that fact to our advantage and after printing a function, check if
any of the IL BBs corresponds to a symbol that was not printed. This fixes
pr11202.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144674 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 19:08:46 +00:00
Jakob Stoklund Olesen
f805a7c25c Revert r144611 and r144613.
These tests are actually correct, clang was miscompiling ExeDepsFix::processUses.

Evan fixed the miscompilation in r144628.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144630 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 07:13:03 +00:00
Chandler Carruth
3273c8937b Rather than trying to use the loop block sequence *or* the function
block sequence when recovering from unanalyzable control flow
constructs, *always* use the function sequence. I'm not sure why I ever
went down the path of trying to use the loop sequence, it is
fundamentally not the correct sequence to use. We're trying to preserve
the incoming layout in the cases of unreasonable control flow, and that
is only encoded at the function level. We already have a filter to
select *exactly* the sub-set of blocks within the function that we're
trying to form into a chain.

The resulting code layout is also significantly better because of this.
In several places we were ending up with completely unreasonable control
flow constructs due to the ordering chosen by the loop structure for its
internal storage. This change removes a completely wasteful vector of
basic blocks, saving memory allocation in the common case even though it
costs us CPU in the fairly rare case of unnatural loops. Finally, it
fixes the latest crasher reduced out of GCC's single source. Thanks
again to Benjamin Kramer for the reduction, my bugpoint skills failed at
it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144627 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 06:26:43 +00:00
Craig Topper
4c077a1f04 Properly qualify AVX2 specific parts of execution dependency table. Also enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144622 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 05:55:35 +00:00
Jakob Stoklund Olesen
ff70467aa2 Really fix test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144613 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 03:17:01 +00:00
Jakob Stoklund Olesen
3c84ec070a Allow for depencendy-breaking instructions before cvt*.
This should unbreak clang-x86_64-darwin10-RA, but I can't actually
reproduce the failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144611 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 02:29:48 +00:00
Jakob Stoklund Olesen
c2ecf3efbf Break false dependencies before partial register updates.
Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.

The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.

The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144602 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-15 01:15:30 +00:00
Evan Cheng
76c8f08567 Add a missing pattern for X86ISD::MOVLPD. rdar://10436044
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144566 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 20:35:52 +00:00
Evan Cheng
2a4410df44 Teach two-address pass to re-schedule two-address instructions (or the kill
instructions of the two-address operands) in order to avoid inserting copies.
This fixes the few regressions introduced when the two-address hack was
disabled (without regressing the improvements).
rdar://10422688


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144559 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 19:48:55 +00:00
Pete Cooper
a77214a4c4 Changed SSE4/AVX <2 x i64> extract and insert ops to be Custom lowered
Constant idx case is still done in tablegen but other cases are then expanded

Fixes <rdar://problem/10435460>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144557 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 19:38:42 +00:00
Chandler Carruth
2770c14185 Fix an overflow bug in MachineBranchProbabilityInfo. This pass relied on
the sum of the edge weights not overflowing uint32, and crashed when
they did. This is generally safe as BranchProbabilityInfo tries to
provide this guarantee. However, the CFG can get modified during codegen
in a way that grows the *sum* of the edge weights. This doesn't seem
unreasonable (imagine just adding more blocks all with the default
weight of 16), but it is hard to come up with a case that actually
triggers 32-bit overflow. Fortuately, the single-source GCC build is
good at this. The solution isn't very pretty, but its no worse than the
previous code. We're already summing all of the edge weights on each
query, we can sum them, check for an overflow, compute a scale, and sum
them again.

I've included a *greatly* reduced test case out of the GCC source that
triggers it. It's a pretty lame test, as it clearly is just barely
triggering the overflow. I'd like to have something that is much more
definitive, but I don't understand the fundamental pattern that triggers
an explosion in the edge weight sums.

The buggy code is duplicated within this file. I'll colapse them into
a single implementation in a subsequent commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144526 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 08:50:16 +00:00
Chandler Carruth
b5856c83ff Teach machine block placement to cope with unnatural loops. These don't
get loop info structures associated with them, and so we need some way
to make forward progress selecting and placing basic blocks. The
technique used here is pretty brutal -- it just scans the list of blocks
looking for the first unplaced candidate. It keeps placing blocks like
this until the CFG becomes tractable.

The cost is somewhat unfortunate, it requires allocating a vector of all
basic block pointers eagerly. I have some ideas about how to simplify
and optimize this, but I'm trying to get the logic correct first.

Thanks to Benjamin Kramer for the reduced test case out of GCC. Sadly
there are other bugs that GCC is tickling that I'm reducing and working
on now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144516 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-14 00:00:35 +00:00
Chandler Carruth
df234353fb Rewrite #3 of machine block placement. This is based somewhat on the
second algorithm, but only loosely. It is more heavily based on the last
discussion I had with Andy. It continues to walk from the inner-most
loop outward, but there is a key difference. With this algorithm we
ensure that as we visit each loop, the entire loop is merged into
a single chain. At the end, the entire function is treated as a "loop",
and merged into a single chain. This chain forms the desired sequence of
blocks within the function. Switching to a single algorithm removes my
biggest problem with the previous approaches -- they had different
behavior depending on which system triggered the layout. Now there is
exactly one algorithm and one basis for the decision making.

The other key difference is how the chain is formed. This is based
heavily on the idea Andy mentioned of keeping a worklist of blocks that
are viable layout successors based on the CFG. Having this set allows us
to consistently select the best layout successor for each block. It is
expensive though.

The code here remains very rough. There is a lot that needs to be done
to clean up the code, and to make the runtime cost of this pass much
lower. Very much WIP, but this was a giant chunk of code and I'd rather
folks see it sooner than later. Everything remains behind a flag of
course.

I've added a couple of tests to exercise the issues that this iteration
was motivated by: loop structure preservation. I've also fixed one test
that was exhibiting the broken behavior of the previous version.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144495 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-13 11:20:44 +00:00
Jakob Stoklund Olesen
334575e79b Remove the -color-ss-with-regs option.
It was off by default.

The new register allocators don't have the problems that made it
necessary to reallocate registers during stack slot coloring.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144481 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-13 00:31:23 +00:00
Jakob Stoklund Olesen
7f67091259 Linear scan is going away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144472 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 22:39:34 +00:00
Jakob Stoklund Olesen
bf27b61593 Remove obsolete test.
This test was committed with a bugfix to RemoveCopyByCommutingDef, but
that optimization is no longer triggered by this test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144470 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 22:39:27 +00:00
Jakob Stoklund Olesen
55adef0c43 Remove obsolete test.
This test is for a very specific LocalRewriter bug.  LocalRewriter is
going away.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144469 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 22:39:24 +00:00
Jakob Stoklund Olesen
bb2fdd63c6 Remove obsolete test.
I don't think this test does what is was supposed to do, and
LocalRewriter is going away anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144463 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 20:37:57 +00:00
Jakob Stoklund Olesen
d211e731aa Eliminate more linear scan tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144462 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 20:35:26 +00:00
Jakob Stoklund Olesen
7d7d569cbb Switch a couple -O0 tests to RABasic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144461 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 20:11:04 +00:00
Jakob Stoklund Olesen
4ee1aa7020 Delete old test of a VirtRegRewriter feature.
This test doesn't expose the issue with RAGreedy.

I filed PR11363 to track the missing InlineSpiller feature.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144459 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 19:53:48 +00:00
Jakob Stoklund Olesen
8658c51c1b Remove old test that doesn't make sense.
The test is checking that the output doesn't contains any 'mov '
strings. It does contain movl, though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144458 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 19:53:45 +00:00
Craig Topper
7be5dfd1a1 Add more AVX2 shift lowering support. Move AVX2 variable shift to use patterns instead of custom lowering code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144457 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-12 09:58:49 +00:00
Craig Topper
46154eb6fd Add lowering for AVX2 shift instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144380 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-11 07:39:23 +00:00
NAKAMURA Takumi
bd165eac9d test/CodeGen/X86/lsr-loop-exit-cond.ll: Try to appease linux and freebsd bots to specify explicit -mtriple=x86_64-darwin.
I guess it expects -relocation-model=pic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144290 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-10 14:18:59 +00:00
Evan Cheng
623a7e146b Use a bigger hammer to fix PR11314 by disabling the "forcing two-address
instruction lower optimization" in the pre-RA scheduler.

The optimization, rather the hack, was done before MI use-list was available.
Now we should be able to implement it in a better way, perhaps in the
two-address pass until a MI scheduler is available.

Now that the scheduler has to backtrack to handle call sequences. Adding
artificial scheduling constraints is just not safe. Furthermore, the hack
is not taking all the other scheduling decisions into consideration so it's just
as likely to pessimize code. So I view disabling this optimization goodness
regardless of PR11314.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144267 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-10 07:43:16 +00:00
Jakob Stoklund Olesen
17afb06648 Strip old implicit operands after foldMemoryOperand.
The TII.foldMemoryOperand hook preserves implicit operands from the
original instruction.  This is not what we want when those implicit
operands refer to the register being spilled.

Implicit operands referring to other registers are preserved.

This fixes PR11347.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144247 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-10 00:17:03 +00:00
Nadav Rotem
c6c7e85a71 AVX2: Add patterns for variable shift operations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144212 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 21:22:13 +00:00
Duncan Sands
ef0b3ca3a8 Speculatively revert commit 144124 (djg) in the hope that the 32 bit
dragonegg self-host buildbot will recover (it is complaining about object
files differing between different build stages).  Original commit message:

Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144188 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 14:20:48 +00:00
Nadav Rotem
bb539bf973 Add AVX2 support for vselect of v32i8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144187 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 13:21:28 +00:00
Craig Topper
b80ada98c5 Enable execution dependency fix pass for YMM registers when AVX2 is enabled. Add AVX2 logical operations to list of replaceable instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144179 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 09:37:21 +00:00
Craig Topper
0a15035f52 Add instruction selection for AVX2 integer comparisons.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144176 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 08:06:13 +00:00
Craig Topper
aaa643c70e Add AVX2 instruction lowering for add, sub, and mul.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144174 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 07:28:55 +00:00
Jakob Stoklund Olesen
f4c4768fb2 Collapse DomainValues across loop back-edges.
During the initial RPO traversal of the basic blocks, remember the ones
that are incomplete because of back-edges from predecessors that haven't
been visited yet.

After the initial RPO, revisit all those loop headers so the incoming
DomainValues on the back-edges can be properly collapsed.

This will properly fix execution domains on software pipelined code,
like the included test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144151 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-09 01:06:56 +00:00
Dan Gohman
9cae2d2225 Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144124 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 21:29:06 +00:00
Pete Cooper
d9eb920aa4 Adding test for machine-licm operating on invariant load instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144104 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 19:06:53 +00:00
NAKAMURA Takumi
a422294ab1 test/CodeGen/X86/vec_shuffle-39.ll: Add explicit -mtriple=x86_64-linux. Passing packed value is not compatible on Win32 x64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144068 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 03:46:39 +00:00
NAKAMURA Takumi
916d6441e1 test/CodeGen/X86/vec_shuffle-38.ll: Relax expression for Win32 x64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144067 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 03:46:32 +00:00
NAKAMURA Takumi
5fb870d861 test/CodeGen/X86/vec_shuffle.ll: Add explicit -mtriple=i686-linux. We may see some suboptimal frame (%ebp) emission on certain hosts. Possible [PR11031]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144066 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 03:46:25 +00:00
Eli Friedman
2efa35f779 Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
Re-commit of r144034, with an extra fix so that RemoveDeadNode doesn't blow up.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144055 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 01:25:24 +00:00
Evan Cheng
7bc389b6b0 Add x86 isel logic and patterns to match movlps from clang generated IR for _mm_loadl_pi(). rdar://10134392, rdar://10050222
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144052 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 00:31:58 +00:00
Bill Wendling
8b7d76990c Convert to the new EH model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144049 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 00:17:28 +00:00
Bill Wendling
30ceba32b2 Convert tests to the new EH model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144048 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 00:09:27 +00:00
Pete Cooper
02e5fb0f58 Added missing newline
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144046 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-08 00:03:24 +00:00
Eli Friedman
58dd0fec4d Revert r144034 while I try to track down a crash.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144044 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 23:53:20 +00:00
Jakob Stoklund Olesen
61f46de349 Fix test for Windows as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144038 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 23:10:43 +00:00
Jakob Stoklund Olesen
b26c7727c9 Kill and collapse outstanding DomainValues.
DomainValues that are only used by "don't care" instructions are now
collapsed to the first possible execution domain after all basic blocks
have been processed.  This typically means the PS domain on x86.

For example, the vsel_i64 and vsel_double functions in sse2-blend.ll are
completely collapsed to the PS domain instead of containing a mix of
execution domains created by isel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144037 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 23:08:21 +00:00
Pete Cooper
a29fc806fe InstCombine now optimizes vector udiv by power of 2 to shifts
Fixes r8429


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144036 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 23:04:49 +00:00
Eli Friedman
1b4f6f2532 Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144034 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 22:51:10 +00:00
Jakob Stoklund Olesen
32dd4eb204 Fix test for Linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144003 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 20:47:23 +00:00
Jakob Stoklund Olesen
3e5d5c53a0 Expand V_SET0 to xorps by default.
The xorps instruction is smaller than pxor, so prefer that encoding.

The ExecutionDepsFix pass will switch the encoding to pxor and xorpd
when appropriate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143996 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 19:15:58 +00:00
Craig Topper
4c763ee613 Add AVX2 variable shift instructions and intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143915 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 08:26:24 +00:00
Craig Topper
28692044db Add AVX2 VPMOVMASK instructions and intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143904 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 03:20:35 +00:00
Craig Topper
69f5df7778 Add AVX2 VEXTRACTI128 and VINSERTI128 instructions. Fix VPERM2I128 to be qualified with HasAVX2 instead of HasAVX. Mark VINSERTF128 and VEXTRACTF128 as never having side effects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143902 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-07 02:00:04 +00:00
Craig Topper
c8eb880a7f More AVX2 instructions and their intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143895 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-06 23:04:08 +00:00
Craig Topper
27e5d0c72a Add more AVX2 instructions and intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143861 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-06 06:12:20 +00:00
Benjamin Kramer
c25c908977 Add an option to pad an uleb128 to MCObjectWriter and remove the uleb128 encoding from the DWARF asm printer.
As a side effect we now print dwarf ulebs with .ascii directives.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143809 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-05 11:52:44 +00:00
Eli Friedman
bd00a934c6 Enhanced vzeroupper insertion pass that avoids inserting vzeroupper where it is unnecessary through local analysis. Patch from Bruno Cardoso Lopes, with some additional changes.
I'm going to wait for any review comments and perform some additional testing before turning this on by default.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143750 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-04 23:46:11 +00:00
Craig Topper
517497cce0 Add intrinsics for X86 vcvtps2ph and vcvtph2ps instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143682 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-04 06:59:21 +00:00
Dan Gohman
65fd6564b8 Reapply r143206, with fixes. Disallow physical register lifetimes
across calls, and only check for nested dependences on the special
call-sequence-resource register.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143660 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-03 21:49:52 +00:00
Pete Cooper
71fccadbed Reverted r143600 - selector reference change
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143646 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-03 20:47:50 +00:00
Craig Topper
98e0b9c86d Add new X86 AVX2 VBROADCAST instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143612 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-03 07:35:53 +00:00
Pete Cooper
d1ffc739c1 Treat objc selector reference globals as invariant so that MachineLICM can hoist them out of loops. Fixes <rdar://problem/6027699>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143600 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-03 00:56:36 +00:00
Nick Lewycky
6c1a703e54 Don't emit a directory entry for the value in DW_AT_comp_dir, that is always
implied by directory index zero.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143570 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-02 20:55:33 +00:00
Craig Topper
205e3378fd More AVX2 instructions and intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143536 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-02 06:54:17 +00:00
Craig Topper
3f2b2c218f Add a bunch more X86 AVX2 instructions and their corresponding intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143529 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-02 04:42:13 +00:00
Eli Friedman
f6aa6b12f1 Teach the x86 backend a couple tricks for dealing with v16i8 sra by a constant splat value. Fixes PR11289.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143498 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-01 21:18:39 +00:00
Craig Topper
ce7de9f36d Fix operand type for x86 pmadd_ub_sw intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143455 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-01 07:25:22 +00:00
Craig Topper
782c8fbd6e Fix operand type for int_x86_ssse3_phadd_sw_128 intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143336 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-31 07:16:37 +00:00
Craig Topper
593c1d9761 Test case for X86 FS/GS Base intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143332 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-31 02:15:47 +00:00
Craig Topper
6b1c5fc02a Begin adding AVX2 instructions. No selection support yet other than intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143331 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-31 02:15:10 +00:00
Nick Lewycky
4e478fed1b Switch new .file directive emission off by default, change llc's flag for it to
-enable-dwarf-directory.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143326 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-31 01:06:02 +00:00
Benjamin Kramer
dade3c1448 X86: Emit logical shift by constant splat of <16 x i8> as a <8 x i16> shift and zero out the bits where zeros should've been shifted in.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143315 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-30 17:31:21 +00:00
Craig Topper
6762427e8e Fix return type for X86 mpsadbw instrinsic. The instruction takes in a vector of 8-bit integers, but produces a vector of 16-bit integers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143313 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-30 17:22:45 +00:00
Nadav Rotem
fb0dfbbff7 Fix pr11266.
On x86: (shl V, 1) -> add V,V

Hardware support for vector-shift is sparse and in many cases we scalarize the
result. Additionally, on sandybridge padd is faster than shl.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143311 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-30 13:24:22 +00:00
Nadav Rotem
5157588840 Stabilize the test by specifying an exact cpu target
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143307 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-30 08:07:50 +00:00
Nadav Rotem
b00418af67 Add a new DAGCombine optimization for BUILD_VECTOR.
If all of the inputs are zero/any_extended, create a new simple BV
which can be further optimized by other BV optimizations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143297 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-29 21:23:04 +00:00
Benjamin Kramer
f86545ecfd Force SSE for this test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143291 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-29 19:43:44 +00:00
Dan Gohman
6f3ddef7c5 Revert r143206, as there are still some failing tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143262 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-29 00:41:52 +00:00
Dan Gohman
bf923b815d Reapply r143177 and r143179 (reverting r143188), with scheduler
fixes: Use a separate register, instead of SP, as the
calling-convention resource, to avoid spurious conflicts with
actual uses of SP. Also, fix unscheduling of calling sequences,
which can be triggered by pseudo-two-address dependencies.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143206 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-28 17:55:38 +00:00
NAKAMURA Takumi
c3e48c38bf Dwarf: [PR11022] Fix emitting DW_AT_const_value(>i64), to be host-endian-neutral.
Don't assume APInt::getRawData() would hold target-aware endianness nor host-compliant endianness. rawdata[0] holds most lower i64, even on big endian host.

FIXME: Add a testcase for big endian target.

FIXME: Ditto on CompileUnit::addConstantFPValue() ?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143194 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-28 14:12:22 +00:00
NAKAMURA Takumi
5c56f0b589 test/CodeGen/X86/2010-08-10-DbgConstant.ll: Add explicit -mtriple=i686-linux. It must be for elf!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143189 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-28 10:50:52 +00:00
Duncan Sands
62c1d00dfd Speculatively disable Dan's commits 143177 and 143179 to see if
it fixes the dragonegg self-host (it looks like gcc is miscompiled).
Original commit messages:
Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

Delete #if 0 code accidentally left in.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143188 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-28 09:55:57 +00:00
Nick Lewycky
6a7efcfc02 Always use the string pool, even when it makes the .o larger. This may help
tools that read the debug info in the .o files by making the DIE sizes more
consistent.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143186 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-28 05:29:47 +00:00
Dan Gohman
2ba60e5930 Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143177 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-28 01:29:32 +00:00
Pete Cooper
cbe35f2147 Changed test to check for correct load size instead of shift as the shift might change if optimised
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143116 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-27 18:15:58 +00:00
Nick Lewycky
390c40d96a Teach our Dwarf emission to use the string pool.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143097 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-27 06:44:11 +00:00
Eli Friedman
fd58cd7563 Don't crash on 128-bit sdiv by constant. Found by inspection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143095 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-27 02:06:39 +00:00
Rafael Espindola
2a1286ed58 Run test with -verify-machineinstrs.
Patch by Sanjoy Das.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143066 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-26 21:20:26 +00:00
Rafael Espindola
66bf7430f5 Fixes an issue reported by -verify-machineinstrs.
Patch by Sanjoy Das.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143064 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-26 21:16:41 +00:00
Rafael Espindola
e840e88239 This commit introduces two fake instructions MORESTACK_RET and
MORESTACK_RET_RESTORE_R10; which are lowered to a RET and a RET
followed by a MOV respectively.  Having a fake instruction prevents
the verifier from seeing a MachineBasicBlock end with a
non-terminator (MOV).  It also prevents the rather eccentric case of a
MachineBasicBlock ending with RET but having successors nevertheless.

Patch by Sanjoy Das.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143062 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-26 21:12:27 +00:00
Chandler Carruth
3071363bcd Completely re-write the algorithm behind MachineBlockPlacement based on
discussions with Andy. Fundamentally, the previous algorithm is both
counter productive on several fronts and prioritizing things which
aren't necessarily the most important: static branch prediction.

The new algorithm uses the existing loop CFG structure information to
walk through the CFG itself to layout blocks. It coalesces adjacent
blocks within the loop where the CFG allows based on the most likely
path taken. Finally, it topologically orders the block chains that have
been formed. This allows it to choose a (mostly) topologically valid
ordering which still priorizes fallthrough within the structural
constraints.

As a final twist in the algorithm, it does violate the CFG when it
discovers a "hot" edge, that is an edge that is more than 4x hotter than
the competing edges in the CFG. These are forcibly merged into
a fallthrough chain.

Future transformations that need te be added are rotation of loop exit
conditions to be fallthrough, and better isolation of cold block chains.
I'm also planning on adding statistics to model how well the algorithm
does at laying out blocks based on the probabilities it receives.

The old tests mostly still pass, and I have some new tests to add, but
the nested loops are still behaving very strangely. This almost seems
like working-as-intended as it rotated the exit branch to be
fallthrough, but I'm not convinced this is actually the best layout. It
is well supported by the probabilities for loops we currently get, but
those are pretty broken for nested loops, so this may change later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142743 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-23 09:18:45 +00:00
Nadav Rotem
5b2bba6122 Fix pr11193.
SHL inserts zeros from the right, thus even when the original
sign_extend_inreg value was of 1-bit, we need to sra.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142724 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-22 12:39:25 +00:00
Nadav Rotem
a054bcb4cf Fix pr11194. When promoting and splitting integers we need to use
ZExtPromotedInteger and SExtPromotedInteger based on the operation we legalize.

SetCC return type needs to be legalized via PromoteTargetBoolean.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142660 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-21 17:35:19 +00:00
Chandler Carruth
7555c40c48 Don't hard code the desired alignment for loops -- it isn't 16-bytes on
all x86 systems. Sorry for the breakage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142656 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-21 16:41:39 +00:00
Nadav Rotem
4bd222ae26 1. Fix the widening of SETCC in WidenVecOp_SETCC. Use the correct return CC type.
2. Fix a typo in CONCAT_VECTORS which exposed the bug in #1.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142648 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-21 11:42:07 +00:00
Chandler Carruth
4a85cc982a Add loop aligning to MachineBlockPlacement based on review discussion so
it's a bit more plausible to use this instead of CodePlacementOpt. The
code for this was shamelessly stolen from CodePlacementOpt, and then
trimmed down a bit. There doesn't seem to be much utility in returning
true/false from this pass as we may or may not have rewritten all of the
blocks. Also, the statistic of counting how many loops were aligned
doesn't seem terribly important so I removed it. If folks would like it
to be included, I'm happy to add it back.

This was probably the most egregious of the missing features, and now
I'm going to start gathering some performance numbers and looking at
specific loop structures that have different layout between the two.

Test is updated to include both basic loop alignment and nested loop
alignment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142645 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-21 08:57:37 +00:00
Chandler Carruth
4162eced73 Add a very basic test for MachineBlockPlacement. This is essentially the
canonical example I used when developing it, and is one of the primary
motivating real-world use cases for __builtin_expect (when burried under
a macro).

I'm working on more test cases here, but I'm trying to make sure both
that the pass is doing the right thing with the test cases and that they
aren't too brittle to changes elsewhere in the code generation pipeline.

Feedback and/or suggestions on how to test this are very welcome.
Especially feedback on whether testing the block comments is a good
strategy; I couldn't find any good examples to steal from but all the
other ideas I had were a lot uglier or more fragile.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142644 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-21 08:01:56 +00:00
Craig Topper
b4c945716f Remove intrinsics for X86 BLSI, BLSMSK, and BLSR intrinsics and replace with custom isel lowering code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142642 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-21 06:55:01 +00:00
Evan Cheng
fd230df463 Fix TLS lowering bug. The CopyFromReg must be glued to the TLSCALL. rdar://10291355
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142550 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-19 22:22:54 +00:00
Nadav Rotem
815af82b74 Improve code generation for vselect on SSE2:
When checking the availability of instructions using the TLI, a 'promoted'
instruction IS available. It means that the value is bitcasted to another type
for which there is an operation. The correct check for the availablity of an
instruction is to check if it should be expanded.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142542 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-19 20:43:16 +00:00
Nadav Rotem
ca58c72267 Add support for the vector-widening of vselect and vector-setcc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142488 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-19 09:45:11 +00:00
Craig Topper
717cdb0df8 Rename PEXTR to PEXT. Add intrinsics for BMI instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142480 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-19 07:48:35 +00:00
Lang Hames
aa13603a3e Added testcase for <rdar://problem/10215997>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142462 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-18 23:50:52 +00:00
Nadav Rotem
d88bc2a683 Add additional element-promotion tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142442 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-18 23:05:33 +00:00
Nadav Rotem
fbf19ef186 Fix a bug in the legalization of vector anyext-load and trunc-store. Mem Index starts with zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142434 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-18 22:32:43 +00:00
Nick Lewycky
44d798d976 Add support for a new extension to the .file directive:
.file filenumber "directory" "filename"

This removes one join+split of the directory+filename in MC internals. Because
bitcode files have independent fields for directory and filenames in debug info,
this patch may change the .o files written by existing .bc files.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142300 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-17 23:05:28 +00:00
Nadav Rotem
63aa7910ae stabalize tests by specifying the exact sse level
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142229 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-17 19:45:38 +00:00
Nadav Rotem
39c14efd6d Clean the triple, add check lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142183 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-17 07:07:51 +00:00
Nadav Rotem
ee6fd6a2b2 Previously v2i32 vectors were legalized to v4i32. Now, they are legalized to
v2i64. These tests do not check MMX nor zmoving into them.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142182 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-17 06:59:01 +00:00
Nadav Rotem
f6481aee43 Add tripple and stabalize a few more tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142158 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-16 21:20:54 +00:00
Nadav Rotem
acb10f1519 Add triple to tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142154 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-16 20:53:20 +00:00
Nadav Rotem
9eee96a598 fix a typo in the test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142153 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-16 20:43:41 +00:00
Nadav Rotem
8fb06b3e8f Enable element promotion type legalization by deafault.
Changed tests which assumed that vectors are legalized by widening them.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142152 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-16 20:31:33 +00:00
Jakob Stoklund Olesen
ac7caa0d43 Update live-in lists when splitting critical edges.
Fixes PR10814. Patch by Jan Sjödin!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141960 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-14 17:25:46 +00:00
Craig Topper
54a11176f6 Add X86 ANDN instruction. Including instruction selection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141947 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-14 07:06:56 +00:00
Craig Topper
909652f687 Add X86 TZCNT instruction and patterns to select it. Also added core-avx2 processor which is gcc's name for Haswell.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141939 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-14 03:21:46 +00:00
Jakob Stoklund Olesen
a80444f88d Add value numbers when spilling dead defs.
When spilling around an instruction with a dead def, remember to add a
value number for the def.

The missing value number wouldn't normally create problems since there
would be an incoming live range as well.  However, due to another bug
we could spill a dead V_SET0 instruction which doesn't read any values.

The missing value number caused an empty live range to be created which
is dangerous since it doesn't interfere with anything.

This fixes part of PR11125.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141923 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-14 00:34:31 +00:00
Benjamin Kramer
d38e99e949 Force CPU type on test so it doesn't accidentally emit movbe instead of bswap on Intel Atom CPUs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141863 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-13 14:27:54 +00:00
Bill Wendling
4e68054b20 More closely follow libgcc, which has code after the `ret' instruction to
release the stack segment and reset the stack pointer. Place the code in its own
MBB to make the verifier happy.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141859 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-13 08:24:19 +00:00
Bill Wendling
1203fe7fc8 Revert r141854 because it was causing failures:
http://lab.llvm.org:8011/builders/llvm-x86_64-linux/builds/101

--- Reverse-merging r141854 into '.':
U    test/MC/Disassembler/X86/x86-32.txt
U    test/MC/Disassembler/X86/simple-tests.txt
D    test/CodeGen/X86/bmi.ll
U    lib/Target/X86/X86InstrInfo.td
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86.td
U    lib/Target/X86/X86Subtarget.h



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141857 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-13 07:48:07 +00:00
Bill Wendling
82222c20be Should not add instructions to a BB after a return instruction. The machine instruction verifier doesn't like this, nor do I.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141856 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-13 07:42:32 +00:00
Craig Topper
8ab1d1e900 Add X86 TZCNT instruction and patterns to select it. Also added core-avx2 processor which is gcc's name for Haswell.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141854 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-13 07:09:14 +00:00
Jakob Stoklund Olesen
dee83c90bb Also inflate register classes around inline asm.
Now that MI->getRegClassConstraint() can also handle inline assembly,
don't bail when recomputing the register class of a virtual register
used by inline asm.

This fixes PR11078.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141836 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-12 23:37:40 +00:00
Bill Wendling
f6fb7ed53c We need to verify that the machine instruction we're using as a replacement for
our current machine instruction defines a register with the same register class
as what's being replaced. This showed up in the SPEC 403.gcc benchmark, where it
would ICE because a tail call was expecting one register class but was given
another. (The machine instruction verifier catches this situation.)
<rdar://problem/10270968>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141830 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-12 23:03:40 +00:00
Bob Wilson
b75e5d96dd Make this test more specific. There are 3 stats that matched "machine-licm".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141741 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-11 23:34:31 +00:00
Eric Christopher
6618a241f7 Add a new wrapper node for a DILexicalBlock that encapsulates it and a
file. Since it should only be used when necessary propagate it through
the backend code generation and tweak testcases accordingly.

This helps with code like in clang's test/CodeGen/debug-info-line.c where
we have multiple #line directives within a single lexical block and want
to generate only a single block that contains each file change.

Part of rdar://10246360

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141729 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-11 22:59:11 +00:00
Devang Patel
2e35047947 Add dominance check for the instruction being hoisted.
For example, MachineLICM should not hoist a load that is not guaranteed to be executed.
Radar 10254254.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141689 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-11 18:09:58 +00:00
Nadav Rotem
6fe4e51547 Add support for legalization of vector SHL/SRA/SRL instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141667 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-11 14:36:35 +00:00
Craig Topper
a6f386b4cd Test case for X86 LZCNT instruction selection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141652 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-11 06:47:01 +00:00
NAKAMURA Takumi
b1044b042f test/CodeGen/X86/movbe.ll: Give explicit -mtriple=x86_64-linux, to unbreak win32 hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141640 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-11 03:41:03 +00:00
Devang Patel
db7334dbc5 Revert r141569 and r141576.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141594 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-10 23:18:02 +00:00
Eli Friedman
dca62d53b7 Make sure the X86 backend doesn't explode on 128-bit shuffles in AVX mode. Fixes PR11102.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141585 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-10 22:28:47 +00:00
Devang Patel
6b50bc9d88 If loop header is also loop exiting block then it may not be safe to hoist instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141576 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-10 20:32:03 +00:00
Nadav Rotem
a7934dd8e4 Fix 10892 - When lowering SIGN_EXTEND_INREG do not lower v2i64 because the
instruction set has no 64-bit SRA support.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141570 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-10 19:31:45 +00:00
Devang Patel
9ac743a4ee Add dominance check for the instruction being hoisted.
For example, MachineLICM should not hoist a load that is not guaranteed to be executed.
Radar 10254254.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141569 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-10 19:09:20 +00:00
Benjamin Kramer
a86a58695d X86: Add patterns for the movbe instruction (mov + bswap, only available on atom)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141563 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-10 18:34:56 +00:00
Jakob Stoklund Olesen
ed74482704 Add TEST8ri_NOREX pseudo to constrain sub_8bit_hi copies.
In 64-bit mode, sub_8bit_hi sub-registers can only be used by NOREX
instructions. The COPY created from the EXTRACT_SUBREG DAG node cannot
target all GR8 registers, only those in GR8_NOREX.

TO enforce this, we ensure that all instructions using the
EXTRACT_SUBREG are GR8_NOREX constrained.

This fixes PR11088.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141499 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-08 18:28:28 +00:00
Jakob Stoklund Olesen
a55f6575ae Add missing test case for r141410.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141498 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-08 18:06:54 +00:00
Evan Cheng
7c1780c5fe High bits of movmskp{s|d} and pmovmskb are known zero. rdar://10247336
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141371 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-07 17:21:44 +00:00
Bill Wendling
0676d2a04c Filecheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140904 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-30 23:40:29 +00:00
Bill Wendling
127f410c3a Add new line at end of file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140903 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-30 23:21:11 +00:00
Bill Wendling
e09b2a0d49 When inferring the pointer alignment, if the global doesn't have an initializer
and the alignment is 0 (i.e., it's defined globally in one file and declared in
another file) it could get an alignment which is larger than the ABI allows for
that type, resulting in aligned moves being used for unaligned loads.

For instance, in file A.c:

   struct S s;

In file B.c:
   struct {
     // something long
   };
   extern S s;

   void foo() {
     struct S p = s;
     // ...
   }

this copy is a 'memcpy' which is turned into a series of 'movaps' instructions
on X86. But this is wrong, because 'struct S' has alignment of 4, not 16.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140902 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-30 23:19:55 +00:00
Andrew Trick
0c01bc385a LSR: rewrite inner loops only.
Rewriting the entire loop nest now requires -enable-lsr-nested.
See PR11035 for some performance data.
A few unit tests specifically test nested LSR, and are now under a flag.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140762 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-29 01:33:38 +00:00
Eli Friedman
7d3e2b78c7 PR11033: Make sure we don't generate PCMPGTQ and PCMPEQQ if the target CPU does not support them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140723 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-28 21:00:25 +00:00
Jakob Stoklund Olesen
df4b35e3dd Remove X86-dependent stuff from SSEDomainFix.
This also enables domain swizzling for AVX code which required a few
trivial test changes.

The pass will be moved to lib/CodeGen shortly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140659 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-27 23:50:46 +00:00
Eli Friedman
139e6699c4 Last batch of test conversions to new atomic instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140585 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-27 00:17:29 +00:00
Eli Friedman
184944acdf Convert a bunch more tests over to the new atomic instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140582 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-26 23:15:09 +00:00
Jakob Stoklund Olesen
51f0c76419 Only run MF.verify() with EXPENSIVE_CHECKS=1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140441 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-24 01:11:19 +00:00
Jakob Stoklund Olesen
5adc07ebe8 Verify that terminators follow non-terminators.
This exposes a -segmented-stacks bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140429 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-23 22:45:39 +00:00
Eli Friedman
bde81d5be9 PR10998: It is not legal to sink an instruction past the terminator of a block; make sure we don't do that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140428 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-23 22:41:57 +00:00
Eli Friedman
7666c7e4d2 PR10989: Don't print .hidden on Windows.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140356 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-23 00:13:02 +00:00
Eli Friedman
a6176adc8a PR10991: make fast-isel correctly check whether accessing a global through an alias involves thread-local storage. (I'm not entirely sure how this is supposed to work, but this patch makes fast-isel consistent with the normal isel path.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140355 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-22 23:41:28 +00:00
Duncan Sands
17470bee5f Synthesize SSE3/AVX 128 bit horizontal add/sub instructions from
floating point add/sub of appropriate shuffle vectors.  Does not
synthesize the 256 bit AVX versions because they work differently.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140332 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-22 20:15:48 +00:00
Devang Patel
1dd4e56d55 Do not unnecessarily use AT_specification DIE because it does not add any value.
Few weeks ago, llvm completely inverted the debug info graph. Earlier each debug info node used to keep track of its compile unit, now compile unit keeps track of important nodes. One impact of this change is that the global variable's do not have any context, which should be checked before deciding to use AT_specification DIE.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140282 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-21 23:41:11 +00:00
Nadav Rotem
d7e0ceaa59 add another testcase for pr10902
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140257 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-21 17:13:40 +00:00
Nadav Rotem
1147248e6f [VECTOR-SELECT] Address one of the bugs in pr10902.
Vector SetCC result types need to be type-legalized.
This code worked before because scalar result types are known to be legal.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140249 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-21 14:34:38 +00:00
Bruno Cardoso Lopes
e97190fdf8 Add a DAGCombine for subvector extracts to remove useless chains of
subvector inserts and extracts. Initial patch by Rackover, Zvi with
some tweak done by me.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140204 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-20 23:19:33 +00:00
Bruno Cardoso Lopes
f4b841d4e2 Revert r140097, working on a better approach
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140203 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-20 23:19:29 +00:00
NAKAMURA Takumi
6aaf0561ae test/CodeGen/X86/avx-minmax.ll: Unbreak Win32.
On Windows x64, 128-bit arguments are not passed by reg but by indirect. eg.

maxpd:
        vmovapd (%rcx), %xmm0
        vmaxpd  (%rdx), %xmm0, %xmm0

FIXME: I don't care YMM on x64 for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140143 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-20 14:11:35 +00:00
Craig Topper
3699261d3f Extend changes from r139986 to produce 256-bit AVX minps/minpd/maxps/maxpd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140140 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-20 07:38:59 +00:00
Bruno Cardoso Lopes
278cbfb3f5 Attempt to fix -mtriple=i686-{cygwin|mingw|win32} regressions. Nakamura,
if this doesn't work, please provide more details.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140107 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-20 00:08:12 +00:00
Bruno Cardoso Lopes
97136c922e Based on the small opt Zvi's patch was trying to achieve, eliminate
128-bit undef subvector insertion into a 256-bit vector

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140097 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-19 23:36:50 +00:00
Bruno Cardoso Lopes
97dc60b759 Match X86ISD::FSETCCsd and X86ISD::FSETCCss while in AVX mode. This fix
PR10955 and PR10948.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140069 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-19 21:29:24 +00:00
Nadav Rotem
354efd88db setOperationAction should be done on the return value of the type, not the operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140001 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-18 14:57:03 +00:00
Nadav Rotem
91e43fd17a When promoting integer vectors we often create ext-loads. This patch adds a
dag-combine optimization to implement the ext-load efficiently (using shuffles).

For example the type <4 x i8> is stored in memory as i32, but it needs to
find its way into a <4 x i32> register. Previously we scalarized the memory
access, now we use shuffles.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139995 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-18 10:39:32 +00:00
Benjamin Kramer
5778fef314 Apply Duncan's test fix from r139986 to the avx version of that test too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139992 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-18 00:41:38 +00:00
Duncan Sands
6bcd2196e5 Synthesize x86 max/min instructions also for vectors (i.e. produce
maxps and maxpd).  This broke the sse41-blend.ll testcase by causing
maxpd to be produced rather than a cmp+blend pair, which is the reason
I tweaked it.  Gives a small speedup on doduc with dragonegg when the
GCC vectorizer is used.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139986 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-17 16:49:39 +00:00
Andrew Trick
cc32efd592 Test case trial and error. Not sure the proper way to check MBB names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139900 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-16 03:57:19 +00:00
Andrew Trick
17bd2c5d68 Reduced a stronger test case for coalescer bug PR10920.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139898 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-16 03:46:49 +00:00
Jakob Stoklund Olesen
01afdb3a45 VirtRegMap is counting spill slots, not register spills.
Fix the stats counters to reflect that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139819 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-15 18:31:13 +00:00
Bruno Cardoso Lopes
0c4b9ff077 Change all checks regarding the presence of any SSE level to always
take into consideration the presence of AVX. This change, together with
the SSEDomainFix enabled for AVX, makes AVX codegen to always (hopefully)
emit the same code as SSE for 128-bit vector ops. I don't
have a testcase for this, but AVX now beats SSE in performance for
128-bit ops in the majority of programas in the llvm testsuite

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139817 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-15 18:27:36 +00:00
Andrew Trick
b1afbac64b [regcoalescing] bug fix for RegistersDefinedFromSameValue.
An improper SlotIndex->VNInfo lookup was leading to unsafe copy removal.
Fixes PR10920 401.bzip2 miscompile with no IV rewrite.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139765 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-15 01:09:33 +00:00
Nadav Rotem
436fe8498a Add integer promotion support for vselect
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139692 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-14 14:42:15 +00:00
Bruno Cardoso Lopes
5ca0d14915 Vector shuffle mask <i32 4, i32 5, i32 2, i32 3> should yield "movsd", not "movss".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139686 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-14 02:36:14 +00:00
Devang Patel
64789c582c Remove unnecessary old test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139674 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-14 00:28:54 +00:00
Eli Friedman
fe731214d2 Error out on CodeGen of unaligned load/store. Fix test so it isn't accidentally testing that case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139641 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-13 20:50:54 +00:00
Nadav Rotem
e1490d1e43 update checked pattern
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139631 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-13 19:59:18 +00:00
Nadav Rotem
aec5861bb6 Add vselect target support for targets that do not support blend but do support
xor/and/or (For example SSE2).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139623 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-13 19:17:42 +00:00
Bruno Cardoso Lopes
8970060a4c Change testcase commandline to be more strict and silence buildbots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139554 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-12 22:59:26 +00:00
Bruno Cardoso Lopes
5fc48100ee Fix PR10845. SUBREG_TO_REG shouldn't be used when the input and
destination types are equal!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139553 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-12 22:59:23 +00:00
Bruno Cardoso Lopes
457d53d9ce Revert the wrong part of r139528, and fix testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139541 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-12 21:24:07 +00:00
Bruno Cardoso Lopes
8e03a821f9 Not sure how CMPPS and CMPPD had already ever worked, I guess it didn't.
However with this fix it does now.

Basically the operand order for the x86 target specific node
is not the same as the instruction, but since the intrinsic need that
specific order at the instruction definition, just change the order
during legalization. Also, there were some wrong invertions of condition
codes, such as GE => LE, GT => LT, fix that too. Fix PR10907.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139528 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-12 19:30:40 +00:00
Eli Friedman
cfeb55cdbc Really un-XFAIL the testcase, like I said I would in r139458.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139459 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-10 02:02:27 +00:00
Richard Trieu
2db8628085 Fixed an assert from:
assert("not implemented for target shuffle node");

to:

  assert(0 && "not implemented for target shuffle node");

This causes a test failure in CodeGen/X86/palignr.ll which has
been marked as XFAIL for the time being.
Test failure filed at PR10901.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139454 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-10 01:26:21 +00:00
Nadav Rotem
8ffad56f8e Implement vector-select support for avx256. Refactor the vblend implementation to have tablegen match the instruction by the node type
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139400 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-09 20:29:17 +00:00
Bruno Cardoso Lopes
7ec8fb8830 Add a AVX version of a simple i64 -> f64 bitcast. This could be
triggered using llc with -O0, which wouldn't let it be folded and
expose the lack of this pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139320 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-08 21:52:33 +00:00
Bruno Cardoso Lopes
7cf79a88c8 Reapply testcase from r139309!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139318 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-08 21:05:43 +00:00
Bruno Cardoso Lopes
caa60f15e4 Remove this crashing test, until I figure out what's going wrong here
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139309 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-08 18:32:36 +00:00
Bruno Cardoso Lopes
814c6ced85 Add AVX versions of blend vector operations and fix some issues noticed
in Nadav's r139285 and r139287 commits.

1) Rename vsel.ll to a more descriptive name
2) Change the order of BLEND operands to "Op1, Op2, Cond", this is
necessary because PBLENDVB is already used in different places with
this order, and it was being emitted in the wrong way for vselect
3) Add AVX patterns and tests for the same SSE41 instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139305 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-08 18:05:08 +00:00
Bruno Cardoso Lopes
7db2d3a504 Fix PR10844: Add patterns to cover non foldable versions of X86vzmovl.
Triggered using llc -O0. Also fix some SET0PS patterns to their AVX
forms and test it on the testcase.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139304 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-08 18:05:02 +00:00
Nadav Rotem
cbdd2d10ba add a testcase for the previous patch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139287 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-08 08:31:31 +00:00
Eli Friedman
d5ccb0558f Fix atomic load and store on x86 to pass -verify-machineinstrs (and possibly fix some subtle bugs involving passes which check mayStore()).
This isn't exactly ideal, but it is good enough for the moment.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139245 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-07 18:48:32 +00:00
Duncan Sands
68b859f757 Another forgotten trampoline testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139230 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-07 10:05:14 +00:00
Devang Patel
541a81cc2b While sinking machine instructions, sink matching DBG_VALUEs also otherwise live debug variable pass will drop DBG_VALUEs on the floor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139208 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-07 00:07:58 +00:00
Jakob Stoklund Olesen
5047d76575 Pseudo CMOV instructions don't clobber EFLAGS.
The explanation about a 0 argument being materialized as xor is no
longer valid.  Rematerialization will check if EFLAGS is live before
clobbering it.

The code produced by X86TargetLowering::EmitLoweredSelect does not
clobber EFLAGS.

This causes one less testb instruction to be generated in the cmov.ll
test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139057 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-02 23:52:55 +00:00
Eli Friedman
4136d23c48 Don't fast-isel for atomic load/store; some cases require extra handling missing from fast-isel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139044 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-02 22:33:24 +00:00
Duncan Sands
147272b8a7 Darwin wants ctors/dtors to be ordered the other way round to linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139015 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-02 18:07:19 +00:00
Benjamin Kramer
d4f27d7daa This test depends on cmov being available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138954 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-01 18:40:01 +00:00
Bruno Cardoso Lopes
a39ccdb9d4 Fix vbroadcast matching logic to early unmatch if the node doesn't have
only one use. Fix PR10825.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138951 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-01 18:15:06 +00:00
Andrew Trick
340d78f4e7 PreRA scheduler should avoid cloning compares.
Added canClobberReachingPhysRegUse() to handle a particular pattern in
which a two-address instruction could be forced to interfere with
EFLAGS, causing a compare to be unnecessarilly cloned.
Fixes rdar://problem/5875261


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138924 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-01 00:54:31 +00:00
Bill Wendling
78ae1f7947 Remove old declare statements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138905 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31 21:41:20 +00:00
Bill Wendling
6b94b67319 Update more tests to the new EH scheme.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138904 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31 21:40:15 +00:00
Bill Wendling
935903191f Update more tests to the new EH scheme.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138903 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31 21:39:05 +00:00
David Greene
d92e2e4f88 Compress Repeated Byte Output
Emit a repeated sequence of bytes using .zero.  This saves an enormous
amount of asm file space for certain programs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138864 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31 17:30:56 +00:00
Benjamin Kramer
31d27ce568 This test requires sse, otherwise x87 ops will block tailcall optimization
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138859 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31 16:49:05 +00:00
Bruno Cardoso Lopes
57d6a5e491 - Move all MOVSS and MOVSD patterns close to their definitions
- Duplicate some store patterns to their AVX forms!
- Catched a bug while restricting the patterns subtarget, fix it
  and update a testcase to check it properly

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138851 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31 03:04:20 +00:00
Evan Cheng
0899f5c62d Fix (movhps load) lowering / pattern to match more cases. rdar://10050549
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138848 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31 02:05:24 +00:00
Benjamin Kramer
8f00ffce50 Fix test typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138843 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31 00:02:59 +00:00
Rafael Espindola
6cac2025da Add a triple.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138831 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-30 21:19:37 +00:00
Rafael Espindola
b0bf8935ee Some test code to check if correct code is being generated.
Patch by Sanjoy Das.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138820 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-30 19:51:29 +00:00
Eli Friedman
f3704769bb Explicitly zero out parts of a vector which are required to be zero by the algorithm in LowerUINT_TO_FP_i32. This only has a substantial effect on the generated code when the input is extracted from a vector register; other ways of loading an i32 do the appropriate zeroing implicitly. Fixes PR10802.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138768 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-29 21:15:46 +00:00
Duncan Sands
fd9c4f76f4 Fix PR5329: pay attention to constructor/destructor priority
when outputting them.  With this, the entire LLVM testsuite
passes when built with dragonegg.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138724 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-28 13:17:22 +00:00
Bill Wendling
234e43a888 Update to new EH scheme.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138699 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-27 04:53:41 +00:00
Bill Wendling
f2cf25b212 Cannot have an llvm.eh.exception call in a non-landing pad block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138698 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-27 04:53:28 +00:00
Eli Friedman
43f51aeca8 Add support for generating CMPXCHG16B on x86-64 for the cmpxchg IR instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138660 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-26 21:21:21 +00:00
Bruno Cardoso Lopes
6292eceea0 Add support for AVX 256-bit version of MOVDDUP!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138588 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-25 21:40:37 +00:00
Bruno Cardoso Lopes
07b7f672a0 Add support for 256-bit versions of VSHUFPD and VSHUFPS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138546 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-25 02:58:26 +00:00
Eli Friedman
f8f90f0174 Hook up 64-bit atomic load/store on x86-32. I plan to write more efficient implementations eventually.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138505 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-24 22:33:28 +00:00
Eli Friedman
bbc87a3a9a Basic tests for atomic load and store on x86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138486 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-24 21:16:59 +00:00
Craig Topper
13894fa135 Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid costly scalarization. Fixes PR10711.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138427 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-24 06:14:18 +00:00
Bruno Cardoso Lopes
d8b7dd5252 Fix a nasty bug where a v4i64 was being wrong emitted with 32-bit
permutations. Also tidy up some patterns and make them close to their
instruction definition!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138392 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-23 22:06:37 +00:00
Nick Lewycky
726ebd6ff3 PerformSubCombine to work on integers larger than i128. Fixes a crasher.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138354 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-23 19:01:24 +00:00
Craig Topper
a534780da0 Add support for breaking 256-bit v16i16 and v32i8 VSETCC into two 128-bit ones, avoiding sclarization. Add vex form of pcmpeqq and pcmpgtq. Fixes more cases for PR10712.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138321 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-23 04:36:33 +00:00
Bruno Cardoso Lopes
3bde6fe0df Introduce a pass to insert vzeroupper instructions to avoid AVX to
SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper"
llc command line option. This is only the first step (very naive and
conservative one) to sketch out the idea, but proper DFA is coming next
to allow smarter decisions. Comments and ideas now and in further commits
will be very appreciated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138317 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-23 01:14:17 +00:00
Bruno Cardoso Lopes
2ac8111159 Add support for breaking 256-bit int VETCC into two 128-bit ones,
avoding scalarization of the compare. Reduces code from 59 to 6
instructions. Fix PR10712.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138271 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-22 20:31:04 +00:00
Jakob Stoklund Olesen
7c6da77810 Add test case for r138018.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138033 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-19 04:30:24 +00:00
Ivan Krasin
74af88a666 FastISel: avoid function calls between the materialization of the constant and its use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137993 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-18 22:06:10 +00:00
Bruno Cardoso Lopes
24b90e2287 Cleanup vector logical ops in AVX and add use int versions for simple
v2i64

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137919 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-18 02:11:34 +00:00
Bruno Cardoso Lopes
0dd80b0d69 Fix PR10688. Add support for spliting 256-bit vector shifts when the
shift amount is variable

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137885 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-17 22:12:20 +00:00
Bruno Cardoso Lopes
0e6d230abd Introduce matching patterns for vbroadcast AVX instruction. The idea is to
match splats in the form (splat (scalar_to_vector (load ...))) whenever
the load can be folded. All the logic and instruction emission is
working but because of PR8156, there are no ways to match loads, cause
they can never be folded for splats. Thus, the tests are XFAILed, but
I've tested and exercised all the logic using a relaxed version for
checking the foldable loads, as if the bug was already fixed. This
should work out of the box once PR8156 gets fixed since MayFoldLoad will
work as expected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137810 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-17 02:29:19 +00:00
Bruno Cardoso Lopes
666f500592 Update test to not use the scalar type to splat from a load
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137809 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-17 02:29:15 +00:00
Bruno Cardoso Lopes
fc0a702128 Now that we have a canonical way to handle 256-bit splats:
vinsertf128 $1 + vpermilps $0, remove the old code that used to first
do the splat in a 128-bit vector and then insert it into a larger one.
This is better because the handling code gets simpler and also makes a
better room for the upcoming vbroadcast!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137807 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-17 02:29:10 +00:00
Bruno Cardoso Lopes
3b86598cfa Instead of always leaving the work to the generic legalizer when
there is no support for native 256-bit shuffles, be more smart in some
cases, for example, when you can extract specific 128-bit parts and use
regular 128-bit shuffles for them. Example:

For this shuffle:
  shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32>
                <i32 1, i32 0, i32 7, i32 6>

This was expanded to:
  vextractf128  $1, %ymm1, %xmm2
  vpextrq $0, %xmm2, %rax
  vmovd %rax, %xmm1
  vpextrq $1, %xmm2, %rax
  vmovd %rax, %xmm2
  vpunpcklqdq %xmm1, %xmm2, %xmm1
  vpextrq $0, %xmm0, %rax
  vmovd %rax, %xmm2
  vpextrq $1, %xmm0, %rax
  vmovd %rax, %xmm0
  vpunpcklqdq %xmm2, %xmm0, %xmm0
  vinsertf128 $1, %xmm1, %ymm0, %ymm0
  ret

Now we get:
  vshufpd $1, %xmm0, %xmm0, %xmm0
  vextractf128  $1, %ymm1, %xmm1
  vshufpd $1, %xmm1, %xmm1, %xmm1
  vinsertf128 $1, %xmm1, %ymm0, %ymm0

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137733 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-16 18:21:54 +00:00