Original commit message from r147481:
DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.
Fix:
Unaligned loads need to generate a vmovups.
rdar://10974078
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152366 91177308-0d34-0410-b5e6-96231b3b80d8
This was testing the handling of sub-register coalescing followed by
remat. The original problem was caused by the extra <imp-def> operands
added by sub-register coalescing. Those <imp-def> operands are not
added any longer, and the test case passes even when the original patch
is reverted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152040 91177308-0d34-0410-b5e6-96231b3b80d8
Some BBs can become dead after codegen preparation. If we delete them here, it
could help enable tail-call optimizations later on.
<rdar://problem/10256573>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152002 91177308-0d34-0410-b5e6-96231b3b80d8
In this instance we are generating the tail-call during legalizeDAG. The 2nd
floor call can't be a tail call because it clobbers %xmm1, which is defined by
the first floor call. The first floor call can't be a tail-call because it's
not in the tail position. The only reasonable way I could think to fix this
in a target-independent manner was to check for glue logic on the copy reg.
rdar://10930395
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151877 91177308-0d34-0410-b5e6-96231b3b80d8
so that the test will not fail when run on an Intel Atom
processor, due to the Atom scheduler producing an instruction sequence that is
different from that which is normally expected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151832 91177308-0d34-0410-b5e6-96231b3b80d8
To avoid problems with zero shifts when getting the bits that move between words
we use a trick: first shift the by amount-1, then do another shift by one. When
amount is 0 (and size 32) we first shift by 31, then by one, instead of by 32.
Also fix a latent bug that emitted the low and high words in the wrong order
when shifting right.
Fixes PR12113.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151637 91177308-0d34-0410-b5e6-96231b3b80d8
When the GEP index is a vector of pointers, the code that calculated the size
of the element started from the vector type, and not the contained pointer type.
As a result, instead of looking at the data element pointed by the vector, this
code used the size of the vector. This works for 32bit members (on 32bit
systems), but not for other types. Added code to peel the vector type and
added a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151626 91177308-0d34-0410-b5e6-96231b3b80d8
[Joe Groff] Hi everyone. My previous patch applied as r151382 had a few problems:
Clang raised a warning, and X86 LowerOperation would assert out for
fptoui f64 to i32 because it improperly lowered to an illegal
BUILD_PAIR. Here's a patch that addresses these issues. Let me know if
any other changes are necessary. Thanks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151432 91177308-0d34-0410-b5e6-96231b3b80d8
ecx = mov eax
al = mov ch
The second copy is not a nop because the sub-indices of ecx,ch is not the
same of that of eax/al.
Re-enabled machine-cp.
PR11940
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151002 91177308-0d34-0410-b5e6-96231b3b80d8
Thanks to Anton, Duncan and Rafael for helping me track this down.
Pointy hat to Rafael for introducing the bug in the first place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150811 91177308-0d34-0410-b5e6-96231b3b80d8
processor, due to the Atom scheduler producing an instruction sequence that is
different from that which is expected.
Patch by Michael Spencer!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150736 91177308-0d34-0410-b5e6-96231b3b80d8
that are greater than the vector element type. For example BUILD_VECTOR
of type <1 x i1> with a constant i8 operand.
This patch fixes the assertion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150477 91177308-0d34-0410-b5e6-96231b3b80d8
If the DEC node had more than one user, it was doing this lowering but
leaving the original DEC node around and so decrementing twice.
Fixes PR11964.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150356 91177308-0d34-0410-b5e6-96231b3b80d8
v8i8 -> v8i32 on AVX machines. The codegen often scalarizes ANY_EXTEND nodes.
The DAGCombiner has two optimizations that can mitigate the problem. First,
if all of the operands of a BUILD_VECTOR node are extracted from an ZEXT/ANYEXT
nodes, then it is possible to create a new simplified BUILD_VECTOR which uses
UNDEFS/ZERO values to eliminate the scalar ZEXT/ANYEXT nodes.
Second, another dag combine optimization lowers BUILD_VECTOR into a shuffle
vector instruction.
In the case of zext v8i8->v8i32 on AVX, a value in an XMM register is to be
shuffled into a wide YMM register.
This patch modifes the second optimization and allows the creation of
shuffle vectors even when the newly generated vector and the original vector
from which we extract the values are of different types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150340 91177308-0d34-0410-b5e6-96231b3b80d8
Creates a configurable regalloc pipeline.
Ensure specific llc options do what they say and nothing more: -reglloc=... has no effect other than selecting the allocator pass itself. This patch introduces a new umbrella flag, "-optimize-regalloc", to enable/disable the optimizing regalloc "superpass". This allows for example testing coalscing and scheduling under -O0 or vice-versa.
When a CodeGen pass requires the MachineFunction to have a particular property, we need to explicitly define that property so it can be directly queried rather than naming a specific Pass. For example, to check for SSA, use MRI->isSSA, not addRequired<PHIElimination>.
CodeGen transformation passes are never "required" as an analysis
ProcessImplicitDefs does not require LiveVariables.
We have a plan to massively simplify some of the early passes within the regalloc superpass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150226 91177308-0d34-0410-b5e6-96231b3b80d8
In this patch we optimize this pattern and convert the sequence into extract op of a narrow type.
This allows the BUILD_VECTOR dag optimizations to construct efficient shuffle operations in many cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149692 91177308-0d34-0410-b5e6-96231b3b80d8