approach taken is different to that in LegalizeDAG
when it is a question of expanding or promoting the
result type: for example, if extracting an i64 from
a <2 x i64>, when i64 needs expanding, it bitcasts
the vector to <4 x i32>, extracts the appropriate
two i32's, and uses those for the Lo and Hi parts.
Likewise, when extracting an i16 from a <4 x i16>,
and i16 needs promoting, it bitcasts the vector to
<2 x i32>, extracts the appropriate i32, twiddles
the bits if necessary, and uses that as the promoted
value. This puts more pressure on bitcast legalization,
and I've added the appropriate cases. They needed to
be added anyway since users can generate such bitcasts
too if they want to. Also, when considering various
cases (Legal, Promote, Expand, Scalarize, Split) it is
a pain that expand can correspond to Expand, Scalarize
or Split, so I've changed the LegalizeTypes enum so it
lists those different cases - now Expand only means
splitting a scalar in two.
The code produced is the same as by LegalizeDAG for
all relevant testcases, except for
2007-10-31-extractelement-i64.ll, where the code seems
to have improved (see below; can an expert please tell
me if it is better or not).
Before < vs after >.
< subl $92, %esp
< movaps %xmm0, 64(%esp)
< movaps %xmm0, (%esp)
< movl 4(%esp), %eax
< movl %eax, 28(%esp)
< movl (%esp), %eax
< movl %eax, 24(%esp)
< movq 24(%esp), %mm0
< movq %mm0, 56(%esp)
---
> subl $44, %esp
> movaps %xmm0, 16(%esp)
> pshufd $1, %xmm0, %xmm1
> movd %xmm1, 4(%esp)
> movd %xmm0, (%esp)
> movq (%esp), %mm0
> movq %mm0, 8(%esp)
< subl $92, %esp
< movaps %xmm0, 64(%esp)
< movaps %xmm0, (%esp)
< movl 12(%esp), %eax
< movl %eax, 28(%esp)
< movl 8(%esp), %eax
< movl %eax, 24(%esp)
< movq 24(%esp), %mm0
< movq %mm0, 56(%esp)
---
> subl $44, %esp
> movaps %xmm0, 16(%esp)
> pshufd $3, %xmm0, %xmm1
> movd %xmm1, 4(%esp)
> movhlps %xmm0, %xmm0
> movd %xmm0, (%esp)
> movq (%esp), %mm0
> movq %mm0, 8(%esp)
< subl $92, %esp
< movaps %xmm0, 64(%esp)
---
> subl $44, %esp
< movl 16(%esp), %eax
< movl %eax, 48(%esp)
< movl 20(%esp), %eax
< movl %eax, 52(%esp)
< movaps %xmm0, (%esp)
< movl 4(%esp), %eax
< movl %eax, 60(%esp)
< movl (%esp), %eax
< movl %eax, 56(%esp)
---
> pshufd $1, %xmm0, %xmm1
> movd %xmm1, 4(%esp)
> movd %xmm0, (%esp)
> movd %xmm1, 12(%esp)
> movd %xmm0, 8(%esp)
< subl $92, %esp
< movaps %xmm0, 64(%esp)
---
> subl $44, %esp
< movl 24(%esp), %eax
< movl %eax, 48(%esp)
< movl 28(%esp), %eax
< movl %eax, 52(%esp)
< movaps %xmm0, (%esp)
< movl 12(%esp), %eax
< movl %eax, 60(%esp)
< movl 8(%esp), %eax
< movl %eax, 56(%esp)
---
> pshufd $3, %xmm0, %xmm1
> movd %xmm1, 4(%esp)
> movhlps %xmm0, %xmm0
> movd %xmm0, (%esp)
> movd %xmm1, 12(%esp)
> movd %xmm0, 8(%esp)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47672 91177308-0d34-0410-b5e6-96231b3b80d8
operand of a VECTOR_SHUFFLE. The mask is a
vector of constant integers. The code in
LegalizeDAG doesn't bother to legalize the
mask, since it's basically just storage for
a bunch of constants, however LegalizeTypes
is more picky. The problem is that there may
not exist any legal vector-of-integers type
with a legal element type, so it is impossible
to create a legal mask! Unless of course you
cheat by creating a BUILD_VECTOR where the
operands have a different type to the element
type of the vector being built... This is
pretty ugly but works - all relevant tests in
the testsuite pass, and produce the same
assembler with and without LegalizeTypes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47670 91177308-0d34-0410-b5e6-96231b3b80d8
stack slot and store if the SINT_TO_FP is actually legal. This allows
us to compile:
double a(double b) {return (unsigned)b;}
to:
_a:
cvttsd2siq %xmm0, %rax
movl %eax, %eax
cvtsi2sdq %rax, %xmm0
ret
instead of:
_a:
subq $8, %rsp
cvttsd2siq %xmm0, %rax
movl %eax, %eax
cvtsi2sdq %rax, %xmm0
addq $8, %rsp
ret
crazy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47660 91177308-0d34-0410-b5e6-96231b3b80d8
_test:
movl %edi, %eax
ret
instead of:
_test:
movl $4294967295, %ecx
movq %rdi, %rax
andq %rcx, %rax
ret
It would be great to write this as a Pat pattern that used subregs
instead of a 'pseudo' instruction, but I don't know how to do that
in td files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47658 91177308-0d34-0410-b5e6-96231b3b80d8
Change several cases in SimplifyDemandedMask that don't ever do any
simplifying to reuse the logic in ComputeMaskedBits instead of
duplicating it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47648 91177308-0d34-0410-b5e6-96231b3b80d8
instead of init'ing it maximally to zeros on entry. getFreePhysReg
is pretty hot and only a few elements are typically used. This speeds
up linscan by 5% on 176.gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47631 91177308-0d34-0410-b5e6-96231b3b80d8
CodeGen/PowerPC/illegal-element-type.ll): suppose
a node X is processed, and processing maps it to
a node Y. Then X continues to exist in the DAG,
but with no users. While processing some other
node, a new node may be created that happens to
be equal to X, and thus X will be reused rather
than a truly new node. This can cause X to
"magically reappear", and since it is in the
Processed state in will not be reprocessed, so
at the end of type legalization the illegal node
X can still be present. The solution is to replace
X with Y whenever X gets resurrected like this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47601 91177308-0d34-0410-b5e6-96231b3b80d8
GOT-style position independent code. Before only tail calls to
protected/hidden functions within the same module were optimized.
Now all function calls are tail call optimized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47594 91177308-0d34-0410-b5e6-96231b3b80d8
calls. Before arguments that could overwrite each other were
explicitly lowered to a stack slot, not giving the register allocator
a chance to optimize. Now a sequence of copyto/copyfrom virtual
registers ensures that arguments are loaded in (virtual) registers
before they are lowered to the stack slot (and might overwrite each
other). Also parameter stack slots are marked mutable for
(potentially) tail calling functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47593 91177308-0d34-0410-b5e6-96231b3b80d8
vr1 = extract_subreg vr2, 3
...
vr3 = extract_subreg vr1, 2
The end result is vr3 is equal to vr2 with subidx 2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47592 91177308-0d34-0410-b5e6-96231b3b80d8
after legalize. Just because a constant is legal (e.g. 0.0 in SSE)
doesn't mean that its negated value is legal (-0.0). We could make
this stronger by checking to see if the negated constant is actually
legal post negation, but it doesn't seem like a big deal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47591 91177308-0d34-0410-b5e6-96231b3b80d8