vr1 = extract_subreg vr2, 3
...
vr3 = extract_subreg vr1, 2
The end result is vr3 is equal to vr2 with subidx 2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47592 91177308-0d34-0410-b5e6-96231b3b80d8
after legalize. Just because a constant is legal (e.g. 0.0 in SSE)
doesn't mean that its negated value is legal (-0.0). We could make
this stronger by checking to see if the negated constant is actually
legal post negation, but it doesn't seem like a big deal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47591 91177308-0d34-0410-b5e6-96231b3b80d8
for CellSPU modifications:
- SPUInstrInfo.td refactoring: "multiclass" really is _your_ friend.
- Other improvements based on refactoring effort in SPUISelLowering.cpp,
esp. in SPUISelLowering::PerformDAGCombine(), where zero amount shifts and
rotates are now eliminiated, other scalar-to-vector-to-scalar silliness
is also eliminated.
- 64-bit operations are being implemented, _muldi3.c gcc runtime now
compiles and generates the right code. More work still needs to be done.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47532 91177308-0d34-0410-b5e6-96231b3b80d8
instead of with mmx registers. This horribleness is apparently
done by gcc to avoid having to insert emms in places that really
should have it. This is the second half of rdar://5741668.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47474 91177308-0d34-0410-b5e6-96231b3b80d8
GCC apparently does this, and code depends on not having to do
emms when this happens. This is x86-64 only so far, second half
should handle x86-32.
rdar://5741668
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47470 91177308-0d34-0410-b5e6-96231b3b80d8
any, we force sdisel to do all regalloc for an asm. This
leads to gross but correct codegen.
This fixes the rest of PR2078.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47454 91177308-0d34-0410-b5e6-96231b3b80d8
inline asms.
Fix PR2078 by marking aliases of registers used when a register is
marked used. This prevents EAX from being allocated when AX is listed
in the clobber set for the asm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47426 91177308-0d34-0410-b5e6-96231b3b80d8
- X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47290 91177308-0d34-0410-b5e6-96231b3b80d8
has plain one-result scalar integer multiplication instructions.
This avoids expanding such instructions into MUL_LOHI sequences that
must be special-cased at isel time, and avoids the problem with that
code that provented memory operands from being folded.
This fixes PR1874, addressesing the most common case. The uncommon
cases of optimizing multiply-high operations will require work
in DAGCombiner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47277 91177308-0d34-0410-b5e6-96231b3b80d8
CTTZ and CTPOP. The expansion code differs from
that in LegalizeDAG in that it chooses to take the
CTLZ/CTTZ count from the Hi/Lo part depending on
whether the Hi/Lo value is zero, not on whether
CTLZ/CTTZ of Hi/Lo returned 32 (or whatever the
width of the type is) for it. I made this change
because the optimizers may well know that Hi/Lo
is zero and exploit it. The promotion code for
CTTZ also differs from that in LegalizeDAG: it
uses an "or" to get the right result when the
original value is zero, rather than using a compare
and select. This also means the value doesn't
need to be zero extended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47075 91177308-0d34-0410-b5e6-96231b3b80d8
node as soon as we create it in SDISel. Previously we would lower it in
legalize. The problem with this is that it only exposes the argument
loads implied by FORMAL_ARGUMENTs after legalize, so that only dag combine 2
can hack on them. This causes us to miss some optimizations because
datatype expansion also happens here.
Exposing the loads early allows us to do optimizations on them. For example
we now compile arg-cast.ll to:
_foo:
movl $2147483647, %eax
andl 8(%esp), %eax
ret
where we previously produced:
_foo:
subl $12, %esp
movsd 16(%esp), %xmm0
movsd %xmm0, (%esp)
movl $2147483647, %eax
andl 4(%esp), %eax
addl $12, %esp
ret
It might also make sense to do this for ISD::CALL nodes, which have implicit
stores on many targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47054 91177308-0d34-0410-b5e6-96231b3b80d8