happening.
Enhance scheduling to set the DEAD flag on implicit defs
more aggressively. Before, we'd set an implicit def operand
to dead if it were present in the SDNode corresponding to
the machineinstr but had no use. Now we do it in this case
AND if the implicit def does not exist in the SDNode at all.
This exposes a couple of problems: one is the FIXME, which
causes a live intervals crash on CodeGen/X86/sibcall.ll.
The second is that it makes machinecse and licm more
aggressive (which is a good thing) but also exposes a case
where licm hoists a set0 and then it doesn't get resunk.
Talking to codegen folks about both these issues, but I need
this patch in in the meantime.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99485 91177308-0d34-0410-b5e6-96231b3b80d8
override prefix and only the r/m16 forms should have had that. Also for variant
one, the AT&T syntax, added suffixes to all forms. Also added the missing
64-bit form for 'CRC32 r64, r/m8'. Plus added test cases for all forms and
tweaked one test case to add the needed suffixes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98980 91177308-0d34-0410-b5e6-96231b3b80d8
- Although it would be nice to allow this decoupling, the assembler needs to be able to reason about MCSymbolRefExprs in too many places to make this viable. We can use a target specific encoding of the variant if this becomes an issue.
- This patch also extends llvm-mc to support parsing of the modifiers, as opposed to lumping them in with the symbol.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98592 91177308-0d34-0410-b5e6-96231b3b80d8
32-bit indices. Instead of shuffling each element out of the index vector,
when all indices are needed, just store the input vector to the stack and
load the elements out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98588 91177308-0d34-0410-b5e6-96231b3b80d8
label is generated, but then the block is deleted. Since the
value is undefined, we just emit the label right after the entry
label of the function. It might matter that the label is in the
same section as the function was afterall.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98579 91177308-0d34-0410-b5e6-96231b3b80d8
function, then the BB is RAUW'd before the definition is emitted. There
are still two cases not being handled, but this should improve us back to
the situation before I touched anything.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98566 91177308-0d34-0410-b5e6-96231b3b80d8
no arguments instead of having to come up with a unique name.
This also makes the code less fragile.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98364 91177308-0d34-0410-b5e6-96231b3b80d8
cl = EXTRACT_SUBREG reg1024, 1, is overly conservative. It should check
for overlaps of vr's live interval with the super registers of the
physical register (ECX in this case) and let JoinIntervals() handle checking
the coalescing feasibility against the physical register (cl in this case).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98251 91177308-0d34-0410-b5e6-96231b3b80d8
available, the only thing this affects is that we produce
.set in one case we didn't before, which shouldn't harm
anything. Make EmitSectionOffset call EmitDifference
instead of duplicating it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98005 91177308-0d34-0410-b5e6-96231b3b80d8
is a workaround for <rdar://problem/7672401/> (which I filed).
This let's us build Wine on Darwin, and it gets the Qt build there a little bit
further (so Doug says).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97845 91177308-0d34-0410-b5e6-96231b3b80d8
CALL ... %RAX<imp-def>
... [not using %RAX]
%EAX = ..., %RAX<imp-use, kill>
RET %EAX<imp-use,kill>
Now we do this:
CALL ... %RAX<imp-def, dead>
... [not using %RAX]
%EAX = ...
RET %EAX<imp-use,kill>
By not artificially keeping %RAX alive, we lower register pressure a bit.
The correct number of instructions for 2008-08-05-SpillerBug.ll is obviously
55, anybody can see that. Sheesh.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97838 91177308-0d34-0410-b5e6-96231b3b80d8
node which has a flag. That flag in turn was used by an
already-selected adde which turned into an ADC32ri8 which
used a selected load which was chained to the load we
folded. This flag use caused us to form a cycle. Fix
this by not ignoring chains in IsLegalToFold even in
cases where the isel thinks it can.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97791 91177308-0d34-0410-b5e6-96231b3b80d8
This code:
float floatingPointComparison(float x, float y) {
double product = (double)x * y;
if (product == 0.0)
return product;
return product - 1.0;
}
produces this:
_floatingPointComparison:
0000000000000000 cvtss2sd %xmm1,%xmm1
0000000000000004 cvtss2sd %xmm0,%xmm0
0000000000000008 mulsd %xmm1,%xmm0
000000000000000c pxor %xmm1,%xmm1
0000000000000010 ucomisd %xmm1,%xmm0
0000000000000014 jne 0x00000004
0000000000000016 jp 0x00000002
0000000000000018 jmp 0x00000008
000000000000001a addsd 0x00000006(%rip),%xmm0
0000000000000022 cvtsd2ss %xmm0,%xmm0
0000000000000026 ret
The "jne/jp/jmp" sequence can be reduced to this instead:
_floatingPointComparison:
0000000000000000 cvtss2sd %xmm1,%xmm1
0000000000000004 cvtss2sd %xmm0,%xmm0
0000000000000008 mulsd %xmm1,%xmm0
000000000000000c pxor %xmm1,%xmm1
0000000000000010 ucomisd %xmm1,%xmm0
0000000000000014 jp 0x00000002
0000000000000016 je 0x00000008
0000000000000018 addsd 0x00000006(%rip),%xmm0
0000000000000020 cvtsd2ss %xmm0,%xmm0
0000000000000024 ret
for a savings of 2 bytes.
This xform can happen when we recognize that jne and jp jump to the same "true"
MBB, the unconditional jump would jump to the "false" MBB, and the "true" branch
is the fall-through MBB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97766 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions technically define AL,AH, but a trick in X86ISelDAGToDAG
reads AX in order to avoid reading AH with a REX instruction.
Fix PR6489.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97742 91177308-0d34-0410-b5e6-96231b3b80d8
long test(long x) { return (x & 123124) | 3; }
Currently compiles to:
_test:
orl $3, %edi
movq %rdi, %rax
andq $123127, %rax
ret
This is because instruction and DAG combiners canonicalize
(or (and x, C), D) -> (and (or, D), (C | D))
However, this is only profitable if (C & D) != 0. It gets in the way of the
3-addressification because the input bits are known to be zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97616 91177308-0d34-0410-b5e6-96231b3b80d8
CopyToReg/CopyFromReg/INLINEASM. These are annoying because
they have the same opcode before an after isel. Fix this by
setting their NodeID to -1 to indicate that they are selected,
just like what automatically happens when selecting things that
end up being machine nodes.
With that done, give IsLegalToFold a new flag that causes it to
ignore chains. This lets the HandleMergeInputChains routine be
the one place that validates chains after a match is successful,
enabling the new hotness in chain processing. This smarter
chain processing eliminates the need for "PreprocessRMW" in the
X86 and MSP430 backends and enables MSP to start matching it's
multiple mem operand instructions more aggressively.
I currently #if out the dead code in the X86 backend and MSP
backend, I'll remove it for real in a follow-on patch.
The testcase changes are:
test/CodeGen/X86/sse3.ll: we generate better code
test/CodeGen/X86/store_op_load_fold2.ll: PreprocessRMW was
miscompiling this before, we now generate correct code
Convert it to filecheck while I'm at it.
test/CodeGen/MSP430/Inst16mm.ll: Add a testcase for mem/mem
folding to make anton happy. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97596 91177308-0d34-0410-b5e6-96231b3b80d8