We cannot use a normal call here since it has extra unmodelled side
effects (it changes stack pointer). This should fix PR5292.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97884 91177308-0d34-0410-b5e6-96231b3b80d8
This code:
float floatingPointComparison(float x, float y) {
double product = (double)x * y;
if (product == 0.0)
return product;
return product - 1.0;
}
produces this:
_floatingPointComparison:
0000000000000000 cvtss2sd %xmm1,%xmm1
0000000000000004 cvtss2sd %xmm0,%xmm0
0000000000000008 mulsd %xmm1,%xmm0
000000000000000c pxor %xmm1,%xmm1
0000000000000010 ucomisd %xmm1,%xmm0
0000000000000014 jne 0x00000004
0000000000000016 jp 0x00000002
0000000000000018 jmp 0x00000008
000000000000001a addsd 0x00000006(%rip),%xmm0
0000000000000022 cvtsd2ss %xmm0,%xmm0
0000000000000026 ret
The "jne/jp/jmp" sequence can be reduced to this instead:
_floatingPointComparison:
0000000000000000 cvtss2sd %xmm1,%xmm1
0000000000000004 cvtss2sd %xmm0,%xmm0
0000000000000008 mulsd %xmm1,%xmm0
000000000000000c pxor %xmm1,%xmm1
0000000000000010 ucomisd %xmm1,%xmm0
0000000000000014 jp 0x00000002
0000000000000016 je 0x00000008
0000000000000018 addsd 0x00000006(%rip),%xmm0
0000000000000020 cvtsd2ss %xmm0,%xmm0
0000000000000024 ret
for a savings of 2 bytes.
This xform can happen when we recognize that jne and jp jump to the same "true"
MBB, the unconditional jump would jump to the "false" MBB, and the "true" branch
is the fall-through MBB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97766 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions technically define AL,AH, but a trick in X86ISelDAGToDAG
reads AX in order to avoid reading AH with a REX instruction.
Fix PR6489.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97742 91177308-0d34-0410-b5e6-96231b3b80d8
register if it isn't possible to match the indexes *and* the base.
This fixes some fast isel rejects of load instructions on oggenc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97739 91177308-0d34-0410-b5e6-96231b3b80d8
'dsload' pattern. tblgen doesn't check patterns to see if they're
textually identical. This allows better factoring.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97630 91177308-0d34-0410-b5e6-96231b3b80d8
that they are not destination type specific. This allows
tblgen to factor them and the type check is redundant with
what the isel does anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97629 91177308-0d34-0410-b5e6-96231b3b80d8
CopyToReg/CopyFromReg/INLINEASM. These are annoying because
they have the same opcode before an after isel. Fix this by
setting their NodeID to -1 to indicate that they are selected,
just like what automatically happens when selecting things that
end up being machine nodes.
With that done, give IsLegalToFold a new flag that causes it to
ignore chains. This lets the HandleMergeInputChains routine be
the one place that validates chains after a match is successful,
enabling the new hotness in chain processing. This smarter
chain processing eliminates the need for "PreprocessRMW" in the
X86 and MSP430 backends and enables MSP to start matching it's
multiple mem operand instructions more aggressively.
I currently #if out the dead code in the X86 backend and MSP
backend, I'll remove it for real in a follow-on patch.
The testcase changes are:
test/CodeGen/X86/sse3.ll: we generate better code
test/CodeGen/X86/store_op_load_fold2.ll: PreprocessRMW was
miscompiling this before, we now generate correct code
Convert it to filecheck while I'm at it.
test/CodeGen/MSP430/Inst16mm.ll: Add a testcase for mem/mem
folding to make anton happy. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97596 91177308-0d34-0410-b5e6-96231b3b80d8
DoInstructionSelection. Inline "SelectRoot" into it from DAGISelHeader.
Sink some other stuff out of DAGISelHeader into SDISel.
Eliminate the various 'Indent' stuff from various targets, which dates
to when isel was recursive.
17 files changed, 114 insertions(+), 430 deletions(-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97555 91177308-0d34-0410-b5e6-96231b3b80d8
Extracting the low element of a vector is now done with EXTRACT_SUBREG,
and the zero-extension performed by load movss is now modeled with
SUBREG_TO_REG, and so on.
Register-to-register movss and movsd are no longer considered copies;
they are two-address instructions which insert a scalar into a vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97354 91177308-0d34-0410-b5e6-96231b3b80d8
necessary to swap the operands to handle NaN and negative zero properly.
Also, reintroduce logic for checking for NaN conditions when forming
SSE min and max instructions, fixed to take into consideration NaNs and
negative zeros. This allows forming min and max instructions in more
cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97025 91177308-0d34-0410-b5e6-96231b3b80d8
place where an i32 imm was required, the old isel just got lucky.
This fixes CodeGen/X86/x86-64-and-mask.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96894 91177308-0d34-0410-b5e6-96231b3b80d8
don't alis it in the MMX .td file with a different width,
split into two X86ISD opcodes. This fixes an x86 testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96859 91177308-0d34-0410-b5e6-96231b3b80d8
during a tail call. A parameter might overwrite this stack slot during the tail
call.
The sequence during a tail call is:
1.) load return address to temp reg
2.) move parameters (might involve storing to return address stack slot)
3.) store return address to new location from temp reg
If the stack location is marked immutable CodeGen can colocate load (1) with the
store (3).
This fixes bug 6225.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96783 91177308-0d34-0410-b5e6-96231b3b80d8
SSE min and max instructions. The real thing this code needs to be
concerned about is negative zero.
Update the sse-minmax.ll test accordingly, and add tests for
-enable-unsafe-fp-math mode as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96775 91177308-0d34-0410-b5e6-96231b3b80d8
create an X86ISD::Cmp node with result type i64 on the
CodeGen/X86/shift-i256.ll testcase and the new isel was assert on it
downstream.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96768 91177308-0d34-0410-b5e6-96231b3b80d8
it to follow the mode needed by the new isel. Instead of returning
the input and output chains, it just returns the (currently only one,
which is a silly limitation) node that has input and output chains.
Since we want the old thing to still work, add a new
SelectScalarSSELoad to emulate the old interface. The XXX suffix
and the wrapper will eventually go away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96715 91177308-0d34-0410-b5e6-96231b3b80d8
dragonegg self-host build. I reverted 96640 in order to revert
96556 (96640 goes on top of 96556), but it also looks like with
both of them applied the breakage happens even earlier. The
symptom of the 96556 miscompile is the following crash:
llvm[3]: Compiling AlphaISelLowering.cpp for Release build
cc1plus: /home/duncan/tmp/tmp/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4982: void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode*, llvm::SDNode*, llvm::SelectionDAG::DAGUpdateListener*): Assertion `(!From->hasAnyUseOfValue(i) || From->getValueType(i) == To->getValueType(i)) && "Cannot use this version of ReplaceAllUsesWith!"' failed.
Stack dump:
0. Running pass 'X86 DAG->DAG Instruction Selection' on function '@_ZN4llvm19AlphaTargetLowering14LowerOperationENS_7SDValueERNS_12SelectionDAGE'
g++: Internal error: Aborted (program cc1plus)
This occurs when building LLVM using LLVM built by LLVM (via
dragonegg). Probably LLVM has miscompiled itself, though it
may have miscompiled GCC and/or dragonegg itself: at this point
of the self-host build, all of GCC, LLVM and dragonegg were built
using LLVM. Unfortunately this kind of thing is extremely hard
to debug, and while I did rummage around a bit I didn't find any
smoking guns, aka obviously miscompiled code.
Found by bisection.
r96556 | evancheng | 2010-02-18 03:13:50 +0100 (Thu, 18 Feb 2010) | 5 lines
Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"
r96640 | evancheng | 2010-02-19 01:34:39 +0100 (Fri, 19 Feb 2010) | 16 lines
Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.
e.g. On x86_64
%0 = icmp eq i32 %x, 0
%1 = icmp eq i32 %y, 0
%2 = xor i1 %1, %0
br i1 %2, label %bb, label %return
=>
testl %edi, %edi
sete %al
testl %esi, %esi
sete %cl
cmpb %al, %cl
je LBB1_2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96672 91177308-0d34-0410-b5e6-96231b3b80d8
It's not clear why this is really required, but it was explicitly
added in r48808 with no real explanation or rdar #.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96438 91177308-0d34-0410-b5e6-96231b3b80d8
into a roundss intrinsic, producing a cyclic dag. The root cause
of this is badness handling ComplexPattern nodes in the old dagisel
that I noticed through inspection. Eliminate a copy of the of the
code that handled ComplexPatterns by making EmitChildMatchCode call
into EmitMatchCode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96408 91177308-0d34-0410-b5e6-96231b3b80d8
If there exists a use of a build_vector that's the bitwise complement of the mask,
then transform the node to
(and (xor x, (build_vector -1,-1,-1,-1)), (build_vector ~c1,~c2,~c3,~c4)).
Since this transformation is only useful when 1) the given build_vector will
become a load from constpool, and 2) (and (xor x -1), y) matches to a single
instruction, I decided this is appropriate as a x86 specific transformation.
rdar://7323335
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96389 91177308-0d34-0410-b5e6-96231b3b80d8
non-temporal. Fix from r96241 for botched encoding of MOVNTDQ.
Add documentation for !nontemporal metadata.
Add a simpler movnt testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96386 91177308-0d34-0410-b5e6-96231b3b80d8
IsLegalToFold and IsProfitableToFold. The generic version of the later simply checks whether the folding candidate has a single use.
This allows the target isel routines more flexibility in deciding whether folding makes sense. The specific case we are interested in is folding constant pool loads with multiple uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96255 91177308-0d34-0410-b5e6-96231b3b80d8
Otherwise AT&T asm printer is used with non-compatible MCAsmInfo and
there is no way to override this behaviour.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96165 91177308-0d34-0410-b5e6-96231b3b80d8
created. This ensures it's updated at all time. It means targets which perform
dynamic stack alignment would know whether it is required and whether frame
pointer register cannot be made available register allocation.
This is a fix for rdar://7625239. Sorry, I can't create a reasonably sized test
case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96069 91177308-0d34-0410-b5e6-96231b3b80d8
We still have the templated X86 JIT emitter, *and* the
almost-copy in X86InstrInfo for getting instruction sizes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96059 91177308-0d34-0410-b5e6-96231b3b80d8
the tables to be const. Teach MCCodeEmitter to handle
the target-indep kinds so that we don't crash on them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95924 91177308-0d34-0410-b5e6-96231b3b80d8
r12b, etc) also encodes to a R/M value of 4, which is just
as illegal as ESP/RSP for the non-sib version an address.
This fixes x86-64 jit miscompilations of a bunch of programs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95866 91177308-0d34-0410-b5e6-96231b3b80d8
Stub out some dummy fixups to make things work.
We can now emit fixups like this:
subl $20, %esp ## encoding: [0x83,0xec,A]
## fixup A - offset: 2, value: 20, kind: fixup_1byte_imm
Emitting $20 as a single-byte fixup to be later resolved
by the assembler is ridiculous of course (vs just emitting
the byte) but this is a failure of the matcher, which
should be producing an imm of 20, not an MCExpr of 20.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95860 91177308-0d34-0410-b5e6-96231b3b80d8
lowering and requires that certain types exist in ValueTypes.h. Modified widening to
check if an op can trap and if so, the widening algorithm will apply only the op on
the defined elements. It is safer to do this in widening because the optimizer can't
guarantee removing unused ops in some cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95823 91177308-0d34-0410-b5e6-96231b3b80d8
Enhance the x86 backend to show the hex values of immediates in
comments when they are large. For example:
movl $1072693248, 4(%esp) ## imm = 0x3FF00000
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95728 91177308-0d34-0410-b5e6-96231b3b80d8
Move some utility TableGen defs, classes, etc. into a common file so
they may be used my multiple pattern files. We will use this for
the AVX specification to help with the transition from the current
SSE specification.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95727 91177308-0d34-0410-b5e6-96231b3b80d8
in X86-32 mode. This is still required in x86-64 mode to avoid
forming [disp+rip] encoding. Rewrite the SIB byte decision logic
to be actually understandable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95693 91177308-0d34-0410-b5e6-96231b3b80d8
into TargetOpcodes.h. #include the new TargetOpcodes.h
into MachineInstr. Add new inline accessors (like isPHI())
to MachineInstr, and start using them throughout the
codebase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95687 91177308-0d34-0410-b5e6-96231b3b80d8
for register tokens. Before, if it encountered
'%al,' it would report 'al,' as the token. Now it
correctly reports '%al'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95594 91177308-0d34-0410-b5e6-96231b3b80d8
only run for x86 with fastisel. I've found it being very effective in
eliminating some obvious dead code as result of formal parameter lowering
especially when tail call optimization eliminated the need for some of the loads
from fixed frame objects. It also shrinks a number of the tests. A couple of
tests no longer make sense and are now eliminated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95493 91177308-0d34-0410-b5e6-96231b3b80d8
Instruction selection for X86 now can choose an instruction
sequence that will fit any address of any symbol, no matter
the pointer width. X86-64 uses a mov+call-via-reg sequence
for this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95323 91177308-0d34-0410-b5e6-96231b3b80d8
Lock prefix, Repeat string operation prefixes and the Segment override prefixes.
Also added versions of the move string and store string instructions without the
repeat prefixes to X86InstrInfo.td. And finally marked the rep versions of
move/store string records in X86InstrInfo.td as isCodeGenOnly = 1 so tblgen is
happy building the disassembler files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95252 91177308-0d34-0410-b5e6-96231b3b80d8
the end of the instruction instead of expecting the caller to
do it. This currently causes the asm-verbose instruction
comments to be on the next line.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95178 91177308-0d34-0410-b5e6-96231b3b80d8
than DEBUG_VALUE :( ) into the target indep AsmPrinter.cpp
file. This allows elimination of the
NO_ASM_WRITER_BOILERPLATE hack among other things.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95177 91177308-0d34-0410-b5e6-96231b3b80d8