in the 32-bit signed offset field of addresses. Even though this
may be intended, some linkers refuse to relocate code where the
relocated address computation overflows.
Also, fix the sign-extension of constant offsets to use the
actual pointer size, rather than the size of the GlobalAddress
node, which may be different, for example on x86-64 where MVT::i32
is used when the address is being fit into the 32-bit displacement
field.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57885 91177308-0d34-0410-b5e6-96231b3b80d8
Where previously LLVM might emit code like this:
ucomisd %xmm1, %xmm0
setne %al
setp %cl
orb %al, %cl
jne .LBB4_2
it now emits this:
ucomisd %xmm1, %xmm0
jne .LBB4_2
jp .LBB4_2
It has fewer instructions and uses fewer registers, but it does
have more branches. And in the case that this code is followed by
a non-fallthrough edge, it may be followed by a jmp instruction,
resulting in three branch instructions in sequence. Some effort
is made to avoid this situation.
To achieve this, X86ISelLowering.cpp now recognizes FCMP_OEQ and
FCMP_UNE in lowered form, and replace them with code that emits
two branches, except in the case where it would require converting
a fall-through edge to an explicit branch.
Also, X86InstrInfo.cpp's branch analysis and transform code now
knows now to handle blocks with multiple conditional branches. It
uses loops instead of having fixed checks for up to two
instructions. It can now analyze and transform code generated
from FCMP_OEQ and FCMP_UNE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57873 91177308-0d34-0410-b5e6-96231b3b80d8
the copy instruction from the instruction list before asking the
target to create the new instruction. This gets the old instruction
out of the way so that it doesn't interfere with the target's
rematerialization code. In the case of x86, this helps it find
more cases where EFLAGS is not live.
Also, in the X86InstrInfo.cpp, teach isSafeToClobberEFLAGS to check
to see if it reached the end of the block after scanning each
instruction, instead of just before. This lets it notice when the
end of the block is only two instructions away, without doing any
additional scanning.
These changes allow rematerialization to clobber EFLAGS in more
cases, for example using xor instead of mov to set the return value
to zero in the included testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57872 91177308-0d34-0410-b5e6-96231b3b80d8
for strange asm conditions earlier. In this case, we have a
double being passed in an integer reg class. Convert to like
sized integer register so that we allocate the right number
for the class (two i32's for the f64 in this case).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57862 91177308-0d34-0410-b5e6-96231b3b80d8
the previous patch this one actually passes make check.
"Fix PR2356 on PowerPC: if we have an input and output that are tied together
that have different sizes (e.g. i32 and i64) make sure to reserve registers for
the bigger operand."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57771 91177308-0d34-0410-b5e6-96231b3b80d8
and add a TargetLowering hook for it to use to determine when this
is legal (i.e. not in PIC mode, etc.)
This allows instruction selection to emit folded constant offsets
in more cases, such as the included testcase, eliminating the need
for explicit arithmetic instructions.
This eliminates the need for the C++ code in X86ISelDAGToDAG.cpp
that attempted to achieve the same effect, but wasn't as effective.
Also, fix handling of offsets in GlobalAddressSDNodes in several
places, including changing GlobalAddressSDNode's offset from
int to int64_t.
The Mips, Alpha, Sparc, and CellSPU targets appear to be
unaware of GlobalAddress offsets currently, so set the hook to
false on those targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57748 91177308-0d34-0410-b5e6-96231b3b80d8
in 32-bit mode instead of assigning a register pair. This has nothing to
do with PR2356, but I happened to notice it while working on it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57704 91177308-0d34-0410-b5e6-96231b3b80d8
use a SUB instruction instead of an ADD, because -128 can be
encoded in an 8-bit signed immediate field, while +128 can't be.
This avoids the need for a 32-bit immediate field in this case.
A similar optimization applies to 64-bit adds with 0x80000000,
with the 32-bit signed immediate field.
To support this, teach tablegen how to handle 64-bit constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57663 91177308-0d34-0410-b5e6-96231b3b80d8
shift counts, and patterns that match dynamic shift counts
when the subtract is obscured by a truncate node.
Add DAGCombiner support for recognizing rotate patterns
when the shift counts are defined by truncate nodes.
Fix and simplify the code for commuting shld and shrd
instructions to work even when the given instruction doesn't
have a parent, and when the caller needs a new instruction.
These changes allow LLVM to use the shld, shrd, rol, and ror
instructions on x86 to replace equivalent code using two
shifts and an or in many more cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57662 91177308-0d34-0410-b5e6-96231b3b80d8
i.e. conditions that cannot be checked with a single instruction. For example,
SETONE and SETUEQ on x86.
- Teach legalizer to implement *illegal* setcc as a and / or of a number of
legal setcc nodes. For now, only implement FP conditions. e.g. SETONE is
implemented as SETO & SETNE, SETUEQ is SETUO | SETEQ.
- Move x86 target over.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57542 91177308-0d34-0410-b5e6-96231b3b80d8
create a new DAG node to represent the new shift to keep the
DAG consistent, even though it'll almost always be folded into
the address.
If a user of the resulting address has multiple uses, the
nodes may get revisited by a later MatchAddress call, in which
case DAG inconsistencies do matter.
This fixes PR2849.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57465 91177308-0d34-0410-b5e6-96231b3b80d8
parameters instead of raw Constants. This prevents the constants from
being selected by the isel pass, fixing PR2735.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57385 91177308-0d34-0410-b5e6-96231b3b80d8
instead.
So now: -fast-isel or -fast-isel=true enable fast-isel, and
-fast-isel=false disables it. Fast-isel is also on by default
with -fast, and off by default otherwise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57270 91177308-0d34-0410-b5e6-96231b3b80d8
are Inexact. (These are not Inexact as defined
by IEEE754, but that seems like a reasonable way
to abstract what happens: information is lost.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57218 91177308-0d34-0410-b5e6-96231b3b80d8
was setting kill flags on tied uses in two-address instructions.
The kill flags were causing the allocator to think it could
allocate the use and its tied def in different registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57039 91177308-0d34-0410-b5e6-96231b3b80d8