other day that PPC custom lowering could create
a BUILD_PAIR of two f64 with a result type of...
f64! - already fixed). Fix a place that triggers
the sanity check.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58378 91177308-0d34-0410-b5e6-96231b3b80d8
is morphed by AnalyzeNewNode into a previously
processed node, and different result values of
that node are remapped to values with different
nodes, then we could end up using wrong values
here [we were assuming that all results remap
to values with the same underlying node]. This
seems theoretically possible, but I don't have
a testcase. The meat of the patch is in the
changes to AnalyzeNewNode/AnalyzeNewValue and
ReplaceNodeWith. While there, I changed names
like RemapNode to RemapValue, since it really
remaps values. To tell the truth, I would be
much happier if we were only remapping nodes
(it would simplify a bunch of logic, and allow
for some cute speedups) but I haven't yet worked
out how to do that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58372 91177308-0d34-0410-b5e6-96231b3b80d8
- One functionality change, '\\' in a name is now printed as a hex
escape instead of "\\\\". This is consistent with other users of
PrintEscapedString.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58343 91177308-0d34-0410-b5e6-96231b3b80d8
Since the ARM constant pool handling supercedes the standard LLVM constant
pool entirely, the JIT emitter does not allocate space for the constants,
nor initialize the memory. The constant pool is considered part of the
instruction stream.
Likewise, when resolving relocations into the constant pool, a hook into
the target back end is used to resolve from the constant ID# to the
address where the constant is stored.
For now, the support in the ARM emitter is limited to 32-bit integer. Future
patches will expand this to the full range of constants necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58338 91177308-0d34-0410-b5e6-96231b3b80d8
ppcf128 to i32 conversion and expand it into a code
sequence like in LegalizeDAG. This needs custom
ppc lowering of FP_ROUND_INREG, so turn that on and
make it work with LegalizeTypes. Probably PPC should
simply custom lower the original conversion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58329 91177308-0d34-0410-b5e6-96231b3b80d8
id could end up being wrong mostly because of
forgetting to remap new nodes that morphed into
processed nodes through CSE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58323 91177308-0d34-0410-b5e6-96231b3b80d8
a memset using 16-byte XMM stores, but where the stack realignment code
didn't work. Until it does (PR2962) disable use of xmm regs in memcpy
and memset formation for linux and other targets with insufficiently
aligned stacks.
This is part of PR2888
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58317 91177308-0d34-0410-b5e6-96231b3b80d8
flag. Then in a debugger developers can set breakpoints at these calls
to see waht is about to be selected and what the resulting subgraph
looks like. This really helps when debugging instruction selection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58278 91177308-0d34-0410-b5e6-96231b3b80d8
can give it the same stack slot as the spilled interval if it is folded.
This prevents the fold/unfold code from pointing to the wrong register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58255 91177308-0d34-0410-b5e6-96231b3b80d8
(and a bunch of other node types). While there, I
added a doNotCSE predicate and used it to reduce code
duplication (some of the duplicated code was wrong...).
This fixes ARM/cse-libcalls.ll when using LegalizeTypes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58249 91177308-0d34-0410-b5e6-96231b3b80d8
worklist twice: UpdateNodeOperands could morph
a new node into a node already on the worklist.
We would then recalculate the NodeId for this
existing node and add it to the worklist. The
testcase is ARM/cse-libcalls.ll, the problem
showing up once UpdateNodeOperands is taught to
do CSE for calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58246 91177308-0d34-0410-b5e6-96231b3b80d8
LargeBlockInfo, we can now dramatically simplify their implementation
and speed them up at the same time. Now the code has time proportional
to the number of uses of the alloca, not the size of the block.
This also eliminates code that tried to batch up different allocas which
are used in the same blocks, and eliminates the 'retry list' logic which
was baroque and no unneccesary. In addition to being a speedup for crazy
cases, this is also a nice cleanup:
PromoteMemoryToRegister.cpp | 270 +++++++++++++++-----------------------------
1 file changed, 96 insertions(+), 174 deletions(-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58229 91177308-0d34-0410-b5e6-96231b3b80d8
a trivial dense map. Use this in RewriteSingleStoreAlloca to
avoid aggressively rescanning blocks over and over again. This
fixes PR2925, speeding up mem2reg on the testcase in that bug
from 4.56s to 0.02s in a debug build on my machine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58227 91177308-0d34-0410-b5e6-96231b3b80d8
target-independent code to target-specific code. This prevents it
from running on targets that aren't using fast-isel.
In addition to saving compile time, this addresses the problem
that not all targets are prepared for it. In order to use this
pass, all instructions must declare all their fixed uses and
defs of physical registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58144 91177308-0d34-0410-b5e6-96231b3b80d8
variable is moved to the execution engine. The JIT calls the TargetJITInfo
to allocate thread local storage. Currently, only linux/x86 knows how to
allocate thread local global variables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58142 91177308-0d34-0410-b5e6-96231b3b80d8
be saved/restored in the prolog/epilog. We need
to do this iff something in the function stores
into it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58116 91177308-0d34-0410-b5e6-96231b3b80d8
LHS is a foldable load, then LHS and RHS are swapped
and SetCCOpcode is changed to SETUGT. But the later
code is expecting operands to be the wrong way round
for SETUGT, but they are not in this case, resulting
in an inverted compare. The solution is to move the
load normalization before the correction for SETUGT.
This bug was tickled by LegalizeTypes which happened
to legalize the testcase slightly differently to
LegalizeDAG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58092 91177308-0d34-0410-b5e6-96231b3b80d8
Prevents DeadMachineInstructionElim from thinking
things like MTCTR are dead (fixes massive
testsuite breakage at -O0).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58043 91177308-0d34-0410-b5e6-96231b3b80d8
LoopPass*.
- Although less precise, this means they can be used in clients
without RTTI (who would otherwise need to include LoopPass.h, which
eventually includes things using dynamic_cast). This was the
simplest solution that presented itself, but I am happy to use a
better one if available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@58010 91177308-0d34-0410-b5e6-96231b3b80d8
assume that i64 has been turned into a BUILD_PAIR
node (when called from LegalizeTypes this hasn't
happened yet) and don't use a vector shuffle mask
with an illegal element type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57972 91177308-0d34-0410-b5e6-96231b3b80d8
may return i8, which can result in SELECT nodes for
which the type of the condition is i8, but there are
no patterns for select with i8 condition. Tweak the
LegalizeTypes logic to avoid this as much as possible.
This isn't a real fix because it is still perfectly
possible to end up with such select nodes - CellSPU
needs to be fixed IMHO.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57968 91177308-0d34-0410-b5e6-96231b3b80d8
that is not of type MVT::i1 in SELECT and SETCC nodes.
Relax the LegalizeTypes SELECT condition promotion
sanity checks to allow other condition types than i1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57966 91177308-0d34-0410-b5e6-96231b3b80d8
to have a different type to the vector element
type. This should be fairly harmless because in
the past guys like this were being built all over
the place (and were cleaned up when I added this
check). The reason for relaxing this check is
that it helps LegalizeTypes legalize vector
shuffles: the mask is a BUILD_VECTOR that it is
*not always possible* to legalize while keeping it
a BUILD_VECTOR (vector_shuffle requires the mask
to be a BUILD_VECTOR, as opposed to a vector with
the right vector type). With this check it is even
harder to legalize the mask - turning the check off
means that LegalizeTypes manages to legalize almost
all vector shuffles encountered in practice. The
correct solution is to change vector_shuffle to be a
variadic node with the mask built into it as operands.
While waiting for that change, this hack stops the
problem with vector_shuffle from blocking the turning
on of LegalizeTypes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57965 91177308-0d34-0410-b5e6-96231b3b80d8
The same one Apple gcc uses, faster. Also gets the
extreme case in gcc.c-torture/execute/ieee/rbug.c
correct which we weren't before; this is not
sufficient to get the test to pass though, there
is another bug.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57926 91177308-0d34-0410-b5e6-96231b3b80d8
in the 32-bit signed offset field of addresses. Even though this
may be intended, some linkers refuse to relocate code where the
relocated address computation overflows.
Also, fix the sign-extension of constant offsets to use the
actual pointer size, rather than the size of the GlobalAddress
node, which may be different, for example on x86-64 where MVT::i32
is used when the address is being fit into the 32-bit displacement
field.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57885 91177308-0d34-0410-b5e6-96231b3b80d8
Where previously LLVM might emit code like this:
ucomisd %xmm1, %xmm0
setne %al
setp %cl
orb %al, %cl
jne .LBB4_2
it now emits this:
ucomisd %xmm1, %xmm0
jne .LBB4_2
jp .LBB4_2
It has fewer instructions and uses fewer registers, but it does
have more branches. And in the case that this code is followed by
a non-fallthrough edge, it may be followed by a jmp instruction,
resulting in three branch instructions in sequence. Some effort
is made to avoid this situation.
To achieve this, X86ISelLowering.cpp now recognizes FCMP_OEQ and
FCMP_UNE in lowered form, and replace them with code that emits
two branches, except in the case where it would require converting
a fall-through edge to an explicit branch.
Also, X86InstrInfo.cpp's branch analysis and transform code now
knows now to handle blocks with multiple conditional branches. It
uses loops instead of having fixed checks for up to two
instructions. It can now analyze and transform code generated
from FCMP_OEQ and FCMP_UNE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57873 91177308-0d34-0410-b5e6-96231b3b80d8
the copy instruction from the instruction list before asking the
target to create the new instruction. This gets the old instruction
out of the way so that it doesn't interfere with the target's
rematerialization code. In the case of x86, this helps it find
more cases where EFLAGS is not live.
Also, in the X86InstrInfo.cpp, teach isSafeToClobberEFLAGS to check
to see if it reached the end of the block after scanning each
instruction, instead of just before. This lets it notice when the
end of the block is only two instructions away, without doing any
additional scanning.
These changes allow rematerialization to clobber EFLAGS in more
cases, for example using xor instead of mov to set the return value
to zero in the included testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57872 91177308-0d34-0410-b5e6-96231b3b80d8