Hook up the TableGen lowering for simple pseudo instructions for ARM and
use it for a subset of the many pseudos the backend has as proof of concept.
More conversions to come.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134705 91177308-0d34-0410-b5e6-96231b3b80d8
We have to do this in DAGBuilder instead of DAGCombiner, because the exact bit is lost after building.
struct foo { char x[24]; };
long bar(struct foo *a, struct foo *b) { return a-b; }
is now compiled into
movl 4(%esp), %eax
subl 8(%esp), %eax
sarl $3, %eax
imull $-1431655765, %eax, %eax
instead of
movl 4(%esp), %eax
subl 8(%esp), %eax
movl $715827883, %ecx
imull %ecx
movl %edx, %eax
shrl $31, %eax
sarl $2, %edx
addl %eax, %edx
movl %edx, %eax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134695 91177308-0d34-0410-b5e6-96231b3b80d8
- Each target asm parser now creates its own MCSubtatgetInfo (if needed).
- Changed AssemblerPredicate to take subtarget features which tablegen uses
to generate asm matcher subtarget feature queries. e.g.
"ModeThumb,FeatureThumb2" is translated to
"(Bits & ModeThumb) != 0 && (Bits & FeatureThumb2) != 0".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134678 91177308-0d34-0410-b5e6-96231b3b80d8
numbers should be printed instead of symbolic register names in
MCAsmStreamer::EmitRegisterName. This is necessary because some versions of
GNU assembler won't accept code in which symbolic register names are used in
cfi directives. There is no change in behavior unless the flag is explicitly
set to true by a backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134635 91177308-0d34-0410-b5e6-96231b3b80d8
before the offset. This change will enable simplification of function
MipsRegisterInfo::eliminateFrameIndex.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134625 91177308-0d34-0410-b5e6-96231b3b80d8
DBG_VALUE 3.310000e+02, 0, !"ds"; dbg:sse.stepfft.c:138:18 @[ sse.stepfft.c:32:10 ]
DBG_VALUE 3.310000e+02, 0, !"ds"; dbg:sse.stepfft.c:138:18 @[ sse.stepfft.c:31:10 ]
These two MIs represent identical value, 3.31..., for one variable, ds, but they are not identical because the represent two separate instances of inlined variable "ds".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134620 91177308-0d34-0410-b5e6-96231b3b80d8
hasPredecessorHelper function allows predecessors to be cached to speed up
repeated invocations. This fixes PR10186.
X.isPredecessorOf(Y) now just calls Y.hasPredecessor(X)
Y.hasPredecessor(X) calls Y.hasPredecessorHelper(X, Visited, Worklist) with
empty Visited and Worklist sets (i.e. no caching over invocations).
Y.hasPredecessorHelper(X, Visited, Worklist) caches search state in Visited
and Worklist to speed up repeated calls. The Visited set is searched for X
before going to the worklist to further search the DAG if necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134592 91177308-0d34-0410-b5e6-96231b3b80d8
Unfortunately, the testcase I have is large and confidential, so I don't have a test to commit at the moment; I'll see if I can come up with something smaller where this issue reproduces.
<rdar://problem/9716278>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134565 91177308-0d34-0410-b5e6-96231b3b80d8
This allows us to remove the (bogus and unneeded) encoding information from
the pseudo-instruction class definitions. All of the pseudos that haven't
been converted yet and still need encoding information instance from the normal
instruction classes and explicitly set isCodeGenOnly, and so are distinct
from this change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134540 91177308-0d34-0410-b5e6-96231b3b80d8
Pseudo-instructions don't have encoding information, as they're lowered
to real instructions by the time we're doing binary encoding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134533 91177308-0d34-0410-b5e6-96231b3b80d8
The promotion code lost any alignment information, when hoisting loads and
stores out of the loop. This lead to incorrect aligned memory accesses. We now
use the largest alignment we can prove to be correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134520 91177308-0d34-0410-b5e6-96231b3b80d8
This is impossible in theory, I can prove it. In practice, our near-zero
threshold can cause the network to oscillate between equally good
solutions.
<rdar://problem/9720596>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134428 91177308-0d34-0410-b5e6-96231b3b80d8
If the function allocates reserved stack space for callee argument frames,
estimateStackSize() needs to account for that, as it doesn't show up as
ordinary frame objects. Otherwise, a callee with a large argument list will
throw off the calculations for whether to allocate an emergency spill slot
and we get assert() failures in the register scavenger.
rdar://9715469
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134415 91177308-0d34-0410-b5e6-96231b3b80d8
Remat during spilling triggers dead code elimination. If a phi-def
becomes unused, that may also cause live ranges to split into separate
connected components.
This type of splitting is different from normal live range splitting. In
particular, there may not be a common original interval.
When the split range is its own original, make sure that the new
siblings are also their own originals. The range being split cannot be
used as an original since it doesn't cover the new siblings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134413 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes the issue noted in PR10251 where early tail dup of bbs with
indirectbr would cause a bb to be duplicated into a loop preheader
and then into its predecessors, creating phi nodes with identical
operands just before register allocation.
This helps with jsinterp.o size (__TEXT goes from 163568 to 126656)
and a bit with performance 1.005x faster on sunspider (jits still enabled).
The result on webkit with the jit disabled is more significant: 1.021x faster.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134372 91177308-0d34-0410-b5e6-96231b3b80d8
A split point inserted in a block with a landing pad successor may be
hoisted above the call to ensure that it dominates all successors. The
code that handles the rest of the basic block must take this into
account.
I am not including a test case, it would be very fragile. PR10244 comes
from building clang with exceptions enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134369 91177308-0d34-0410-b5e6-96231b3b80d8
Add a MI->emitError() method that the backend can use to report errors
related to inline assembly. Call it from X86FloatingPoint.cpp when the
constraints are wrong.
This enables proper clang diagnostics from the backend:
$ clang -c pr30848.c
pr30848.c:5:12: error: Inline asm output regs must be last on the x87 stack
__asm__ ("" : "=u" (d)); /* { dg-error "output regs" } */
^
1 error generated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134307 91177308-0d34-0410-b5e6-96231b3b80d8
Every live range is assigned a cascade number the first time it is
involved in an eviction. As the evictor, it gets a new cascade number.
Every evictee is assigned the same cascade number as the evictor.
Eviction is prohibited if the evictor has a lower assigned cascade
number than the evictee.
This means that assigned cascade numbers are monotonically increasing
with every eviction, yet they are bounded by NextCascade which can only
be incremented by new live ranges. Thus, infinite loops cannot happen,
but eviction cascades can still be triggered by new live ranges as we
want.
Thanks to Andy for explaining this to me.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134303 91177308-0d34-0410-b5e6-96231b3b80d8
outside the loop and reducible.
This more completely hides them from LSR, which isn't usually able to
do anything meaningful with non-affine expressions anyway, and this
consequently hides them from SCEVExpander, which is acutely unprepared
for non-affine expressions.
Replace test/CodeGen/X86/lsr-nonaffine.ll with a new test that tests
the new behavior.
This works around the bug in PR10117 / rdar://problem/9633149, and is
generally an improvement besides.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134268 91177308-0d34-0410-b5e6-96231b3b80d8
The DSP instructions in the Thumb2 instruction set are an optional extension
in the Cortex-M* archtitecture. When present, the implementation is considered
an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation."
Add a subtarget feature hook for the v7e-m instructions and hook it up. The
cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is
a v7e-m implementation.
rdar://9572992
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134261 91177308-0d34-0410-b5e6-96231b3b80d8
itineraries.
- Refactor TargetSubtarget to be based on MCSubtargetInfo.
- Change tablegen generated subtarget info to initialize MCSubtargetInfo
and hide more details from targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134257 91177308-0d34-0410-b5e6-96231b3b80d8
t2MOVCC[ri] are just t2MOV[ri] instructions, so properly pseudo-ize them.
The Thumb1 versions, tMOVCC[ri] were only present for use by the size-
reduction pass, so they're no longer necessary at all and can be deleted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134242 91177308-0d34-0410-b5e6-96231b3b80d8
copy is a kill") to see if it fixes the i386 dragonegg buildbot, which is timing out
because gcc built with dragonegg is going into an infinite loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134237 91177308-0d34-0410-b5e6-96231b3b80d8
The constraints are represented by the register class of the original
virtual register created for the inline asm. If the register class were
included in the operand descriptor, we might be able to do this.
For now, just give up on regclass inflation when inline asm is involved.
No test case, this bug hasn't happened yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134226 91177308-0d34-0410-b5e6-96231b3b80d8
We would put the return value from long double functions in the wrong
register.
This fixes gcc.c-torture/execute/conversion.c
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134205 91177308-0d34-0410-b5e6-96231b3b80d8
Merge the tMOVr, tMOVgpr2tgpr, tMOVtgpr2gpr, and tMOVgpr2gpr instructions
into tMOVr. There's no need to keep them separate. Giving the tMOVr
instruction the proper GPR register class for its operands is sufficient
to give the register allocator enough information to do the right thing
directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134204 91177308-0d34-0410-b5e6-96231b3b80d8
Fix a FIXME and allow predication (in Thumb2) for the T1 register to
register MOV instructions. This allows some better codegen with
if-conversion (as seen in the test updates), plus it lays the groundwork
for pseudo-izing the tMOVCC instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134197 91177308-0d34-0410-b5e6-96231b3b80d8
It's just a call to a special helper function. Get rid of the T2 variant
entirely, as it's identical to the Thumb1 version.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134178 91177308-0d34-0410-b5e6-96231b3b80d8
It's just a t2LDMIA_UPD instruction with extra codegen properties, so it
doesn't need the encoding information. As a side-benefit, we now correctly
recognize for instruction printing as a 'pop' instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134173 91177308-0d34-0410-b5e6-96231b3b80d8
It's just a tPOP instruction with additional code-gen properties, so it
doesn't need encoding information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134172 91177308-0d34-0410-b5e6-96231b3b80d8
tADDrSPi is not predicable, so we can't size-reduce a t2ADDri to it if the
predicate is anything other than "always."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134130 91177308-0d34-0410-b5e6-96231b3b80d8
be the first encoded as the first feature. It then uses the CPU name to look up
features / scheduling itineray even though clients know full well the CPU name
being used to query these properties.
The fix is to just have the clients explictly pass the CPU name!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134127 91177308-0d34-0410-b5e6-96231b3b80d8
This patch will sometimes choose live range split points next to
interference instead of always splitting next to a register point. That
means spill code can now appear almost anywhere, and it was necessary
to fix code that didn't expect that.
The difficult places were:
- Between a CALL returning a value on the x87 stack and the
corresponding FpPOP_RETVAL (was FpGET_ST0). Probably also near x87
inline assembly, but that didn't actually show up in testing.
- Between a CALL popping arguments off the stack and the corresponding
ADJCALLSTACKUP.
Both are fixed now. The only place spill code can't appear is after
terminators, see SplitAnalysis::getLastSplitPoint.
Original commit message:
Rewrite RAGreedy::splitAroundRegion, now with cool ASCII art.
This function has to deal with a lot of special cases, and the old
version got it wrong sometimes. In particular, it would sometimes leave
multiple uses in the stack interval in a single block. That causes bad
code with multiple reloads in the same basic block.
The new version handles block entry and exit in a single pass. It first
eliminates all the easy cases, and then goes on to create a local
interval for the blocks with difficult interference. Previously, we
would only create the local interval for completely isolated blocks.
It can happen that the stack interval becomes completely empty because
we could allocate a register in all edge bundles, and the new local
intervals deal with the interference. The empty stack interval is
harmless, but we need to remove a SplitKit assertion that checks for
empty intervals.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134125 91177308-0d34-0410-b5e6-96231b3b80d8
Unlike Thumb1, Thumb2 does not have dedicated encodings for adjusting the
stack pointer. It can just use the normal add-register-immediate encoding
since it can use all registers as a source, not just R0-R7. The extra
instruction definitions are just duplicates of the normal instructions with
the (not well enforced) constraint that the source register was SP.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134114 91177308-0d34-0410-b5e6-96231b3b80d8
Some x86-32 calls pop values off the stack, and we need to readjust the
stack pointer after the call. This happens when ADJCALLSTACKUP is
eliminated.
It could happen that spill code was inserted between the CALL and
ADJCALLSTACKUP instructions, and we would compute wrong stack pointer
offsets for those frame index references.
Fix this by inserting the stack pointer adjustment immediately after the
call instead of where the ADJCALLSTACKUP instruction was erased.
I don't have a test case since we don't currently insert code in that
position. We will soon, though. I am testing a regalloc patch that
didn't work on Linux because of this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134113 91177308-0d34-0410-b5e6-96231b3b80d8
already makes the assumption, which is correct on ARM, that a type's alignment is
less than its alloc size. This improves codegen with Clang (which inserts a lot of
extraneous alignment specifiers) and fixes <rdar://problem/9695089>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134106 91177308-0d34-0410-b5e6-96231b3b80d8
The tSpill and tRestore instructions are just copies of the tSTRspi and
tLDRspi instructions, respectively. Just use those directly instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134092 91177308-0d34-0410-b5e6-96231b3b80d8
For example, ".byte 256" would previously assert() when emitting an object
file. Now it generates a diagnostic that the literal value is out of range.
rdar://9686950
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134069 91177308-0d34-0410-b5e6-96231b3b80d8