MachineBasicBlock::canFallThrough(). We're interested in the state of the
instruction (i.e., is this a barrier or not?), not if the instruction is
predicable or not.
rdar://10501092
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149070 91177308-0d34-0410-b5e6-96231b3b80d8
The live range of the source register may be extended when a redundant
copy is eliminated. Make sure any kill flags between the two copies are
cleared.
This fixes PR11765.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149069 91177308-0d34-0410-b5e6-96231b3b80d8
This boils down to using MachineOperand::readsReg() more.
This fixes PR11829 where a use ended up after the first def when
lowering REG_SEQUENCE instructions involving IMPLICIT_DEFs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148996 91177308-0d34-0410-b5e6-96231b3b80d8
A REG_SEQUENCE instruction is lowered into a sequence of partial defs:
%vreg7:ssub_0<def,undef> = COPY %vreg20:ssub_0
%vreg7:ssub_1<def> = COPY %vreg2
%vreg7:ssub_2<def> = COPY %vreg2
%vreg7:ssub_3<def> = COPY %vreg2
The first def needs an <undef> flag to indicate it is the beginning of
the live range, while the other defs are read-modify-write. Previously,
we depended on LiveIntervalAnalysis to notice and fix the missing
<def,undef>, but that solution was never robust, it was causing problems
with ProcessImplicitDefs and the lowering of chained REG_SEQUENCE
instructions.
This fixes PR11841.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148879 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds an new option --arm-enable-ehabi-descriptors that
enables emitting unwinding descriptors. This provides a mode with a
working backtrace() without the (currently broken) exception support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148800 91177308-0d34-0410-b5e6-96231b3b80d8
violation -- MC cannot depend on CodeGen.
Specifically, the MCTargetDesc component of each target is actually
a subcomponent of the MC library. As such, it cannot depend on the
target-independent code generator, because MC itself cannot depend on
the target-independent code generator. This change moved a flag from the
ARM MCTargetDesc file ARMMCAsmInfo.cpp to the CodeGen layer in
ARMException.cpp, leaving behind an 'extern' to refer back to it. That
layering order isn't viable givin the constraints outlined above.
Commandline flags are designed to be static specifically to avoid these
types of bugs.
Fixing this is likely going to require some non-trivial refactoring.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148759 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds an new value to the --arm-enable-ehabi option that
disables emitting unwinding descriptors. This mode gives a working
backtrace() without the (currently broken) exception support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148686 91177308-0d34-0410-b5e6-96231b3b80d8
We have patterns for vector sext and zext operations but were missing
anyext. Without those patterns, codegen will fail when the selection DAG
has any_extend nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148568 91177308-0d34-0410-b5e6-96231b3b80d8
overly conservative. It was concerned about cases where it would prohibit
folding simple [r, c] addressing modes. e.g.
ldr r0, [r2]
ldr r1, [r2, #4]
=>
ldr r0, [r2], #4
ldr r1, [r2]
Change the logic to look for such cases which allows it to form indexed memory
ops more aggressively.
rdar://10674430
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148086 91177308-0d34-0410-b5e6-96231b3b80d8
Allow LDRD to be formed from pairs with different LDR encodings. This was the original intention of the pass. Somewhere along the way, the LDR opcodes were refined which broke the optimization. We really don't care what the original opcodes are as long as they both map to the same LDRD and the immediate still fits.
Fixes rdar://10435045 ARMLoadStoreOptimization cannot handle mixed LDRi8/LDRi12
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147922 91177308-0d34-0410-b5e6-96231b3b80d8
define physical registers. It's currently very restrictive, only catching
cases where the CE is in an immediate (and only) predecessor. But it catches
a surprising large number of cases.
rdar://10660865
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147827 91177308-0d34-0410-b5e6-96231b3b80d8
This enables basic local CSE, giving us 20% smaller code for
consumer-typeset in -O0 builds.
<rdar://problem/10658692>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147720 91177308-0d34-0410-b5e6-96231b3b80d8
opportunities that only present themselves after late optimizations
such as tail duplication .e.g.
## BB#1:
movl %eax, %ecx
movl %ecx, %eax
ret
The register allocator also leaves some of them around (due to false
dep between copies from phi-elimination, etc.)
This required some changes in codegen passes. Post-ra scheduler and the
pseudo-instruction expansion passes have been moved after branch folding
and tail merging. They were before branch folding before because it did
not always update block livein's. That's fixed now. The pass change makes
independently since we want to properly schedule instructions after
branch folding / tail duplication.
rdar://10428165
rdar://10640363
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147716 91177308-0d34-0410-b5e6-96231b3b80d8
This eliminates a lot of constant pool entries for -O0 builds of code
with many global variable accesses.
This speeds up -O0 codegen of consumer-typeset by 2x because the
constant island pass no longer has to look at thousands of constant pool
entries.
<rdar://problem/10629774>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147712 91177308-0d34-0410-b5e6-96231b3b80d8
Now that canRealignStack() understands frozen reserved registers, it is
safe to use it for aligned spill instructions.
It will only return true if the registers reserved at the beginning of
register allocation allow for dynamic stack realignment.
<rdar://problem/10625436>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147579 91177308-0d34-0410-b5e6-96231b3b80d8
This patch caused a miscompilation of oggenc because a frame pointer was
suddenly needed halfway through register allocation.
<rdar://problem/10625436>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147487 91177308-0d34-0410-b5e6-96231b3b80d8
Use the spill slot alignment as well as the local variable alignment to
determine when the stack needs to be realigned. This works now that the
ARM target can always realign the stack by using a base pointer.
Still respect the ARMBaseRegisterInfo::canRealignStack() function
vetoing a realigned stack. Don't use aligned spill code in that case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146997 91177308-0d34-0410-b5e6-96231b3b80d8
We used to rely on the *eh_sjlj_setjmp instructions to mark that a function
with setjmp/longjmp exception handling clobbers all the registers. But with
the recent reorganization of ARM EH, those eh_sjlj_setjmp instructions are
expanded away earlier, before PEI can see them to determine what registers to
save and restore. Mark the dispatchsetup instruction in the same way, since
that instruction cannot be expanded early. This also more accurately reflects
when the registers are clobbered.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146949 91177308-0d34-0410-b5e6-96231b3b80d8
On ARM, peephole optimization for ABS creates a trivial cfg triangle which tempts machine sink to sink instructions in code which is really straight line code. Sometimes this sinking may alter register allocator input such that use and def of a reg is divided by a branch in between, which may result in extra spills. Now mahine sink avoids sinking if final sink destination is post dominator.
Radar 10266272.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146604 91177308-0d34-0410-b5e6-96231b3b80d8
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
prevent IT blocks from being broken apart.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146542 91177308-0d34-0410-b5e6-96231b3b80d8
These modifiers simply select either the low or high D subregister of a Neon
Q register. I've also removed the unimplemented 'p' modifier, which turns out
to be a bit different than the comment here suggests and as far as I can tell
was only intended for internal use in Apple's version of gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146417 91177308-0d34-0410-b5e6-96231b3b80d8
I followed three heuristics for deciding whether to set 'true' or
'false':
- Everything target independent got 'true' as that is the expected
common output of the GCC builtins.
- If the target arch only has one way of implementing this operation,
set the flag in the way that exercises the most of codegen. For most
architectures this is also the likely path from a GCC builtin, with
'true' being set. It will (eventually) require lowering away that
difference, and then lowering to the architecture's operation.
- Otherwise, set the flag differently dependending on which target
operation should be tested.
Let me know if anyone has any issue with this pattern or would like
specific tests of another form. This should allow the x86 codegen to
just iteratively improve as I teach the backend how to differentiate
between the two forms, and everything else should remain exactly the
same.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146370 91177308-0d34-0410-b5e6-96231b3b80d8