Registers are not allocated strictly in spill weight order when live range
splitting and spilling has created new shorter intervals with higher spill
weights.
When one of the new heavy intervals conflicts with a single lighter interval,
simply evict the old interval instead of trying to split the heavy one.
The lighter interval is a better candidate for splitting, it has a smaller use
density.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125151 91177308-0d34-0410-b5e6-96231b3b80d8
The last split point can be anywhere in the block, so it interferes with the
strictly monotonic requirements of advanceTo().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125132 91177308-0d34-0410-b5e6-96231b3b80d8
This is a lot easier than trying to get kill flags right during live range
splitting and rematerialization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125113 91177308-0d34-0410-b5e6-96231b3b80d8
anything but the simplest of cases, lower a 256-bit BUILD_VECTOR by
splitting it into 128-bit parts and recombining.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125105 91177308-0d34-0410-b5e6-96231b3b80d8
If a live range is used by a terminator instruction, and that live range needs
to leave the block on the stack or in a different register, it can be necessary
to have both sides of the split live at the terminator instruction.
Example:
%vreg2 = COPY %vreg1
JMP %vreg1
Becomes after spilling %vreg2:
SPILL %vreg1
JMP %vreg1
The spill doesn't kill the register as is normally the case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125102 91177308-0d34-0410-b5e6-96231b3b80d8
Avoid using the same register for two def operands or and earlyclobber
def and use operand. This fixes PR8986 and improves on the prior fix
for rdar://problem/8959122.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125089 91177308-0d34-0410-b5e6-96231b3b80d8
t2LDRpci with t2LDRi12.
There are a couple of problems with this.
1. The encoding for the literal and immediate constant are different.
Note bit 7 of the literal case is 'U' so it can be negative.
2. t2LDRi12 is now narrowed to tLDRpci before constant island pass is run.
So we end up never using the Thumb2 instruction, which ends up creating a
lot more constant islands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125074 91177308-0d34-0410-b5e6-96231b3b80d8
the active loop. This is generally desirable, and it avoids trouble
in situations such as the testcase in PR9123, though the failure
mode depends on use-list order, so it is infeasible to test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125065 91177308-0d34-0410-b5e6-96231b3b80d8
After uses of a live range are removed, recompute the live range to only cover
the remaining uses. This is necessary after rematerializing the value before
some (but not all) uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125058 91177308-0d34-0410-b5e6-96231b3b80d8
parsing of operands introduced in r125030. As a small note, besides using a more
generic approach we can also have more descriptive output when debugging
llvm-mc, example:
mcr p7, #1, r5, c1, c1, #4
note: parsed instruction:
['mcr', <ARMCC::al>,
<coprocessor number: 7>,
1,
<register 73>,
<coprocessor register: 1>,
<coprocessor register: 1>,
4]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125052 91177308-0d34-0410-b5e6-96231b3b80d8
The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using
post-increment versions, but all the rest of the NEON load/store instructions
should be handled now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125014 91177308-0d34-0410-b5e6-96231b3b80d8
These operations are expanded to pairs of loads or stores, and the first one
uses the address register update to produce the address for the second one.
So far, the second load/store has also updated the address register, just
for convenience, since that output has never been used. In anticipation of
actually supporting post-increment updates for these operations, this changes
the non-updating operations to use a non-updating load/store for the second
instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125013 91177308-0d34-0410-b5e6-96231b3b80d8
failures with relocations.
The code committed is a first cut at compatibility for emitted relocations in
ELF .o.
Why do this? because existing ARM tools like emitting relocs symbols as
explicit relocations, not as section-offset relocs.
Result is that with these changes,
1) relocs are now substantially identical what to gcc outputs.
2) larger apps (including many spec2k tests) compile, cross-link, and pass
Added reminder fixme to tests for future conversion to .s form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124996 91177308-0d34-0410-b5e6-96231b3b80d8
Unified EmitTextAttribute for both Asm and Obj emission (.cpu only)
Added necessary cortex-A8 related attrs for codegen compat tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124995 91177308-0d34-0410-b5e6-96231b3b80d8
config.h.* have conditions whether each symbol is defined or not.
Autoconf and CMake may check symbols in libgcc.a for JIT on Mingw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124950 91177308-0d34-0410-b5e6-96231b3b80d8
<rdar://problem/8959122> illegal register operands for UMULL instruction in cfrac nightly test
I'm stil working on a unit test, but the case is:
rx = movcc rx, r3
r2 = ldr
r2, r3 = umull r2, r2
The anti-dep breaker should not convert this into an illegal instruction:
r2, r2 = umull
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124932 91177308-0d34-0410-b5e6-96231b3b80d8
a) Making it a per call site bonus for functions that we can move from
indirect to direct calls.
b) Reduces the bonus from 500 to 100 per call site.
c) Subtracts the size of the possible newly inlineable call from the
bonus to only add a bonus if we can inline a small function to devirtualize
it.
Also changes the bonus from a positive that's subtracted to a negative
that's added.
Fixes the remainder of rdar://8546196 by reducing the object file size
after inlining by 84%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124916 91177308-0d34-0410-b5e6-96231b3b80d8
This allows us to easily support 256-bit operations that don't have
native 256-bit support. This applies to integer operations, certain
types of shuffles and various othher things.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124910 91177308-0d34-0410-b5e6-96231b3b80d8
If interference reaches the last split point, it is effectively live out and
should be marked as 'MustSpill'.
This can make a difference when the terminator uses a register. There is no way
that register can be reused in the outgoing CFG bundle, even if it isn't live
out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124900 91177308-0d34-0410-b5e6-96231b3b80d8
(yes, this is different from R_ARM_CALL)
- Adds a new method getARMBranchTargetOpValue() which handles the
necessary distinction between the conditional and unconditional br/bl
needed for ARM/ELF
At least for ARM mode, the needed fixup for conditional versus unconditional
br/bl is identical, but the ARM docs and existing ARM tools expect this
reloc type...
Added a few FIXME's for future naming fixups in ARMInstrInfo.td
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124895 91177308-0d34-0410-b5e6-96231b3b80d8
A live range cannot be split everywhere in a basic block. A split must go before
the first terminator, and if the variable is live into a landing pad, the split
must happen before the call that can throw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124894 91177308-0d34-0410-b5e6-96231b3b80d8
We should not be attempting a region split if it won't lead to at least one
directly allocatable interval. That could cause infinite splitting loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124893 91177308-0d34-0410-b5e6-96231b3b80d8
infrastructure. This makes lowering 256-bit vectors to 128-bit
vectors simple when 256-bit vector support is not available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124868 91177308-0d34-0410-b5e6-96231b3b80d8
precisely track pressure on a selection DAG, but we can at least keep
it balanced. This design accounts for various interesting aspects of
selection DAGS: register and subregister copies, glued nodes, dead
nodes, unused registers, etc.
Added SUnit::NumRegDefsLeft and ScheduleDAGSDNodes::RegDefIter.
Note: I disabled PrescheduleNodesWithMultipleUses when register
pressure is enabled, based on no evidence other than I don't think it
makes sense to have both enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124853 91177308-0d34-0410-b5e6-96231b3b80d8
When the live range is live through a block that doesn't use the register, but
that has interference, region splitting wants to split at the top and bottom of
the basic block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124839 91177308-0d34-0410-b5e6-96231b3b80d8
Allow a live range to end with a kill flag, but don't allow a kill flag that
doesn't end the live range.
This makes the machine code verifier more useful during register allocation when
kill flag computation is deferred.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124838 91177308-0d34-0410-b5e6-96231b3b80d8
If the found value is not live-through the block, we should only add liveness up
to the requested slot index. When the value is live-through, the whole block
should be colored.
Bug found by SSA verification in the machine code verifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124812 91177308-0d34-0410-b5e6-96231b3b80d8
These end points come from the inserted copies, and can be passed directly to
useIntv. This simplifies the coloring code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124799 91177308-0d34-0410-b5e6-96231b3b80d8
matching EXTRACT_SUBVECTOR to VEXTRACTF128 along with support routines
to examine and translate index values. VINSERTF128 comes next. With
these two in place we can begin supporting more AVX operations as
INSERT/EXTRACT can be used as a fallback when 256-bit support is not
available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124797 91177308-0d34-0410-b5e6-96231b3b80d8
auto-simplifier). This has a big impact on Ada code, but not much else.
Unfortunately the impact is mostly negative! This is due to PR9004 (aka
SCCP failing to resolve conditional branch conditions in the destination
blocks of the branch), in which simple correlated expressions are not
resolved but complicated ones are, so simplifying has a bad effect!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124788 91177308-0d34-0410-b5e6-96231b3b80d8
Reversing the operands allows us to fold, but doesn't force us to. Also, at
this point the DAG is still being optimized, so the check for hasOneUse is not
very precise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124773 91177308-0d34-0410-b5e6-96231b3b80d8
The greedy register allocator revealed some problems with the value mapping in
SplitKit. We would sometimes start mapping values before all defs were known,
and that could change a value from a simple 1-1 mapping to a multi-def mapping
that requires ssa update.
The new approach collects all defs and register assignments first without
filling in any live intervals. Only when finish() is called, do we compute
liveness and mapped values. At this time we know with certainty which values map
to multiple values in a split range.
This also has the advantage that we can compute live ranges based on the
remaining uses after rematerializing at split points.
The current implementation has many opportunities for compile time optimization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124765 91177308-0d34-0410-b5e6-96231b3b80d8
overflow (nsw flag), which was disabled because it breaks 254.gap. I have
informed the GAP authors of the mistake in their code, and arranged for the
testsuite to use -fwrapv when compiling this benchmark.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124746 91177308-0d34-0410-b5e6-96231b3b80d8
This makes the job of the later optzn passes easier, allowing the vast amount of
icmp transforms to chew on it.
We transform 840 switches in gcc.c, leading to a 16k byte shrink of the resulting
binary on i386-linux.
The testcase from README.txt now compiles into
decl %edi
cmpl $3, %edi
sbbl %eax, %eax
andl $1, %eax
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124724 91177308-0d34-0410-b5e6-96231b3b80d8