1. Allocate them in the entry block of the function to enable function-wide
re-use. The instructions to create them should be re-materializable, so
there shouldn't be additional cost compared to creating them local
to the basic blocks where they are used.
2. Collect all of the frame index references for the function and sort them
by the local offset referenced. Iterate over the sorted list to
allocate the virtual base registers. This enables creation of base
registers optimized for positive-offset access of frame references.
(Note: This may be appropriate to later be a target hook to do the
sorting in a target appropriate manner. For now it's done here for
simplicity.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112609 91177308-0d34-0410-b5e6-96231b3b80d8
any more. I plan to reimplement alloca promotion using SSAUpdater later.
It looks like Bill's URoR logic really always needs domtree, so the pass
now always asks for domtree info.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112597 91177308-0d34-0410-b5e6-96231b3b80d8
Eventually, we want to disable physreg coalescing completely, and let the
register allocator do its job using hints.
This option makes it possible to measure the impact of disabling physreg
coalescing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112567 91177308-0d34-0410-b5e6-96231b3b80d8
1) nuke ConstDataCoalSection, which is dead.
2) revise my previous patch for rdar://8018335,
which was completely wrong. Specifically, it doesn't
make sense to mark __TEXT,__const_coal as PURE_INSTRUCTIONS,
because it is for readonly data. templates (it turns out)
go to const_coal_nt. The real fix for rdar://8018335 was
to give ConstTextCoalSection a section kind of ReadOnly
instead of Text.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112496 91177308-0d34-0410-b5e6-96231b3b80d8
instead of PromoteMemToReg. This allows it to stop using DF and DT,
eliminating a computation of DT and DF from clang -O3. Clang is now
down to 2 runs of DomFrontier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112457 91177308-0d34-0410-b5e6-96231b3b80d8
virtual base registers handle this function, and more. A bit more cleanup
to do on the interface to eliminateFrameIndex() after this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112237 91177308-0d34-0410-b5e6-96231b3b80d8
expanding: e.g. <2 x float> -> <4 x float> instead of -> 2 floats. This
affects two places in the code: handling cross block values and handling
function return and arguments. Since vectors are already widened by
legalizetypes, this gives us much better code and unblocks x86-64 abi
and SPU abi work.
For example, this (which is a silly example of a cross-block value):
define <4 x float> @test2(<4 x float> %A) nounwind {
%B = shufflevector <4 x float> %A, <4 x float> undef, <2 x i32> <i32 0, i32 1>
%C = fadd <2 x float> %B, %B
br label %BB
BB:
%D = fadd <2 x float> %C, %C
%E = shufflevector <2 x float> %D, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
ret <4 x float> %E
}
Now compiles into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
addps %xmm0, %xmm0
ret
previously it compiled into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
pshufd $1, %xmm0, %xmm1
## kill: XMM0<def> XMM0<kill> XMM0<def>
insertps $0, %xmm0, %xmm0
insertps $16, %xmm1, %xmm0
addps %xmm0, %xmm0
ret
This implements rdar://8230384
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112101 91177308-0d34-0410-b5e6-96231b3b80d8
For now it's still a command line option, but the interface to the generic
code doesn't need to know that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111942 91177308-0d34-0410-b5e6-96231b3b80d8
relative offsets when there are offsets encoded in the instructions and
simplifies final allocation in PEI. rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111836 91177308-0d34-0410-b5e6-96231b3b80d8
hierarchy with virtual methods and using llvm_unreachable to properly indicate
unreachable states which would otherwise leave variables uninitialized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111803 91177308-0d34-0410-b5e6-96231b3b80d8
It's similar to "linker_private_weak", but it's known that the address of the
object is not taken. For instance, functions that had an inline definition, but
the compiler decided not to inline it. Note, unlike linker_private and
linker_private_weak, linker_private_weak_def_auto may have only default
visibility. The symbols are removed by the linker from the final linked image
(executable or dynamic library).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111684 91177308-0d34-0410-b5e6-96231b3b80d8
it involves specific floating-point types, legalize should expand an
extending load to a non-extending load followed by a separate extend operation.
For example, we currently expand SEXTLOAD to EXTLOAD+SIGN_EXTEND_INREG (and
assert that EXTLOAD should always be supported). Now we can expand that to
LOAD+SIGN_EXTEND. This is needed to allow vector SIGN_EXTEND and ZERO_EXTEND
to be used for NEON.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111586 91177308-0d34-0410-b5e6-96231b3b80d8
The Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01
implements parts of C++0x based on the draft standard. An old version of
the draft had a bug that makes std::pair<T1*, T2*>(something, 0) fail to
compile. This is because the template<class U, class V> pair(U&& x, V&& y)
constructor is selected, even though it later fails to implicitly convert
U and V to frist_type and second_type.
This has been fixed in n3090, but it seems that Microsoft is not going to
update msvc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111535 91177308-0d34-0410-b5e6-96231b3b80d8
base registers were required. This will allow for slightly better packing
of the locals when alignment padding is necessary after callee saved registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111508 91177308-0d34-0410-b5e6-96231b3b80d8
frame index reference to an object in the local block is seen, check if
it's near enough to any previously allocaated base register to re-use.
rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111443 91177308-0d34-0410-b5e6-96231b3b80d8
We must complete the DFS, otherwise we might miss needed phi-defs, and
prematurely color live ranges with a non-dominating value.
This is not a big deal since we get to color more of the CFG and the next
mapValue call will be faster.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111397 91177308-0d34-0410-b5e6-96231b3b80d8
LiveIntervalMap maps values from a parent LiveInterval to a child interval that
is a strict subset. It will create phi-def values as needed to preserve the
VNInfo SSA form in the child interval.
This leads to an algorithm very similar to the one in SSAUpdaterImpl.h, but with
enough differences that the code can't be reused:
- We don't need to manipulate PHI instructions.
- LiveIntervals have kills.
- We have MachineDominatorTree.
- We can use df_iterator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111393 91177308-0d34-0410-b5e6-96231b3b80d8
Nothing fancy, just ask the target if any currently available base reg
is in range for the instruction under consideration and use the first one
that is. Placeholder ARM implementation simply returns false for now.
ongoing saga of rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111374 91177308-0d34-0410-b5e6-96231b3b80d8
the local block. Resolve references to those indices to a new base register.
For simplification and testing purposes, a new virtual base register is
allocated for each frame index being resolved. The result is truly horrible,
but correct, code that's good for exercising the new code paths.
Next up is adding thumb1 support, which should be very simple. Following that
will be adding base register re-use and implementing a reasonable ARM
heuristic for when a virtual base register should be generated at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111315 91177308-0d34-0410-b5e6-96231b3b80d8
whether to allocate a virtual frame base register to resolve the frame
index reference in it. Implement a simple version for ARM to aid debugging.
In LocalStackSlotAllocation, scan the function for frame index references
to local frame indices and ask the target whether to allocate virtual
frame base registers for any it encounters. Purely infrastructural for
debug output. Next step is to actually allocate base registers, then add
intelligent re-use of them.
rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111262 91177308-0d34-0410-b5e6-96231b3b80d8
mapping. Have the local block track its alignment requirement, and then
apply that when the block itself is allocated. Previously, offsets could
get adjusted in PEI to be different, relative to one another, than the
block allocation thought they would be, which defeats the point of doing
the allocation this way. Continuing rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111197 91177308-0d34-0410-b5e6-96231b3b80d8
experimental pass that allocates locals relative to one another before
register allocation and then assigns them to actual stack slots as a block
later in PEI. This will eventually allow targets with limited index offset
range to allocate additional base registers (not just FP and SP) to
more efficiently reference locals, as well as handle situations where
locals cannot be referenced via SP or FP at all (dynamic stack realignment
together with variable sized objects, for example). It's currently
incomplete and almost certainly buggy. Work in progress.
Disabled by default and gated via the -enable-local-stack-alloc command
line option.
rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111059 91177308-0d34-0410-b5e6-96231b3b80d8
The earliestStart argument is entirely specific to linear scan allocation, and
can be easily calculated by RegAllocLinearScan.
Replace std::vector with SmallVector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111055 91177308-0d34-0410-b5e6-96231b3b80d8
When a live range is contained a single block, we can split it around
instruction clusters. The current approach is very primitive, splitting before
and after the largest gap between uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111043 91177308-0d34-0410-b5e6-96231b3b80d8
numbers match. The old check could accidentally leave holes in openli.
Also let useIntv add all ranges for the phi-def value inserted by
enterIntvAtEnd. This works as long at the value mapping is established in
enterIntvAtEnd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110995 91177308-0d34-0410-b5e6-96231b3b80d8
This can happen if the original interval has been broken into two disconnected
parts. Ideally, we should be able to detect when the graph is disconnected and
create separate intervals, but that code is not implemented yet.
Example:
Two basic blocks are both branching to a loop header. Our interval is defined in
both basic blocks, and live into the loop along both edges.
We decide to split the interval around the loop. The interval is split into an
inside part and an outside part. The outside part now has two disconnected
segments, one in each basic block.
If we later decide to split the outside interval into single blocks, we get one
interval per basic block and an empty dupli for the remainder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110976 91177308-0d34-0410-b5e6-96231b3b80d8
split intervals. THis means the analysis can be used for multiple splits as long
as curli doesn't shrink.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110975 91177308-0d34-0410-b5e6-96231b3b80d8
Before spilling a live range, we split it into a separate range for each basic
block where it is used. That way we only get one reload per basic block if the
new smaller ranges can allocate to a register.
This type of splitting is already present in the standard spiller.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110934 91177308-0d34-0410-b5e6-96231b3b80d8
operands. We don't currently have a hook to provide "the largest super class of
A where all registers' getSubReg(subidx) is valid and in B".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110730 91177308-0d34-0410-b5e6-96231b3b80d8
The live interval may be used for a spill slot as well, and that spill slot
could be shared by split registers. We cannot shrink it, even if we know the
current register won't need the spill slot in that range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110721 91177308-0d34-0410-b5e6-96231b3b80d8
When splitting a live range, the new registers have fewer uses and the
permissible register class may be less constrained. Recompute the register class
constraint from the uses of new registers created for a split. This may let them
be allocated from a larger set, possibly avoiding a spill.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110703 91177308-0d34-0410-b5e6-96231b3b80d8
register at a time. This turns out to be slightly faster than iterating over
instructions, but more importantly, it allows us to compute spill weights for
new registers created after the spill weight pass has run.
Also compute the allocation hint at the same time as the spill weight. This
allows us to use the spill weight as a cost metric for copies, and choose the
most profitable hint if there is more than one possibility.
The new hints provide a very small (< 0.1%) but universal code size improvement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110631 91177308-0d34-0410-b5e6-96231b3b80d8
pass. This pass should expand with all of the small, fine-grained optimization
passes to reduce compile time and increase happiment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110627 91177308-0d34-0410-b5e6-96231b3b80d8
If we are emitting COPY instructions for the REG_SEQUENCE, make sure the kill
flag goes on the last COPY. Otherwise we may be using a killed register.
<rdar://problem/8287792>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110589 91177308-0d34-0410-b5e6-96231b3b80d8
relatively expensive comparison analyzer on each instruction. Also rename the
comparison analyzer method to something more in line with what it actually does.
This pass is will eventually be folded into the Machine CSE pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110539 91177308-0d34-0410-b5e6-96231b3b80d8
necessary.
Sometimes, live range splitting doesn't shrink the current interval, but simply
changes some instructions to use a new interval. That makes the original more
suitable for spilling. In this case, we don't need to duplicate the original.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110481 91177308-0d34-0410-b5e6-96231b3b80d8
After heavy editing of a live interval, it is much easier to simply renumber the
live values instead of trying to keep track of the unused ones.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110463 91177308-0d34-0410-b5e6-96231b3b80d8
When a physical register is in use, some alias of that register has a live
interval with a relevant live range. That is the sad state of intervals after
physreg coalescing of subregs, and it is good enough for correct register
allocation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110452 91177308-0d34-0410-b5e6-96231b3b80d8
This pass tries to remove comparison instructions when possible. For instance,
if you have this code:
sub r1, 1
cmp r1, 0
bz L1
and "sub" either sets the same flag as the "cmp" instruction or could be
converted to set the same flag, then we can eliminate the "cmp" instruction all
together. This is a important for ARM where the ALU instructions could set the
CPSR flag, but need a special suffix ('s') to do so.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110423 91177308-0d34-0410-b5e6-96231b3b80d8
When a joined COPY changes subreg liveness, we keep it around as a KILL,
otherwise it is safe to delete.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110403 91177308-0d34-0410-b5e6-96231b3b80d8
LiveVariables becomes horribly wrong while the coalescer is running, but the
analysis is not zapped until after the coalescer pass has run. This causes tons
of false reports when calling verify form the coalescer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110402 91177308-0d34-0410-b5e6-96231b3b80d8
We verify that the LiveInterval is live at uses and defs, and that all
instructions have a SlotIndex.
Stuff we don't check yet:
- Is the LiveInterval minimal?
- Do all defs correspond to instructions or phis?
- Do all defs dominate all their live ranges?
- Are all live ranges continually reachable from their def?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110386 91177308-0d34-0410-b5e6-96231b3b80d8
be killed before being redefined.
These checks are usually disabled, and usually fail when enabled. We de facto
allow live registers to be redefined without a kill, the corresponding
assertions in RegScavenger were removed long ago.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110362 91177308-0d34-0410-b5e6-96231b3b80d8
We are now at a point where we can split around simple single-entry, single-exit
loops, although still with some bugs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110257 91177308-0d34-0410-b5e6-96231b3b80d8
When the normalizeSpillWeights function was introduced, I forgot to remove this
normalization.
This change could affect register allocation. Hopefully for the better.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110119 91177308-0d34-0410-b5e6-96231b3b80d8
check the range of the constant when optimizing a comparison between a
constant and a sign_extend_inreg node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109854 91177308-0d34-0410-b5e6-96231b3b80d8
multiple defs, like t2LDRSB_POST.
The first def could accidentally steal the physreg that the second, tied def was
required to be allocated to.
Now, the tied use-def is treated more like an early clobber, and the physreg is
reserved before allocating the other defs.
This would never be a problem when the tied def was the only def which is the
usual case.
This fixes MallocBench/gs for thumb2 -O0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109715 91177308-0d34-0410-b5e6-96231b3b80d8
protectors, to be near the stack protectors on the stack. Accomplish this by
tagging the stack object with a predicate that indicates that it would trigger
this. In the prolog-epilog inserter, assign these objects to the stack after the
stack protector but before the other objects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109481 91177308-0d34-0410-b5e6-96231b3b80d8
instead of fixed size arrays, so that increasing FirstVirtualRegister to 16K
won't cause a compile time performance regression.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109330 91177308-0d34-0410-b5e6-96231b3b80d8
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.
On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109300 91177308-0d34-0410-b5e6-96231b3b80d8
to be of a different register class. For example, in Thumb1 if the live-in is
a high register, we want the vreg to be a low register. rdar://8224931
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109291 91177308-0d34-0410-b5e6-96231b3b80d8
it's too late to start backing off aggressive latency scheduling when most
of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
For ARM, this is almost always a win on # of instructions. It's runtime
neutral for most of the tests. But for some kernels with high register
pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
54 and sped up by 20%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109279 91177308-0d34-0410-b5e6-96231b3b80d8