expanding: e.g. <2 x float> -> <4 x float> instead of -> 2 floats. This
affects two places in the code: handling cross block values and handling
function return and arguments. Since vectors are already widened by
legalizetypes, this gives us much better code and unblocks x86-64 abi
and SPU abi work.
For example, this (which is a silly example of a cross-block value):
define <4 x float> @test2(<4 x float> %A) nounwind {
%B = shufflevector <4 x float> %A, <4 x float> undef, <2 x i32> <i32 0, i32 1>
%C = fadd <2 x float> %B, %B
br label %BB
BB:
%D = fadd <2 x float> %C, %C
%E = shufflevector <2 x float> %D, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
ret <4 x float> %E
}
Now compiles into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
addps %xmm0, %xmm0
ret
previously it compiled into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
pshufd $1, %xmm0, %xmm1
## kill: XMM0<def> XMM0<kill> XMM0<def>
insertps $0, %xmm0, %xmm0
insertps $16, %xmm1, %xmm0
addps %xmm0, %xmm0
ret
This implements rdar://8230384
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112101 91177308-0d34-0410-b5e6-96231b3b80d8
For now it's still a command line option, but the interface to the generic
code doesn't need to know that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111942 91177308-0d34-0410-b5e6-96231b3b80d8
relative offsets when there are offsets encoded in the instructions and
simplifies final allocation in PEI. rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111836 91177308-0d34-0410-b5e6-96231b3b80d8
hierarchy with virtual methods and using llvm_unreachable to properly indicate
unreachable states which would otherwise leave variables uninitialized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111803 91177308-0d34-0410-b5e6-96231b3b80d8
It's similar to "linker_private_weak", but it's known that the address of the
object is not taken. For instance, functions that had an inline definition, but
the compiler decided not to inline it. Note, unlike linker_private and
linker_private_weak, linker_private_weak_def_auto may have only default
visibility. The symbols are removed by the linker from the final linked image
(executable or dynamic library).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111684 91177308-0d34-0410-b5e6-96231b3b80d8
it involves specific floating-point types, legalize should expand an
extending load to a non-extending load followed by a separate extend operation.
For example, we currently expand SEXTLOAD to EXTLOAD+SIGN_EXTEND_INREG (and
assert that EXTLOAD should always be supported). Now we can expand that to
LOAD+SIGN_EXTEND. This is needed to allow vector SIGN_EXTEND and ZERO_EXTEND
to be used for NEON.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111586 91177308-0d34-0410-b5e6-96231b3b80d8
The Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01
implements parts of C++0x based on the draft standard. An old version of
the draft had a bug that makes std::pair<T1*, T2*>(something, 0) fail to
compile. This is because the template<class U, class V> pair(U&& x, V&& y)
constructor is selected, even though it later fails to implicitly convert
U and V to frist_type and second_type.
This has been fixed in n3090, but it seems that Microsoft is not going to
update msvc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111535 91177308-0d34-0410-b5e6-96231b3b80d8
base registers were required. This will allow for slightly better packing
of the locals when alignment padding is necessary after callee saved registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111508 91177308-0d34-0410-b5e6-96231b3b80d8
frame index reference to an object in the local block is seen, check if
it's near enough to any previously allocaated base register to re-use.
rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111443 91177308-0d34-0410-b5e6-96231b3b80d8
We must complete the DFS, otherwise we might miss needed phi-defs, and
prematurely color live ranges with a non-dominating value.
This is not a big deal since we get to color more of the CFG and the next
mapValue call will be faster.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111397 91177308-0d34-0410-b5e6-96231b3b80d8
LiveIntervalMap maps values from a parent LiveInterval to a child interval that
is a strict subset. It will create phi-def values as needed to preserve the
VNInfo SSA form in the child interval.
This leads to an algorithm very similar to the one in SSAUpdaterImpl.h, but with
enough differences that the code can't be reused:
- We don't need to manipulate PHI instructions.
- LiveIntervals have kills.
- We have MachineDominatorTree.
- We can use df_iterator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111393 91177308-0d34-0410-b5e6-96231b3b80d8
Nothing fancy, just ask the target if any currently available base reg
is in range for the instruction under consideration and use the first one
that is. Placeholder ARM implementation simply returns false for now.
ongoing saga of rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111374 91177308-0d34-0410-b5e6-96231b3b80d8
the local block. Resolve references to those indices to a new base register.
For simplification and testing purposes, a new virtual base register is
allocated for each frame index being resolved. The result is truly horrible,
but correct, code that's good for exercising the new code paths.
Next up is adding thumb1 support, which should be very simple. Following that
will be adding base register re-use and implementing a reasonable ARM
heuristic for when a virtual base register should be generated at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111315 91177308-0d34-0410-b5e6-96231b3b80d8
whether to allocate a virtual frame base register to resolve the frame
index reference in it. Implement a simple version for ARM to aid debugging.
In LocalStackSlotAllocation, scan the function for frame index references
to local frame indices and ask the target whether to allocate virtual
frame base registers for any it encounters. Purely infrastructural for
debug output. Next step is to actually allocate base registers, then add
intelligent re-use of them.
rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111262 91177308-0d34-0410-b5e6-96231b3b80d8
mapping. Have the local block track its alignment requirement, and then
apply that when the block itself is allocated. Previously, offsets could
get adjusted in PEI to be different, relative to one another, than the
block allocation thought they would be, which defeats the point of doing
the allocation this way. Continuing rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111197 91177308-0d34-0410-b5e6-96231b3b80d8
experimental pass that allocates locals relative to one another before
register allocation and then assigns them to actual stack slots as a block
later in PEI. This will eventually allow targets with limited index offset
range to allocate additional base registers (not just FP and SP) to
more efficiently reference locals, as well as handle situations where
locals cannot be referenced via SP or FP at all (dynamic stack realignment
together with variable sized objects, for example). It's currently
incomplete and almost certainly buggy. Work in progress.
Disabled by default and gated via the -enable-local-stack-alloc command
line option.
rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111059 91177308-0d34-0410-b5e6-96231b3b80d8
The earliestStart argument is entirely specific to linear scan allocation, and
can be easily calculated by RegAllocLinearScan.
Replace std::vector with SmallVector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111055 91177308-0d34-0410-b5e6-96231b3b80d8
When a live range is contained a single block, we can split it around
instruction clusters. The current approach is very primitive, splitting before
and after the largest gap between uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111043 91177308-0d34-0410-b5e6-96231b3b80d8
numbers match. The old check could accidentally leave holes in openli.
Also let useIntv add all ranges for the phi-def value inserted by
enterIntvAtEnd. This works as long at the value mapping is established in
enterIntvAtEnd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110995 91177308-0d34-0410-b5e6-96231b3b80d8
This can happen if the original interval has been broken into two disconnected
parts. Ideally, we should be able to detect when the graph is disconnected and
create separate intervals, but that code is not implemented yet.
Example:
Two basic blocks are both branching to a loop header. Our interval is defined in
both basic blocks, and live into the loop along both edges.
We decide to split the interval around the loop. The interval is split into an
inside part and an outside part. The outside part now has two disconnected
segments, one in each basic block.
If we later decide to split the outside interval into single blocks, we get one
interval per basic block and an empty dupli for the remainder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110976 91177308-0d34-0410-b5e6-96231b3b80d8
split intervals. THis means the analysis can be used for multiple splits as long
as curli doesn't shrink.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110975 91177308-0d34-0410-b5e6-96231b3b80d8
Before spilling a live range, we split it into a separate range for each basic
block where it is used. That way we only get one reload per basic block if the
new smaller ranges can allocate to a register.
This type of splitting is already present in the standard spiller.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110934 91177308-0d34-0410-b5e6-96231b3b80d8
operands. We don't currently have a hook to provide "the largest super class of
A where all registers' getSubReg(subidx) is valid and in B".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110730 91177308-0d34-0410-b5e6-96231b3b80d8
The live interval may be used for a spill slot as well, and that spill slot
could be shared by split registers. We cannot shrink it, even if we know the
current register won't need the spill slot in that range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110721 91177308-0d34-0410-b5e6-96231b3b80d8
When splitting a live range, the new registers have fewer uses and the
permissible register class may be less constrained. Recompute the register class
constraint from the uses of new registers created for a split. This may let them
be allocated from a larger set, possibly avoiding a spill.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110703 91177308-0d34-0410-b5e6-96231b3b80d8
register at a time. This turns out to be slightly faster than iterating over
instructions, but more importantly, it allows us to compute spill weights for
new registers created after the spill weight pass has run.
Also compute the allocation hint at the same time as the spill weight. This
allows us to use the spill weight as a cost metric for copies, and choose the
most profitable hint if there is more than one possibility.
The new hints provide a very small (< 0.1%) but universal code size improvement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110631 91177308-0d34-0410-b5e6-96231b3b80d8
pass. This pass should expand with all of the small, fine-grained optimization
passes to reduce compile time and increase happiment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110627 91177308-0d34-0410-b5e6-96231b3b80d8
If we are emitting COPY instructions for the REG_SEQUENCE, make sure the kill
flag goes on the last COPY. Otherwise we may be using a killed register.
<rdar://problem/8287792>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110589 91177308-0d34-0410-b5e6-96231b3b80d8
relatively expensive comparison analyzer on each instruction. Also rename the
comparison analyzer method to something more in line with what it actually does.
This pass is will eventually be folded into the Machine CSE pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110539 91177308-0d34-0410-b5e6-96231b3b80d8
necessary.
Sometimes, live range splitting doesn't shrink the current interval, but simply
changes some instructions to use a new interval. That makes the original more
suitable for spilling. In this case, we don't need to duplicate the original.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110481 91177308-0d34-0410-b5e6-96231b3b80d8
After heavy editing of a live interval, it is much easier to simply renumber the
live values instead of trying to keep track of the unused ones.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110463 91177308-0d34-0410-b5e6-96231b3b80d8
When a physical register is in use, some alias of that register has a live
interval with a relevant live range. That is the sad state of intervals after
physreg coalescing of subregs, and it is good enough for correct register
allocation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110452 91177308-0d34-0410-b5e6-96231b3b80d8
This pass tries to remove comparison instructions when possible. For instance,
if you have this code:
sub r1, 1
cmp r1, 0
bz L1
and "sub" either sets the same flag as the "cmp" instruction or could be
converted to set the same flag, then we can eliminate the "cmp" instruction all
together. This is a important for ARM where the ALU instructions could set the
CPSR flag, but need a special suffix ('s') to do so.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110423 91177308-0d34-0410-b5e6-96231b3b80d8
When a joined COPY changes subreg liveness, we keep it around as a KILL,
otherwise it is safe to delete.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110403 91177308-0d34-0410-b5e6-96231b3b80d8
LiveVariables becomes horribly wrong while the coalescer is running, but the
analysis is not zapped until after the coalescer pass has run. This causes tons
of false reports when calling verify form the coalescer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110402 91177308-0d34-0410-b5e6-96231b3b80d8
We verify that the LiveInterval is live at uses and defs, and that all
instructions have a SlotIndex.
Stuff we don't check yet:
- Is the LiveInterval minimal?
- Do all defs correspond to instructions or phis?
- Do all defs dominate all their live ranges?
- Are all live ranges continually reachable from their def?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110386 91177308-0d34-0410-b5e6-96231b3b80d8
be killed before being redefined.
These checks are usually disabled, and usually fail when enabled. We de facto
allow live registers to be redefined without a kill, the corresponding
assertions in RegScavenger were removed long ago.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110362 91177308-0d34-0410-b5e6-96231b3b80d8
We are now at a point where we can split around simple single-entry, single-exit
loops, although still with some bugs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110257 91177308-0d34-0410-b5e6-96231b3b80d8
When the normalizeSpillWeights function was introduced, I forgot to remove this
normalization.
This change could affect register allocation. Hopefully for the better.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110119 91177308-0d34-0410-b5e6-96231b3b80d8
check the range of the constant when optimizing a comparison between a
constant and a sign_extend_inreg node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109854 91177308-0d34-0410-b5e6-96231b3b80d8
multiple defs, like t2LDRSB_POST.
The first def could accidentally steal the physreg that the second, tied def was
required to be allocated to.
Now, the tied use-def is treated more like an early clobber, and the physreg is
reserved before allocating the other defs.
This would never be a problem when the tied def was the only def which is the
usual case.
This fixes MallocBench/gs for thumb2 -O0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109715 91177308-0d34-0410-b5e6-96231b3b80d8
protectors, to be near the stack protectors on the stack. Accomplish this by
tagging the stack object with a predicate that indicates that it would trigger
this. In the prolog-epilog inserter, assign these objects to the stack after the
stack protector but before the other objects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109481 91177308-0d34-0410-b5e6-96231b3b80d8
instead of fixed size arrays, so that increasing FirstVirtualRegister to 16K
won't cause a compile time performance regression.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109330 91177308-0d34-0410-b5e6-96231b3b80d8
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.
On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109300 91177308-0d34-0410-b5e6-96231b3b80d8
to be of a different register class. For example, in Thumb1 if the live-in is
a high register, we want the vreg to be a low register. rdar://8224931
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109291 91177308-0d34-0410-b5e6-96231b3b80d8
it's too late to start backing off aggressive latency scheduling when most
of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
For ARM, this is almost always a win on # of instructions. It's runtime
neutral for most of the tests. But for some kernels with high register
pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
54 and sped up by 20%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109279 91177308-0d34-0410-b5e6-96231b3b80d8
Make MDNode::destroy private.
Fix the one thing that used MDNode::destroy, outside of MDNode itself.
One should never delete or destroy an MDNode explicitly. MDNodes
implicitly go away when there are no references to them (implementation
details aside).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109028 91177308-0d34-0410-b5e6-96231b3b80d8
The spillers can pluck the analyses they need from the pass reference.
Switch some never-null pointers to references.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108969 91177308-0d34-0410-b5e6-96231b3b80d8
Determine which loop exit blocks need a 'pre-exit' block inserted.
Recognize when this would be impossible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108941 91177308-0d34-0410-b5e6-96231b3b80d8
This is a work in progress. So far we have some basic loop analysis to help
determine where it is useful to split a live range around a loop.
The actual loop splitting code from Splitter.cpp is also going to move in here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108842 91177308-0d34-0410-b5e6-96231b3b80d8
and interval table. Reduces output HTML file sizes by ~80% in my test cases.
Also fix access of private member type by << operator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108823 91177308-0d34-0410-b5e6-96231b3b80d8
Updated renderer to use allocation information from VirtRegMap (if
available) to render spilled intervals differently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108815 91177308-0d34-0410-b5e6-96231b3b80d8
loop, for the reasons in the comments. This is a
major win on 253.perlbmk on ARM Darwin. I expect it
to be a good heuristic in general, but it's possible
some things will regress; I'll be watching.
7940152.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108792 91177308-0d34-0410-b5e6-96231b3b80d8
update the current basic block in addition to the current insert
position, so that they remain consistent. This fixes rdar://8204072.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108765 91177308-0d34-0410-b5e6-96231b3b80d8
- Unfortunate, but necessary for now to handle subtarget instruction matching. Eventually we should factor out the lower level target machine information so we don't need to do this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108664 91177308-0d34-0410-b5e6-96231b3b80d8
I am assured by people more knowledgeable than me that there are no rounding issues in eliminating this.
This fixed <rdar://problem/8197504>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108639 91177308-0d34-0410-b5e6-96231b3b80d8
Still very much under development. Comments and fixes will be forthcoming.
(This commit includes some small tweaks to LiveIntervals & LoopInfo to support the splitter)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108615 91177308-0d34-0410-b5e6-96231b3b80d8
any command line paramater changed the register allocation produced by
PBQP.
Turns out variety is not the spice of life.
Fixed some comparators, added others. All good now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108613 91177308-0d34-0410-b5e6-96231b3b80d8
void foo() { __builtin_unreachable(); }
It will output the following on Darwin X86:
_func1:
Leh_func_begin0:
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
Leh_func_end0:
This prolog adds a new Call Frame Information (CFI) row to the FDE with an
address that is not within the address range of the code it describes -- part is
equal to the end of the function -- and therefore results in an invalid EH
frame. If we emit a nop in this situation, then the CFI row is now within the
address range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108568 91177308-0d34-0410-b5e6-96231b3b80d8
since it doesn't work for front-ends which don't emit column information
(which includes llvm-gcc in its present configuration), and doesn't
work for clang for K&R style variables where the variables are declared
in a different order from the parameter list.
Instead, make a separate pass through the instructions to collect the
llvm.dbg.declare instructions in order. This ensures that the debug
information for variables is emitted in this order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108538 91177308-0d34-0410-b5e6-96231b3b80d8
occasions, caused code to be generated in a different order.
All cases I've seen involved float softening in the type
legalizer, and this could be perhaps be fixed there, but
it's better not to generate things differently in the first
place. 7797940 (6/29/2010..7/15/2010).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108484 91177308-0d34-0410-b5e6-96231b3b80d8
the function. We'll just turn it into a "trap" instruction instead.
The problem with not handling this is that it might generate a prologue without
the equivalent epilogue to go with it:
$ cat t.ll
define void @foo() {
entry:
unreachable
}
$ llc -o - t.ll -relocation-model=pic -disable-fp-elim -unwind-tables
.section __TEXT,__text,regular,pure_instructions
.globl _foo
.align 4, 0x90
_foo: ## @foo
Leh_func_begin0:
## BB#0: ## %entry
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
Leh_func_end0:
...
The unwind tables then have bad data in them causing all sorts of problems.
Fixes <rdar://problem/8096481>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108473 91177308-0d34-0410-b5e6-96231b3b80d8
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108465 91177308-0d34-0410-b5e6-96231b3b80d8
to keep "Text" in sync with the "pure instructions" section attribute.
Lack of this attribute was preventing the assembler from emitting
multibyte noops instructions for templates (and inlines, and other
coalesced stuff) and was causing the assembler to mismatch .o files.
This fixes rdar://8018335
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108461 91177308-0d34-0410-b5e6-96231b3b80d8
independent of the order that isel happens to visit the dbg_declare
intrinsics. This fixes a bug in which the formal arguments were
being printed in reverse order, now that fast isel is going bottom up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108369 91177308-0d34-0410-b5e6-96231b3b80d8
constants, since they may not be emited near the other instructions
which get the same line, and this confuses debug info.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108302 91177308-0d34-0410-b5e6-96231b3b80d8
LiveInterval::overlapsFrom dereferences end() if it is called on an empty
interval.
It would be reasonable to just return false - an empty interval doesn't overlap
anything, but I want to know who is doing it first.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108264 91177308-0d34-0410-b5e6-96231b3b80d8
they already have one.
This fixes the himenobmtxpa miscompilation on ARM.
The PostRA scheduler got confused by the double memoperand and hoisted a stack
slot load above a store to the same slot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108219 91177308-0d34-0410-b5e6-96231b3b80d8
AggressiveAntiDepBreaker should not be using getPhysicalRegisterRegClass. An
instruction might be using a register that can only be replaced with one from
a subclass of getPhysicalRegisterRegClass.
With this patch we use getMinimalPhysRegClass. This is correct, but
conservative. We should check the uses of the register and select the
largest register class that can be used in all of them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108122 91177308-0d34-0410-b5e6-96231b3b80d8
physical register can be allocated in the class of the virtual are sufficient.
I think that the test for virtual registers is more strict than it needs to be,
it should be possible to coalesce two virtual registers the class of one
is a subclass of the other.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108118 91177308-0d34-0410-b5e6-96231b3b80d8
getMinimalPhysRegClass. It was used to produce spills, and it is better to
use the most specific class if possible.
Update getLoadStoreRegOpcode to handle GR32_AD.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108115 91177308-0d34-0410-b5e6-96231b3b80d8
Targets must now implement TargetInstrInfo::copyPhysReg instead. There is no
longer a default implementation forwarding to copyRegToReg.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108095 91177308-0d34-0410-b5e6-96231b3b80d8
The first one was used just to call isSafeToMoveRegClassDefs. In
general, using a more specific reg class is better, in practice only
x86 implements that method and the results are always the same.
The second one is in FindFreeRegister and is used to check if a register
is in a register class, a much more direct call to contains is better as
it should cover more cases and is faster.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108093 91177308-0d34-0410-b5e6-96231b3b80d8
assert()s, switching to void-casts. Removed an unneeded Compiler.h include as
a result. There are two other uses in LLVM, but they're not due to assert()s,
so I've left them alone.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108088 91177308-0d34-0410-b5e6-96231b3b80d8
correct alignment information, which simplifies ExpandRes_VAARG a bit.
The patch introduces a new alignment information to TargetLoweringInfo. This is
needed since the two natural candidates cannot be used:
* The 's' in target data: If this is set to the minimal alignment of any
argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for
example.
* The getTransientStackAlignment method. It is possible for an architecture to
have argument less aligned than what we maintain the stack pointer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108072 91177308-0d34-0410-b5e6-96231b3b80d8
The remaining copyRegToReg calls actually check the return value (shock!), so we
cannot trivially replace them with COPY instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108069 91177308-0d34-0410-b5e6-96231b3b80d8
if a block is split (by a custom inserter), the insert point may be in a
different block than it was originally. This fixes 32-bit llvm-gcc
bootstrap builds, and I haven't been able to reproduce it otherwise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108060 91177308-0d34-0410-b5e6-96231b3b80d8
ScheduleDAGEmit, TwoAddressLowering, and PHIElimination.
This switches the bulk of register copies to using COPY, but many less used
copyRegToReg calls remain.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108050 91177308-0d34-0410-b5e6-96231b3b80d8
- Check getBytesToPopOnReturn().
- Eschew ST0 and ST1 for return values.
- Fix the PIC base register initialization so that it doesn't ever
fail to end up the top of the entry block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108039 91177308-0d34-0410-b5e6-96231b3b80d8
inserted in a MBB, and return an already inserted MI.
This target API change is necessary to allow foldMemoryOperand to call
storeToStackSlot and loadFromStackSlot when folding a COPY to a stack slot
reference in a target independent way.
The foldMemoryOperandImpl hook is going to change in the same way, but I'll wait
until COPY folding is actually implemented. Most targets only fold copies and
won't need to specialize this hook at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107991 91177308-0d34-0410-b5e6-96231b3b80d8
U utils/TableGen/FastISelEmitter.cpp
--- Reverse-merging r107943 into '.':
U test/CodeGen/X86/fast-isel.ll
U test/CodeGen/X86/fast-isel-loads.ll
U include/llvm/Target/TargetLowering.h
U include/llvm/Support/PassNameParser.h
U include/llvm/CodeGen/FunctionLoweringInfo.h
U include/llvm/CodeGen/CallingConvLower.h
U include/llvm/CodeGen/FastISel.h
U include/llvm/CodeGen/SelectionDAGISel.h
U lib/CodeGen/LLVMTargetMachine.cpp
U lib/CodeGen/CallingConvLower.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
U lib/CodeGen/SelectionDAG/FastISel.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
U lib/CodeGen/SelectionDAG/InstrEmitter.cpp
U lib/CodeGen/SelectionDAG/TargetLowering.cpp
U lib/Target/XCore/XCoreISelLowering.cpp
U lib/Target/XCore/XCoreISelLowering.h
U lib/Target/X86/X86ISelLowering.cpp
U lib/Target/X86/X86FastISel.cpp
U lib/Target/X86/X86ISelLowering.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107987 91177308-0d34-0410-b5e6-96231b3b80d8
disabled and then never turned back on again. Adjust some tests, one because
this change avoids an unnecessary instruction, and the other to make it
continue testing what it was intended to test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107941 91177308-0d34-0410-b5e6-96231b3b80d8
the simplification of frame index register scavenging to not have to check
for available registers directly and instead just let scavengeRegister()
handle it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107880 91177308-0d34-0410-b5e6-96231b3b80d8
EXTRACT_SUBREG no longer appears as a machine instruction. Use COPY instead.
Add isCopy() checks in many places using isMoveInstr() and isExtractSubreg().
The isMoveInstr hook will be removed later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107879 91177308-0d34-0410-b5e6-96231b3b80d8
This target hook is intended to replace copyRegToReg entirely, but for now it
calls copyRegToReg.
Any remaining calls to copyRegToReg wil be replaced by COPY instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107854 91177308-0d34-0410-b5e6-96231b3b80d8
(if there are any) and use the one which remains available for the longest
rather than just using the first one. This should help enable better re-use
of the loaded frame index values. rdar://7318760
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107847 91177308-0d34-0410-b5e6-96231b3b80d8
around everywhere, and also give it an InsertPt member, to enable isel
to operate at an arbitrary position within a block, rather than just
appending to a block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107791 91177308-0d34-0410-b5e6-96231b3b80d8
instance, rather than pointers to all of FunctionLoweringInfo's
members.
This eliminates an NDEBUG ABI sensitivity.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107789 91177308-0d34-0410-b5e6-96231b3b80d8
than assuming a target will custom lower them. Targets which do so should
exlicitly mark them as having custom lowerings. PR7454.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107734 91177308-0d34-0410-b5e6-96231b3b80d8
INSERT_SUBREG will now only appear in SSA machine instructions.
Fix the handling of partial redefs in ProcessImplicitDefs. This is now relevant
since partial redef COPY instructions appear.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107726 91177308-0d34-0410-b5e6-96231b3b80d8
It is OK for an alias live range to overlap if there is a copy to or from the
physical register. CoalescerPair can work out if the copy is coalescable
independently of the alias.
This means that we can join with the actual destination interval instead of
using the getOrigDstReg() hack. It is no longer necessary to merge clobber
ranges into subregisters.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107695 91177308-0d34-0410-b5e6-96231b3b80d8
This way *only* debug sections can be discarded, but not the opposite. Seems like the copy-and-pasto from ELF code, since there it contains the reverse flag ('alloc').
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107658 91177308-0d34-0410-b5e6-96231b3b80d8
This code is transitional, it will soon be possible to eliminate
isExtractSubreg, isInsertSubreg, and isMoveInstr in most places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107547 91177308-0d34-0410-b5e6-96231b3b80d8
The COPY instruction is intended to replace the target specific copy
instructions for virtual registers as well as the EXTRACT_SUBREG and
INSERT_SUBREG instructions in MachineFunctions. It won't we used in a selection
DAG.
COPY is lowered to native register copies by LowerSubregs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107529 91177308-0d34-0410-b5e6-96231b3b80d8
new basic blocks, and if used as a function argument, that can cause call frame
setup / destroy pairs to be split across a basic block boundary. That prevents
us from doing a simple assertion to check that the pairs match and alloc/
dealloc the same amount of space. Modify the assertion to only check the
amount allocated when there are matching pairs in the same basic block.
rdar://8022442
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107517 91177308-0d34-0410-b5e6-96231b3b80d8
- X86 unfolding should check if the instructions being unfolded has memoperands.
If there is no memoperands, then it must assume conservative alignment. If this
would introduce an expensive sse unaligned load / store, then unfoldMemoryOperand
etc. should not unfold the instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107509 91177308-0d34-0410-b5e6-96231b3b80d8
PrologEpilog code, and use it to determine whether
the asm forces stack alignment or not. gcc consistently
does not do this for GCC-style asms; Apple gcc inconsistently
sometimes does it for asm blocks. There is no
convenient place to put a bit in either the SDNode or
the MachineInstr form, so I've added an extra operand
to each; unlovely, but it does allow for expansion for
more bits, should we need it. PR 5125. Some
existing testcases are affected.
The operand lists of the SDNode and MachineInstr forms
are indexed with awesome mnemonics, like "2"; I may
fix this someday, but not now. I'm not making it any
worse. If anyone is inspired I think you can find all
the right places from this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107506 91177308-0d34-0410-b5e6-96231b3b80d8
This allows us to recognize the common case where all uses could be
rematerialized, and no stack slot allocation is necessary.
If some values could be fully rematerialized, remove them from the live range
before allocating a stack slot for the rest.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107492 91177308-0d34-0410-b5e6-96231b3b80d8
Objective-C metadata types which should be marked as "weak", but which the
linker will remove upon final linkage. However, this linkage isn't specific to
Objective-C.
For example, the "objc_msgSend_fixup_alloc" symbol is defined like this:
.globl l_objc_msgSend_fixup_alloc
.weak_definition l_objc_msgSend_fixup_alloc
.section __DATA, __objc_msgrefs, coalesced
.align 3
l_objc_msgSend_fixup_alloc:
.quad _objc_msgSend_fixup
.quad L_OBJC_METH_VAR_NAME_1
This is different from the "linker_private" linkage type, because it can't have
the metadata defined with ".weak_definition".
Currently only supported on Darwin platforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107433 91177308-0d34-0410-b5e6-96231b3b80d8
available in a register. This is pretty primitive, but it reduces the
number of instructions in common testcases by 4%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107380 91177308-0d34-0410-b5e6-96231b3b80d8
correct catch-all value. This saves having to iterate through all of the
selectors in the program again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107345 91177308-0d34-0410-b5e6-96231b3b80d8
LocalRewriter::runOnMachineFunction uses this information to mark dead spill
slots.
This means that InlineSpiller now also works for functions that spill.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107302 91177308-0d34-0410-b5e6-96231b3b80d8
InlineSpiller inserts loads and spills immediately instead of deferring to
VirtRegMap. This is possible now because SlotIndexes allows instructions to be
inserted and renumbered.
This is work in progress, and is mostly a copy of TrivialSpiller so far. It
works very well for functions that don't require spilling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107227 91177308-0d34-0410-b5e6-96231b3b80d8
metadata types which should be marked as "weak", but which the linker will
remove upon final linkage. For example, the "objc_msgSend_fixup_alloc" symbol is
defined like this:
.globl l_objc_msgSend_fixup_alloc
.weak_definition l_objc_msgSend_fixup_alloc
.section __DATA, __objc_msgrefs, coalesced
.align 3
l_objc_msgSend_fixup_alloc:
.quad _objc_msgSend_fixup
.quad L_OBJC_METH_VAR_NAME_1
This is different from the "linker_private" linkage type, because it can't have
the metadata defined with ".weak_definition".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107205 91177308-0d34-0410-b5e6-96231b3b80d8
A partial redefine needs to be treated like a tied operand, and the register
must be reloaded while processing use operands.
This fixes a bug where partially redefined registers were processed as normal
defs with a reload added. The reload could clobber another use operand if it was
a kill that allowed register reuse.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107193 91177308-0d34-0410-b5e6-96231b3b80d8
The LowerSubregs pass needs to preserve implicit def operands attached to
EXTRACT_SUBREG instructions when it replaces those instructions with copies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107189 91177308-0d34-0410-b5e6-96231b3b80d8
of getPhysicalRegisterRegClass with it.
If we want to make a copy (or estimate its cost), it is better to use the
smallest class as more efficient operations might be possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107140 91177308-0d34-0410-b5e6-96231b3b80d8
back-edges), make sure not to include dbg_value instructions in the count.
Closing in on the end of rdar://7797940
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107119 91177308-0d34-0410-b5e6-96231b3b80d8
There are 2 changes relative to the previous version of the patch:
1) For the "simple" if-conversion case, there's no need to worry about
RemoveExtraEdges not handling an unanalyzable branch. Predicated terminators
are ignored in this context, so RemoveExtraEdges does the right thing.
This might break someday if we ever treat indirect branches (BRIND) as
predicable, but for now, I just removed this part of the patch, because
in the case where we do not add an unconditional branch, we rely on keeping
the fall-through edge to CvtBBI (which is empty after this transformation).
The change relative to the previous patch is:
@@ -1036,10 +1036,6 @@
IterIfcvt = false;
}
- // RemoveExtraEdges won't work if the block has an unanalyzable branch,
- // which is typically the case for IfConvertSimple, so explicitly remove
- // CvtBBI as a successor.
- BBI.BB->removeSuccessor(CvtBBI->BB);
RemoveExtraEdges(BBI);
// Update block info. BB can be iteratively if-converted.
2) My patch exposed a bug in the code for merging the tail of a "diamond",
which had previously never been exercised. The code was simply checking that
the tail had a single predecessor, but there was a case in
MultiSource/Benchmarks/VersaBench/dbms where that single predecessor was
neither edge of the diamond. I added the following change to check for
that:
@@ -1276,7 +1276,18 @@
// tail, add a unconditional branch to it.
if (TailBB) {
BBInfo TailBBI = BBAnalysis[TailBB->getNumber()];
- if (TailBB->pred_size() == 1 && !TailBBI.HasFallThrough) {
+ bool CanMergeTail = !TailBBI.HasFallThrough;
+ // There may still be a fall-through edge from BBI1 or BBI2 to TailBB;
+ // check if there are any other predecessors besides those.
+ unsigned NumPreds = TailBB->pred_size();
+ if (NumPreds > 1)
+ CanMergeTail = false;
+ else if (NumPreds == 1 && CanMergeTail) {
+ MachineBasicBlock::pred_iterator PI = TailBB->pred_begin();
+ if (*PI != BBI1->BB && *PI != BBI2->BB)
+ CanMergeTail = false;
+ }
+ if (CanMergeTail) {
MergeBlocks(BBI, TailBBI);
TailBBI.IsDone = true;
} else {
With these fixes, I was able to run all the SingleSource and MultiSource
tests successfully.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107110 91177308-0d34-0410-b5e6-96231b3b80d8
have to be registers, per gcc documentation. This affects
the logic for determining what "g" should lower to. PR 7393.
A couple of existing testcases are affected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107079 91177308-0d34-0410-b5e6-96231b3b80d8