process those instructions that define phi sources. This is a 47% speedup of
StrongPHIElimination compile time on 403.gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122627 91177308-0d34-0410-b5e6-96231b3b80d8
we are only interested in the defs when discovering interferences.
This is a 28% speedup running StrongPHIElimination on 403.gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122596 91177308-0d34-0410-b5e6-96231b3b80d8
when running without the verifier, and I have not yet checked them to see if
the new results are still correct. There are more verifier failures, but they
all seem to be additional occurrences of verifier failures that occur with the
existing PHIElimination pass. There are a few obvious issues with the code:
1) It doesn't properly update the register equivalence classes during copy
insertion, and instead recomputes them before merging live intervals and
renaming registers. I wanted to keep this first patch simple for debugging
purposes, but it shouldn't be very hard to do this.
2) It doesn't mix the renaming and live interval merging with the copy insertion
process, which leads to a lot of virtual register churn. Virtual registers and
live intervals are created, only to later be merged into others. The code should
be smarter and only create a new virtual register if there is no existing
register in the same congruence class.
3) In one place the code uses a DenseMap per basic block, which is unnecessary
heap allocation. There should be an inline storage version of DenseMap.
I did a quick compile-time test of running llc on 403.gcc with and without
StrongPHIElimination. It is slightly slower with StrongPHIElimination, because
the small decrease in the coalescer runtime can't beat the increase in phi
elimination runtime. Perhaps fixing the above performance issues will narrow
the gap.
I also haven't yet run any tests of the quality of the generated code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122582 91177308-0d34-0410-b5e6-96231b3b80d8
valno verification. The "Different value live out of predecessor" check is
incorrect in the case of phi-def valnos, so just skip that check for phi-def
valnos and instead check that all of the valnos for predecessors have phi-kill.
Fixes PR8863.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122581 91177308-0d34-0410-b5e6-96231b3b80d8
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.
Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.
Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.
Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.
ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.
ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122541 91177308-0d34-0410-b5e6-96231b3b80d8
In the bottom-up selection DAG scheduling, handle two-address
instructions that read/write unspillable registers. Treat
the entire chain of two-address nodes as a single live range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122472 91177308-0d34-0410-b5e6-96231b3b80d8
loads properly. We miscompiled the testcase into:
_test: ## @test
movl $128, (%rdi)
movzbl 1(%rdi), %eax
ret
Now we get a proper:
_test: ## @test
movl $128, (%rdi)
movsbl (%rdi), %eax
movzbl %ah, %eax
ret
This fixes PR8757.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122392 91177308-0d34-0410-b5e6-96231b3b80d8
the shift type was needed one place, the shift count
type another. The transform in 123555 had the same
problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122366 91177308-0d34-0410-b5e6-96231b3b80d8
count operand. These should be the same but apparently are
not always, and this is cleaner anyway. This improves the
code in an existing test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122354 91177308-0d34-0410-b5e6-96231b3b80d8
of the problems with my last attempt were in the updating of LiveIntervals
rather than the coalescing itself. Therefore, I decided to get that right first
by essentially reimplementing the existing PHIElimination using LiveIntervals.
It works correctly, with only a few tests failing (which may not be legitimate
failures) and no new verifier failures (at least as far as I can tell, I didn't
count the number per file).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122321 91177308-0d34-0410-b5e6-96231b3b80d8
Edge bundles is an annotation on the CFG that turns it into a bipartite directed
graph where each basic block is connected to an outgoing and an ingoing bundle.
These bundles are useful for identifying regions of the CFG for live range
splitting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122301 91177308-0d34-0410-b5e6-96231b3b80d8
ARM (and other 32-bit-only) targets support for i8 and i16 overflow
multiplies. The generated code isn't great, but this at least fixes
CodeGen/Generic/overflow.ll when running on ARM hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122221 91177308-0d34-0410-b5e6-96231b3b80d8
Imagine we see:
EFLAGS = inst1
EFLAGS = inst2 FLAGS
gpr = inst3 EFLAGS
Previously, we would refuse to schedule inst2 because it clobbers
the EFLAGS of the predecessor. However, it also uses the EFLAGS
of the predecessor, so it is safe to emit. SDep edges ensure that
the right order happens already anyway.
This fixes 2 testsuite crashes with the X86 patch I'm going to
commit next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122211 91177308-0d34-0410-b5e6-96231b3b80d8
alternative register allocator that does not require LiveIntervals by specifying
it on the command-line for a target that has StrongPHIElimination enabled by
default.
These checks are pretty meaningless anyways, since StrongPHIElimination and
PHIElimination are never used at the same time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122176 91177308-0d34-0410-b5e6-96231b3b80d8
isel is *required* to split the edge. PHI values get evaluated
on the edge, not in their predecessor block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122170 91177308-0d34-0410-b5e6-96231b3b80d8
use before rematerializing the load.
This allows us to produce:
addps LCPI0_1(%rip), %xmm2
Instead of:
movaps LCPI0_1(%rip), %xmm3
addps %xmm3, %xmm2
Saving a register and an instruction. The standard spiller already knows how to
do this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122133 91177308-0d34-0410-b5e6-96231b3b80d8
the loop predecessors.
The register can be live-out from a predecessor without being live-in to the
loop header if there is a critical edge from the predecessor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122123 91177308-0d34-0410-b5e6-96231b3b80d8
createMachineVerifierPass and MachineFunction::verify.
The banner is printed before the machine code dump, just like the printer pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122113 91177308-0d34-0410-b5e6-96231b3b80d8
RegAllocBase::VerifyEnabled.
Run the machine code verifier in a few interesting places during RegAllocGreedy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122107 91177308-0d34-0410-b5e6-96231b3b80d8
The heuristics split around the largest loop where the current register may be
allocated without interference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122106 91177308-0d34-0410-b5e6-96231b3b80d8
may be called. If the entry block is empty, the insertion point iterator will be
the "end()" value. Calling ->getParent() on it (among others) causes problems.
Modify materializeFrameBaseRegister to take the machine basic block and insert
the frame base register at the beginning of that block. (It's very similar to
what the code does all ready. The only difference is that it will always insert
at the beginning of the entry block instead of after a previous materialization
of the frame base register. I doubt that that matters here.)
<rdar://problem/8782198>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122104 91177308-0d34-0410-b5e6-96231b3b80d8
BUILD_VECTOR operands where the element type is not legal. I had previously
changed this code to insert TRUNCATE operations, but that was just wrong.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122102 91177308-0d34-0410-b5e6-96231b3b80d8
the operand uses the same register as a tied operand:
%r1 = add %r1, %r1
If add were a three-address instruction, kill flags would be required on at
least one of the uses. Since it is a two-address instruction, the tied use
operand must not have a kill flag.
This change makes the kill flag on the untied use operand optional.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122082 91177308-0d34-0410-b5e6-96231b3b80d8
This is a three-way interval list intersection between a virtual register, a
live interval union, and a loop. It will be used to identify interference-free
loops for live range splitting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122034 91177308-0d34-0410-b5e6-96231b3b80d8
live range splitting around loops guided by register pressure.
So far, trySplit() simply prints a lot of debug output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121918 91177308-0d34-0410-b5e6-96231b3b80d8
A MachineLoopRange contains the intervals of slot indexes covered by the blocks
in a loop. This representation of the loop blocks is more efficient to compare
against interfering registers during register coalescing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121917 91177308-0d34-0410-b5e6-96231b3b80d8
Bypass loops have the current live range live through, but contain no uses or
defs. Splitting around a bypass loop can free registers for other uses inside
the loop by spilling the split range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121871 91177308-0d34-0410-b5e6-96231b3b80d8
regB = move RCX
regA = op regB, regC
RAX = move regA
where both regB and regC are killed. If regB is constrainted to non-compatible
physical registers but regC is not constrainted at all, then it's better to
commute the instruction.
movl %edi, %eax
shlq $32, %rcx
leaq (%rcx,%rax), %rax
=>
movl %edi, %eax
shlq $32, %rcx
orq %rcx, %rax
rdar://8762995
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121793 91177308-0d34-0410-b5e6-96231b3b80d8
spill weight. Filter out fixed registers instead.
Add support for reassigning an interference that was assigned to an alias.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121737 91177308-0d34-0410-b5e6-96231b3b80d8
when the wider type is legal. This allows us to compile:
define zeroext i16 @test1(i16 zeroext %x) nounwind {
entry:
%div = udiv i16 %x, 33
ret i16 %div
}
into:
test1: # @test1
movzwl 4(%esp), %eax
imull $63551, %eax, %eax # imm = 0xF83F
shrl $21, %eax
ret
instead of:
test1: # @test1
movw $-1985, %ax # imm = 0xFFFFFFFFFFFFF83F
mulw 4(%esp)
andl $65504, %edx # imm = 0xFFE0
movl %edx, %eax
shrl $5, %eax
ret
Implementing rdar://8760399 and example #4 from:
http://blog.regehr.org/archives/320
We should implement the same thing for [su]mul_hilo, but I don't
have immediate plans to do this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121696 91177308-0d34-0410-b5e6-96231b3b80d8
for each constant pool entry. Using WriteTypeSymbolic here takes time
proportional to the size of the module, for each constant pool entry.
This speeds up -verbose-asm llc on 252.eon (a random testcase at my disposal)
from 4.4s to 2.137s. llc takes 2.11s with asm-verbose off, so this is now a
pretty reasonable cost for verbose comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121691 91177308-0d34-0410-b5e6-96231b3b80d8
catch this here rather than later after accessing uninitialized memory
etc. Fires when compiling the testcase in PR8237.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121635 91177308-0d34-0410-b5e6-96231b3b80d8
The spiller should only spill. The register allocator will drive live range
splitting, it has the needed information about register pressure and
interferences.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121590 91177308-0d34-0410-b5e6-96231b3b80d8
registers for a given virtual register.
Reserved registers are filtered from the allocation order, and any valid hint is
returned as the first suggestion.
For target dependent hints, a number of arcane target hooks are invoked.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121497 91177308-0d34-0410-b5e6-96231b3b80d8
references instead.
Similarly, IntervalMap::begin() is almost as expensive as find(), so use find(x)
instead of begin().advanceTo(x);
This makes RegAllocBasic run another 5% faster.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121344 91177308-0d34-0410-b5e6-96231b3b80d8
abstract priority queue interface in subclasses that want to override the
priority calculations.
Subclasses must provide a getPriority() implementation instead.
This approach requires less code as long as priorities are expressable as simple
floats, and it avoids the dangers of defining potentially expensive priority
comparison functions.
It also should speed up priority_queue operations since they no longer have to
chase pointers when comparing registers. This is not measurable, though.
Preferably, we shouldn't use floats to guide code generation. The use of floats
here is derived from the use of floats for spill weights. Spill weights have a
dynamic range that doesn't lend itself easily to a fixpoint implementation.
When someone invents a stable spill weight representation, it can be reused for
allocation priorities.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121294 91177308-0d34-0410-b5e6-96231b3b80d8
both forward and backward scheduling. Rename it to
ScoreboardHazardRecognizer (Scoreboard is one word). Remove integer
division from the scoreboard's critical path.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121274 91177308-0d34-0410-b5e6-96231b3b80d8
This new register allocator is initially identical to RegAllocBasic, but it will
receive all of the tricks that RegAllocBasic won't get.
RegAllocGreedy will eventually replace linear scan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121234 91177308-0d34-0410-b5e6-96231b3b80d8
Minor optimization to the use of IntervalMap iterators. They are fairly
heavyweight, so prefer SI.valid() over SI != end().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121217 91177308-0d34-0410-b5e6-96231b3b80d8
zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method
trunc(), to be const and to return a new value instead of modifying the
object in place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121120 91177308-0d34-0410-b5e6-96231b3b80d8
as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc
bootstrap on darwin10 using darwin9's assembler and linker.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121006 91177308-0d34-0410-b5e6-96231b3b80d8
time, this method existed, but now PHIElimination uses the method of the same
name on MachineBasicBlock.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120952 91177308-0d34-0410-b5e6-96231b3b80d8
The StrongPHIElimination pass did not work, and nobody has worked on it for two
years.
A rewrite is underway, so I am leaving this shell pass instead of deleting it
completely.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120830 91177308-0d34-0410-b5e6-96231b3b80d8
Scan the MachineFunction for DBG_VALUE instructions, and replace them with a
data structure similar to LiveIntervals. The live range of a DBG_VALUE is
determined by propagating it down the dominator tree until a new DBG_VALUE is
found. When a DBG_VALUE lives in a register, its live range is confined to the
live range of the register's value.
LiveDebugVariables runs before coalescing, so DBG_VALUEs are not artificially
extended when registers are joined.
The missing half will recreate DBG_VALUE instructions from the intervals when
register allocation is complete.
The pass is disabled by default. It can be enabled with the temporary command
line option -live-debug-variables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120636 91177308-0d34-0410-b5e6-96231b3b80d8
legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120501 91177308-0d34-0410-b5e6-96231b3b80d8
in favor of the widespread llvm style. Capitalize variables and add
newlines for visual parsing. Rename variables for readability.
And other cleanup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120490 91177308-0d34-0410-b5e6-96231b3b80d8
This analysis is going to run immediately after LiveIntervals. It will stay
alive during register allocation and keep track of user variables mentioned in
DBG_VALUE instructions.
When the register allocator is moving values between registers and the stack, it
is very hard to keep track of DBG_VALUE instructions. We usually get it wrong.
This analysis maintains a data structure that makes it easy to update DBG_VALUE
instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120385 91177308-0d34-0410-b5e6-96231b3b80d8
so don't claim they are. They are allocated using DAG.getNode, so attempts
to access MemSDNode fields results in reading off the end of the allocated
memory. This fixes crashes with "llc -debug" due to debug code trying to
print MemSDNode fields for these barrier nodes (since the crashes are not
deterministic, use valgrind to see this). Add some nasty checking to try
to catch this kind of thing in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119901 91177308-0d34-0410-b5e6-96231b3b80d8
DAGCombine from making an illegal transformation of bitcast of a scalar to a
vector into a scalar_to_vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119819 91177308-0d34-0410-b5e6-96231b3b80d8
MCStreamer instead of just MCObjectStreamer. Address changes cannot
be as efficient as we have to use DW_LNE_set_addres, but at least
most of the logic is shared.
This will be used so that, with CodeGen still using EmitDwarfLocDirective,
llvm-gcc is able to produce debug_line sections without needing an
assembler that supports .loc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119777 91177308-0d34-0410-b5e6-96231b3b80d8
if the extension types were not the same. The result was that if you
fed a select with sext and zext loads, as in the testcase, then it
would get turned into a zext (or sext) of the select, which is wrong
in the cases when it should have been an sext (resp. zext). Reported
and diagnosed by Sebastien Deldon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119728 91177308-0d34-0410-b5e6-96231b3b80d8
and testing is easier. A good example is the unknown-location.ll test that
now can just look for ".loc 1 0 0". We also don't use a DW_LNE_set_address for
every address change anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119613 91177308-0d34-0410-b5e6-96231b3b80d8
memset; we may need it to decide between MOVAPS and MOVUPS
later. Adjust a test that was looking for wrong code.
PR 3866 / 8675131.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119605 91177308-0d34-0410-b5e6-96231b3b80d8
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.
Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.
Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.
2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.
rdar://8663787, rdar://8241368
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119548 91177308-0d34-0410-b5e6-96231b3b80d8
SrcMgrDiagHandler, we can improve clang diagnostics for inline asm:
instead of reporting them on a source line of the original line,
we can report it on the correct line wherever the string literal came
from. For something like this:
void foo() {
asm("push %rax\n"
".code32\n");
}
we used to get this: (note that the line in t.c isn't helpful)
t.c:4:7: error: warning: ignoring directive for now
asm("push %rax\n"
^
<inline asm>:2:1: note: instantiated into assembly here
.code32
^
now we get:
t.c:5:8: error: warning: ignoring directive for now
".code32\n"
^
<inline asm>:2:1: note: instantiated into assembly here
.code32
^
Note that we're pointing to line 5 properly now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119488 91177308-0d34-0410-b5e6-96231b3b80d8
cookie argument to the SourceMgr diagnostic stuff. This cleanly separates
LLVMContext's inlineasm handler from the sourcemgr error handling
definition, increasing type safety and cleaning things up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119486 91177308-0d34-0410-b5e6-96231b3b80d8
easier to debug, and to avoid complications when the CFG changes
in the middle of the instruction selection process.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119382 91177308-0d34-0410-b5e6-96231b3b80d8
Always spill the full representative register at any point where any subregister
is live.
This fixes PR8620 which caused the old logic to get confused and not spill
anything at all.
The fundamental problem here is that the coalescer is too aggressive about
physical register coalescing. It sometimes makes it impossible to allocate
registers without these emergency spills.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119375 91177308-0d34-0410-b5e6-96231b3b80d8