do not double-count the duplicate instructions by counting once from the
beginning and again from the end. Keep track of where the duplicates from
the beginning ended and don't go past that point when counting duplicates
at the end. Radar 8589805.
This change causes one of the MC/ARM/simple-fp-encoding tests to produce
different (better!) code without the vmovne instruction being tested.
I changed the test to produce vmovne and vmoveq instructions but moving
between register files in the opposite direction. That's not quite the same
but predicated versions of those instructions weren't being tested before,
so at least the test coverage is not any worse, just different.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117333 91177308-0d34-0410-b5e6-96231b3b80d8
instructions separately from the count of non-predicated instructions. The
instruction count is used in places to determine how many instructions to
copy, predicate, etc. and things get confused if that count includes the
extra cost for microcoded ops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117332 91177308-0d34-0410-b5e6-96231b3b80d8
2) live-outs.
Previously the post-RA schedulers completely ignore these dependencies since
returns, branches, etc. are all scheduling barriers. This patch model the
latencies between instructions being scheduled and the barriers. It also
handle calls by marking their register uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117193 91177308-0d34-0410-b5e6-96231b3b80d8
framework. It's purpose is not to improve register allocation per se,
but to make it easier to develop powerful live range splitting. I call
it the basic allocator because it is as simple as a global allocator
can be but provides the building blocks for sophisticated register
allocation with live range splitting.
A minimal implementation is provided that trivially spills whenever it
runs out of registers. I'm checking in now to get high-level design
and style feedback. I've only done minimal testing. The next step is
implementing a "greedy" allocation algorithm that does some register
reassignment and makes better splitting decisions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117174 91177308-0d34-0410-b5e6-96231b3b80d8
When a block has exactly two uses and the register is both live-in and live-out,
don't isolate the block. We would be inserting two copies, so we haven't really
made any progress.
If the live-in and live-out values separate into disconnected components after
splitting, we would be making progress. We can't detect that for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117169 91177308-0d34-0410-b5e6-96231b3b80d8
An exit block with a critical edge must only have predecessors in the loop, or
just before the loop. This guarantees that the inserted copies in the loop
predecessors dominate the exit block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117144 91177308-0d34-0410-b5e6-96231b3b80d8
- Initial register pressure in the loop should be all the live defs into the
loop. Not just those from loop preheader which is often empty.
- When an instruction is hoisted, update register pressure from loop preheader
to the original BB.
- Treat only use of a virtual register as kill since the code is still SSA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116956 91177308-0d34-0410-b5e6-96231b3b80d8
operand, also check if subregisters are killed.
Add <imp-def> operands for subregisters that remain alive after a super register
is killed.
I don't have a testcase for this that reproduces on trunk. <rdar://problem/8441758>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116940 91177308-0d34-0410-b5e6-96231b3b80d8
setup they require. Use this for ARM/Darwin to rematerialize the base
pointer from the frame pointer when required. rdar://8564268
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116879 91177308-0d34-0410-b5e6-96231b3b80d8
Pull an unsigned out of the Contents union such that it has the same size as two
pointers and no padding.
Arrange members such that the Contents union and all pointers can be 8-byte
aligned without padding.
This speeds up code generation by 0.8% on a 64-bit host. 32-bit hosts should be
unaffected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116857 91177308-0d34-0410-b5e6-96231b3b80d8
must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize
the pass's dependencies.
Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.
I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems
with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass
registration/creation, please send the testcase to me directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116820 91177308-0d34-0410-b5e6-96231b3b80d8
in MultiSource/Benchmarks/VersaBench/beamformer/beamformer.
SmallSet.insert returns true if the element is inserted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116790 91177308-0d34-0410-b5e6-96231b3b80d8
"long latency" enough to hoist even if it may increase spilling. Reloading
a value from spill slot is often cheaper than performing an expensive
computation in the loop. For X86, that means machine LICM will hoist
SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON
instructions.
- Enable register pressure aware machine LICM by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116781 91177308-0d34-0410-b5e6-96231b3b80d8
does normal initialization and normal chaining. Change the default
AliasAnalysis implementation to NoAlias.
Update StandardCompileOpts.h and friends to explicitly request
BasicAliasAnalysis.
Update tests to explicitly request -basicaa.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116720 91177308-0d34-0410-b5e6-96231b3b80d8
All registers created during splitting or spilling are assigned to the same
stack slot as the parent register.
When splitting or rematting, we may not spill at all. In that case the stack
slot is still assigned, but it will be dead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116546 91177308-0d34-0410-b5e6-96231b3b80d8
splitting or spillling, and to help with rematerialization.
Use LiveRangeEdit in InlineSpiller and SplitKit. This will eventually make it
possible to share remat code between InlineSpiller and SplitKit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116543 91177308-0d34-0410-b5e6-96231b3b80d8
Before we would also split around a loop if any peripheral block had multiple
uses. This could cause repeated splitting when splitting a different live range
would insert uses into the periphery.
Now -spiller=inline passes the nightly test suite again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116494 91177308-0d34-0410-b5e6-96231b3b80d8
perform initialization without static constructors AND without explicit initialization
by the client. For the moment, passes are required to initialize both their
(potential) dependencies and any passes they preserve. I hope to be able to relax
the latter requirement in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116334 91177308-0d34-0410-b5e6-96231b3b80d8
LocalRewriter.
This is a bit of a hack that adds an implicit use operand to model the
read-modify-write nature of a partial redef. Uses and defs are rewritten in
separate passes, and a single operand would never be processed twice.
<rdar://problem/8518892>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116210 91177308-0d34-0410-b5e6-96231b3b80d8
functions: computeRemainder and rewrite.
When the remainder breaks up into multiple components, remember to rewrite those
uses as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116121 91177308-0d34-0410-b5e6-96231b3b80d8
Such a check does not make any sense in presense of inlining and other compiler-dependent stuff.
This should fix bunch of warnings on mingw32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116113 91177308-0d34-0410-b5e6-96231b3b80d8
implicit. e.g.
%D6<def>, %D7<def> = VLD1q16 %R2<kill>, 0, ..., %Q3<imp-def>
%Q1<def> = VMULv8i16 %Q1<kill>, %Q3<kill>, ...
The real definition indices are 0,1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116080 91177308-0d34-0410-b5e6-96231b3b80d8
connected components. These components should be allocated different virtual
registers because there is no reason for them to be allocated together.
Add the ConnectedVNInfoEqClasses class to calculate the connected components,
and move values to new LiveIntervals.
Use it from SplitKit::rewrite by creating new virtual registers for the
components.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116006 91177308-0d34-0410-b5e6-96231b3b80d8
This function is intended to be used when inserting a machine instruction that
trivially restricts the legal registers, like LEA requiring a GR32_NOSP
argument.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115875 91177308-0d34-0410-b5e6-96231b3b80d8
allow target to correctly compute latency for cases where static scheduling
itineraries isn't sufficient. e.g. variable_ops instructions such as
ARM::ldm.
This also allows target without scheduling itineraries to compute operand
latencies. e.g. X86 can return (approximated) latencies for high latency
instructions such as division.
- Compute operand latencies for those defined by load multiple instructions,
e.g. ldm and those used by store multiple instructions, e.g. stm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115755 91177308-0d34-0410-b5e6-96231b3b80d8
never kept after splitting.
Keeping the original interval made sense when the split region doesn't modify
the register, and the original is spilled. We can get the same effect by
detecting reloaded values when spilling around copies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115695 91177308-0d34-0410-b5e6-96231b3b80d8
Insert copy after defining instruction.
Fix LiveIntervalMap::extendTo to properly handle live segments starting before
the current basic block.
Make sure the open live range is extended to the inserted copy's use slot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115665 91177308-0d34-0410-b5e6-96231b3b80d8
having to do a double cast (uint64_t --> double --> float). This is based on the algorithm from compiler_rt's __floatundisf
for X86-64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115634 91177308-0d34-0410-b5e6-96231b3b80d8
// %a = ...
// %b = and i32 %a, 2
// %c = srl i32 %b, 1
// brcond i32 %c ...
//
// into
//
// %a = ...
// %b = and i32 %a, 2
// %c = setcc eq %b, 0
// brcond %c ...
Make sure it restores local variable N1, which corresponds to the condition operand if it fails to match.
This apparently breaks TCE but since that backend isn't in the tree I don't have a test for it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115571 91177308-0d34-0410-b5e6-96231b3b80d8
scheduling change in svn 115121. The CriticalAntiDepBreaker had bad
liveness information. It was calculating the KillIndices for one scheduling
region in a basic block, rescheduling that region so the KillIndices were
no longer valid, and then using those wrong KillIndices to make decisions
for the next scheduling region. I've not been able to reduce a small
testcase for this. Radar 8502534.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115400 91177308-0d34-0410-b5e6-96231b3b80d8
LiveInterval::MergeValueNumberInto instead of trying to extend LiveRanges and
getting it wrong.
This fixed PR8249 where a valno with a multi-segment live range was defined by
an identity copy created by RemoveCopyByCommutingDef. Some of the live
segments disappeared.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115385 91177308-0d34-0410-b5e6-96231b3b80d8
stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide
more nuanced estimates in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115364 91177308-0d34-0410-b5e6-96231b3b80d8
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.
Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics.
MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces. Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.
The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115243 91177308-0d34-0410-b5e6-96231b3b80d8
edited during emission.
If the basic block ends in a switch that gets lowered to a jump table, any
phis at the default edge were getting updated wrong. The jump table data
structure keeps a pointer to the header blocks that wasn't getting updated
after the MBB is split.
This bug was exposed on 32-bit Linux when disabling critical edge splitting in
codegen prepare.
The fix is to uipdate stale MBB pointers whenever a block is split during
emission.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115191 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than having arbitrary cutoffs, actually try to cost model the conversion.
For now, the constants are tuned to more or less match our existing behavior, but these will be
changed to reflect realistic values as this work proceeds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114973 91177308-0d34-0410-b5e6-96231b3b80d8
support aligned comm. Detect when compiling for 10.4 and don't
emit an alignment for comm. THis will hopefully fix PR8198.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114817 91177308-0d34-0410-b5e6-96231b3b80d8
when the unconditional branch destination is the fallthrough block. The
canonicalization makes it easier to allow optimizations on DAGs to invert
conditional branches. The branch folding pass (and AnalyzeBranch) will clean up
the unnecessary unconditional branches later.
This is one of the patches leading up to disabling codegen prepare critical edge
splitting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114630 91177308-0d34-0410-b5e6-96231b3b80d8
Allocator instances can now be created by calling createPBQPRegisterAllocator.
Tidied up use of CoalescerPair as per Jakob's suggestions.
Made the new PBQPBuilder based construction process the default. The internal construction process
remains in-place and available via -pbqp-builder=false for now. It will be removed shortly if the new
process doesn't cause any regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114626 91177308-0d34-0410-b5e6-96231b3b80d8
creating it before and subtracting split ranges.
This way, the SSA update code in LiveIntervalMap can properly create and use new
phi values in dupli. Now it is possible to create split regions where a value
escapes along two different CFG edges, creating phi values outside the split
region.
This is a work in progress and probably quite broken.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114492 91177308-0d34-0410-b5e6-96231b3b80d8
that complex patterns are matched after the entire pattern has
a structural match, therefore the NodeStack isn't in a useful
state when the actual call to the matcher happens.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114489 91177308-0d34-0410-b5e6-96231b3b80d8
the predicate to discover the number of sign bits. Enhance X86's target lowering to provide
a useful response to this query.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114473 91177308-0d34-0410-b5e6-96231b3b80d8
I think I've audited all uses, so it should be dependable for address spaces,
and the pointer+offset info should also be accurate when there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114464 91177308-0d34-0410-b5e6-96231b3b80d8
instead of calling lower_bound or upper_bound directly.
This cleans up the search logic a bit because {lower,upper}_bound compare
LR->start by default, and it is usually simpler to search LR->end.
Funnelling all searches through one function also makes it possible to replace
the search algorithm with something faster than binary search.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114448 91177308-0d34-0410-b5e6-96231b3b80d8
into OptimizeCompareInstr.
This necessitates the passing of CmpValue around,
so widen the virtual functions to accomodate.
No functionality changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114428 91177308-0d34-0410-b5e6-96231b3b80d8
"getFixedStack" on the MachinePointerInfo class. While
this isn't the problem I'm setting out to solve, it is the
right way to eliminate PseudoSourceValue, so lets go with it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114406 91177308-0d34-0410-b5e6-96231b3b80d8
MachinePointerInfo, propagating the type out a level of API. Remove
the old MachineFunction::getMachineMemOperand impl.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114393 91177308-0d34-0410-b5e6-96231b3b80d8
MachinePointerInfo struct, no functionality change.
This also adds an assert to MachineMemOperand::MachineMemOperand
that verifies that the Value* is either null or is an IR pointer type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114389 91177308-0d34-0410-b5e6-96231b3b80d8
CombinerAA cannot assume that different FrameIndex's never alias, but can instead use
MachineFrameInfo to get the actual offsets of these slots and check for actual aliasing.
This fixes CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll and CodeGen/X86/tailcallstack64.ll
when CombinerAA is enabled, modulo a different register allocation sequence.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114348 91177308-0d34-0410-b5e6-96231b3b80d8
For now the allocator still uses the old (internal) construction mechanism by default. This will be phased out soon assuming
no issues with the builder system come up.
To invoke the new construction mechanism just pass '-regalloc=pbqp -pbqp-builder' to llc. To provide custom constraints a
Target just needs to extend PBQPBuilder and pass an instance of their derived builder to the RegAllocPBQP constructor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114272 91177308-0d34-0410-b5e6-96231b3b80d8
NO path to the destination containing side effects, not that SOME path contains no side effects.
In practice, this only manifests with CombinerAA enabled, because otherwise the chain has little
to no branching, so "any" is effectively equivalent to "all".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114268 91177308-0d34-0410-b5e6-96231b3b80d8
1) Do forward copy propagation. This makes it easier to estimate the cost of the
instruction being sunk.
2) Break critical edges on demand, including cases where the value is used by
PHI nodes.
Critical edge splitting is not yet enabled by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114227 91177308-0d34-0410-b5e6-96231b3b80d8