Allow targets to expand COPY and other standard pseudo-instructions
before they are expanded with copyPhysReg().
This allows the target to examine the COPY instruction for extra
operands indicating it can be widened to a preferable super-register
copy. See the ARM -widen-vmovs option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141578 91177308-0d34-0410-b5e6-96231b3b80d8
For example, MachineLICM should not hoist a load that is not guaranteed to be executed.
Radar 10254254.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141569 91177308-0d34-0410-b5e6-96231b3b80d8
across unwind edges. This is for the back-end which expects such things.
The code is from the original SjLj EH pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141463 91177308-0d34-0410-b5e6-96231b3b80d8
to the landing pad. This will be used by the back-end to generate the jump
tables for dispatching the arriving longjmp in sjlj eh.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141224 91177308-0d34-0410-b5e6-96231b3b80d8
PhysReg operands are not allowed to have sub-register indices at all.
For virtual registers with sub-reg indices, check that all registers in
the register class support the sub-reg index.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141220 91177308-0d34-0410-b5e6-96231b3b80d8
EXTRACT_SUBREG is emitted as %dst = COPY %src:sub, so there is no need to
constrain the %dst register class. RegisterCoalescer will apply the
necessary constraints if it decides to eliminate the COPY.
The %src register class does need to be constrained to something with
the right sub-registers, though. This is currently done manually with
COPY_TO_REGCLASS nodes. They can possibly be removed after this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141207 91177308-0d34-0410-b5e6-96231b3b80d8
The register class created by INSERT_SUBREG and SUBREG_TO_REG must be
legal and support the SubIdx sub-registers.
The new getSubClassWithSubReg() hook can compute that.
This may create INSERT_SUBREG instructions defining a larger register
class than the sub-register being inserted. That is OK,
RegisterCoalescer will constrain the register class as needed when it
eliminates the INSERT_SUBREG instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141198 91177308-0d34-0410-b5e6-96231b3b80d8
TwoAddressInstructionPass should annotate instructions with <undef>
flags when it lower REG_SEQUENCE instructions. LiveIntervals should not
be in the business of modifying code (except for kill flags, perhaps).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141187 91177308-0d34-0410-b5e6-96231b3b80d8
For example:
%vreg10:dsub_0<def,undef> = COPY %vreg1
%vreg10:dsub_1<def> = COPY %vreg2
is rewritten as:
%D2<def> = COPY %D0, %Q1<imp-def>
%D3<def> = COPY %D1, %Q1<imp-use,kill>, %Q1<imp-def>
The first COPY doesn't care about the previous value of %Q1, so it
doesn't read that register.
The second COPY is a partial redefinition of %Q1, so it implicitly kills
and redefines that register.
This makes it possible to recognize instructions that can harmlessly
clobber the full super-register. The write and don't read the
super-register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141139 91177308-0d34-0410-b5e6-96231b3b80d8
RegisterCoalescer can create sub-register defs when it is joining a
register with a sub-register. Add <undef> flags to these new
sub-register defs where appropriate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141138 91177308-0d34-0410-b5e6-96231b3b80d8
The <undef> flag says that a MachineOperand doesn't read its register,
or doesn't depend on the previous value of its register.
A full register def never depends on the previous register value. A
partial register def may depend on the previous value if it is intended
to update part of a register.
For example:
%vreg10:dsub_0<def,undef> = COPY %vreg1
%vreg10:dsub_1<def> = COPY %vreg2
The first copy instruction defines the full %vreg10 register with the
bits not covered by dsub_0 defined as <undef>. It is not considered a
read of %vreg10.
The second copy modifies part of %vreg10 while preserving the rest. It
has an implicit read of %vreg10.
This patch adds a MachineOperand::readsReg() method to determine if an
operand reads its register.
Previously, this was modelled by adding a full-register <imp-def>
operand to the instruction. This approach makes it possible to
determine directly from a MachineOperand if it reads its register. No
scanning of MI operands is required.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141124 91177308-0d34-0410-b5e6-96231b3b80d8
and the alignment is 0 (i.e., it's defined globally in one file and declared in
another file) it could get an alignment which is larger than the ABI allows for
that type, resulting in aligned moves being used for unaligned loads.
For instance, in file A.c:
struct S s;
In file B.c:
struct {
// something long
};
extern S s;
void foo() {
struct S p = s;
// ...
}
this copy is a 'memcpy' which is turned into a series of 'movaps' instructions
on X86. But this is wrong, because 'struct S' has alignment of 4, not 16.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140902 91177308-0d34-0410-b5e6-96231b3b80d8
This helps with porting code from 2.9 to 3.0 as TargetSelect.h changed location,
and if you include the old one by accident you will trigger this assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140848 91177308-0d34-0410-b5e6-96231b3b80d8
The function needs to scan the implicit operands anyway, so no
performance is won by caching the number of implicit operands added to
an instruction.
This also fixes a bug when adding operands after an implicit operand has
been added manually. The NumImplicitOps count wasn't kept up to date.
MachineInstr::addOperand() will now consistently place all explicit
operands before all the implicit operands, regardless of the order they
are added. It is possible to change an MI opcode and add additional
explicit operands. They will be inserted before any existing implicit
operands.
The only exception is inline asm instructions where operands are never
reordered. This is because of a hack that marks explicit clobber regs
on inline asm as <implicit-def> to please the fast register allocator.
This hack can go away when InstrEmitter and FastIsel can add exact
<dead> flags to physreg defs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140744 91177308-0d34-0410-b5e6-96231b3b80d8
Upon further review, most of the EH code should remain written at the IR
level. The part which breaks SSA form is the dispatch table, so that part will
be moved to the back-end.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140730 91177308-0d34-0410-b5e6-96231b3b80d8
This intrinsic is used to pass the index of the function context to the back-end
for further processing. The back-end is in charge of filling in the rest of the
entries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140676 91177308-0d34-0410-b5e6-96231b3b80d8
The DWARF exception pass uses the call site information, which is set up here. A
pre-RA pass is too late for it to use this information. So create and setup the
function context here, and then insert the call site values here (and map the
call sites for the DWARF EH pass). This is simpler than the original pass, and
doesn't make the CFG lose its SSA-ness.
It's a win-win-win-win-lose-win-win situation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140675 91177308-0d34-0410-b5e6-96231b3b80d8
current IR-level pass.
The old SjLj EH pass has some problems, especially with the new EH model. Most
significantly, it violates some of the new restrictions the new model has. For
instance, the 'dispatch' table wants to jump to the landing pad, but we cannot
allow that because only an invoke's unwind edge can jump to a landing pad. This
requires us to mangle the code something awful. In addition, we need to keep the
now dead landingpad instructions around instead of CSE'ing them because the
DWARF emitter uses that information (they are dead because no control flow edge
will execute them - the control flow edge from an invoke's unwind is superceded
by the edge coming from the dispatch).
Basically, this pass belongs not at the IR level where SSA is king, but at the
code-gen level, where we have more flexibility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140646 91177308-0d34-0410-b5e6-96231b3b80d8
Many targets use pseudo instructions to help register allocation. Like
the COPY instruction, these pseudos can be expanded after register
allocation. The early expansion can make life easier for PEI and the
post-ra scheduler.
This patch adds a hook that is called for all remaining pseudo
instructions from the ExpandPostRAPseudos pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140472 91177308-0d34-0410-b5e6-96231b3b80d8
SDNodes may return values which are wider than the incoming element types. In
this patch we fix the integer promotion of these nodes.
Fixes spill-q.ll when running -promote-elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140471 91177308-0d34-0410-b5e6-96231b3b80d8
I'll fix the file contents in the next commit.
This pass is currently expanding the COPY and SUBREG_TO_REG pseudos. I
am going to add a hook so targets can expand more pseudo-instructions
after register allocation.
Many targets have pseudo-instructions that assist the register
allocator. They can be expanded after register allocation, before PEI
and PostRA scheduling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140469 91177308-0d34-0410-b5e6-96231b3b80d8
(this is always the case for scalars), otherwise use the promoted result type.
Fix test/CodeGen/X86/vsplit-and.ll when promote-elements is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140464 91177308-0d34-0410-b5e6-96231b3b80d8
When generating the trunc-store of i1's, we need to use the vector type and not
the scalar type.
This patch fixes the assertion in CodeGen/Generic/bool-vector.ll when
running with -promote-elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140463 91177308-0d34-0410-b5e6-96231b3b80d8
DecomposeMERGE_VALUES to "know" that results are legalized in
a particular order, by passing it the number of the result
being legalized (the type legalization core provides this, it
just needs to be passed on).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140373 91177308-0d34-0410-b5e6-96231b3b80d8
integer-promotion of CONCAT_VECTORS.
Test: test/CodeGen/X86/widen_shuffle-1.ll
This patch fixes the above tests (when running in with -promote-elements).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140372 91177308-0d34-0410-b5e6-96231b3b80d8
Sometimes register class constraints are trivial, like GR32->GR32_NOSP,
or GPR->rGPR. Teach InstrEmitter to simply constrain the virtual
register instead of emitting a copy in these cases.
Normally, these copies are handled by the coalescer. This saves some
coalescer work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140340 91177308-0d34-0410-b5e6-96231b3b80d8
The function will refuse to use a register class with fewer registers
than MinNumRegs. This can be used by clients to avoid accidentally
increase register pressure too much.
The default value of MinNumRegs=0 doesn't affect how constrainRegClass()
works.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140339 91177308-0d34-0410-b5e6-96231b3b80d8
Few weeks ago, llvm completely inverted the debug info graph. Earlier each debug info node used to keep track of its compile unit, now compile unit keeps track of important nodes. One impact of this change is that the global variable's do not have any context, which should be checked before deciding to use AT_specification DIE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140282 91177308-0d34-0410-b5e6-96231b3b80d8
Vector SetCC result types need to be type-legalized.
This code worked before because scalar result types are known to be legal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140249 91177308-0d34-0410-b5e6-96231b3b80d8
This is still a hack until we can teach tblgen to generate the
optional CPSR operand rather than an implicit CPSR def. But the
strangeness is now limited to the selection DAG. ADD/SUB MI's no
longer have implicit CPSR defs, nor do we allow flag setting variants
of these opcodes in machine code. There are several corner cases to
consider, and getting one wrong would previously lead to nasty
miscompilation. It's not the first time I've debugged one, so this
time I added enough verification to ensure it won't happen again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140228 91177308-0d34-0410-b5e6-96231b3b80d8
No functionality change. The hook makes it explicit which patterns
require "special" handling. i.e. it self-documents tblgen
deficiencies. I plan to add verification in ExpandISelPseudos and
Thumb2SizeReduce to catch any missing hasPostISelHooks. Otherwise it's
too fragile.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140160 91177308-0d34-0410-b5e6-96231b3b80d8
Modified ARMISelLowering::AdjustInstrPostInstrSelection to handle the
full gamut of CPSR defs/uses including instructins whose "optional"
cc_out operand is not really optional. This allowed removal of the
hasPostISelHook to simplify the .td files and make the implementation
more robust.
Fixes rdar://10137436: sqlite3 miscompile
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140134 91177308-0d34-0410-b5e6-96231b3b80d8
The leaveIntvAfter() function normally inserts a back-copy after the
requested instruction, making the back-copy kill the live range.
In spill mode, try to insert the back-copy before the last use instead.
That means the last use becomes the kill instead of the back-copy. This
lowers the register pressure because the last use can now redefine the
same register it was reading.
This will also improve compile time: The back-copy isn't a kill, so
hoisting it in hoistCopiesForSize() won't force a recomputation of the
source live range. Similarly, if the back-copy isn't hoisted by the
splitter, the spiller will not attempt hoisting it locally.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139883 91177308-0d34-0410-b5e6-96231b3b80d8
If the source register is live after the copy being spilled, there is no
point to hoisting it. Hoisting inside a basic block only serves to
resolve interferences by shortening the live range of the source.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139882 91177308-0d34-0410-b5e6-96231b3b80d8
When -split-spill-mode is enabled, spill hoisting is performed by
SplitKit instead of by InlineSpiller. This hidden command line option
is for testing the splitter spill mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139845 91177308-0d34-0410-b5e6-96231b3b80d8
Adjust counters when removing spill and reload instructions.
We still don't account for reloads being removed by eliminateDeadDefs().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139806 91177308-0d34-0410-b5e6-96231b3b80d8
When traceSiblingValue() encounters a PHI-def value created by live
range splitting, don't look at all the predecessor blocks. That can be
very expensive in a complicated CFG.
Instead, consider that all the non-PHI defs jointly dominate all the
PHI-defs. Tracing directly to all the non-PHI defs is much faster that
zipping around in the CFG when there are many PHIs with many
predecessors.
This significantly improves compile time for indirectbr interpreters.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139797 91177308-0d34-0410-b5e6-96231b3b80d8
Blocks with multiple PHI successors only need to go on the worklist
once. Use a SmallPtrSet to track the live-out blocks that have already
been handled. This is a lot faster than the two live range check we
would otherwise do.
Also stop recomputing hasPHIKill flags. Like RenumberValues(), it is
conservatively correct to leave them in, and they are not used for
anything important.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139792 91177308-0d34-0410-b5e6-96231b3b80d8
It does, after all.
RemoveCopyByCommutingDef rewrites the uses of one particular value
number in A. It doesn't know how to rewrite phi uses, so there can't be
any.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139787 91177308-0d34-0410-b5e6-96231b3b80d8
There is only one legitimate use remaining, in addIntervalsForSpills().
All other calls to hasPHIKill() are only used to update PHIKill flags.
The addIntervalsForSpills() function is part of the old spilling
framework, only used by linearscan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139783 91177308-0d34-0410-b5e6-96231b3b80d8
Instead, let HasOtherReachingDefs() test for defs in B that overlap any
phi-defs in A as well. This test is slightly different, but almost
identical.
A perfectly precise test would only check those phi-defs in A that are
reachable from AValNo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139782 91177308-0d34-0410-b5e6-96231b3b80d8
The source live range is recomputed using shrinkToUses() which does
handle phis correctly. The hasPHIKill() condition was relevant in the
old days when ReMaterializeTrivialDef() tried to recompute the live
range itself.
The shrinkToUses() function will mark the original def as dead when no
more uses and phi kills remain. It is then removed by
runOnMachineFunction().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139781 91177308-0d34-0410-b5e6-96231b3b80d8
It is conservatively correct to keep the hasPHIKill flags, even after
deleting PHI-defs.
The calculation can be very expensive after taildup has created a
quadratic number of indirectbr edges in the CFG, and the hasPHIKill flag
isn't used for anything after RenumberValues().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139780 91177308-0d34-0410-b5e6-96231b3b80d8
An improper SlotIndex->VNInfo lookup was leading to unsafe copy removal.
Fixes PR10920 401.bzip2 miscompile with no IV rewrite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139765 91177308-0d34-0410-b5e6-96231b3b80d8
THe LRE_DidCloneVirtReg callback may be called with vitual registers
that RAGreedy doesn't even know about yet. In that case, there are no
data structures to update.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139702 91177308-0d34-0410-b5e6-96231b3b80d8
When a back-copy is hoisted to the nearest common dominator, keep
looking up the dominator tree for a less loopy dominator, and place the
back-copy there instead.
Don't do this when a single existing back-copy dominates all the others.
Assume the client knows what he is doing, and keep the dominating
back-copy.
This prevents us from hoisting back-copies into loops in most cases. If
a value is defined in a loop with multiple exits, we may still hoist
back-copies into that loop. That is the speed/size tradeoff.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139698 91177308-0d34-0410-b5e6-96231b3b80d8
When a ParentVNI maps to multiple defs in a new interval, its live range
may still be derived directly from RegAssign by transferValues().
On the other hand, when instructions have been rematerialized or
hoisted, it may be necessary to completely recompute live ranges using
LiveRangeCalc::extend() to all uses.
Use a bit in the value map to indicate that a live range must be
recomputed. Rename markComplexMapped() to forceRecompute().
This fixes some live range verification errors when
-split-spill-mode=size hoists back-copies by recomputing source ranges
when RegAssign kills can't be moved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139660 91177308-0d34-0410-b5e6-96231b3b80d8
Whenever the complement interval is defined by multiple copies of the
same value, hoist those back-copies to the nearest common dominator.
This ensures that at most one copy is inserted per value in the
complement inteval, and no phi-defs are needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139651 91177308-0d34-0410-b5e6-96231b3b80d8
This function is used to flag values where the complement interval may
overlap other intervals. Call it from overlapIntv, and use the flag to
fully recompute those live ranges in transferValues().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139612 91177308-0d34-0410-b5e6-96231b3b80d8
Three out of four clients prefer this interface which is consistent with
extendIntervalEndTo() and LiveRangeCalc::extend().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139604 91177308-0d34-0410-b5e6-96231b3b80d8
The complement interval may overlap the other intervals created, so use
a separate LiveRangeCalc instance to compute its live range.
A LiveRangeCalc instance can only be shared among non-overlapping
intervals.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139603 91177308-0d34-0410-b5e6-96231b3b80d8
SplitKit will soon need two copies of these data structures, and the
algorithms will also be useful when LiveIntervalAnalysis becomes
independent of LiveVariables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139572 91177308-0d34-0410-b5e6-96231b3b80d8
Splitting a landing pad takes considerable care because of PHIs and other
nasties. The problem is that the jump table needs to jump to the landing pad
block. However, the landing pad block can be jumped to only by an invoke
instruction. So we clone the landingpad instruction into its own basic block,
have the invoke jump to there. The landingpad instruction's basic block's
successor is now the target for the jump table.
But because of PHI nodes, we need to create another basic block for the jump
table to jump to. This is definitely a hack, because the values for the PHI
nodes may not be defined on the edge from the jump table. But that's okay,
because the jump table is simply a construct to mimic what is happening in the
CFG. So the values are mysteriously there, even though there is no value for the
PHI from the jump table's edge (hence calling this a hack).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139545 91177308-0d34-0410-b5e6-96231b3b80d8
SplitKit always computes a complement live range to cover the places
where the original live range was live, but no explicit region has been
allocated.
Currently, the complement live range is created to be as small as
possible - it never overlaps any of the regions. This minimizes
register pressure, but if the complement is going to be spilled anyway,
that is not very important. The spiller will eliminate redundant
spills, and hoist others by making the spill slot live range overlap
some of the regions created by splitting. Stack slots are cheap.
This patch adds the interface to enable spill modes in SplitKit. In
spill mode, SplitKit will assume that the complement is going to spill,
so it will allow it to overlap regions in order to avoid back-copies.
By doing some of the spiller's work early, the complement live range
becomes simpler. In some cases, it can become much simpler because no
extra PHI-defs are required. This will speed up both splitting and
spilling.
This is only the interface to enable spill modes, no implementation yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139500 91177308-0d34-0410-b5e6-96231b3b80d8
In some cases such as interpreters using indirectbr, the CFG can be very
complicated, and live range splitting may be forced to insert a large
number of phi-defs. When that happens, traceSiblingValue can spend a
lot of time zipping around in the CFG looking for defs and reloads.
This patch causes more information to be cached in SibValues, and the
cached values are used to terminate searches early. This speeds up
spilling by 20x in one interpreter test case. For more typical code,
this is just a 10% speedup of spilling.
The previous version had bugs that caused miscompilations. They have
been fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139378 91177308-0d34-0410-b5e6-96231b3b80d8
In some cases such as interpreters using indirectbr, the CFG can be very
complicated, and live range splitting may be forced to insert a large
number of phi-defs. When that happens, traceSiblingValue can spend a
lot of time zipping around in the CFG looking for defs and reloads.
This patch causes more information to be cached in SibValues, and the
cached values are used to terminate searches early. This speeds up
spilling by 20x in one interpreter test case. For more typical code,
this is just a 10% speedup of spilling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139247 91177308-0d34-0410-b5e6-96231b3b80d8
(The fix for the related failures on x86 is going to be nastier because we actually need Acquire memoperands attached to the atomic load instrs, etc.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139221 91177308-0d34-0410-b5e6-96231b3b80d8
with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons. Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all"). Patch mostly by
Nadav Rotem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139159 91177308-0d34-0410-b5e6-96231b3b80d8
init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC. While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function. To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function. Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!). Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC. Patch mostly by Sanjoy Das.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139140 91177308-0d34-0410-b5e6-96231b3b80d8
If we have a chain of zext -> assert_zext -> zext -> use, the first zext would get simplified away because of the later zext, and then the later zext would get simplified away because of the assert. The solution is to teach SimplifyDemandedBits that assert_zext demands all of the high bits of its input, rather than only those demanded by its users. No testcase because the only example I have manifests as llvm-gcc miscompiling LLVM, and I haven't found a smaller case that reproduces this problem.
Fixes <rdar://problem/10063365>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139059 91177308-0d34-0410-b5e6-96231b3b80d8
to be unreliable on platforms which require memcpy calls, and it is
complicating broader legalize cleanups. It is hoped that these cleanups
will make memcpy byval easier to implement in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138977 91177308-0d34-0410-b5e6-96231b3b80d8
- On COFF the .lcomm directive has an alignment argument.
- On ELF we fall back to .local + .comm
Based on a patch by NAKAMURA Takumi.
Fixes PR9337, PR9483 and PR10128.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138976 91177308-0d34-0410-b5e6-96231b3b80d8
An instruction may define part of a register where the other bits are
undefined. In that case, it is safe to rematerialize the instruction.
For example:
%vreg2:ssub_0<def> = VLDRS <cp#0>, 0, pred:14, pred:%noreg, %vreg2<imp-def>
The extra <imp-def> operand indicates that the instruction does not read
the other parts of the virtual register, so a remat is safe.
This patch simply allows multiple def operands for the virtual register.
It is MI->readsVirtualRegister() that determines if we depend on a
previous value so remat is impossible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138953 91177308-0d34-0410-b5e6-96231b3b80d8
The problem is fixed for all register allocators by r138944, so this
patch is no longer necessary.
<rdar://problem/10032939>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138945 91177308-0d34-0410-b5e6-96231b3b80d8
An instruction that redefines only part of a larger register can never
be rematerialized since the virtual register value depends on the old
value in other parts of the register.
This was fixed for the inline spiller in r138794. This patch fixes the
problem for all register allocators, and includes a small test case.
<rdar://problem/10032939>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138944 91177308-0d34-0410-b5e6-96231b3b80d8
Added canClobberReachingPhysRegUse() to handle a particular pattern in
which a two-address instruction could be forced to interfere with
EFLAGS, causing a compare to be unnecessarilly cloned.
Fixes rdar://problem/5875261
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138924 91177308-0d34-0410-b5e6-96231b3b80d8
Emit a repeated sequence of bytes using .zero. This saves an enormous
amount of asm file space for certain programs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138864 91177308-0d34-0410-b5e6-96231b3b80d8
X86. Modify the pass added in the previous patch to call this new
code.
This new prologues generated will call a libgcc routine (__morestack)
to allocate more stack space from the heap when required
Patch by Sanjoy Das.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138812 91177308-0d34-0410-b5e6-96231b3b80d8
Add a instruction flag: hasPostISelHook which tells the pre-RA scheduler to
call a target hook to adjust the instruction. For ARM, this is used to
adjust instructions which may be setting the 's' flag. ADC, SBC, RSB, and RSC
instructions have implicit def of CPSR (required since it now uses CPSR physical
register dependency rather than "glue"). If the carry flag is used, then the
target hook will *fill in* the optional operand with CPSR. Otherwise, the hook
will remove the CPSR implicit def from the MachineInstr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138810 91177308-0d34-0410-b5e6-96231b3b80d8
I don't currently have a good testcase for this; will try to get one
tomorrow. <rdar://problem/10032939>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138794 91177308-0d34-0410-b5e6-96231b3b80d8
I don't really like the patterns, but I'm having trouble coming up with a
better way to handle them.
I plan on making other targets use the same legalization
ARM-without-memory-barriers is using... it's not especially efficient, but
if anyone cares, it's not that hard to fix for a given target if there's
some better lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138621 91177308-0d34-0410-b5e6-96231b3b80d8
A value of -1 at a call site tells the personality function that this call isn't
handled by the current function. Since the ResumeInsts are converted to calls to
_Unwind_SjLj_Resume, add a (volatile) store of -1 to its 'call site'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138416 91177308-0d34-0410-b5e6-96231b3b80d8
This is not necessarily the first or dominating use of the EH values. The IR
breaks if it's not. So replace the specific value in the instruction with the
new value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138406 91177308-0d34-0410-b5e6-96231b3b80d8
The invoke could be at the end of the entry block. If it's the only one, then we
won't process all of the landingpad instructions correctly. This code is
currently ugly, but should be made much nicer once the new EH switch is thrown.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138397 91177308-0d34-0410-b5e6-96231b3b80d8
value, we insert a load of the exception object and selector object from memory,
which is where it actually resides. If it's used by a PHI node, we follow that
to where it is being used. Eventually, all landingpad instructions should have
no uses. Any PHI nodes that were associated with those landingpads should be
removed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138302 91177308-0d34-0410-b5e6-96231b3b80d8
the intent seems to be to terminate even in Release builds, just use abort()
directly.
If program flow ever reaches a __builtin_unreachable (which llvm_unreachable is
#define'd to on newer GCCs) then the program is undefined.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138068 91177308-0d34-0410-b5e6-96231b3b80d8
Normally, a partial register def is treated as reading the
super-register unless it also defines the full register like this:
%vreg110:sub_32bit<def> = COPY %vreg77:sub_32bit, %vreg110<imp-def>
This patch also uses the <undef> flag on partial defs to recognize
non-reading operands:
%vreg110:sub_32bit<def,undef> = COPY %vreg77:sub_32bit
This fixes a subtle bug in RegisterCoalescer where LIS->shrinkToUses
would treat a coalesced copy as still reading the register, extending
the live range artificially.
My test case only works when I disable DCE so a dead copy is left for
RegisterCoalescer, so I am not including it.
<rdar://problem/9967101>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138018 91177308-0d34-0410-b5e6-96231b3b80d8
The landingpad instruction is lowered into the EXCEPTIONADDR and EHSELECTION
SDNodes. The information from the landingpad instruction is harvested by the
'AddLandingPadInfo' function. The new EH uses the current EH scheme in the
back-end. This will change once we switch over to the new scheme. (Reviewed by
Jakob!)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137880 91177308-0d34-0410-b5e6-96231b3b80d8
This generates the SDNodes for the new exception handling scheme. It takes the
two values coming from the landingpad instruction and assigns them to the
EXCEPTIONADDR and EHSELECTION nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137873 91177308-0d34-0410-b5e6-96231b3b80d8
Things are much saner now. We no longer need to modify the laning pads, because
of the invariants we impose upon them. The only thing DwarfEHPrepare needs to do
is convert the 'resume' instruction into a call to '_Unwind_Resume'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137855 91177308-0d34-0410-b5e6-96231b3b80d8
MDNodes graph structure such that compiler unit keeps track of important MDNodes and update dwarf writer to process mdnodes top-down instead of bottom up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137778 91177308-0d34-0410-b5e6-96231b3b80d8
When a variable is inlined multiple places, abstract variable keeps name, location, type etc.. info and all other concreate instances of the variable directly refers to abstract variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137637 91177308-0d34-0410-b5e6-96231b3b80d8
be illegal, even if the requested vector type is legal. Testcase is one of the
disabled ARM tests in the vector-select patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137562 91177308-0d34-0410-b5e6-96231b3b80d8
This implements the 'landingpad' instruction. It's used to indicate that a basic
block is a landing pad. There are several restrictions on its use (see
LangRef.html for more detail). These restrictions allow the exception handling
code to gather the information it needs in a much more sane way.
This patch has the definition, implementation, C interface, parsing, and bitcode
support in it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137501 91177308-0d34-0410-b5e6-96231b3b80d8
The Query class now holds two iterators instead of an InterferenceResult
instance. The iterators are used as bookmarks for repeated
collectInterferingVRegs calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137380 91177308-0d34-0410-b5e6-96231b3b80d8
The InterferenceResult iterator turned out to be less important than we
thought it would be. LiveIntervalUnion clients want higher level
information, like the list of interfering virtual registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137346 91177308-0d34-0410-b5e6-96231b3b80d8
lower XMM register gets in first. This will allow the SUBREG pattern to
elliminate the first vector insertion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137310 91177308-0d34-0410-b5e6-96231b3b80d8
Coalescing can remove copy-like instructions with sub-register operands
that constrained the register class. Examples are:
x86: GR32_ABCD:sub_8bit_hi -> GR32
arm: DPR_VFP2:ssub0 -> DPR
Recompute the register class of any virtual registers that are used by
less instructions after coalescing.
This affects code generation for the Cortex-A8 where we use NEON
instructions for f32 operations, c.f. fp_convert.ll:
vadd.f32 d16, d1, d0
vcvt.s32.f32 d0, d16
The register allocator is now free to use d16 for the temporary, and
that comes first in the allocation order because it doesn't interfere
with any s-registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137133 91177308-0d34-0410-b5e6-96231b3b80d8
This function doesn't have anything to do with spill weights, and MRI
already has functions for manipulating the register class of a virtual
register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137123 91177308-0d34-0410-b5e6-96231b3b80d8
These the methods are target-independent since they simply scan the
memory operands. They can live in TargetInstrInfoImpl.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137063 91177308-0d34-0410-b5e6-96231b3b80d8
All new local ranges are marked as RS_New now, so there is no need to
attempt splitting of RS_Spill ranges any more.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137002 91177308-0d34-0410-b5e6-96231b3b80d8
The local ranges created get to stay in the RS_New stage, just like for
local and region splitting.
This gives tryLocalSplit a bit more freedom the first time it sees one
of these new local ranges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137001 91177308-0d34-0410-b5e6-96231b3b80d8
These functions are no longer used, and they are easily replaced with a
loop calling shouldSplitSingleBlock and splitSingleBlock.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136993 91177308-0d34-0410-b5e6-96231b3b80d8
Drop the use of SplitAnalysis::getMultiUseBlocks, there is no need to go
through a SmallPtrSet any more.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136992 91177308-0d34-0410-b5e6-96231b3b80d8
Normally, we don't create a live range for a single instruction in a
basic block, the spiller does that anyway. However, when splitting a
live range that belongs to a proper register sub-class, inserting these
extra COPY instructions completely remove the constraints from the
remainder interval, and it may be allocated from the larger super-class.
The spiller will mop up these small live ranges if we end up spilling
anyway. It calls them snippets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136989 91177308-0d34-0410-b5e6-96231b3b80d8
Some instructions require restricted register classes, but most of the
time that doesn't affect register allocation. For example, some
instructions don't work with the stack pointer, but that is a reserved
register anyway.
Sometimes it matters, GR32_ABCD only has 4 allocatable registers. For
such a proper sub-class, the register allocator should try to enable
register class inflation since that makes more registers available for
allocation.
Make sure only legal super-classes are considered. For example, tGPR is
not a proper sub-class in Thumb mode, but in ARM mode it is.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136981 91177308-0d34-0410-b5e6-96231b3b80d8
The old code would look at kills and defs in one pass over the
instruction operands, causing problems with this code:
%R0<def>, %CPSR<def,dead> = tLSLri %R5<kill>, 2, pred:14, pred:%noreg
%R0<def>, %CPSR<def,dead> = tADDrr %R4<kill>, %R0<kill>, pred:14, %pred:%noreg
The last instruction kills and redefines %R0, so it is still live after
the instruction.
This caused a register scavenger crash when compiling 483.xalancbmk for
armv6. I am not including a test case because it requires too much bad
luck to expose this old bug.
First you need to convince the register allocator to use %R0 twice on
the tADDrr instruction, then you have to convince BranchFolding to do
something that causes it to run the register scavenger on he bad block.
<rdar://problem/9898200>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136973 91177308-0d34-0410-b5e6-96231b3b80d8
inlined variable, based on the discussion in PR10542.
This explodes the runtime of several passes down the pipeline due to
a large number of "copies" remaining live across a large function. This
only shows up with both debug and opt, but when it does it creates
a many-minute compile when self-hosting LLVM+Clang. There are several
other cases that show these types of regressions.
All of this is tracked in PR10542, and progress is being made on fixing
the issue. Once its addressed, the re-instated, but until then this
restores the performance for self-hosting and other opt+debug builds.
Devang, let me know if this causes any trouble, or impedes fixing it in
any way, and thanks for working on this!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136953 91177308-0d34-0410-b5e6-96231b3b80d8
It is possible to have multiple DBG_VALUEs for the same variable:
32L TEST32rr %vreg0<kill>, %vreg0, %EFLAGS<imp-def>; GR32:%vreg0
DBG_VALUE 2, 0, !"i"
DBG_VALUE %noreg, %0, !"i"
When that happens, keep the last one instead of the first.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136842 91177308-0d34-0410-b5e6-96231b3b80d8
This helps generate better code in functions with high register
pressure.
The previous version of compact region splitting caused regressions
because the regions were a bit too large. A stronger negative bias
applied in r136832 fixed this problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136836 91177308-0d34-0410-b5e6-96231b3b80d8
Apply twice the negative bias on transparent blocks when computing the
compact regions. This excludes loop backedges from the region when only
one of the loop blocks uses the register.
Previously, we would include the backedge in the region if the loop
preheader and the loop latch both used the register, but the loop header
didn't.
When both the header and latch blocks use the register, we still keep it
live on the backedge.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136832 91177308-0d34-0410-b5e6-96231b3b80d8