This new function will decrement the reference count, and collapse a
domain value when the last reference is gone.
This simplifies DomainValue reference counting, and decouples it from
the LiveRegs array.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144131 91177308-0d34-0410-b5e6-96231b3b80d8
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144124 91177308-0d34-0410-b5e6-96231b3b80d8
The old value may still be referenced by some live-out list, and we
don't wan't to collapse those instructions twice.
This fixes the "Can only swizzle VMOVD" assertion in some armv7 SPEC
builds.
<rdar://problem/10413292>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144117 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for trimming constants to GetDemandedBits. This fixes some funky
constant generation that occurs when stores are expanded for targets that don't
support unaligned stores natively.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144102 91177308-0d34-0410-b5e6-96231b3b80d8
When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144100 91177308-0d34-0410-b5e6-96231b3b80d8
DomainValues that are only used by "don't care" instructions are now
collapsed to the first possible execution domain after all basic blocks
have been processed. This typically means the PS domain on x86.
For example, the vsel_i64 and vsel_double functions in sse2-blend.ll are
completely collapsed to the PS domain instead of containing a mix of
execution domains created by isel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144037 91177308-0d34-0410-b5e6-96231b3b80d8
The enterBasicBlock() function is combining live-out values from
predecessor blocks. The RPO traversal means that more predecessors
have been visited when that happens, only back-edges are missing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144025 91177308-0d34-0410-b5e6-96231b3b80d8
to fix the types section (all types, not just global types), and testcases.
The code to do the final emission is disabled by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143923 91177308-0d34-0410-b5e6-96231b3b80d8
the pubnames and pubtypes tables. LLDB can currently use this format
and a full spec is forthcoming and submission for standardization is planned.
A basic summary:
The dwarf accelerator tables are an indirect hash table optimized
for null lookup rather than access to known data. They are output into
an on-disk format that looks like this:
.-------------.
| HEADER |
|-------------|
| BUCKETS |
|-------------|
| HASHES |
|-------------|
| OFFSETS |
|-------------|
| DATA |
`-------------'
where the header contains a magic number, version, type of hash function,
the number of buckets, total number of hashes, and room for a special
struct of data and the length of that struct.
The buckets contain an index (e.g. 6) into the hashes array. The hashes
section contains all of the 32-bit hash values in contiguous memory, and
the offsets contain the offset into the data area for the particular
hash.
For a lookup example, we could hash a function name and take it modulo the
number of buckets giving us our bucket. From there we take the bucket value
as an index into the hashes table and look at each successive hash as long
as the hash value is still the same modulo result (bucket value) as earlier.
If we have a match we look at that same entry in the offsets table and
grab the offset in the data for our final match.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143921 91177308-0d34-0410-b5e6-96231b3b80d8
into the function. Reflect that here so that the array will be placed next to
the SP.
<rdar://problem/10128329>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143590 91177308-0d34-0410-b5e6-96231b3b80d8
the mailing list. Suggestions for other statistics to collect would be
awesome. =]
Currently these are implemented as a separate pass guarded by a separate
flag. I'm not thrilled by that, but I wanted to be able to collect the
statistics for the old code placement as well as the new in order to
have a point of comparison. I'm planning on folding them into the single
pass if / when there is only one pass of interest.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143537 91177308-0d34-0410-b5e6-96231b3b80d8
If all of the inputs are zero/any_extended, create a new simple BV
which can be further optimized by other BV optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143297 91177308-0d34-0410-b5e6-96231b3b80d8
fixes: Use a separate register, instead of SP, as the
calling-convention resource, to avoid spurious conflicts with
actual uses of SP. Also, fix unscheduling of calling sequences,
which can be triggered by pseudo-two-address dependencies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143206 91177308-0d34-0410-b5e6-96231b3b80d8
Don't assume APInt::getRawData() would hold target-aware endianness nor host-compliant endianness. rawdata[0] holds most lower i64, even on big endian host.
FIXME: Add a testcase for big endian target.
FIXME: Ditto on CompileUnit::addConstantFPValue() ?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143194 91177308-0d34-0410-b5e6-96231b3b80d8
it fixes the dragonegg self-host (it looks like gcc is miscompiled).
Original commit messages:
Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.
Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.
Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.
Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.
This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.
Delete #if 0 code accidentally left in.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143188 91177308-0d34-0410-b5e6-96231b3b80d8
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.
Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.
Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.
Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.
This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143177 91177308-0d34-0410-b5e6-96231b3b80d8
trying to legalize the operand types when only the result type
is required to be legalized - the type legalization machinery
will get round to the operands later if they need legalizing.
There can be a point to legalizing operands in parallel with
the result: when this saves compile time or results in better
code. There was only one case in which this was true: when
the operand is also split, so keep the logic for that bit.
As a result of this change, additional operand legalization
methods may need to be introduced to handle nodes where the
result and operand types can differ, like SIGN_EXTEND, but
the testsuite doesn't contain any tests where this is the case.
In any case, it seems better to require such methods (and die
with an assert if they doesn't exist) than to quietly produce
wrong code if we forgot to special case the node in
SplitVecRes_UnaryOp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143026 91177308-0d34-0410-b5e6-96231b3b80d8
This code makes different decisions when compiled into x87 instructions
because of different rounding behavior. That caused phase 2/3
miscompares on 32-bit Linux when the phase 1 compiler was built with gcc
(using x87), and the phase 2 compiler was built with clang (using SSE).
This fixes PR11200.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143006 91177308-0d34-0410-b5e6-96231b3b80d8
An MBB which branches to an EH landing pad shouldn't be considered for tail merging.
In SjLj EH, the jump to the landing pad is not done explicitly through a branch
statement. The EH landing pad is added as a successor to the throwing
BB. Because of that however, the branch folding pass could mistakenly think that
it could merge the throwing BB with another BB. This isn't safe to do.
<rdar://problem/10334833>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143001 91177308-0d34-0410-b5e6-96231b3b80d8
down to this commit. Original commit message:
An MBB which branches to an EH landing pad shouldn't be considered for tail merging.
In SjLj EH, the jump to the landing pad is not done explicitly through a branch
statement. The EH landing pad is added as a successor to the throwing
BB. Because of that however, the branch folding pass could mistakenly think that
it could merge the throwing BB with another BB. This isn't safe to do.
<rdar://problem/10334833>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142920 91177308-0d34-0410-b5e6-96231b3b80d8
In SjLj EH, the jump to the landing pad is not done explicitly through a branch
statement. The EH landing pad is added as a successor to the throwing
BB. Because of that however, the branch folding pass could mistakenly think that
it could merge the throwing BB with another BB. This isn't safe to do.
<rdar://problem/10334833>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142891 91177308-0d34-0410-b5e6-96231b3b80d8
to get important constant branch probabilities and use them for finding
the best branch out of a set of possibilities.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142762 91177308-0d34-0410-b5e6-96231b3b80d8
discussions with Andy. Fundamentally, the previous algorithm is both
counter productive on several fronts and prioritizing things which
aren't necessarily the most important: static branch prediction.
The new algorithm uses the existing loop CFG structure information to
walk through the CFG itself to layout blocks. It coalesces adjacent
blocks within the loop where the CFG allows based on the most likely
path taken. Finally, it topologically orders the block chains that have
been formed. This allows it to choose a (mostly) topologically valid
ordering which still priorizes fallthrough within the structural
constraints.
As a final twist in the algorithm, it does violate the CFG when it
discovers a "hot" edge, that is an edge that is more than 4x hotter than
the competing edges in the CFG. These are forcibly merged into
a fallthrough chain.
Future transformations that need te be added are rotation of loop exit
conditions to be fallthrough, and better isolation of cold block chains.
I'm also planning on adding statistics to model how well the algorithm
does at laying out blocks based on the probabilities it receives.
The old tests mostly still pass, and I have some new tests to add, but
the nested loops are still behaving very strangely. This almost seems
like working-as-intended as it rotated the exit branch to be
fallthrough, but I'm not convinced this is actually the best layout. It
is well supported by the probabilities for loops we currently get, but
those are pretty broken for nested loops, so this may change later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142743 91177308-0d34-0410-b5e6-96231b3b80d8
The assumption in the back-end is that PHIs are not allowed at the start of the
landing pad block for SjLj exceptions.
<rdar://problem/10313708>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142689 91177308-0d34-0410-b5e6-96231b3b80d8
ZExtPromotedInteger and SExtPromotedInteger based on the operation we legalize.
SetCC return type needs to be legalized via PromoteTargetBoolean.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142660 91177308-0d34-0410-b5e6-96231b3b80d8
it's a bit more plausible to use this instead of CodePlacementOpt. The
code for this was shamelessly stolen from CodePlacementOpt, and then
trimmed down a bit. There doesn't seem to be much utility in returning
true/false from this pass as we may or may not have rewritten all of the
blocks. Also, the statistic of counting how many loops were aligned
doesn't seem terribly important so I removed it. If folks would like it
to be included, I'm happy to add it back.
This was probably the most egregious of the missing features, and now
I'm going to start gathering some performance numbers and looking at
specific loop structures that have different layout between the two.
Test is updated to include both basic loop alignment and nested loop
alignment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142645 91177308-0d34-0410-b5e6-96231b3b80d8
block frequency analyses. This differs substantially from the existing
block-placement pass in LLVM:
1) It operates on the Machine-IR in the CodeGen layer. This exposes much
more (and more precise) information and opportunities. Also, the
results are more stable due to fewer transforms ocurring after the
pass runs.
2) It uses the generalized probability and frequency analyses. These can
model static heuristics, code annotation derived heuristics as well
as eventual profile loading. By basing the optimization on the
analysis interface it can work from any (or a combination) of these
inputs.
3) It uses a more aggressive algorithm, both building chains from tho
bottom up to maximize benefit, and using an SCC-based walk to layout
chains of blocks in a profitable ordering without O(N^2) iterations
which the old pass involves.
The pass is currently gated behind a flag, and not enabled by default
because it still needs to grow some important features. Most notably, it
needs to support loop aligning and careful layout of loop structures
much as done by hand currently in CodePlacementOpt. Once it supports
these, and has sufficient testing and quality tuning, it should replace
both of these passes.
Thanks to Nick Lewycky and Richard Smith for help authoring & debugging
this, and to Jakob, Andy, Eric, Jim, and probably a few others I'm
forgetting for reviewing and answering all my questions. Writing
a backend pass is *sooo* much better now than it used to be. =D
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142641 91177308-0d34-0410-b5e6-96231b3b80d8
When checking the availability of instructions using the TLI, a 'promoted'
instruction IS available. It means that the value is bitcasted to another type
for which there is an operation. The correct check for the availablity of an
instruction is to check if it should be expanded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142542 91177308-0d34-0410-b5e6-96231b3b80d8
svn r139159 caused SelectionDAG::getConstant() to promote BUILD_VECTOR operands
with illegal types, even before type legalization. For this testcase, that led
to one BUILD_VECTOR with i16 operands and another with promoted i32 operands,
which triggered the assertion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142370 91177308-0d34-0410-b5e6-96231b3b80d8
.file filenumber "directory" "filename"
This removes one join+split of the directory+filename in MC internals. Because
bitcode files have independent fields for directory and filenames in debug info,
this patch may change the .o files written by existing .bc files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142300 91177308-0d34-0410-b5e6-96231b3b80d8
Use the custom inserter for the ARM setjmp intrinsics. Instead of creating the
SjLj dispatch table in IR, where it frequently violates serveral assumptions --
in particular assumptions made by the landingpad instruction about what can
branch to a landing pad and what cannot. Performing this in the back-end allows
us to violate these assumptions without the IR getting angry at us.
It also allows us to perform a small optimization. We can shove the address of
the dispatch's basic block into the function context and not have to add code
around the setjmp to check for the return value and jump to the dispatch.
Neat, huh?
<rdar://problem/10116753>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142294 91177308-0d34-0410-b5e6-96231b3b80d8
to match its final use.
With this change, all of test-suite compiles for Thumb2 with -verify-coalescing
enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142287 91177308-0d34-0410-b5e6-96231b3b80d8
Some code want to check that *any* call within a function has the 'returns
twice' attribute, not just that the current function has one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142221 91177308-0d34-0410-b5e6-96231b3b80d8
This isn't put into the 'clear()' method because the information needs to stick
around (at least for a little bit) after the selection DAG is built.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142032 91177308-0d34-0410-b5e6-96231b3b80d8
When spilling around an instruction with a dead def, remember to add a
value number for the def.
The missing value number wouldn't normally create problems since there
would be an incoming live range as well. However, due to another bug
we could spill a dead V_SET0 instruction which doesn't read any values.
The missing value number caused an empty live range to be created which
is dangerous since it doesn't interfere with anything.
This fixes part of PR11125.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141923 91177308-0d34-0410-b5e6-96231b3b80d8
have the same address as the one we deleted, and we don't want that in the set
yet. Noticed by inspection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141849 91177308-0d34-0410-b5e6-96231b3b80d8
Now that MI->getRegClassConstraint() can also handle inline assembly,
don't bail when recomputing the register class of a virtual register
used by inline asm.
This fixes PR11078.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141836 91177308-0d34-0410-b5e6-96231b3b80d8
Most instructions have some requirements for their register operands.
Usually, this is expressed as register class constraints in the
MCInstrDesc, but for inline assembly the constraints are encoded in the
flag words.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141835 91177308-0d34-0410-b5e6-96231b3b80d8
The inline asm operand constraint is initially encoded in the virtual
register for the operand, but that register class may change during
coalescing, and the original constraint is lost.
Encode the original register class as part of the flag word for each
inline asm operand. This makes it possible to recover the actual
constraint required by inline asm, just like we can for normal
instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141833 91177308-0d34-0410-b5e6-96231b3b80d8
our current machine instruction defines a register with the same register class
as what's being replaced. This showed up in the SPEC 403.gcc benchmark, where it
would ICE because a tail call was expecting one register class but was given
another. (The machine instruction verifier catches this situation.)
<rdar://problem/10270968>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141830 91177308-0d34-0410-b5e6-96231b3b80d8
rather than the previous index. If a block has a single instruction, the
previous index may be in a different basic block.
I have no clue how this used to work on all of test-suite, because now this
failure is seen quite often when trying to compile code with -strong-phi-elim.
This fixes PR10252.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141812 91177308-0d34-0410-b5e6-96231b3b80d8
containing loop's header to see if that's a landing pad. If it is, then we don't
want to hoist instructions out of the loop and above the header.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141767 91177308-0d34-0410-b5e6-96231b3b80d8
1. The speculation check may not have been performed if the BB hasn't had a load
LICM candidate.
2. If the candidate would be CSE'ed, then go ahead and speculatively LICM the
instruction even if it's in high register pressure situation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141747 91177308-0d34-0410-b5e6-96231b3b80d8
file. Since it should only be used when necessary propagate it through
the backend code generation and tweak testcases accordingly.
This helps with code like in clang's test/CodeGen/debug-info-line.c where
we have multiple #line directives within a single lexical block and want
to generate only a single block that contains each file change.
Part of rdar://10246360
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141729 91177308-0d34-0410-b5e6-96231b3b80d8
The blocks with invokes have branches to the dispatch block, because that more
correctly models the behavior of the CFG. The dispatch of course has edges to
the landing pads. Those landing pads could contain invokes, which then have
branches back to the dispatch. This creates a loop. The machine LICM pass looks
at this loop and thinks it can hoist elements out of it. But because the
dispatch is an alternate entry point into the program, the hoisted instructions
won't be executed.
I wasn't able to get a testcase which was small and could reproduce all of the
time. The function_try_block.cpp in llvm-test was where this showed up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141726 91177308-0d34-0410-b5e6-96231b3b80d8
For example, MachineLICM should not hoist a load that is not guaranteed to be executed.
Radar 10254254.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141689 91177308-0d34-0410-b5e6-96231b3b80d8
Allow targets to expand COPY and other standard pseudo-instructions
before they are expanded with copyPhysReg().
This allows the target to examine the COPY instruction for extra
operands indicating it can be widened to a preferable super-register
copy. See the ARM -widen-vmovs option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141578 91177308-0d34-0410-b5e6-96231b3b80d8
For example, MachineLICM should not hoist a load that is not guaranteed to be executed.
Radar 10254254.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141569 91177308-0d34-0410-b5e6-96231b3b80d8
across unwind edges. This is for the back-end which expects such things.
The code is from the original SjLj EH pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141463 91177308-0d34-0410-b5e6-96231b3b80d8
to the landing pad. This will be used by the back-end to generate the jump
tables for dispatching the arriving longjmp in sjlj eh.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141224 91177308-0d34-0410-b5e6-96231b3b80d8
PhysReg operands are not allowed to have sub-register indices at all.
For virtual registers with sub-reg indices, check that all registers in
the register class support the sub-reg index.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141220 91177308-0d34-0410-b5e6-96231b3b80d8
EXTRACT_SUBREG is emitted as %dst = COPY %src:sub, so there is no need to
constrain the %dst register class. RegisterCoalescer will apply the
necessary constraints if it decides to eliminate the COPY.
The %src register class does need to be constrained to something with
the right sub-registers, though. This is currently done manually with
COPY_TO_REGCLASS nodes. They can possibly be removed after this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141207 91177308-0d34-0410-b5e6-96231b3b80d8
The register class created by INSERT_SUBREG and SUBREG_TO_REG must be
legal and support the SubIdx sub-registers.
The new getSubClassWithSubReg() hook can compute that.
This may create INSERT_SUBREG instructions defining a larger register
class than the sub-register being inserted. That is OK,
RegisterCoalescer will constrain the register class as needed when it
eliminates the INSERT_SUBREG instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141198 91177308-0d34-0410-b5e6-96231b3b80d8
TwoAddressInstructionPass should annotate instructions with <undef>
flags when it lower REG_SEQUENCE instructions. LiveIntervals should not
be in the business of modifying code (except for kill flags, perhaps).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141187 91177308-0d34-0410-b5e6-96231b3b80d8
For example:
%vreg10:dsub_0<def,undef> = COPY %vreg1
%vreg10:dsub_1<def> = COPY %vreg2
is rewritten as:
%D2<def> = COPY %D0, %Q1<imp-def>
%D3<def> = COPY %D1, %Q1<imp-use,kill>, %Q1<imp-def>
The first COPY doesn't care about the previous value of %Q1, so it
doesn't read that register.
The second COPY is a partial redefinition of %Q1, so it implicitly kills
and redefines that register.
This makes it possible to recognize instructions that can harmlessly
clobber the full super-register. The write and don't read the
super-register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141139 91177308-0d34-0410-b5e6-96231b3b80d8
RegisterCoalescer can create sub-register defs when it is joining a
register with a sub-register. Add <undef> flags to these new
sub-register defs where appropriate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141138 91177308-0d34-0410-b5e6-96231b3b80d8
The <undef> flag says that a MachineOperand doesn't read its register,
or doesn't depend on the previous value of its register.
A full register def never depends on the previous register value. A
partial register def may depend on the previous value if it is intended
to update part of a register.
For example:
%vreg10:dsub_0<def,undef> = COPY %vreg1
%vreg10:dsub_1<def> = COPY %vreg2
The first copy instruction defines the full %vreg10 register with the
bits not covered by dsub_0 defined as <undef>. It is not considered a
read of %vreg10.
The second copy modifies part of %vreg10 while preserving the rest. It
has an implicit read of %vreg10.
This patch adds a MachineOperand::readsReg() method to determine if an
operand reads its register.
Previously, this was modelled by adding a full-register <imp-def>
operand to the instruction. This approach makes it possible to
determine directly from a MachineOperand if it reads its register. No
scanning of MI operands is required.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141124 91177308-0d34-0410-b5e6-96231b3b80d8
and the alignment is 0 (i.e., it's defined globally in one file and declared in
another file) it could get an alignment which is larger than the ABI allows for
that type, resulting in aligned moves being used for unaligned loads.
For instance, in file A.c:
struct S s;
In file B.c:
struct {
// something long
};
extern S s;
void foo() {
struct S p = s;
// ...
}
this copy is a 'memcpy' which is turned into a series of 'movaps' instructions
on X86. But this is wrong, because 'struct S' has alignment of 4, not 16.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140902 91177308-0d34-0410-b5e6-96231b3b80d8
This helps with porting code from 2.9 to 3.0 as TargetSelect.h changed location,
and if you include the old one by accident you will trigger this assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140848 91177308-0d34-0410-b5e6-96231b3b80d8
The function needs to scan the implicit operands anyway, so no
performance is won by caching the number of implicit operands added to
an instruction.
This also fixes a bug when adding operands after an implicit operand has
been added manually. The NumImplicitOps count wasn't kept up to date.
MachineInstr::addOperand() will now consistently place all explicit
operands before all the implicit operands, regardless of the order they
are added. It is possible to change an MI opcode and add additional
explicit operands. They will be inserted before any existing implicit
operands.
The only exception is inline asm instructions where operands are never
reordered. This is because of a hack that marks explicit clobber regs
on inline asm as <implicit-def> to please the fast register allocator.
This hack can go away when InstrEmitter and FastIsel can add exact
<dead> flags to physreg defs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140744 91177308-0d34-0410-b5e6-96231b3b80d8
Upon further review, most of the EH code should remain written at the IR
level. The part which breaks SSA form is the dispatch table, so that part will
be moved to the back-end.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140730 91177308-0d34-0410-b5e6-96231b3b80d8
This intrinsic is used to pass the index of the function context to the back-end
for further processing. The back-end is in charge of filling in the rest of the
entries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140676 91177308-0d34-0410-b5e6-96231b3b80d8
The DWARF exception pass uses the call site information, which is set up here. A
pre-RA pass is too late for it to use this information. So create and setup the
function context here, and then insert the call site values here (and map the
call sites for the DWARF EH pass). This is simpler than the original pass, and
doesn't make the CFG lose its SSA-ness.
It's a win-win-win-win-lose-win-win situation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140675 91177308-0d34-0410-b5e6-96231b3b80d8
current IR-level pass.
The old SjLj EH pass has some problems, especially with the new EH model. Most
significantly, it violates some of the new restrictions the new model has. For
instance, the 'dispatch' table wants to jump to the landing pad, but we cannot
allow that because only an invoke's unwind edge can jump to a landing pad. This
requires us to mangle the code something awful. In addition, we need to keep the
now dead landingpad instructions around instead of CSE'ing them because the
DWARF emitter uses that information (they are dead because no control flow edge
will execute them - the control flow edge from an invoke's unwind is superceded
by the edge coming from the dispatch).
Basically, this pass belongs not at the IR level where SSA is king, but at the
code-gen level, where we have more flexibility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140646 91177308-0d34-0410-b5e6-96231b3b80d8
Many targets use pseudo instructions to help register allocation. Like
the COPY instruction, these pseudos can be expanded after register
allocation. The early expansion can make life easier for PEI and the
post-ra scheduler.
This patch adds a hook that is called for all remaining pseudo
instructions from the ExpandPostRAPseudos pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140472 91177308-0d34-0410-b5e6-96231b3b80d8
SDNodes may return values which are wider than the incoming element types. In
this patch we fix the integer promotion of these nodes.
Fixes spill-q.ll when running -promote-elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140471 91177308-0d34-0410-b5e6-96231b3b80d8
I'll fix the file contents in the next commit.
This pass is currently expanding the COPY and SUBREG_TO_REG pseudos. I
am going to add a hook so targets can expand more pseudo-instructions
after register allocation.
Many targets have pseudo-instructions that assist the register
allocator. They can be expanded after register allocation, before PEI
and PostRA scheduling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140469 91177308-0d34-0410-b5e6-96231b3b80d8
(this is always the case for scalars), otherwise use the promoted result type.
Fix test/CodeGen/X86/vsplit-and.ll when promote-elements is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140464 91177308-0d34-0410-b5e6-96231b3b80d8
When generating the trunc-store of i1's, we need to use the vector type and not
the scalar type.
This patch fixes the assertion in CodeGen/Generic/bool-vector.ll when
running with -promote-elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140463 91177308-0d34-0410-b5e6-96231b3b80d8
DecomposeMERGE_VALUES to "know" that results are legalized in
a particular order, by passing it the number of the result
being legalized (the type legalization core provides this, it
just needs to be passed on).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140373 91177308-0d34-0410-b5e6-96231b3b80d8
integer-promotion of CONCAT_VECTORS.
Test: test/CodeGen/X86/widen_shuffle-1.ll
This patch fixes the above tests (when running in with -promote-elements).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140372 91177308-0d34-0410-b5e6-96231b3b80d8
Sometimes register class constraints are trivial, like GR32->GR32_NOSP,
or GPR->rGPR. Teach InstrEmitter to simply constrain the virtual
register instead of emitting a copy in these cases.
Normally, these copies are handled by the coalescer. This saves some
coalescer work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140340 91177308-0d34-0410-b5e6-96231b3b80d8
The function will refuse to use a register class with fewer registers
than MinNumRegs. This can be used by clients to avoid accidentally
increase register pressure too much.
The default value of MinNumRegs=0 doesn't affect how constrainRegClass()
works.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140339 91177308-0d34-0410-b5e6-96231b3b80d8
Few weeks ago, llvm completely inverted the debug info graph. Earlier each debug info node used to keep track of its compile unit, now compile unit keeps track of important nodes. One impact of this change is that the global variable's do not have any context, which should be checked before deciding to use AT_specification DIE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140282 91177308-0d34-0410-b5e6-96231b3b80d8
Vector SetCC result types need to be type-legalized.
This code worked before because scalar result types are known to be legal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140249 91177308-0d34-0410-b5e6-96231b3b80d8
This is still a hack until we can teach tblgen to generate the
optional CPSR operand rather than an implicit CPSR def. But the
strangeness is now limited to the selection DAG. ADD/SUB MI's no
longer have implicit CPSR defs, nor do we allow flag setting variants
of these opcodes in machine code. There are several corner cases to
consider, and getting one wrong would previously lead to nasty
miscompilation. It's not the first time I've debugged one, so this
time I added enough verification to ensure it won't happen again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140228 91177308-0d34-0410-b5e6-96231b3b80d8
No functionality change. The hook makes it explicit which patterns
require "special" handling. i.e. it self-documents tblgen
deficiencies. I plan to add verification in ExpandISelPseudos and
Thumb2SizeReduce to catch any missing hasPostISelHooks. Otherwise it's
too fragile.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140160 91177308-0d34-0410-b5e6-96231b3b80d8
Modified ARMISelLowering::AdjustInstrPostInstrSelection to handle the
full gamut of CPSR defs/uses including instructins whose "optional"
cc_out operand is not really optional. This allowed removal of the
hasPostISelHook to simplify the .td files and make the implementation
more robust.
Fixes rdar://10137436: sqlite3 miscompile
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140134 91177308-0d34-0410-b5e6-96231b3b80d8
The leaveIntvAfter() function normally inserts a back-copy after the
requested instruction, making the back-copy kill the live range.
In spill mode, try to insert the back-copy before the last use instead.
That means the last use becomes the kill instead of the back-copy. This
lowers the register pressure because the last use can now redefine the
same register it was reading.
This will also improve compile time: The back-copy isn't a kill, so
hoisting it in hoistCopiesForSize() won't force a recomputation of the
source live range. Similarly, if the back-copy isn't hoisted by the
splitter, the spiller will not attempt hoisting it locally.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139883 91177308-0d34-0410-b5e6-96231b3b80d8
If the source register is live after the copy being spilled, there is no
point to hoisting it. Hoisting inside a basic block only serves to
resolve interferences by shortening the live range of the source.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139882 91177308-0d34-0410-b5e6-96231b3b80d8
When -split-spill-mode is enabled, spill hoisting is performed by
SplitKit instead of by InlineSpiller. This hidden command line option
is for testing the splitter spill mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139845 91177308-0d34-0410-b5e6-96231b3b80d8
Adjust counters when removing spill and reload instructions.
We still don't account for reloads being removed by eliminateDeadDefs().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139806 91177308-0d34-0410-b5e6-96231b3b80d8
When traceSiblingValue() encounters a PHI-def value created by live
range splitting, don't look at all the predecessor blocks. That can be
very expensive in a complicated CFG.
Instead, consider that all the non-PHI defs jointly dominate all the
PHI-defs. Tracing directly to all the non-PHI defs is much faster that
zipping around in the CFG when there are many PHIs with many
predecessors.
This significantly improves compile time for indirectbr interpreters.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139797 91177308-0d34-0410-b5e6-96231b3b80d8
Blocks with multiple PHI successors only need to go on the worklist
once. Use a SmallPtrSet to track the live-out blocks that have already
been handled. This is a lot faster than the two live range check we
would otherwise do.
Also stop recomputing hasPHIKill flags. Like RenumberValues(), it is
conservatively correct to leave them in, and they are not used for
anything important.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139792 91177308-0d34-0410-b5e6-96231b3b80d8
It does, after all.
RemoveCopyByCommutingDef rewrites the uses of one particular value
number in A. It doesn't know how to rewrite phi uses, so there can't be
any.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139787 91177308-0d34-0410-b5e6-96231b3b80d8
There is only one legitimate use remaining, in addIntervalsForSpills().
All other calls to hasPHIKill() are only used to update PHIKill flags.
The addIntervalsForSpills() function is part of the old spilling
framework, only used by linearscan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139783 91177308-0d34-0410-b5e6-96231b3b80d8
Instead, let HasOtherReachingDefs() test for defs in B that overlap any
phi-defs in A as well. This test is slightly different, but almost
identical.
A perfectly precise test would only check those phi-defs in A that are
reachable from AValNo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139782 91177308-0d34-0410-b5e6-96231b3b80d8
The source live range is recomputed using shrinkToUses() which does
handle phis correctly. The hasPHIKill() condition was relevant in the
old days when ReMaterializeTrivialDef() tried to recompute the live
range itself.
The shrinkToUses() function will mark the original def as dead when no
more uses and phi kills remain. It is then removed by
runOnMachineFunction().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139781 91177308-0d34-0410-b5e6-96231b3b80d8
It is conservatively correct to keep the hasPHIKill flags, even after
deleting PHI-defs.
The calculation can be very expensive after taildup has created a
quadratic number of indirectbr edges in the CFG, and the hasPHIKill flag
isn't used for anything after RenumberValues().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139780 91177308-0d34-0410-b5e6-96231b3b80d8
An improper SlotIndex->VNInfo lookup was leading to unsafe copy removal.
Fixes PR10920 401.bzip2 miscompile with no IV rewrite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139765 91177308-0d34-0410-b5e6-96231b3b80d8
THe LRE_DidCloneVirtReg callback may be called with vitual registers
that RAGreedy doesn't even know about yet. In that case, there are no
data structures to update.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139702 91177308-0d34-0410-b5e6-96231b3b80d8
When a back-copy is hoisted to the nearest common dominator, keep
looking up the dominator tree for a less loopy dominator, and place the
back-copy there instead.
Don't do this when a single existing back-copy dominates all the others.
Assume the client knows what he is doing, and keep the dominating
back-copy.
This prevents us from hoisting back-copies into loops in most cases. If
a value is defined in a loop with multiple exits, we may still hoist
back-copies into that loop. That is the speed/size tradeoff.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139698 91177308-0d34-0410-b5e6-96231b3b80d8
When a ParentVNI maps to multiple defs in a new interval, its live range
may still be derived directly from RegAssign by transferValues().
On the other hand, when instructions have been rematerialized or
hoisted, it may be necessary to completely recompute live ranges using
LiveRangeCalc::extend() to all uses.
Use a bit in the value map to indicate that a live range must be
recomputed. Rename markComplexMapped() to forceRecompute().
This fixes some live range verification errors when
-split-spill-mode=size hoists back-copies by recomputing source ranges
when RegAssign kills can't be moved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139660 91177308-0d34-0410-b5e6-96231b3b80d8
Whenever the complement interval is defined by multiple copies of the
same value, hoist those back-copies to the nearest common dominator.
This ensures that at most one copy is inserted per value in the
complement inteval, and no phi-defs are needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139651 91177308-0d34-0410-b5e6-96231b3b80d8