A partial redefine needs to be treated like a tied operand, and the register
must be reloaded while processing use operands.
This fixes a bug where partially redefined registers were processed as normal
defs with a reload added. The reload could clobber another use operand if it was
a kill that allowed register reuse.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107193 91177308-0d34-0410-b5e6-96231b3b80d8
The LowerSubregs pass needs to preserve implicit def operands attached to
EXTRACT_SUBREG instructions when it replaces those instructions with copies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107189 91177308-0d34-0410-b5e6-96231b3b80d8
of getPhysicalRegisterRegClass with it.
If we want to make a copy (or estimate its cost), it is better to use the
smallest class as more efficient operations might be possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107140 91177308-0d34-0410-b5e6-96231b3b80d8
There are 2 changes relative to the previous version of the patch:
1) For the "simple" if-conversion case, there's no need to worry about
RemoveExtraEdges not handling an unanalyzable branch. Predicated terminators
are ignored in this context, so RemoveExtraEdges does the right thing.
This might break someday if we ever treat indirect branches (BRIND) as
predicable, but for now, I just removed this part of the patch, because
in the case where we do not add an unconditional branch, we rely on keeping
the fall-through edge to CvtBBI (which is empty after this transformation).
The change relative to the previous patch is:
@@ -1036,10 +1036,6 @@
IterIfcvt = false;
}
- // RemoveExtraEdges won't work if the block has an unanalyzable branch,
- // which is typically the case for IfConvertSimple, so explicitly remove
- // CvtBBI as a successor.
- BBI.BB->removeSuccessor(CvtBBI->BB);
RemoveExtraEdges(BBI);
// Update block info. BB can be iteratively if-converted.
2) My patch exposed a bug in the code for merging the tail of a "diamond",
which had previously never been exercised. The code was simply checking that
the tail had a single predecessor, but there was a case in
MultiSource/Benchmarks/VersaBench/dbms where that single predecessor was
neither edge of the diamond. I added the following change to check for
that:
@@ -1276,7 +1276,18 @@
// tail, add a unconditional branch to it.
if (TailBB) {
BBInfo TailBBI = BBAnalysis[TailBB->getNumber()];
- if (TailBB->pred_size() == 1 && !TailBBI.HasFallThrough) {
+ bool CanMergeTail = !TailBBI.HasFallThrough;
+ // There may still be a fall-through edge from BBI1 or BBI2 to TailBB;
+ // check if there are any other predecessors besides those.
+ unsigned NumPreds = TailBB->pred_size();
+ if (NumPreds > 1)
+ CanMergeTail = false;
+ else if (NumPreds == 1 && CanMergeTail) {
+ MachineBasicBlock::pred_iterator PI = TailBB->pred_begin();
+ if (*PI != BBI1->BB && *PI != BBI2->BB)
+ CanMergeTail = false;
+ }
+ if (CanMergeTail) {
MergeBlocks(BBI, TailBBI);
TailBBI.IsDone = true;
} else {
With these fixes, I was able to run all the SingleSource and MultiSource
tests successfully.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107110 91177308-0d34-0410-b5e6-96231b3b80d8
have to be registers, per gcc documentation. This affects
the logic for determining what "g" should lower to. PR 7393.
A couple of existing testcases are affected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107079 91177308-0d34-0410-b5e6-96231b3b80d8
When an instruction has tied operands and physreg defines, we must take extra
care that the tied operands conflict with neither physreg defs nor uses.
The special treatment is given to inline asm and instructions with tied operands
/ early clobbers and physreg defines.
This fixes PR7509.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107043 91177308-0d34-0410-b5e6-96231b3b80d8
regressions.
--- Reverse-merging r106939 into '.':
U test/CodeGen/Thumb2/thumb2-ifcvt3.ll
U lib/CodeGen/IfConversion.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106951 91177308-0d34-0410-b5e6-96231b3b80d8
if-conversion. The RemoveExtraEdges function doesn't work for blocks that
end with unanalyzable branches, so in those cases, the "extra" edges must
be explicitly removed. The CopyAndPredicateBlock and MergeBlocks methods
can also avoid copying successor edges due to branches that have already
been removed. The latter case is especially helpful when MergeBlocks is
called for handling "diamond" if-conversions, where otherwise you can end
up with some weird intermediate states in the CFG. Unfortunately I've
been unable to find cases where this cleanup actually makes a significant
difference in the code. There is one test where we manage to remove an
empty block at the end of a function. Radar 6911268.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106939 91177308-0d34-0410-b5e6-96231b3b80d8
CopyFromReg nodes for aliasing registers (AX and AL). This confuses the fast
register allocator.
Instead of CopyFromReg(AL), use ExtractSubReg(CopyFromReg(AX), sub_8bit).
This fixes PR7312.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106934 91177308-0d34-0410-b5e6-96231b3b80d8
introduced in r106343, but only showed up recently (with a particular compiler &
linker combination) because of the particular check, and because we have no
builtin checking for dereferencing the end of an array, which is truly
unfortunate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106908 91177308-0d34-0410-b5e6-96231b3b80d8
for an "i" constraint should get lowered; PR 6309. While
this argument was passed around a lot, this is the only
place it was used, so it goes away from a lot of other
places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106893 91177308-0d34-0410-b5e6-96231b3b80d8
original SDNode. This is badness. Also, this function allows one SDNode to point
multiple flags to another SDNode. Badness as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106793 91177308-0d34-0410-b5e6-96231b3b80d8
address requires a register or secondary load to compute
(most PIC modes). This improves "g" constraint handling. 8015842.
The test from 2007 is attempting to test the fix for PR1761,
but since -relocation-model=static doesn't work on Darwin
x86-64, it was not testing what it was supposed to be testing
and was passing erroneously. Fixed to use Linux x86-64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106779 91177308-0d34-0410-b5e6-96231b3b80d8
CoalescerPair can determine if a copy can be coalesced, and which register gets
merged away. The old logic in SimpleRegisterCoalescing had evolved into
something a bit too convoluted.
This second attempt fixes some crashes that only occurred Linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106769 91177308-0d34-0410-b5e6-96231b3b80d8
when the condition is constant. This optimization shouldn't be
necessary, because codegen shouldn't be able to find dead control
paths that the IR-level optimizer can't find. And it's undesirable,
because it encourages bugpoint to leave "br i1 false" branches
in its output. And it wasn't updating the CFG.
I updated all the tests I could, but some tests are too reduced
and I wasn't able to meaningfully preserve them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106748 91177308-0d34-0410-b5e6-96231b3b80d8
CoalescerPair can determine if a copy can be coalesced, and which register gets
merged away. The old logic in SimpleRegisterCoalescing had evolved into
something a bit too convoluted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106701 91177308-0d34-0410-b5e6-96231b3b80d8
void t(int *cp0, int *cp1, int *dp, int fmd) {
int c0, c1, d0, d1, d2, d3;
c0 = (*cp0++ & 0xffff) | ((*cp1++ << 16) & 0xffff0000);
c1 = (*cp0++ & 0xffff) | ((*cp1++ << 16) & 0xffff0000);
/* ... */
}
It code gens into something pretty bad. But with this change (analogous to the
X86 back-end), it will use ldm and generate few instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106693 91177308-0d34-0410-b5e6-96231b3b80d8
into the same node, but with different non-memory operands, we need to replace
the memory operands after it's finished morphing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106643 91177308-0d34-0410-b5e6-96231b3b80d8
Measurements show that it does not speed up coalescing, so there is no reason
the keep the added complexity around.
Also clean out some unused methods and static functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106548 91177308-0d34-0410-b5e6-96231b3b80d8
opportunities. For example, this lets it emit this:
movq (%rax), %rcx
addq %rdx, %rcx
instead of this:
movq %rdx, %rcx
addq (%rax), %rcx
in the case where %rdx has subsequent uses. It's the same number
of instructions, and usually the same encoding size on x86, but
it appears faster, and in general, it may allow better scheduling
for the load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106493 91177308-0d34-0410-b5e6-96231b3b80d8
This allows the fast regiser allocator to remove redundant
register moves.
Update a set of tests that depend on the register allocator
to be linear scan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106420 91177308-0d34-0410-b5e6-96231b3b80d8
use sharing map. The reconcileNewOffset logic already forces a
separate use if the kinds differ, so incorporating the kind in the
key means we can track more sharing opportunities.
More sharing means fewer total uses to track, which means smaller
problem sizes, which means the conservative throttles don't kick
in as often.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106396 91177308-0d34-0410-b5e6-96231b3b80d8
- This fixed a number of bugs in if-converter, tail merging, and post-allocation
scheduler. If-converter now runs branch folding / tail merging first to
maximize if-conversion opportunities.
- Also changed the t2IT instruction slightly. It now defines the ITSTATE
register which is read by instructions in the IT block.
- Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't
change the instruction ordering in the IT block (since IT mask has been
finalized). It also ensures no other instructions can be scheduled between
instructions in the IT block.
This is not yet enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106344 91177308-0d34-0410-b5e6-96231b3b80d8
instructions, but it doesn't really understand live ranges, so the first
INSERT_SUBREG uses an implicitly defined register.
Fix it in LiveVariableAnalysis by adding the <undef> flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106333 91177308-0d34-0410-b5e6-96231b3b80d8
basic tests.
This has been well tested on Darwin but not elsewhere.
It should work provided the linker correctly resolves
B.W <label in other function>
which it has not seen before, at least from llvm-based
compilers. I'm leaving the arm-tail-calls switch in
until I see if there's any problems because of that;
it might need to be disabled for some environments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106299 91177308-0d34-0410-b5e6-96231b3b80d8
does for {flags}. If we create virtual registers of the CCR class, RegAllocFast
may try to spill them, and we can't do that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106289 91177308-0d34-0410-b5e6-96231b3b80d8
LiveVariableAnalysis was a bit picky about a register only being redefined once,
but that really isn't necessary.
Here is an example of chained INSERT_SUBREGs that we can handle now:
68 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1028<kill>, 14
register: %reg1040 +[70,134:0)
76 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1029<kill>, 13
register: %reg1040 replace range with [70,78:1) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,134:0) 0@78-(134) 1@70-(78)
84 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1030<kill>, 12
register: %reg1040 replace range with [78,86:2) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,86:2)[86,134:0) 0@86-(134) 1@70-(78) 2@78-(86)
92 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1031<kill>, 11
register: %reg1040 replace range with [86,94:3) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,86:2)[86,94:3)[94,134:0) 0@94-(134) 1@70-(78) 2@78-(86) 3@86-(94)
rdar://problem/8096390
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106152 91177308-0d34-0410-b5e6-96231b3b80d8
will conflict with another live range. The place which creates this scenerio is
the code in X86 that lowers a select instruction by splitting the MBBs. This
eliminates the need to check from the bottom up in an MBB for live pregs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106066 91177308-0d34-0410-b5e6-96231b3b80d8
Early clobbers defining a virtual register were first alocated to a physreg and
then processed as a physreg EC, spilling the virtreg.
This fixes PR7382.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105998 91177308-0d34-0410-b5e6-96231b3b80d8
Given a copy instruction, CoalescerPair can determine which registers to
coalesce in order to eliminate the copy. It deals with all the subreg fun to
determine a tuple (DstReg, SrcReg, SubIdx) such that:
- SrcReg is a virtual register that will disappear after coalescing.
- DstReg is a virtual or physical register whose live range will be extended.
- SubIdx is 0 when DstReg is a physical register.
- SrcReg can be joined with DstReg:SubIdx.
CoalescerPair::isCoalescable() determines if another copy instruction is
compatible with the same tuple. This fixes some NEON miscompilations where
shuffles are getting coalesced as if they were copies.
The CoalescerPair class will replace a lot of the spaghetti logic in JoinCopy
later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105997 91177308-0d34-0410-b5e6-96231b3b80d8
replacing the overly conservative checks that I had introduced recently to
deal with correctness issues. This makes a pretty noticable difference
in our testcases where reg_sequences are used. I've updated one test to
check that we no longer emit the unnecessary subreg moves.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105991 91177308-0d34-0410-b5e6-96231b3b80d8
symbols as declarations in the X86 backend. This would manifest
on darwin x86-32 as errors like this with -fvisibility=hidden:
symbol '__ZNSbIcED1Ev' can not be undefined in a subtraction expression
This fixes PR7353.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105954 91177308-0d34-0410-b5e6-96231b3b80d8
i64 and f64 types, but now it also handle Neon vector types, so the f64 result
of VMOVDRR may need to be converted to a Neon type. Radar 8084742.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105845 91177308-0d34-0410-b5e6-96231b3b80d8
This is a bit of a hack to make inline asm look more like call instructions.
It would be better to produce correct dead flags during isel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105749 91177308-0d34-0410-b5e6-96231b3b80d8
there could be multiple subexpressions within a single expansion which
require insert point adjustment. This fixes PR7306.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105510 91177308-0d34-0410-b5e6-96231b3b80d8
replace an OpA with a widened OpB, it is possible to get new uses of OpA due to CSE
when recursively updating nodes. Since OpA has been processed, the new uses are
not examined again. The patch checks if this occurred and it it did, updates the
new uses of OpA to use OpB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105453 91177308-0d34-0410-b5e6-96231b3b80d8
registers it defines then interfere with an existing preg live range.
For instance, if we had something like these machine instructions:
BB#0
... = imul ... EFLAGS<imp-def,dead>
test ..., EFLAGS<imp-def>
jcc BB#2 EFLAGS<imp-use>
BB#1
... ; fallthrough to BB#2
BB#2
... ; No code that defines EFLAGS
jcc ... EFLAGS<imp-use>
Machine sink will come along, see that imul implicitly defines EFLAGS, but
because it's "dead", it assumes that it can move imul into BB#2. But when it
does, imul's "dead" imp-def of EFLAGS is raised from the dead (a zombie) and
messes up the condition code for the jump (and pretty much anything else which
relies upon it being correct).
The solution is to know which pregs are live going into a basic block. However,
that information isn't calculated at this point. Nor does the LiveVariables pass
take into account non-allocatable physical registers. In lieu of this, we do a
*very* conservative pass through the basic block to determine if a preg is live
coming out of it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105387 91177308-0d34-0410-b5e6-96231b3b80d8
expansion is the same as that used by LegalizeDAG.
The resulting code sucks in terms of performance/codesize on x86-32 for a
64-bit operation; I haven't looked into whether different expansions might be
better in general.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105378 91177308-0d34-0410-b5e6-96231b3b80d8
that are too large. This causes the freebsd bootloader to be too
large apparently.
It's unclear if this should be an -Os or -Oz thing. Thoughts welcome.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105228 91177308-0d34-0410-b5e6-96231b3b80d8
optimization level.
This only really affects llc for now because both the llvm-gcc and clang front
ends override the default register allocator. I intend to remove that code later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104904 91177308-0d34-0410-b5e6-96231b3b80d8
Mon Ping provided; unfortunately bugpoint failed to
reduce it, but I think it's important to have a test for
this in the suite. 8023512.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104624 91177308-0d34-0410-b5e6-96231b3b80d8
Fix it by changing the T2I_rbin_s_is multiclass to handle the CPSR
output and 'S' suffix in the same way as T2I_bin_s_irs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104531 91177308-0d34-0410-b5e6-96231b3b80d8
copying VFP subregs. This exposed a bunch of dead code in the *spill-q.ll
tests, so I tweaked those tests to keep that code from being optimized away.
Radar 7872877.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104415 91177308-0d34-0410-b5e6-96231b3b80d8
so that it will continue to test what it was meant to test when I commit a
separate change for better support of BUILD_VECTOR and VECTOR_SHUFFLE for Neon.
Fix a DAG combiner crash exposed by this test change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104380 91177308-0d34-0410-b5e6-96231b3b80d8
pass after isel instead of being interlaced with it, we can
trust that all the code for a function has been isel'd before
it is run.
The practical impact of this is that we can scan for machine
instr phis instead of doing a fuzzy match on the LLVM BB for
phi nodes. Doing the fuzzy match required knowing when isel
would produce an fp reg stack phi which was gross. It was
also wrong in cases where select got lowered to a branch
tree because cmovs aren't available (PR6828).
Just do the scan on machine phis which is simpler, faster
and more correct. This fixes PR6828.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104333 91177308-0d34-0410-b5e6-96231b3b80d8
definitions of the virtual register.
This happens when spilling the registers produced by REG_SEQUENCE:
%reg1047:5<def>, %reg1047:6<def>, %reg1047:7<def> = VLD3d8 %reg1033, 0, pred:14, pred:%reg0
The rewriter would spill the register multiple times, dead store elimination
tried to keep up, but ended up cutting the branch it was sitting on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104321 91177308-0d34-0410-b5e6-96231b3b80d8
operand on the left, the interesting operand is on the right. This
fixes a bug where LSR was failing to recognize ICmpZero uses,
which led it to be unable to reverse the induction variable in the
attached testcase.
Delete test/CodeGen/X86/stack-color-with-reg-2.ll, because its test
is extremely fragile and hard to meaningfully update.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104262 91177308-0d34-0410-b5e6-96231b3b80d8
the addressing modes don't make this trivially easy. This allows
it to avoid falling into the less precise heuristics in more
cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104186 91177308-0d34-0410-b5e6-96231b3b80d8
lowering REG_SEQUENCE instructions.
Insert copies for REG_SEQUENCE sources not killed to avoid breaking later passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104146 91177308-0d34-0410-b5e6-96231b3b80d8
The trouble arises when the result of a vector cmp + sext is then and'ed with all ones. Instcombine will turn it into a vector cmp + zext, dag combiner will miss turning it into a vsetcc and hell breaks loose after that.
Teach dag combine to turn a vector cpm + zest into a vsetcc + and 1. This fixes rdar://7923010.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104094 91177308-0d34-0410-b5e6-96231b3b80d8