The register allocator expects minimal physreg live ranges. Schedule
physreg copies accordingly. This is slightly tricky when they occur in
the middle of the scheduling region. For now, this is handled by
rescheduling the copy when its associated instruction is
scheduled. Eventually we may instead bundle them, but only if we can
preserve the bundles as parallel copies during regalloc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179449 91177308-0d34-0410-b5e6-96231b3b80d8
When debugging performance regressions we often ask ourselves if the regression
that we see is due to poor isel/sched/ra or due to some micro-architetural
problem. When comparing two code sequences one good way to rule out front-end
bottlenecks (and other the issues) is to force code alignment. This pass adds
a flag that forces the alignment of all of the basic blocks in the program.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179353 91177308-0d34-0410-b5e6-96231b3b80d8
In the simple and triangle if-conversion cases, when CopyAndPredicateBlock is
used because the to-be-predicated block has other predecessors, we need to
explicitly remove the old copied block from the successors list. Normally if
conversion relies on TII->AnalyzeBranch combined with BB->CorrectExtraCFGEdges
to cleanup the successors list, but if the predicated block contained an
un-analyzable branch (such as a now-predicated return), then this will fail.
These extra successors were causing a problem on PPC because it was causing
later passes (such as PPCEarlyReturm) to leave dead return-only basic blocks in
the code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179227 91177308-0d34-0410-b5e6-96231b3b80d8
The target hooks are getting out of hand. What does it mean to run
before or after regalloc anyway? Allowing either Pass* or AnalysisID
pass identification should make it much easier for targets to use the
substitutePass and insertPass APIs, and create less need for badly
named target hooks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179140 91177308-0d34-0410-b5e6-96231b3b80d8
therefore not at all) of the pc or statement list. We also don't
need to emit the compilation dir so save so space and time
and don't bother.
Fix up the testcase accordingly and verify that we don't emit
the attributes or the items that they use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179114 91177308-0d34-0410-b5e6-96231b3b80d8
This pattern occurs in SROA output due to the way vector arguments are lowered
on ARM.
The testcase from PR15525 now compiles into this, which is better than the code
we got with the old scalarrepl:
_Store:
ldr.w r9, [sp]
vmov d17, r3, r9
vmov d16, r1, r2
vst1.8 {d16, d17}, [r0]
bx lr
Differential Revision: http://llvm-reviews.chandlerc.com/D647
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179106 91177308-0d34-0410-b5e6-96231b3b80d8
a relocation across sections. Do this for DW_AT_stmt list in the
skeleton CU and check the relocations in the debug_info section.
Add a FIXME for multiple CUs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178969 91177308-0d34-0410-b5e6-96231b3b80d8
We used to do "SmallString += CUID", which is incorrect, since CUID will
be truncated to a char.
rdar://problem/13573833
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178941 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes PEI as previously described, but correctly handles the case where
the instruction defining the virtual register to be scavenged is the first in
the block. Arnold provided me with a bugpoint-reduced test case, but even that
seems too large to use as a regression test. If I'm successful in cleaning it
up then I'll commit that as well.
Original commit message:
This change fixes a bug that I introduced in r178058. After a register is
scavenged using one of the available spills slots the instruction defining the
virtual register needs to be moved to after the spill code. The scavenger has
already processed the defining instruction so that registers killed by that
instruction are available for definition in that same instruction. Unfortunately,
after this, the scavenger needs to iterate through the spill code and then
visit, again, the instruction that defines the now-scavenged register. In order
to avoid confusion, the register scavenger needs the ability to 'back up'
through the spill code so that it can again process the instructions in the
appropriate order. Prior to this fix, once the scavenger reached the
just-moved instruction, it would assert if it killed any registers because,
having already processed the instruction, it believed they were undefined.
Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178919 91177308-0d34-0410-b5e6-96231b3b80d8
During LTO, the target options on functions within the same Module may
change. This would necessitate resetting some of the back-end. Do this for X86,
because it's a Friday afternoon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178917 91177308-0d34-0410-b5e6-96231b3b80d8
Reverting because this breaks one of the LTO builders. Original commit message:
This change fixes a bug that I introduced in r178058. After a register is
scavenged using one of the available spills slots the instruction defining the
virtual register needs to be moved to after the spill code. The scavenger has
already processed the defining instruction so that registers killed by that
instruction are available for definition in that same instruction. Unfortunately,
after this, the scavenger needs to iterate through the spill code and then
visit, again, the instruction that defines the now-scavenged register. In order
to avoid confusion, the register scavenger needs the ability to 'back up'
through the spill code so that it can again process the instructions in the
appropriate order. Prior to this fix, once the scavenger reached the
just-moved instruction, it would assert if it killed any registers because,
having already processed the instruction, it believed they were undefined.
Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178916 91177308-0d34-0410-b5e6-96231b3b80d8
This change fixes a bug that I introduced in r178058. After a register is
scavenged using one of the available spills slots the instruction defining the
virtual register needs to be moved to after the spill code. The scavenger has
already processed the defining instruction so that registers killed by that
instruction are available for definition in that same instruction. Unfortunately,
after this, the scavenger needs to iterate through the spill code and then
visit, again, the instruction that defines the now-scavenged register. In order
to avoid confusion, the register scavenger needs the ability to 'back up'
through the spill code so that it can again process the instructions in the
appropriate order. Prior to this fix, once the scavenger reached the
just-moved instruction, it would assert if it killed any registers because,
having already processed the instruction, it believed they were undefined.
Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178845 91177308-0d34-0410-b5e6-96231b3b80d8
For now, just save the compile time since the ConvergingScheduler
heuristics don't use this analysis. We'll probably enable it later
after compile-time investigation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178822 91177308-0d34-0410-b5e6-96231b3b80d8
On certain architectures we can support efficient vectorized version of
instructions if the operand value is uniform (splat) or a constant scalar.
An example of this is a vector shift on x86.
We can efficiently support
for (i = 0 ; i < ; i += 4)
w[0:3] = v[0:3] << <2, 2, 2, 2>
but not
for (i = 0; i < ; i += 4)
w[0:3] = v[0:3] << x[0:3]
This patch adds a parameter to getArithmeticInstrCost to further qualify operand
values as uniform or uniform constant.
Targets can then choose to return a different cost for instructions with such
operand values.
A follow-up commit will test this feature on x86.
radar://13576547
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178807 91177308-0d34-0410-b5e6-96231b3b80d8
There is a difference for FORM_ref_addr between DWARF 2 and DWARF 3+.
Since Eric is against guarding DWARF 2 ref_addr with DarwinGDBCompat, we are
still in discussion on how to handle this.
The correct solution is to update our header to say version 4 instead of version
2 and update tool chains as well.
rdar://problem/13559431
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178806 91177308-0d34-0410-b5e6-96231b3b80d8
the target system.
It was hard-coded to 4 bytes before. I can't get llvm to generate a
ref_addr on a reasonably sized testing case.
rdar://problem/13559431
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178722 91177308-0d34-0410-b5e6-96231b3b80d8
For this we need to use a libcall. Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner. A test case from the bug report is included.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178639 91177308-0d34-0410-b5e6-96231b3b80d8
The new instruction scheduling models provide information about the
number of cycles consumed on each processor resource. This makes it
possible to estimate ILP more accurately than simply counting
instructions / issue width.
The functions getResourceDepth() and getResourceLength() now identify
the limiting processor resource, and return a cycle count based on that.
This gives more precise resource information, particularly in traces
that use one resource a lot more than others.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178553 91177308-0d34-0410-b5e6-96231b3b80d8
This is helps on architectures where i8,i16 are not legal but we have byte, and
short loads/stores. Allowing us to merge copies like the one below on ARM.
copy(char *a, char *b, int n) {
do {
int t0 = a[0];
int t1 = a[1];
b[0] = t0;
b[1] = t1;
radar://13536387
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178546 91177308-0d34-0410-b5e6-96231b3b80d8
We would also like to merge sequences that involve a variable index like in the
example below.
int index = *idx++
int i0 = c[index+0];
int i1 = c[index+1];
b[0] = i0;
b[1] = i1;
By extending the parsing of the base pointer to handle dags that contain a
base, index, and offset we can handle examples like the one above.
The dag for the code above will look something like:
(load (i64 add (i64 copyfromreg %c)
(i64 signextend (i8 load %index))))
(load (i64 add (i64 copyfromreg %c)
(i64 signextend (i32 add (i32 signextend (i8 load %index))
(i32 1)))))
The code that parses the tree ignores the intermediate sign extensions. However,
if there is a sign extension it needs to be on all indexes.
(load (i64 add (i64 copyfromreg %c)
(i64 signextend (add (i8 load %index)
(i8 1))))
vs
(load (i64 add (i64 copyfromreg %c)
(i64 signextend (i32 add (i32 signextend (i8 load %index))
(i32 1)))))
radar://13536387
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178483 91177308-0d34-0410-b5e6-96231b3b80d8
die values. A lot of DIEs have 10 attributes in C++ code (example
clang), none had more than 12. Seems like a good default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178366 91177308-0d34-0410-b5e6-96231b3b80d8
immediate in a register. I don't believe this should ever fail, but I see no
harm in trying to make this code bullet proof.
I've added an assert to ensure my assumtion is correct. If the assertion fires
something is wrong and we should fix it, rather then just silently fall back to
SelectionDAG isel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178305 91177308-0d34-0410-b5e6-96231b3b80d8
This is a follow-up to r178073 (which should actually make target-customized
spilling work again).
I still don't have a regression test for this (but it would be good to have
one; Thumb 1 and Mips16 use this callback as well).
Patch by Richard Sandiford.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178137 91177308-0d34-0410-b5e6-96231b3b80d8
As pointed out by Richard Sandiford, my recent updates to the register
scavenger broke targets that use custom spilling (because the new code assumed
that if there were no valid spill slots, than spilling would be impossible).
I don't have a test case, but it should be possible to create one for Thumb 1,
Mips 16, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178073 91177308-0d34-0410-b5e6-96231b3b80d8
The previous algorithm could not deal properly with scavenging multiple virtual
registers because it kept only one live virtual -> physical mapping (and
iterated through operands in order). Now we don't maintain a current mapping,
but rather use replaceRegWith to completely remove the virtual register as
soon as the mapping is established.
In order to allow the register scavenger to return a physical register killed
by an instruction for definition by that same instruction, we now call
RS->forward(I) prior to eliminating virtual registers defined in I. This
requires a minor update to forward to ignore virtual registers.
These new features will be tested in forthcoming commits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178058 91177308-0d34-0410-b5e6-96231b3b80d8
- Handle the case where the result of 'insert_subvect' is bitcasted
before 'extract_subvec'. This removes the redundant insertf128/extractf128
pair on unaligned 256-bit vector load/store on vectors of non 64-bit integer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177945 91177308-0d34-0410-b5e6-96231b3b80d8
For instance, following transformation will be disabled:
x + x + x => 3.0f * x;
The problem of these transformations is that it introduces a FP constant, which
following Instruction-Selection pass cannot handle.
Reviewed by Nadav, thanks a lot!
rdar://13445387
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177933 91177308-0d34-0410-b5e6-96231b3b80d8