The height on an instruction is the minimum number of cycles from the
instruction is issued to the end of the trace. Heights are computed for
all instructions in and below the trace center block.
The method for computing heights is different from the depth
computation. As we visit instructions in the trace bottom-up, heights of
used instructions are pushed upwards. This way, we avoid scanning long
use lists, looking for uses in the current trace.
At each basic block boundary, a list of live-in registers and their
minimum heights is saved in the trace block info. These live-in lists
are used when restarting depth computations on a trace that
converges with an already computed trace. They will also be used to
accurately compute the critical path length.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161138 91177308-0d34-0410-b5e6-96231b3b80d8
Assuming infinite issue width, compute the earliest each instruction in
the trace can issue, when considering the latency of data dependencies.
The issue cycle is record as a 'depth' from the beginning of the trace.
This is half the computation required to find the length of the critical
path through the trace. Heights are next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161074 91177308-0d34-0410-b5e6-96231b3b80d8
One motivating example is to sink an instruction from a basic block which has
two successors: one outside the loop, the other inside the loop. We should try
to sink the instruction outside the loop.
rdar://11980766
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161062 91177308-0d34-0410-b5e6-96231b3b80d8
We are extending live ranges, so kill flags are not accurate. They
aren't needed until they are recomputed after RA anyway.
<rdar://problem/11950722>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161023 91177308-0d34-0410-b5e6-96231b3b80d8
We branch to the successor with higher edge weight first.
Convert from
je LBB4_8 --> to outer loop
jmp LBB4_14 --> to inner loop
to
jne LBB4_14
jmp LBB4_8
PR12750
rdar: 11393714
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161018 91177308-0d34-0410-b5e6-96231b3b80d8
This lets traces include the final iteration of a nested loop above the
center block, and the first iteration of a nested loop below the center
block.
We still don't allow traces to contain backedges, and traces are
truncated where they would leave a loop, as seen from the center block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161003 91177308-0d34-0410-b5e6-96231b3b80d8
When computing a trace, all the candidates for pred/succ must have been
visited. Filter out back-edges first, though. The PO traversal ignores
them.
Thanks to Andy for spotting this in review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160995 91177308-0d34-0410-b5e6-96231b3b80d8
This is a cleaned up version of the isFree() function in
MachineTraceMetrics.cpp.
Transient instructions are very unlikely to produce any code in the
final output. Either because they get eliminated by RegisterCoalescing,
or because they are pseudo-instructions like labels and debug values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160977 91177308-0d34-0410-b5e6-96231b3b80d8
The MachineTraceMetrics analysis must be invalidated before modifying
the CFG. This will catch some of the violations of that rule.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160969 91177308-0d34-0410-b5e6-96231b3b80d8
A->isPredecessor(B) is the same as B->isSuccessor(A), but it can
tolerate a B that is null or dangling. This shouldn't happen normally,
but it it useful for verification code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160968 91177308-0d34-0410-b5e6-96231b3b80d8
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.
rdar://10554090 and rdar://11873276
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160919 91177308-0d34-0410-b5e6-96231b3b80d8
A value number is a PHI def if and only if it begins at a block
boundary. This can be derived from the def slot, a separate flag is not
necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160893 91177308-0d34-0410-b5e6-96231b3b80d8
This option replaces the existing live interval computation with one
based on LiveRangeCalc.cpp. The new algorithm does not depend on
LiveVariables, and it can be run at any time, before or after leaving
SSA form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160892 91177308-0d34-0410-b5e6-96231b3b80d8
This is still a work in progress.
Out-of-order CPUs usually execute instructions from multiple basic
blocks simultaneously, so it is necessary to look at longer traces when
estimating the performance effects of code transformations.
The MachineTraceMetrics analysis will pick a typical trace through a
given basic block and provide performance metrics for the trace. Metrics
will include:
- Instruction count through the trace.
- Issue count per functional unit.
- Critical path length, and per-instruction 'slack'.
These metrics can be used to determine the performance limiting factor
when executing the trace, and how it will be affected by a code
transformation.
Initially, this will be used by the early if-conversion pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160796 91177308-0d34-0410-b5e6-96231b3b80d8
It is redundant; RegisterCoalescer will do the remat if it can't eliminate
the copy. Collected instruction counts before and after this. A few extra
instructions are generated due to spilling but it is normal to see these kinds
of changes with almost any small codegen change, according to Jakob.
This also fixed rdar://11830760 where xor is expected instead of movi0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160749 91177308-0d34-0410-b5e6-96231b3b80d8
When a live range splits into multiple connected components, we would
arbitrarily assign <undef> uses to component 0. This is wrong when the
use is tied to a def that gets assigned to a different component:
%vreg69<def> = ADD8ri %vreg68<undef>, 1
The use and def must get the same virtual register.
Fix this by assigning <undef> uses to the same component as the value
defined by the instruction, if any:
%vreg69<def> = ADD8ri %vreg69<undef>, 1
This fixes PR13402. The PR has a test case which I am not including
because it is unlikely to keep exposing this behavior in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160739 91177308-0d34-0410-b5e6-96231b3b80d8
that do not support it (X86 does not lower select_cc).
PR: 13428
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160619 91177308-0d34-0410-b5e6-96231b3b80d8
LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load
into its only use. Only do that when the load is safe to move, and it
won't extend any live ranges.
This fixes PR13414.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160575 91177308-0d34-0410-b5e6-96231b3b80d8
PHIElimination splits critical edges when it predicts it can resolve
interference and eliminate copies. It doesn't split the edge if the
interference wouldn't be resolved anyway because the phi-use register is
live in the critical edge anyway.
Teach PHIElimination to split loop exiting edges with interference, even
if it wouldn't resolve the interference. This removes the necessary
copies from the loop, which is still an improvement from injecting the
copies into the loop.
The test case demonstrates the improvement. Before:
LBB0_1:
cmpb $0, (%rdx)
leaq 1(%rdx), %rdx
movl %esi, %eax
je LBB0_1
After:
LBB0_1:
cmpb $0, (%rdx)
leaq 1(%rdx), %rdx
je LBB0_1
movl %esi, %eax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160571 91177308-0d34-0410-b5e6-96231b3b80d8