Rewrite the shared implementation of BlockFrequencyInfo and
MachineBlockFrequencyInfo entirely.
The old implementation had a fundamental flaw: precision losses from
nested loops (or very wide branches) compounded past loop exits (and
convergence points).
The @nested_loops testcase at the end of
test/Analysis/BlockFrequencyAnalysis/basic.ll is motivating. This
function has three nested loops, with branch weights in the loop headers
of 1:4000 (exit:continue). The old analysis gives non-sensical results:
Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
---- Block Freqs ----
entry = 1.0
for.cond1.preheader = 1.00103
for.cond4.preheader = 5.5222
for.body6 = 18095.19995
for.inc8 = 4.52264
for.inc11 = 0.00109
for.end13 = 0.0
The new analysis gives correct results:
Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
block-frequency-info: nested_loops
- entry: float = 1.0, int = 8
- for.cond1.preheader: float = 4001.0, int = 32007
- for.cond4.preheader: float = 16008001.0, int = 128064007
- for.body6: float = 64048012001.0, int = 512384096007
- for.inc8: float = 16008001.0, int = 128064007
- for.inc11: float = 4001.0, int = 32007
- for.end13: float = 1.0, int = 8
Most importantly, the frequency leaving each loop matches the frequency
entering it.
The new algorithm leverages BlockMass and PositiveFloat to maintain
precision, separates "probability mass distribution" from "loop
scaling", and uses dithering to eliminate probability mass loss. I have
unit tests for these types out of tree, but it was decided in the review
to make the classes private to BlockFrequencyInfoImpl, and try to shrink
them (or remove them entirely) in follow-up commits.
The new algorithm should generally have a complexity advantage over the
old. The previous algorithm was quadratic in the worst case. The new
algorithm is still worst-case quadratic in the presence of irreducible
control flow, but it's linear without it.
The key difference between the old algorithm and the new is that control
flow within a loop is evaluated separately from control flow outside,
limiting propagation of precision problems and allowing loop scale to be
calculated independently of mass distribution. Loops are visited
bottom-up, their loop scales are calculated, and they are replaced by
pseudo-nodes. Mass is then distributed through the function, which is
now a DAG. Finally, loops are revisited top-down to multiply through
the loop scales and the masses distributed to pseudo nodes.
There are some remaining flaws.
- Irreducible control flow isn't modelled correctly. LoopInfo and
MachineLoopInfo ignore irreducible edges, so this algorithm will
fail to scale accordingly. There's a note in the class
documentation about how to get closer. See also the comments in
test/Analysis/BlockFrequencyInfo/irreducible.ll.
- Loop scale is limited to 4096 per loop (2^12) to avoid exhausting
the 64-bit integer precision used downstream.
- The "bias" calculation proposed on llvmdev is *not* incorporated
here. This will be added in a follow-up commit, once comments from
this review have been handled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206548 91177308-0d34-0410-b5e6-96231b3b80d8
to a more normal move operation on the graph itself. The definition
already got removed, but I missed the declaration.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206455 91177308-0d34-0410-b5e6-96231b3b80d8
This will become necessary to build up the SCC iterators and SCC
definitions. Moving it now so that subsequent diffs are incremental.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206454 91177308-0d34-0410-b5e6-96231b3b80d8
graph. This simplifies the custom move constructor operation to one of
walking the graph and updating the 'up' pointers to point to the new
location of the graph. Switch the nodes from a reference to a pointer
for the 'up' edge to facilitate this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206450 91177308-0d34-0410-b5e6-96231b3b80d8
It doesn't work. I'm still cleaning up all the places where I blindly
followed this pattern. There are more to come in this code too.
As a benefit, this lets the default copy and move operations Just Work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206375 91177308-0d34-0410-b5e6-96231b3b80d8
Moves redundant template parameters into an implementation detail of
BlockFrequencyInfoImpl.
No functionality change.
<rdar://problem/14292693>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206084 91177308-0d34-0410-b5e6-96231b3b80d8
This is a shared implementation class for BlockFrequencyInfo and
MachineBlockFrequencyInfo, not for BlockFrequency, a related (but
distinct) class.
No functionality change.
<rdar://problem/14292693>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206083 91177308-0d34-0410-b5e6-96231b3b80d8
Also updated as many loops as I could find using df_begin/idf_begin -
strangely I found no uses of idf_begin. Is that just used out of tree?
Also a few places couldn't use df_begin because either they used the
member functions of the depth first iterators or had specific ordering
constraints (I added a comment in the latter case).
Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T>
where you needed iterator_range<idf_iterator<T>>)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206016 91177308-0d34-0410-b5e6-96231b3b80d8
In preparation for an upcoming commit implementing unrolling preferences for
x86, this adds additional fields to the UnrollingPreferences structure:
- PartialThreshold and PartialOptSizeThreshold - Like Threshold and
OptSizeThreshold, but used when not fully unrolling. These are necessary
because we need different thresholds for full unrolling from those used when
partially unrolling (the full unrolling thresholds are generally going to be
larger).
- MaxCount - A cap on the unrolling factor when partially unrolling. This can
be used by a target to prevent the unrolled loop from exceeding some
resource limit independent of the loop size (such as number of branches).
There should be no functionality change for any in-tree targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205347 91177308-0d34-0410-b5e6-96231b3b80d8
Implement Pass::releaseMemory() in BlockFrequencyInfo and
MachineBlockFrequencyInfo. Just delete the private implementation when
not in use. Switch to a std::unique_ptr to make the logic more clear.
<rdar://problem/14292693>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204741 91177308-0d34-0410-b5e6-96231b3b80d8
Extend the target hook to take also the operand index into account when
calculating the cost of the constant materialization.
Related to <rdar://problem/16381500>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204435 91177308-0d34-0410-b5e6-96231b3b80d8
This commit extends the coverage of the constant hoisting pass, adds additonal
debug output and updates the function names according to the style guide.
Related to <rdar://problem/16381500>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204389 91177308-0d34-0410-b5e6-96231b3b80d8
as well. I don't see any particular need but it imposes no cost to
support it and it makes the API cleaner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203448 91177308-0d34-0410-b5e6-96231b3b80d8
the stack of the analysis group because they are all immutable passes.
This is made clear by Craig's recent work to use override
systematically -- we weren't overriding anything for 'finalizePass'
because there is no such thing.
This is kind of a lame restriction on the API -- we can no longer push
and pop things, we just set up the stack and run. However, I'm not
invested in building some better solution on top of the existing
(terrifying) immutable pass and legacy pass manager.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203437 91177308-0d34-0410-b5e6-96231b3b80d8
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203083 91177308-0d34-0410-b5e6-96231b3b80d8
to ensure we don't mess up any of the overrides. Necessary for cleaning
up the Value use iterators and enabling range-based traversing of use
lists.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202958 91177308-0d34-0410-b5e6-96231b3b80d8
a bit surprising, as the class is almost entirely abstracted away from
any particular IR, however it encodes the comparsion predicates which
mutate ranges as ICmp predicate codes. This is reasonable as they're
used for both instructions and constants. Thus, it belongs in the IR
library with instructions and constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202838 91177308-0d34-0410-b5e6-96231b3b80d8
this would have been required because of the use of DataLayout, but that
has moved into the IR proper. It is still required because this folder
uses the constant folding in the analysis library (which uses the
datalayout) as the more aggressive basis of its folder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202832 91177308-0d34-0410-b5e6-96231b3b80d8
Move the test for this class into the IR unittests as well.
This uncovers that ValueMap too is in the IR library. Ironically, the
unittest for ValueMap is useless in the Support library (honestly, so
was the ValueHandle test) and so it already lives in the IR unittests.
Mmmm, tasty layering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202821 91177308-0d34-0410-b5e6-96231b3b80d8
business.
This header includes Function and BasicBlock and directly uses the
interfaces of both classes. It has to do with the IR, it even has that
in the name. =] Put it in the library it belongs to.
This is one step toward making LLVM's Support library survive a C++
modules bootstrap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202814 91177308-0d34-0410-b5e6-96231b3b80d8
Eventually DataLayoutPass should go away, but for now that is the only easy
way to get a DataLayout in some APIs. This patch only changes the ones that
have easy access to a Module.
One interesting issue with sometimes using DataLayoutPass and sometimes
fetching it from the Module is that we have to make sure they are equivalent.
We can get most of the way there by always constructing the pass with a Module.
In fact, the pass could be changed to point to an external DataLayout instead
of owning one to make this stricter.
Unfortunately, the C api passes a DataLayout, so it has to be up to the caller
to make sure the pass and the module are in sync.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202204 91177308-0d34-0410-b5e6-96231b3b80d8
After this I will set the default back to F_None. The advantage is that
before this patch forgetting to set F_Binary would corrupt a file on windows.
Forgetting to set F_Text produces one that cannot be read in notepad, which
is a better failure mode :-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202052 91177308-0d34-0410-b5e6-96231b3b80d8
During LSR of one loop we can run into a situation where we have to expand the
start of a recurrence of a loop induction variable in this loop. This start
value is a value derived of the induction variable of a preceeding loop. SCEV
has cannonicalized this value to a different recurrence than the recurrence of
the preceeding loop's induction variable (the type and/or step direction) has
changed). When we come to instantiate this SCEV we created a second induction
variable in this preceeding loop. This patch tries to base such derived
induction variables of the preceeding loop's induction variable.
This helps twolf on arm and seems to help scimark2 on x86.
Reapply with a fix for the case of a value derived from a pointer.
radar://15970709
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201496 91177308-0d34-0410-b5e6-96231b3b80d8