to std::set<std::pair<Inst,Func>> to avoid duplicate entries.
This speeds up the CompleteBU pass from 1.99s to .15s on povray and the
eqgraph passes from 1.5s to .16s on the same.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21031 91177308-0d34-0410-b5e6-96231b3b80d8
this clone is supposed to be used for *ALL* of the functions in the SCC.
This fixes the memory explosion problem the TD pass was having, reducing the
memory growth from 24MB -> 3.5MB on povray and 270MB ->8.3MB on perlbmk!
This obviously also speeds up the TD pass *a lot*.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20763 91177308-0d34-0410-b5e6-96231b3b80d8
sites that target multiple callees. If we have a function table, for
example, with N callees, and M callers call through it, we used to have
to perform O(M*N) graph inlinings. Now we perform O(M+N) inlinings.
This speeds up the td pass on perlbmk from 36.26s to 25.75s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20743 91177308-0d34-0410-b5e6-96231b3b80d8
graph into all of the functions it calls when we visit a graph, change it so
that the graph visitor inlines all of the callers of a graph into the current
graph when it visits it.
While we're at it, inline global information from the GG instead of from each
of the callers. The GG contains a superset of the info that the callers do
anyway, and this way we only need to do it one time (not one for each caller).
This speeds up the TD pass substantially on several programs, and there is
still room for improvement. For example, the TD pass used to take 147s
on perlbmk, it now takes 36s. On povray, we went from about 5s to 1.97s.
134.perl is down from ~1s for Loc+BU+TD to .6s.
The TD pass needs a lot of improvement though, which will occur with later
patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20723 91177308-0d34-0410-b5e6-96231b3b80d8
to tell apart anyway, and only track the leader for of these equivalence
classes in our graphs.
This dramatically reduces the number of GlobalValue*'s that appear in scalar
maps, which A) reduces memory usage, by eliminating many many scalarmap entries
and B) reduces time for operations that need to execute an operation for each
global in the scalar map.
As an example, this reduces the memory used to analyze 176.gcc from 1GB to
511MB, which (while it's still way too much) is better because it doesn't hit
swap anymore. On eon, this shrinks the local graphs from 14MB to 6.8MB,
shrinks the bu+td graphs of povray from 50M to 40M, shrinks the TD graphs of
130.li from 8.8M to 3.6M, etc.
This change also speeds up DSA on large programs where this makes a big
difference. For example, 130.li goes from 1.17s -> 0.56s, 134.perl goes
from 2.14 -> 0.93s, povray goes from 15.63s->7.99s (!!!).
This also apparently either fixes the problem that caused DSA to crash on
perlbmk and gcc, or it hides it, because DSA now works on these. These
both take entirely too much time in the TD pass (147s for perl, 538s for
gcc, vs 7.67/5.9s in the bu pass for either one), but this is a known
problem that I'll deal with later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20696 91177308-0d34-0410-b5e6-96231b3b80d8
* Change the FunctionCalls and AuxFunctionCalls vectors into std::lists.
This makes many operations on these lists much more natural, and avoids
*exteremely* expensive copying of DSCallSites (e.g. moving nodes around
between lists, erasing a node from not the end of the vector, etc).
With a profile build of analyze, this speeds up BU DS from 25.14s to
12.59s on 176.gcc. I expect that it would help TD even more, but I don't
have data for it.
This effectively eliminates removeIdenticalCalls and children from the
profile, going from 6.53 to 0.27s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19939 91177308-0d34-0410-b5e6-96231b3b80d8
Move include/Config and include/Support into include/llvm/Config,
include/llvm/ADT and include/llvm/Support. From here on out, all LLVM
public header files must be under include/llvm/.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16137 91177308-0d34-0410-b5e6-96231b3b80d8
removeDeadNodes is called, only call it at the end of the pass being run.
This saves 1.3 seconds running DSA on 177.mesa (5.3->4.0s), which is
pretty big. This is only possible because of the automatic garbage
collection done on forwarding nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11178 91177308-0d34-0410-b5e6-96231b3b80d8
the globals directly. This doesn't save any substantial time, however, because the
globals graph only contains globals!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10997 91177308-0d34-0410-b5e6-96231b3b80d8
efficient in the case where a function calls into the same graph multiple times
(ie, it either contains multiple calls to the same function, or multiple calls
to functions in the same SCC graph)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10986 91177308-0d34-0410-b5e6-96231b3b80d8
map was only used to implement a marginal GlobalsGraph optimization, and it
actually slows the analysis down (due to the overhead of keeping it), so just
eliminate it entirely.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10955 91177308-0d34-0410-b5e6-96231b3b80d8
after all callers are inlined into the current graph.
(2) Optimize the way a graph is inlined into its callees in the TD phase:
(a) Use DSGraph::cloneReachableSubgraph to clone only a subgraph at
each call site, for faster inlining.
(b) Clone separately for the same callee at different call sites,
since only the reachable subgraph is being cloned, not the entire
caller graph.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7188 91177308-0d34-0410-b5e6-96231b3b80d8