1. Increase max node size from 64->256 to avoid collapsing an important
structure in 181.mcf
2. If we have multiple calls to an indirect call node with an indirect
callee, fold these call nodes together, to avoid DSA turning apoc into
a flaming fireball of death when analyzing 176.gcc.
With this change, 176.gcc now takes ~7s to analyze for loc+bu+td, with
5.7s of that in the BU pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20775 91177308-0d34-0410-b5e6-96231b3b80d8
this clone is supposed to be used for *ALL* of the functions in the SCC.
This fixes the memory explosion problem the TD pass was having, reducing the
memory growth from 24MB -> 3.5MB on povray and 270MB ->8.3MB on perlbmk!
This obviously also speeds up the TD pass *a lot*.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20763 91177308-0d34-0410-b5e6-96231b3b80d8
up the TD pass about 30% for povray and perlbmk. It's still not clear why
copying a 5MB set of graphs turns into a 25MB set of graphs though :(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20762 91177308-0d34-0410-b5e6-96231b3b80d8
sites that target multiple callees. If we have a function table, for
example, with N callees, and M callers call through it, we used to have
to perform O(M*N) graph inlinings. Now we perform O(M+N) inlinings.
This speeds up the td pass on perlbmk from 36.26s to 25.75s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20743 91177308-0d34-0410-b5e6-96231b3b80d8
graph into all of the functions it calls when we visit a graph, change it so
that the graph visitor inlines all of the callers of a graph into the current
graph when it visits it.
While we're at it, inline global information from the GG instead of from each
of the callers. The GG contains a superset of the info that the callers do
anyway, and this way we only need to do it one time (not one for each caller).
This speeds up the TD pass substantially on several programs, and there is
still room for improvement. For example, the TD pass used to take 147s
on perlbmk, it now takes 36s. On povray, we went from about 5s to 1.97s.
134.perl is down from ~1s for Loc+BU+TD to .6s.
The TD pass needs a lot of improvement though, which will occur with later
patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20723 91177308-0d34-0410-b5e6-96231b3b80d8
Globals Graph for the local pass, the second is after all of the locals
graphs have been constructed. This allows for many additional global EC's
to be recognized that weren't before. This speeds up analysis of programs
like 177.mesa, where it changes DSA from taking 0.712s to 0.4018s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20711 91177308-0d34-0410-b5e6-96231b3b80d8
to tell apart anyway, and only track the leader for of these equivalence
classes in our graphs.
This dramatically reduces the number of GlobalValue*'s that appear in scalar
maps, which A) reduces memory usage, by eliminating many many scalarmap entries
and B) reduces time for operations that need to execute an operation for each
global in the scalar map.
As an example, this reduces the memory used to analyze 176.gcc from 1GB to
511MB, which (while it's still way too much) is better because it doesn't hit
swap anymore. On eon, this shrinks the local graphs from 14MB to 6.8MB,
shrinks the bu+td graphs of povray from 50M to 40M, shrinks the TD graphs of
130.li from 8.8M to 3.6M, etc.
This change also speeds up DSA on large programs where this makes a big
difference. For example, 130.li goes from 1.17s -> 0.56s, 134.perl goes
from 2.14 -> 0.93s, povray goes from 15.63s->7.99s (!!!).
This also apparently either fixes the problem that caused DSA to crash on
perlbmk and gcc, or it hides it, because DSA now works on these. These
both take entirely too much time in the TD pass (147s for perl, 538s for
gcc, vs 7.67/5.9s in the bu pass for either one), but this is a known
problem that I'll deal with later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20696 91177308-0d34-0410-b5e6-96231b3b80d8
effect these calls can have is due to global variables, and these passes
all use the globals graph to capture their effect anyway. This speeds up
the BU pass very slightly on perlbmk, reducing the number of dsnodes
allocated from 98913 to 96423.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20676 91177308-0d34-0410-b5e6-96231b3b80d8
graph, and the combination of a function that does not reference globals, takes
not arguments and returns no value is pretty rare.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20670 91177308-0d34-0410-b5e6-96231b3b80d8
to determine mod/ref behavior, instead of creating a *copy* of the caller
graph and inlining the callee graph into the copy.
This speeds up aa-eval on Ptrdist/yacr2 from 109.13s to 3.98s, and gives
identical results. The speedup is similar on other programs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20669 91177308-0d34-0410-b5e6-96231b3b80d8
1. Chain to the parent implementation of M/R analysis if we can't find
any information. It has some heuristics that often do well.
2. Do not clear all flags, this can make invalid nodes by turning nodes
that used to be collapsed into non-collapsed nodes (fixing crashes)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20659 91177308-0d34-0410-b5e6-96231b3b80d8
{ short, short }
to
short
where the second short maps onto the second field of the first struct. In
this case, the struct index is not aligned, so we should avoid calling
getLink(2), which asserts out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20626 91177308-0d34-0410-b5e6-96231b3b80d8
void foo() {
G = 1;
}
would have an empty DSGraph even though G (a global) is directly used
in the function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20619 91177308-0d34-0410-b5e6-96231b3b80d8
using Function::arg_{iterator|begin|end}. Likewise Module::g* -> Module::global_*.
This patch is contributed by Gabor Greif, thanks!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20597 91177308-0d34-0410-b5e6-96231b3b80d8
If we fold three constants together (c1+c2+c3), make sure to keep
LHSC updated, instead of reusing (in this case), the 1 instead of the
partial sum.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20337 91177308-0d34-0410-b5e6-96231b3b80d8
Actually teach dsa about select instructions. This doesn't affect the
graph in any way other than not setting a spurious U marker on pointer
nodes that are selected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20324 91177308-0d34-0410-b5e6-96231b3b80d8
X = gep null, ...
Used to not create a scalar map entry for X, which caused clients to barf.
This is bad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20316 91177308-0d34-0410-b5e6-96231b3b80d8
infinite loops (using the new replaceSymbolicValuesWithConcrete method).
This patch reverts this patch:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050131/023830.html
... which was an attempted fix for this problem. Unfortunately, that patch
caused test/Regression/Transforms/IndVarsSimplify/exit_value_tests.llx to fail
and slightly castrated the entire analysis. This patch fixes it right.
This patch is dedicated to jeffc, for making me deal with this. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20146 91177308-0d34-0410-b5e6-96231b3b80d8
into a temporary graph, remember it for later, then inline the tmp graph into
the call site.
In the case where there are other call sites to the same set of functions, this
permits us to just inline the temporary graph instead of all of the callees.
This turns N*M inlining situations into an N+M inlining situation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20036 91177308-0d34-0410-b5e6-96231b3b80d8
* Change the FunctionCalls and AuxFunctionCalls vectors into std::lists.
This makes many operations on these lists much more natural, and avoids
*exteremely* expensive copying of DSCallSites (e.g. moving nodes around
between lists, erasing a node from not the end of the vector, etc).
With a profile build of analyze, this speeds up BU DS from 25.14s to
12.59s on 176.gcc. I expect that it would help TD even more, but I don't
have data for it.
This effectively eliminates removeIdenticalCalls and children from the
profile, going from 6.53 to 0.27s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19939 91177308-0d34-0410-b5e6-96231b3b80d8
not invalidated on entry and on exit of the block. This fixes some N^2
behavior in common cases, and speeds up gcc another 5% to 22.35s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19910 91177308-0d34-0410-b5e6-96231b3b80d8
will cause us to miss cases where the input pointer to a load could be value
numbered to another load. Something like this:
%X = load int* %P1
%Y = load int* %P2
Those are obviously the same if P1/P2 are the same. The code this patch
removes attempts to handle that. However, since GCSE iterates, this doesn't
actually buy us anything: GCSE will first replace P1 or P2 with the other
one, then the load can be value numbered as equal.
Removing this code speeds up gcse a lot. On 176.gcc in debug mode, this
speeds up gcse from 29.08s -> 25.73s, a 13% savings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19906 91177308-0d34-0410-b5e6-96231b3b80d8
If needed, this can be resurrected from CVS.
Note that several of the interfaces (e.g. the IPModRef ones) are supersumed
by generic AliasAnalysis interfaces that have been written since this code
was developed (and they are not DSA specific).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19864 91177308-0d34-0410-b5e6-96231b3b80d8
1. Rename createLoaderPass to CreateProfileLoaderPass
2. Opt shouldn't use the pass registered in CodeGen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19431 91177308-0d34-0410-b5e6-96231b3b80d8
a function call at the core of the loop inline and removing unused
stack variables from an often called function. This doesn't improve things
much, the real saving will be by reducing the number of calls to this
function (100K+ when linking kimwitu++).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19119 91177308-0d34-0410-b5e6-96231b3b80d8
Regression/Analysis/GlobalsModRef/purecse.ll. Isn't this what the
-Woverload-whatever flag would warn about :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19018 91177308-0d34-0410-b5e6-96231b3b80d8
Make only one print method to avoid overloaded virtual warnings when \
compiled with -Woverloaded-virtual
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18589 91177308-0d34-0410-b5e6-96231b3b80d8
All SPEC CFP 95 programs now work, though the JIT isn't loading -lf2c right
so they aren't testing correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18499 91177308-0d34-0410-b5e6-96231b3b80d8