global roots in from callees to callers. The BU graphs do not have accurate
globals information and all of the clients know it. Instead, just make sure
the GG is up-to-date, and they will be perfectly satiated.
This speeds up the BU pass on 176.gcc from 5.5s to 1.5s, and Loc+BU+TD
from 7s to 2.7s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20786 91177308-0d34-0410-b5e6-96231b3b80d8
1. Increase max node size from 64->256 to avoid collapsing an important
structure in 181.mcf
2. If we have multiple calls to an indirect call node with an indirect
callee, fold these call nodes together, to avoid DSA turning apoc into
a flaming fireball of death when analyzing 176.gcc.
With this change, 176.gcc now takes ~7s to analyze for loc+bu+td, with
5.7s of that in the BU pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20775 91177308-0d34-0410-b5e6-96231b3b80d8
this clone is supposed to be used for *ALL* of the functions in the SCC.
This fixes the memory explosion problem the TD pass was having, reducing the
memory growth from 24MB -> 3.5MB on povray and 270MB ->8.3MB on perlbmk!
This obviously also speeds up the TD pass *a lot*.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20763 91177308-0d34-0410-b5e6-96231b3b80d8
up the TD pass about 30% for povray and perlbmk. It's still not clear why
copying a 5MB set of graphs turns into a 25MB set of graphs though :(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20762 91177308-0d34-0410-b5e6-96231b3b80d8
sites that target multiple callees. If we have a function table, for
example, with N callees, and M callers call through it, we used to have
to perform O(M*N) graph inlinings. Now we perform O(M+N) inlinings.
This speeds up the td pass on perlbmk from 36.26s to 25.75s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20743 91177308-0d34-0410-b5e6-96231b3b80d8
graph into all of the functions it calls when we visit a graph, change it so
that the graph visitor inlines all of the callers of a graph into the current
graph when it visits it.
While we're at it, inline global information from the GG instead of from each
of the callers. The GG contains a superset of the info that the callers do
anyway, and this way we only need to do it one time (not one for each caller).
This speeds up the TD pass substantially on several programs, and there is
still room for improvement. For example, the TD pass used to take 147s
on perlbmk, it now takes 36s. On povray, we went from about 5s to 1.97s.
134.perl is down from ~1s for Loc+BU+TD to .6s.
The TD pass needs a lot of improvement though, which will occur with later
patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20723 91177308-0d34-0410-b5e6-96231b3b80d8
Globals Graph for the local pass, the second is after all of the locals
graphs have been constructed. This allows for many additional global EC's
to be recognized that weren't before. This speeds up analysis of programs
like 177.mesa, where it changes DSA from taking 0.712s to 0.4018s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20711 91177308-0d34-0410-b5e6-96231b3b80d8
to tell apart anyway, and only track the leader for of these equivalence
classes in our graphs.
This dramatically reduces the number of GlobalValue*'s that appear in scalar
maps, which A) reduces memory usage, by eliminating many many scalarmap entries
and B) reduces time for operations that need to execute an operation for each
global in the scalar map.
As an example, this reduces the memory used to analyze 176.gcc from 1GB to
511MB, which (while it's still way too much) is better because it doesn't hit
swap anymore. On eon, this shrinks the local graphs from 14MB to 6.8MB,
shrinks the bu+td graphs of povray from 50M to 40M, shrinks the TD graphs of
130.li from 8.8M to 3.6M, etc.
This change also speeds up DSA on large programs where this makes a big
difference. For example, 130.li goes from 1.17s -> 0.56s, 134.perl goes
from 2.14 -> 0.93s, povray goes from 15.63s->7.99s (!!!).
This also apparently either fixes the problem that caused DSA to crash on
perlbmk and gcc, or it hides it, because DSA now works on these. These
both take entirely too much time in the TD pass (147s for perl, 538s for
gcc, vs 7.67/5.9s in the bu pass for either one), but this is a known
problem that I'll deal with later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20696 91177308-0d34-0410-b5e6-96231b3b80d8
effect these calls can have is due to global variables, and these passes
all use the globals graph to capture their effect anyway. This speeds up
the BU pass very slightly on perlbmk, reducing the number of dsnodes
allocated from 98913 to 96423.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20676 91177308-0d34-0410-b5e6-96231b3b80d8
graph, and the combination of a function that does not reference globals, takes
not arguments and returns no value is pretty rare.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20670 91177308-0d34-0410-b5e6-96231b3b80d8
to determine mod/ref behavior, instead of creating a *copy* of the caller
graph and inlining the callee graph into the copy.
This speeds up aa-eval on Ptrdist/yacr2 from 109.13s to 3.98s, and gives
identical results. The speedup is similar on other programs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20669 91177308-0d34-0410-b5e6-96231b3b80d8
1. Chain to the parent implementation of M/R analysis if we can't find
any information. It has some heuristics that often do well.
2. Do not clear all flags, this can make invalid nodes by turning nodes
that used to be collapsed into non-collapsed nodes (fixing crashes)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20659 91177308-0d34-0410-b5e6-96231b3b80d8
{ short, short }
to
short
where the second short maps onto the second field of the first struct. In
this case, the struct index is not aligned, so we should avoid calling
getLink(2), which asserts out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20626 91177308-0d34-0410-b5e6-96231b3b80d8
void foo() {
G = 1;
}
would have an empty DSGraph even though G (a global) is directly used
in the function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20619 91177308-0d34-0410-b5e6-96231b3b80d8
using Function::arg_{iterator|begin|end}. Likewise Module::g* -> Module::global_*.
This patch is contributed by Gabor Greif, thanks!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20597 91177308-0d34-0410-b5e6-96231b3b80d8
If we fold three constants together (c1+c2+c3), make sure to keep
LHSC updated, instead of reusing (in this case), the 1 instead of the
partial sum.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20337 91177308-0d34-0410-b5e6-96231b3b80d8
Actually teach dsa about select instructions. This doesn't affect the
graph in any way other than not setting a spurious U marker on pointer
nodes that are selected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20324 91177308-0d34-0410-b5e6-96231b3b80d8
X = gep null, ...
Used to not create a scalar map entry for X, which caused clients to barf.
This is bad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20316 91177308-0d34-0410-b5e6-96231b3b80d8
infinite loops (using the new replaceSymbolicValuesWithConcrete method).
This patch reverts this patch:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050131/023830.html
... which was an attempted fix for this problem. Unfortunately, that patch
caused test/Regression/Transforms/IndVarsSimplify/exit_value_tests.llx to fail
and slightly castrated the entire analysis. This patch fixes it right.
This patch is dedicated to jeffc, for making me deal with this. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20146 91177308-0d34-0410-b5e6-96231b3b80d8
into a temporary graph, remember it for later, then inline the tmp graph into
the call site.
In the case where there are other call sites to the same set of functions, this
permits us to just inline the temporary graph instead of all of the callees.
This turns N*M inlining situations into an N+M inlining situation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20036 91177308-0d34-0410-b5e6-96231b3b80d8
* Change the FunctionCalls and AuxFunctionCalls vectors into std::lists.
This makes many operations on these lists much more natural, and avoids
*exteremely* expensive copying of DSCallSites (e.g. moving nodes around
between lists, erasing a node from not the end of the vector, etc).
With a profile build of analyze, this speeds up BU DS from 25.14s to
12.59s on 176.gcc. I expect that it would help TD even more, but I don't
have data for it.
This effectively eliminates removeIdenticalCalls and children from the
profile, going from 6.53 to 0.27s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19939 91177308-0d34-0410-b5e6-96231b3b80d8
not invalidated on entry and on exit of the block. This fixes some N^2
behavior in common cases, and speeds up gcc another 5% to 22.35s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19910 91177308-0d34-0410-b5e6-96231b3b80d8
will cause us to miss cases where the input pointer to a load could be value
numbered to another load. Something like this:
%X = load int* %P1
%Y = load int* %P2
Those are obviously the same if P1/P2 are the same. The code this patch
removes attempts to handle that. However, since GCSE iterates, this doesn't
actually buy us anything: GCSE will first replace P1 or P2 with the other
one, then the load can be value numbered as equal.
Removing this code speeds up gcse a lot. On 176.gcc in debug mode, this
speeds up gcse from 29.08s -> 25.73s, a 13% savings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19906 91177308-0d34-0410-b5e6-96231b3b80d8
If needed, this can be resurrected from CVS.
Note that several of the interfaces (e.g. the IPModRef ones) are supersumed
by generic AliasAnalysis interfaces that have been written since this code
was developed (and they are not DSA specific).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19864 91177308-0d34-0410-b5e6-96231b3b80d8
1. Rename createLoaderPass to CreateProfileLoaderPass
2. Opt shouldn't use the pass registered in CodeGen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19431 91177308-0d34-0410-b5e6-96231b3b80d8
a function call at the core of the loop inline and removing unused
stack variables from an often called function. This doesn't improve things
much, the real saving will be by reducing the number of calls to this
function (100K+ when linking kimwitu++).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19119 91177308-0d34-0410-b5e6-96231b3b80d8
Regression/Analysis/GlobalsModRef/purecse.ll. Isn't this what the
-Woverload-whatever flag would warn about :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19018 91177308-0d34-0410-b5e6-96231b3b80d8
Make only one print method to avoid overloaded virtual warnings when \
compiled with -Woverloaded-virtual
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18589 91177308-0d34-0410-b5e6-96231b3b80d8
All SPEC CFP 95 programs now work, though the JIT isn't loading -lf2c right
so they aren't testing correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18499 91177308-0d34-0410-b5e6-96231b3b80d8
themselves. Make sure to update DSInfo correctly. This fixes a testcase
reduced from Prolangs-C++/objects
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17439 91177308-0d34-0410-b5e6-96231b3b80d8
function pointer equivalences. This fixes many problems, including a testcase
reduced Prolangs-C++/objects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17437 91177308-0d34-0410-b5e6-96231b3b80d8
a DSGraph at a time instead of a function at a time. This is also more
correct, though it doesn't seem to fix any programs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17435 91177308-0d34-0410-b5e6-96231b3b80d8
* *DO NOT* print CBU graphs when asked to print our own. This is just
FREAKING confusing and misleading: it's better to not print anything.
* Simplify and clean up some code
* Add some more paranoia assertion checking code that I found to track
down this bug:
* Fix a nasty bug that was causing us to crash on Prolangs-C++/objects,
where we were missing processing some graphs. This hunk is the bugfix:
- if (!I->isExternal() && !FoldedGraphsMap.count(I))
+ if (!I->isExternal() && !ValMap.count(I))
urg!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17386 91177308-0d34-0410-b5e6-96231b3b80d8
the CBU graphs, copy them instead of hacking on the CBU graphs.
Also, instead of forwarding request from ECGraphs clients to the CBU graphs
clients, service them ourselves.
Finally, remove a broken "optimization"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17378 91177308-0d34-0410-b5e6-96231b3b80d8
1. Calls to external global VARIABLES should not be treated as a call to an
external function
2. Efficiently deleting an element from a vector by using std::swap with
the back, then pop_back is NOT a good way to keep the vector sorted.
3. Our hope of having stuff get deleted by making them redundant just won't
work. In particular, if we have three calls in sequence that should be
merged: A, B, C first we unify B into A. To be sure that they appeared
identical (so B would be erased) we set B = A. On the next step, we
unified C into A and set C = A. Unfortunately, this is no guarantee that
C = B, so we would fail to delete the dead call. Switch to a more
explicit scheme.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17357 91177308-0d34-0410-b5e6-96231b3b80d8
* change some uses of NH.getNode() in a bool context to use !NH.isNull()
* Fix a bunch of places where we depended on the (undefined) order of
evaluation of arguments to function calls to ensure that getNode() was
called before getOffset(). In practice, this was NOT happening.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17354 91177308-0d34-0410-b5e6-96231b3b80d8
to go in. This patch allows us to compute the trip count of loops controlled
by values loaded from constant arrays. The cannonnical example of this is
strlen when passed a constant argument:
for (int i = 0; "constantstring"[i]; ++i) ;
return i;
In this case, it will compute that the loop executes 14 times, which means
that the exit value of i is 14. Because of this, the loop gets DCE'd and
we are happy. This also applies to anything that does similar things, e.g.
loops like this:
const float Array[] = { 0.1, 2.1, 3.2, 23.21 };
for (int i = 0; Array[i] < 20; ++i)
and is actually fairly general.
The problem with this is that it almost never triggers. The reason is that
we run indvars and the loop optimizer only at compile time, which is before
things like strlen and strcpy have been inlined into the program from libc.
Because of this, it almost never is used (it triggers twice in specint2k).
I'm committing it because it DOES work, may be useful in the future, and
doesn't slow us down at all. If/when we start running the loop optimizer
at link-time (-O4?) this will be very nice indeed :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16926 91177308-0d34-0410-b5e6-96231b3b80d8