may-aliasing stores that partially overlap with different base
pointers. This implements PR6043 and the non-variable part of
PR8657
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120485 91177308-0d34-0410-b5e6-96231b3b80d8
1. if the underlying pointer passed in can be resolved
to any argument or alloca, then we don't need to scan.
Previously we would only avoid the scan if the alloca
or byval was actually considered dead.
2. The dead store processing code is itself completely
dead and didn't handle volatile stores right anyway,
so delete it. This allows simplifying the interface
to RemoveAccessedObjects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120467 91177308-0d34-0410-b5e6-96231b3b80d8
made sense to me. We now have a set of dead stack objects, and
they become live when loaded. Fix a theoretical problem where
we'd pass in the wrong pointer to the alias query.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120465 91177308-0d34-0410-b5e6-96231b3b80d8
If the call might read all the allocas, stop scanning early.
Convert a vector to smallvector, shrink SmallPtrSet to 16 instead
of 64 to avoid crazy linear scans.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120463 91177308-0d34-0410-b5e6-96231b3b80d8
now that DSE hacks on them. This fixes a regression I introduced,
by generalizing DSE to hack on transfers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120445 91177308-0d34-0410-b5e6-96231b3b80d8
about pairs of AA::Location's instead of looking for MemDep's
"Def" predicate. This is more powerful and general, handling
memset/memcpy/store all uniformly, and implementing PR8701 and
probably obsoleting parts of memcpyoptimizer.
This also fixes an obscure bug with init.trampoline and i8
stores, but I'm not surprised it hasn't been hit yet. Enhancing
init.trampoline to carry the size that it stores would allow
DSE to be much more aggressive about optimizing them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120406 91177308-0d34-0410-b5e6-96231b3b80d8
contains "ref".
Enhance DSE to use a modref query instead of a store-specific hack
to generalize the "ignore may-alias stores" optimization to handle
memset and memcpy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120368 91177308-0d34-0410-b5e6-96231b3b80d8
1. Don't bother trying to optimize:
lifetime.end(ptr)
store(ptr)
as it is undefined, and therefore shouldn't exist.
2. Move the 'storing a loaded pointer' xform up, simplifying
the may-aliased store code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120359 91177308-0d34-0410-b5e6-96231b3b80d8
by my recent GVN improvement. Looking through a single layer of
PHI nodes when attempting to sink GEPs, we need to iteratively
look through arbitrary PHI nests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120202 91177308-0d34-0410-b5e6-96231b3b80d8
method in MemDep instead of inserting an instruction, doing a query,
then removing it. Neither operation is effectively cached.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119930 91177308-0d34-0410-b5e6-96231b3b80d8
allowing the memcpy to be eliminated.
Unfortunately, the requirements on byval's without explicit
alignment are really weak and impossible to predict in the
mid-level optimizer, so this doesn't kick in much with current
frontends. The fix is to change clang to set alignment on all
byval arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119916 91177308-0d34-0410-b5e6-96231b3b80d8
if all the operands of the PHI are equivalent. This allows CodeGenPrepare to undo
unprofitable PRE transforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119853 91177308-0d34-0410-b5e6-96231b3b80d8
preserves LCSSA form out of ScalarEvolution and into the LoopInfo
class. Use it to check that SimplifyInstruction simplifications
are not breaking LCSSA form. Fixes PR8622.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119727 91177308-0d34-0410-b5e6-96231b3b80d8
this was a tree of hashtables, and a query recursed into the table for the immediate dominator ad infinitum
if the initial lookup failed. This led to really bad performance on tall, narrow CFGs.
We can instead replace it with what is conceptually a multimap of value numbers to leaders (actually
represented by a hashtable with a list of Value*'s as the value type), and then
determine which leader from that set to use very cheaply thanks to the DFS numberings maintained by
DominatorTree. Because there are typically few duplicates of a given value, this scan tends to be
quite fast. Additionally, we use a custom linked list and BumpPtr allocation to avoid any unnecessary
allocation in representing the value-side of the multimap.
This change brings with it a 15% (!) improvement in the total running time of GVN on 403.gcc, which I
think is pretty good considering that includes all the "real work" being done by MemDep as well.
The one downside to this approach is that we can no longer use GVN to perform simple conditional progation,
but that seems like an acceptable loss since we now have LVI and CorrelatedValuePropagation to pick up
the slack. If you see conditional propagation that's not happening, please file bugs against LVI or CVP.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119714 91177308-0d34-0410-b5e6-96231b3b80d8
refusing to optimize two memcpy's like this:
copy A <- B
copy C <- A
if it couldn't prove that noalias(B,C). We can eliminate
the copy by producing a memmove instead of memcpy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119694 91177308-0d34-0410-b5e6-96231b3b80d8
there is no need to check to see if the source and dest of a memcpy are noalias,
behavior is undefined if not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119691 91177308-0d34-0410-b5e6-96231b3b80d8
if it is passed as a byval argument. The byval argument will just be a
read, so it is safe to read from the original global instead. This allows
us to promote away the %agg.tmp alloca in PR8582
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119686 91177308-0d34-0410-b5e6-96231b3b80d8
systematically, CollapsePhi will always return null here. Note
that CollapsePhi did an extra check, isSafeReplacement, which
the SimplifyInstruction logic does not do. I think that check
was bogus - I guess we will soon find out! (It was originally
added in commit 41998 without a testcase).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119456 91177308-0d34-0410-b5e6-96231b3b80d8
"%z = %x and %y". If GVN can prove that %y equals %x, then it turns
this into "%z = %x and %x". With the new code, %z will be replaced
with %x everywhere (and then deleted). Previously %z would be value
numbered too, which is a waste of time. Also, while a clever value
numbering algorithm would give %z the same value number as %x, our
current one doesn't do so (at least I don't think it does). The new
logic has an essentially equivalent effect to what you would get if
%z was given the same value number as %x, i.e. it should make value
numbering smarter. While there, get hold of target data once at the
start rather than a gazillion times all over the place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118923 91177308-0d34-0410-b5e6-96231b3b80d8
references. For example, this allows gvn to eliminate the load in
this example:
void foo(int n, int* p, int *q) {
p[0] = 0;
p[1] = 1;
if (n) {
*q = p[0];
}
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118714 91177308-0d34-0410-b5e6-96231b3b80d8
needs to be guaranteed never to be run on an unreachable block. However, earlier block simplifications may have
changed the CFG to make block that were reachable when we began our iteration unreachable by the time we try to
simplify them. (Note that this also means that our depth-first iterators were potentially being invalidated).
This should not have a large impact on code quality, since later runs of instcombine should pick up these simplifications.
Fixes PR8506.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117709 91177308-0d34-0410-b5e6-96231b3b80d8
must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize
the pass's dependencies.
Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.
I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems
with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass
registration/creation, please send the testcase to me directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116820 91177308-0d34-0410-b5e6-96231b3b80d8
perform initialization without static constructors AND without explicit initialization
by the client. For the moment, passes are required to initialize both their
(potential) dependencies and any passes they preserve. I hope to be able to relax
the latter requirement in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116334 91177308-0d34-0410-b5e6-96231b3b80d8
formulae which become illegal as a result of the offset updating don't
escape.
This is for rdar://8529692. No testcase yet, because the given cases
hit use-list ordering differences.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116093 91177308-0d34-0410-b5e6-96231b3b80d8
This doesn't usually matter, because the other heuristics usually
succeed regardless, but it's good to keep the register use
bookkeeping consistent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116005 91177308-0d34-0410-b5e6-96231b3b80d8
initialization functions that initialize the set of passes implemented in
that library. Add C bindings for these functions as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115927 91177308-0d34-0410-b5e6-96231b3b80d8
Anyone interested in more general PRE would be better served by implementing it separately, to get real
anticipation calculation, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115337 91177308-0d34-0410-b5e6-96231b3b80d8
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.
Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics.
MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces. Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.
The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115243 91177308-0d34-0410-b5e6-96231b3b80d8
code size (making this transform code size neutral), and it allows us to hoist values out of loops, which is always
a good thing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115205 91177308-0d34-0410-b5e6-96231b3b80d8
Because of this, we cannot use the Simplify* APIs, as they can assert-fail on unreachable code. Since it's not easy to determine
if a given threading will cause a block to become unreachable, simply defer simplifying simplification to later InstCombine and/or
DCE passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115082 91177308-0d34-0410-b5e6-96231b3b80d8
register pressure and thus excess spills, which we don't currently recover from well. This should
be re-evaluated in the future if our ability to generate good spills/splits improves.
Partial fix for <rdar://problem/7635585>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114919 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts revision 114633. It was breaking llvm-gcc-i386-linux-selfhost.
It seems there is a downstream bug that is exposed by
-cgp-critical-edge-splitting=0. When that bug is fixed, this patch can go back
in.
Note that the changes to tailcallfp2.ll are not reverted. They were good are
required.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114859 91177308-0d34-0410-b5e6-96231b3b80d8
Splitting critical edges at the merge point only addressed part of the issue; it is also possible for non-post-domination
to occur when the path from the load to the merge has branches in it. Unfortunately, full anticipation analysis is
time-consuming, so for now approximate it. This is strictly more conservative than real anticipation, so we will miss
some cases that real PRE would allow, but we also no longer insert loads into paths where they didn't exist before. :-)
This is a very slight net positive on SPEC for me (0.5% on average). Most of the benchmarks are largely unaffected, but
when it pays off it pays off decently: 181.mcf improves by 4.5% on my machine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114785 91177308-0d34-0410-b5e6-96231b3b80d8