larger memsets. Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:
_mad_synth_mute: ## @mad_synth_mute
## BB#0: ## %entry
pushq %rax
movl $4096, %esi ## imm = 0x1000
callq ___bzero
popq %rax
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123089 91177308-0d34-0410-b5e6-96231b3b80d8
that it was leaving in loops after rotation (between the original latch
block and the original header.
With this change, it is possible for rotated loops to have just a single
basic block, which is useful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123075 91177308-0d34-0410-b5e6-96231b3b80d8
1. Rip out LoopRotate's domfrontier updating code. It isn't
needed now that LICM doesn't use DF and it is super complex
and gross.
2. Make DomTree updating code a lot simpler and faster. The
old loop over all the blocks was just to find a block??
3. Change the code that inserts the new preheader to just use
SplitCriticalEdge instead of doing an overcomplex
reimplementation of it.
No behavior change, except for the name of the inserted preheader.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123072 91177308-0d34-0410-b5e6-96231b3b80d8
they all ready do). This removes two dominator recomputations prior to isel,
which is a 1% improvement in total llc time for 403.gcc.
The only potentially suspect thing is making GCStrategy recompute dominators if
it used a custom lowering strategy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123064 91177308-0d34-0410-b5e6-96231b3b80d8
them into the loop preheader, eliminating silly instructions like
"icmp i32 0, 100" in fixed tripcount loops. This also better exposes the
bigger problem with loop rotate that I'd like to fix: once this has been
folded, the duplicated conditional branch *often* turns into an uncond branch.
Not aggressively handling this is pessimizing later loop optimizations
somethin' fierce by making "dominates all exit blocks" checks fail.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123060 91177308-0d34-0410-b5e6-96231b3b80d8
1. Take a flags argument instead of a bool. This makes
it more clear to the reader what it is used for.
2. Add a flag that says that "remapping a value not in the
map is ok".
3. Reimplement MapValue to share a bunch of code and be a lot
more efficient. For lookup failures, don't drop null values
into the map.
4. Using the new flag a bunch of code can vaporize in LinkModules
and LoopUnswitch, kill it.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123058 91177308-0d34-0410-b5e6-96231b3b80d8
map from ValueMapper.h (giving us access to its utilities)
and add a fastpath in the loop rotation code, avoiding expensive
ssa updator manipulation for values with nothing to update.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123057 91177308-0d34-0410-b5e6-96231b3b80d8
into a separate function, so that it can be called from a loop using a worklist
rather than a loop traversing a whole basic block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122943 91177308-0d34-0410-b5e6-96231b3b80d8
step is to only process instructions in subloops if they have been modified by
an earlier simplification.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122869 91177308-0d34-0410-b5e6-96231b3b80d8
skipping them, but it should probably use a worklist and only revisit those
instructions in subloops that have actually changed. It should probably also
use a worklist after the first iteration like instsimplify now does. Regardless,
it's only 0.3% of opt -O2 time on 403.gcc if it replaces the instcombine placed
in the middle of the loop passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122868 91177308-0d34-0410-b5e6-96231b3b80d8
fewer things into the value numbering maps, but any speedup is beneath the noise threshold on my machine
on 403.gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122844 91177308-0d34-0410-b5e6-96231b3b80d8
avoids adding them to the various value numbering tables, resulting in a minor (~3%) speedup for GVN
on 40.gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122819 91177308-0d34-0410-b5e6-96231b3b80d8
when safe.
The testcase is basically this nested loop:
void foo(char *X) {
for (int i = 0; i != 100; ++i)
for (int j = 0; j != 100; ++j)
X[j+i*100] = 0;
}
which gets turned into a single memset now. clang -O3 doesn't optimize
this yet though due to a phase ordering issue I haven't analyzed yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122806 91177308-0d34-0410-b5e6-96231b3b80d8
instruction *after* the store. The store will always be deleted
if the transformation kicks in, so we'd do an N^2 scan of every
loop block. Whoops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122805 91177308-0d34-0410-b5e6-96231b3b80d8
FunctionPass. It probably doesn't have a reason to be a LoopPass, as it will
probably drop the simple fixed point and either use RPO iteration or Duncan's
approach in instsimplify of only revisiting instructions that have changed.
The next step is to preserve LoopSimplify. This looks like it won't be too hard,
although the pass manager doesn't actually seem to respect when non-loop passes
claim to preserve LCSSA or LoopSimplify. This will have to be fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122791 91177308-0d34-0410-b5e6-96231b3b80d8
that are allowed to have metadata operands are intrinsic calls,
and the only ones that take metadata currently return void.
Just reject all void instructions, which should not be value
numbered anyway. To future proof things, add an assert to the
getHashValue impl for calls to check that metadata operands
aren't present.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122759 91177308-0d34-0410-b5e6-96231b3b80d8
nested values, so they can change and drop to null, which can
change the hash and cause havok.
It turns out that it isn't a good idea to value number stuff
with metadata operands anyway, so... don't.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122758 91177308-0d34-0410-b5e6-96231b3b80d8
capacity on the Visited SmallPtrSet. On 403.gcc, this is about a 4.5% speedup of
CodeGenPrepare time (which itself is 10% of time spent in the backend).
This is progress towards PR8889.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122741 91177308-0d34-0410-b5e6-96231b3b80d8
On 176.gcc, this catches 13090 loads and calls, and increases the
number of simple instructions CSE'd from 29658 to 36208.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122727 91177308-0d34-0410-b5e6-96231b3b80d8
of instcombine that is currently in the middle of the loop pass pipeline. This
commit only checks in the pass; it will hopefully be enabled by default later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122719 91177308-0d34-0410-b5e6-96231b3b80d8
sure that the loop we're promoting into a memcpy doesn't mutate the input
of the memcpy. Before we were just checking that the dest of the memcpy
wasn't mod/ref'd by the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122712 91177308-0d34-0410-b5e6-96231b3b80d8
blocks in a loop, instead of just the header block. This makes it more
aggressive, able to handle Duncan's Ada examples.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122704 91177308-0d34-0410-b5e6-96231b3b80d8
isExitBlockDominatedByBlockInLoop is a relic of the days when domtree was
*just* a tree and didn't have DFS numbers. Checking DFS numbers is faster
and easier than "limiting the search of the tree".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122702 91177308-0d34-0410-b5e6-96231b3b80d8
header for now for memset/memcpy opportunities. It turns out that loop-rotate
is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for
loops" into 2 basic block loops that loop-idiom was ignoring.
With this fix, we form many *many* more memcpy and memsets than before, including
on the "history" loops in the viterbi benchmark, which look like this:
for (j=0; j<MAX_history; ++j) {
history_new[i][j+1] = history[2*i][j];
}
Transforming these loops into memcpy's speeds up the viterbi benchmark from
11.98s to 3.55s on my machine. Woo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122685 91177308-0d34-0410-b5e6-96231b3b80d8
size of a loop header instead of its own code size estimator.
This allows it to handle bitcasts etc more precisely.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122681 91177308-0d34-0410-b5e6-96231b3b80d8
pipeline to be caught by instcombine, and it's not feasible to catch them in SimplifyCFG because the
use-lists are in an inconsistent state at the point where it could know that it need to simplify them.
Instead, have CodeGenPrepare look for trivially redundant PHIs as part of its general cleanup effort.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122516 91177308-0d34-0410-b5e6-96231b3b80d8
I still think that LVI should be handling this, but that capability is some ways off in the future,
and this matters for some significant benchmarks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122378 91177308-0d34-0410-b5e6-96231b3b80d8
which have trapping constant exprs in them due to PHI nodes.
Eliminating them can cause the constant expr to be evalutated
on new paths if the input edges are critical.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122164 91177308-0d34-0410-b5e6-96231b3b80d8
a null endptr argument, because they may write to errno.
This fixes a seflhost miscompile observed on Linux targets when TBAA
was enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122014 91177308-0d34-0410-b5e6-96231b3b80d8
When it sees a promising select it now tries to figure out whether the condition of the select is known in any of the predecessors and if so it maps the operands appropriately.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121859 91177308-0d34-0410-b5e6-96231b3b80d8
The last uses of these functions were removed in r113852 when LazyValueInfo was permanently enabled and removed the need for them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121133 91177308-0d34-0410-b5e6-96231b3b80d8
zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method
trunc(), to be const and to return a new value instead of modifying the
object in place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121120 91177308-0d34-0410-b5e6-96231b3b80d8
memcpy's like:
memcpy(A, B)
memcpy(A, C)
we cannot delete the first memcpy as dead if A and C might be aliases.
If so, we actually get:
memcpy(A, B)
memcpy(A, A)
which is not correct to transform into:
memcpy(A, A)
This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks
Jakub!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120974 91177308-0d34-0410-b5e6-96231b3b80d8
Should have no functional change other than the order of two transformations that are mutually-exclusive and the exact formatting of debug output.
Internally, it now stores the ConstantInt*s as Constant*s, and actual undef values instead of nulls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120946 91177308-0d34-0410-b5e6-96231b3b80d8
20040709-1.c from the gcc testsuite. I was using the size of a
pointer instead of the pointee. This fixes rdar://8713376
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120519 91177308-0d34-0410-b5e6-96231b3b80d8
may-aliasing stores that partially overlap with different base
pointers. This implements PR6043 and the non-variable part of
PR8657
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120485 91177308-0d34-0410-b5e6-96231b3b80d8
1. if the underlying pointer passed in can be resolved
to any argument or alloca, then we don't need to scan.
Previously we would only avoid the scan if the alloca
or byval was actually considered dead.
2. The dead store processing code is itself completely
dead and didn't handle volatile stores right anyway,
so delete it. This allows simplifying the interface
to RemoveAccessedObjects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120467 91177308-0d34-0410-b5e6-96231b3b80d8
made sense to me. We now have a set of dead stack objects, and
they become live when loaded. Fix a theoretical problem where
we'd pass in the wrong pointer to the alias query.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120465 91177308-0d34-0410-b5e6-96231b3b80d8
If the call might read all the allocas, stop scanning early.
Convert a vector to smallvector, shrink SmallPtrSet to 16 instead
of 64 to avoid crazy linear scans.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120463 91177308-0d34-0410-b5e6-96231b3b80d8
now that DSE hacks on them. This fixes a regression I introduced,
by generalizing DSE to hack on transfers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120445 91177308-0d34-0410-b5e6-96231b3b80d8
about pairs of AA::Location's instead of looking for MemDep's
"Def" predicate. This is more powerful and general, handling
memset/memcpy/store all uniformly, and implementing PR8701 and
probably obsoleting parts of memcpyoptimizer.
This also fixes an obscure bug with init.trampoline and i8
stores, but I'm not surprised it hasn't been hit yet. Enhancing
init.trampoline to carry the size that it stores would allow
DSE to be much more aggressive about optimizing them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120406 91177308-0d34-0410-b5e6-96231b3b80d8
contains "ref".
Enhance DSE to use a modref query instead of a store-specific hack
to generalize the "ignore may-alias stores" optimization to handle
memset and memcpy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120368 91177308-0d34-0410-b5e6-96231b3b80d8
1. Don't bother trying to optimize:
lifetime.end(ptr)
store(ptr)
as it is undefined, and therefore shouldn't exist.
2. Move the 'storing a loaded pointer' xform up, simplifying
the may-aliased store code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120359 91177308-0d34-0410-b5e6-96231b3b80d8
by my recent GVN improvement. Looking through a single layer of
PHI nodes when attempting to sink GEPs, we need to iteratively
look through arbitrary PHI nests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120202 91177308-0d34-0410-b5e6-96231b3b80d8
method in MemDep instead of inserting an instruction, doing a query,
then removing it. Neither operation is effectively cached.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119930 91177308-0d34-0410-b5e6-96231b3b80d8
allowing the memcpy to be eliminated.
Unfortunately, the requirements on byval's without explicit
alignment are really weak and impossible to predict in the
mid-level optimizer, so this doesn't kick in much with current
frontends. The fix is to change clang to set alignment on all
byval arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119916 91177308-0d34-0410-b5e6-96231b3b80d8
if all the operands of the PHI are equivalent. This allows CodeGenPrepare to undo
unprofitable PRE transforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119853 91177308-0d34-0410-b5e6-96231b3b80d8
preserves LCSSA form out of ScalarEvolution and into the LoopInfo
class. Use it to check that SimplifyInstruction simplifications
are not breaking LCSSA form. Fixes PR8622.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119727 91177308-0d34-0410-b5e6-96231b3b80d8
this was a tree of hashtables, and a query recursed into the table for the immediate dominator ad infinitum
if the initial lookup failed. This led to really bad performance on tall, narrow CFGs.
We can instead replace it with what is conceptually a multimap of value numbers to leaders (actually
represented by a hashtable with a list of Value*'s as the value type), and then
determine which leader from that set to use very cheaply thanks to the DFS numberings maintained by
DominatorTree. Because there are typically few duplicates of a given value, this scan tends to be
quite fast. Additionally, we use a custom linked list and BumpPtr allocation to avoid any unnecessary
allocation in representing the value-side of the multimap.
This change brings with it a 15% (!) improvement in the total running time of GVN on 403.gcc, which I
think is pretty good considering that includes all the "real work" being done by MemDep as well.
The one downside to this approach is that we can no longer use GVN to perform simple conditional progation,
but that seems like an acceptable loss since we now have LVI and CorrelatedValuePropagation to pick up
the slack. If you see conditional propagation that's not happening, please file bugs against LVI or CVP.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119714 91177308-0d34-0410-b5e6-96231b3b80d8
refusing to optimize two memcpy's like this:
copy A <- B
copy C <- A
if it couldn't prove that noalias(B,C). We can eliminate
the copy by producing a memmove instead of memcpy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119694 91177308-0d34-0410-b5e6-96231b3b80d8
there is no need to check to see if the source and dest of a memcpy are noalias,
behavior is undefined if not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119691 91177308-0d34-0410-b5e6-96231b3b80d8
if it is passed as a byval argument. The byval argument will just be a
read, so it is safe to read from the original global instead. This allows
us to promote away the %agg.tmp alloca in PR8582
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119686 91177308-0d34-0410-b5e6-96231b3b80d8
systematically, CollapsePhi will always return null here. Note
that CollapsePhi did an extra check, isSafeReplacement, which
the SimplifyInstruction logic does not do. I think that check
was bogus - I guess we will soon find out! (It was originally
added in commit 41998 without a testcase).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119456 91177308-0d34-0410-b5e6-96231b3b80d8
"%z = %x and %y". If GVN can prove that %y equals %x, then it turns
this into "%z = %x and %x". With the new code, %z will be replaced
with %x everywhere (and then deleted). Previously %z would be value
numbered too, which is a waste of time. Also, while a clever value
numbering algorithm would give %z the same value number as %x, our
current one doesn't do so (at least I don't think it does). The new
logic has an essentially equivalent effect to what you would get if
%z was given the same value number as %x, i.e. it should make value
numbering smarter. While there, get hold of target data once at the
start rather than a gazillion times all over the place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118923 91177308-0d34-0410-b5e6-96231b3b80d8
references. For example, this allows gvn to eliminate the load in
this example:
void foo(int n, int* p, int *q) {
p[0] = 0;
p[1] = 1;
if (n) {
*q = p[0];
}
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118714 91177308-0d34-0410-b5e6-96231b3b80d8
needs to be guaranteed never to be run on an unreachable block. However, earlier block simplifications may have
changed the CFG to make block that were reachable when we began our iteration unreachable by the time we try to
simplify them. (Note that this also means that our depth-first iterators were potentially being invalidated).
This should not have a large impact on code quality, since later runs of instcombine should pick up these simplifications.
Fixes PR8506.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117709 91177308-0d34-0410-b5e6-96231b3b80d8
must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize
the pass's dependencies.
Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.
I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems
with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass
registration/creation, please send the testcase to me directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116820 91177308-0d34-0410-b5e6-96231b3b80d8
perform initialization without static constructors AND without explicit initialization
by the client. For the moment, passes are required to initialize both their
(potential) dependencies and any passes they preserve. I hope to be able to relax
the latter requirement in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116334 91177308-0d34-0410-b5e6-96231b3b80d8
formulae which become illegal as a result of the offset updating don't
escape.
This is for rdar://8529692. No testcase yet, because the given cases
hit use-list ordering differences.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116093 91177308-0d34-0410-b5e6-96231b3b80d8