with a fix
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101397 91177308-0d34-0410-b5e6-96231b3b80d8
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101364 91177308-0d34-0410-b5e6-96231b3b80d8
numerator is an induction variable. For example, with code like this:
for (i=0;i<n;++i)
x[i%n] = 0;
IndVarSimplify will now recognize that i is always less than n inside
the loop, and eliminate the remainder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101113 91177308-0d34-0410-b5e6-96231b3b80d8
expression is a UDiv and it doesn't appear that the UDiv came from
the user's source.
ScalarEvolution has recently figured out how to compute a tripcount
expression for the inner loop in
SingleSource/Benchmarks/Shootout/sieve.c, using a udiv. Emitting a
udiv instruction dramatically slows down the enclosing loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101068 91177308-0d34-0410-b5e6-96231b3b80d8
a ScalarEvolution bug with overflow handling is fixed, the normal analysis
code will automatically decline to operate on the icmp instructions which
are responsible for the loop exit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101032 91177308-0d34-0410-b5e6-96231b3b80d8
instead of deleting just the user. This makes it more consistent with
other code in IndVarSimplify, and theoretically can eliminate more users
earlier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101027 91177308-0d34-0410-b5e6-96231b3b80d8
the loop exit test. This usually doesn't come up for a variety of
reasons, but it isn't impossible, so make IndVarSimplify handle it
conservatively.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101008 91177308-0d34-0410-b5e6-96231b3b80d8
variables. For example, with code like this:
for (i=0;i<n;++i)
if (i<n)
x[i] = 0;
IndVarSimplify will now recognize that i is always less than n inside
the loop, and eliminate the if.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101000 91177308-0d34-0410-b5e6-96231b3b80d8
into adjacent loops. Also, ensure that the insert position is
dominated by the loop latch of any loop in the post-inc set which
has a latch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100906 91177308-0d34-0410-b5e6-96231b3b80d8
forced constant is changed to a constant, we would end
up adding the instruction to the wrong worklist,
preventing it from being properly revisited. This fixes
rdar://7832370
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100837 91177308-0d34-0410-b5e6-96231b3b80d8
explicitly split into stride-and-offset pairs. Also, add the
ability to track multiple post-increment loops on the same expression.
This refines the concept of "normalizing" SCEV expressions used for
to post-increment uses, and introduces a dedicated utility routine for
normalizing and denormalizing expressions.
This fixes the expansion of expressions which are post-increment users
of more than one loop at a time. More broadly, this takes LSR another
step closer to being able to reason about more than one loop at a time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100699 91177308-0d34-0410-b5e6-96231b3b80d8
undefs in branches/switches, we have two cases: a branch on a literal
undef or a branch on a symbolic value which is undef. If we have a
literal undef, the code was correct: forcing it to a constant is the
right thing to do.
If we have a branch on a symbolic value that is undef, we should force
the symbolic value to a constant, which then makes the successor block
live. Forcing the condition of the branch to being a constant isn't
safe if later paths become live and the value becomes overdefined. This
is the case that 'forcedconstant' is designed to handle, so just use it.
This fixes rdar://7765019 but there is no good testcase for this, the
one I have is too insane to be useful in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100478 91177308-0d34-0410-b5e6-96231b3b80d8
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100304 91177308-0d34-0410-b5e6-96231b3b80d8
exits the loop. With this information we can guarantee
the iteration count of the loop is bounded by the
compare. I think this xforms is finally safe now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100285 91177308-0d34-0410-b5e6-96231b3b80d8
checker. Amusingly, we already had tests that we should
have rejects because they would be miscompiled in the
testsuite.
The remaining issue with this is that we don't check that
the branch causes us to exit the loop if it fails, so we
don't actually know if we remain in bounds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100284 91177308-0d34-0410-b5e6-96231b3b80d8
to a signed vs unsigned value depending on the sign of the
constant fp means that we can't distinguish between a
truly negative number and a positive number so large the
32nd bit is set. So, do don't this!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100283 91177308-0d34-0410-b5e6-96231b3b80d8
the required validity checks in the first place, and supporting
a condition large enough to require the 32'nd bit isn't worth it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100280 91177308-0d34-0410-b5e6-96231b3b80d8
this cleans up a bunch of code and also fixes several crashes and
miscompiles. More to come unfortunately, this optimization
is quite broken.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100270 91177308-0d34-0410-b5e6-96231b3b80d8
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100191 91177308-0d34-0410-b5e6-96231b3b80d8
is necessary. Inherits from new templated baseclass CallSiteBase<>
which is highly customizable. Base CallSite on it too, in a configuration
that allows full mutation.
Adapt some call sites in analyses to employ ImmutableCallSite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100100 91177308-0d34-0410-b5e6-96231b3b80d8
generate wrong code pretty much anywhere AFAICT.
A case that hits the bug reproducibly is impossible,
but the situation was like this:
Addr = ...
Store -> Addr
Addr2 = GEP , 0, 0
Store -> Addr2
Handling the first store, the code changed replaced Addr
with a sunkaddr and deleted Addr, but not its table
entry. Code in OptimizedBlock replaced Addr2 with a
bitcast; if that happened to reuse the memory of Addr,
the old table entry was erroneously found when handling
the second store.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100044 91177308-0d34-0410-b5e6-96231b3b80d8
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99928 91177308-0d34-0410-b5e6-96231b3b80d8
pointer. There was also a SmallPtrSet whose settiness wasn't being used, so I
changed it to a SmallVector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99713 91177308-0d34-0410-b5e6-96231b3b80d8
I have audited all getOperandNo calls now, fixing
hidden assumptions. CallSite related uglyness will
be eliminated successively.
Note this patch has a long and griveous history,
for all the back-and-forths have a look at
CallSite.h's log.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99399 91177308-0d34-0410-b5e6-96231b3b80d8
so that the SCEVExpander doesn't retain a dangling pointer as its
insert position. The dangling pointer in this case wasn't ever used
to insert new instructions, but it was causing trouble with
SCEVExpander's code for automatically advancing its insert position
past debug intrinsics.
This fixes use-after-free errors that valgrind noticed in
test/Transforms/IndVarSimplify/2007-06-06-DeleteDanglesPtr.ll and
test/Transforms/IndVarSimplify/exit_value_tests.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99036 91177308-0d34-0410-b5e6-96231b3b80d8
This time I did a self-hosted bootstrap on Linux x86-64,
with no problems. Let's see how darwin 64-bit self-hosting
goes. At the first sign of failure I'll back this out.
Maybe the valgrind bots give me a hint of what may be wrong
(it at all).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98957 91177308-0d34-0410-b5e6-96231b3b80d8
out the remainder of the calls that we should lower in some way and
move the tests to the new correct directory. Fix up tests that are now
optimized more than they were before by -instcombine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97875 91177308-0d34-0410-b5e6-96231b3b80d8
can be used in more places. Add an argument for the TargetData that
most of them need. Update for the getInt8PtrTy() change. Should be
no functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97844 91177308-0d34-0410-b5e6-96231b3b80d8
predecessors before returning. Otherwise, if multiple predecessor edges need
splitting, we only get one of them per iteration. This makes a small but
measurable compile time improvement with -enable-full-load-pre.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97521 91177308-0d34-0410-b5e6-96231b3b80d8
which branch on undef to branch on a boolean constant for the edge
exiting the loop. This helps ScalarEvolution compute trip counts for
loops.
Teach ScalarEvolution to recognize single-value PHIs, when safe, and
ForgetSymbolicName to forget such single-value PHI nodes as apprpriate
in ForgetSymbolicName.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97126 91177308-0d34-0410-b5e6-96231b3b80d8
argument is non-null, pass it along to PHITranslateSubExpr so that it can
prefer using existing values that dominate the PredBB, instead of just
blindly picking the first equivalent value that it finds on a uselist.
Also when the DominatorTree is specified, have PHITranslateValue filter
out any result that does not dominate the PredBB. This is basically just
refactoring the check that used to be in GetAvailablePHITranslatedSubExpr
and also in GVN.
Despite my initial expectations, this change does not affect the results
of GVN for any testcases that I could find, but it should help compile time.
Before this change, if PHITranslateSubExpr picked a value that does not
dominate, PHITranslateWithInsertion would then insert a new value, which GVN
would later determine to be redundant and would replace. By picking a good
value to begin with, we save GVN the extra work of inserting and then
replacing a new value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97010 91177308-0d34-0410-b5e6-96231b3b80d8
induction variable value and a loop-variant value, don't force the
insert position to be at the post-increment position, because it may
not be dominated by the loop-variant value. This fixes a
use-before-def problem noticed on PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96774 91177308-0d34-0410-b5e6-96231b3b80d8
strides in foreign loops. This helps locate reuse opportunities
with existing induction variables in foreign loops and reduces
the need for inserting new ones. This fixes rdar://7657764.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96629 91177308-0d34-0410-b5e6-96231b3b80d8
a loop exit value, so that if a loop gets deleted, ScalarEvolution
isn't stick holding on to dangling SCEVAddRecExprs for that loop. This
fixes PR6339.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96626 91177308-0d34-0410-b5e6-96231b3b80d8
and T->isPointerTy(). Convert most instances of the first form to the second form.
Requested by Chris.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96344 91177308-0d34-0410-b5e6-96231b3b80d8
as it also peeks at which registers are being used by other uses. This
makes LSR less sensitive to use-list order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96308 91177308-0d34-0410-b5e6-96231b3b80d8
with multiplication by constants distributed through, occasionally
those subexpressions can include both x and -x. For now, if this
condition is discovered within LSR, just prune such cases away,
as they won't be profitable. This fixes a "zero allocated in a
base register" assertion failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96177 91177308-0d34-0410-b5e6-96231b3b80d8
and add a doxygen comment.
Cache the phi entry to avoid doing tons of
PHINode::getBasicBlockIndex calls in the common case.
On my insane testcase from re2c, this speeds up CGP from
617.4s to 7.9s (78x).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96083 91177308-0d34-0410-b5e6-96231b3b80d8
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.
This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95975 91177308-0d34-0410-b5e6-96231b3b80d8
block. Other blocks may have pointer cycles that will crash
basicaa and other alias analyses. In any case, there is no
point wasting cycles optimizing dead blocks. This fixes
rdar://7635088
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95852 91177308-0d34-0410-b5e6-96231b3b80d8
Initial skeleton and SCEVUnknown lowering implemented,
the rest should come relatively quickly. Move testcase
to new directory.
Move pass to right before SimplifyLibCalls - which is
moved down a bit so we can take advantage of a few opts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95628 91177308-0d34-0410-b5e6-96231b3b80d8
container data. This prevents it from holding onto dangling
pointers and potentially behaving unpredictably.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95409 91177308-0d34-0410-b5e6-96231b3b80d8
short-circuited conditions to AND/OR expressions, and those expressions
are often converted back to a short-circuited form in code gen. The
original source order may have been optimized to take advantage of the
expected values, and if we reassociate them, we change the order and
subvert that optimization. Radar 7497329.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95333 91177308-0d34-0410-b5e6-96231b3b80d8
The SRThreshold value makes perfect sense for checking if an entire aggregate
should be promoted to a scalar integer, but it is not so good for splitting
an aggregate into its separate elements. A struct may contain a large embedded
array along with some scalar fields that would benefit from being split apart
by SROA. Even if the total aggregate size is large, it may still be good to
perform SROA. Thus, the most important piece of this patch is simply moving
the aggregate size comparison vs. SRThreshold so that it guards only the
aggregate promotion.
We have also been checking the number of elements to decide if an aggregate
should be split up. The limit of "SRThreshold/4" seemed rather arbitrary,
and I don't think it's very useful to derive this limit from SRThreshold
anyway. I've collected some data showing that the current default limit of
32 (since SRThreshold defaults to 128) is a reasonable cutoff for struct
types. One thing suggested by the data is that distinguishing between structs
and arrays might be useful. There are (obviously) a lot more large arrays
than large structs (as measured by the number of elements and not the total
size -- a large array inside a struct still counts as a single element given
the way we do SROA right now). Out of 8377 arrays where we successfully
performed SROA while compiling a large set of benchmarks, only 16 of them had
more than 8 elements. And, for those 16 arrays, it's not at all clear that
SROA was actually beneficial. So, to offset the compile time cost of
investigating more large structs for SROA, the patch lowers the limit on array
elements to 8.
This fixes Apple Radar 7563690.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95224 91177308-0d34-0410-b5e6-96231b3b80d8