the LHS and RHS of an and/or instruction, don't multiply add
known predecessor values. This fixes the crash on testcase
from PR7498
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108114 91177308-0d34-0410-b5e6-96231b3b80d8
operation, but the way it's implemented requires the operation to also be
commutative. So add a check for commutativity (and tweak the corresponding
comments). This makes no difference in practice since every associative
LLVM instruction is also commutative! Here's an example to show the need
for commutativity: the accum_recursion.ll testcase calculates the factorial
function. Before the transformation the result of a call is
((((1*1)*2)*3)...)*x
while afterwards it is
(((1*x)*(x-1))...*2)*1
which clearly requires both associativity and commutativity of * to be equal
to the original.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108056 91177308-0d34-0410-b5e6-96231b3b80d8
have any effect, and second, deleting stores can potentially invalidate
an AliasAnalysis, and there's currently no notification for this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107496 91177308-0d34-0410-b5e6-96231b3b80d8
large integers, the first inserted value would always create
an 'or X, 0'. Even though this is trivially zapped by
instcombine, don't bother creating this pointless instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106979 91177308-0d34-0410-b5e6-96231b3b80d8
the returned value after the tail call if it differs from other return
values. The optimal thing to do would be to introduce a phi node for
the return value, but for the moment just fix the miscompile.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106947 91177308-0d34-0410-b5e6-96231b3b80d8
SCEVUnknown values which are loop-variant, as LSR can't do anything
interesting with these values in any case. This fixes very slow compile
times on loops which have large numbers of such values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106897 91177308-0d34-0410-b5e6-96231b3b80d8
for an "i" constraint should get lowered; PR 6309. While
this argument was passed around a lot, this is the only
place it was used, so it goes away from a lot of other
places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106893 91177308-0d34-0410-b5e6-96231b3b80d8
enough special case, and it theoretically allows more folding because
it works even when x is unanalyzable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106763 91177308-0d34-0410-b5e6-96231b3b80d8
use sharing map. The reconcileNewOffset logic already forces a
separate use if the kinds differ, so incorporating the kind in the
key means we can track more sharing opportunities.
More sharing means fewer total uses to track, which means smaller
problem sizes, which means the conservative throttles don't kick
in as often.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106396 91177308-0d34-0410-b5e6-96231b3b80d8
The memcmp will be optimized further and even the pathological case
'strstr(x, "x") == x' generates optimal code now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106097 91177308-0d34-0410-b5e6-96231b3b80d8
Changed directly instead of using a return value.
Rename FilterOutUndesirableDedicatedRegisters's Changed variable to
distinguish it from LSRInstance's Changed member.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104269 91177308-0d34-0410-b5e6-96231b3b80d8
operand on the left, the interesting operand is on the right. This
fixes a bug where LSR was failing to recognize ICmpZero uses,
which led it to be unable to reverse the induction variable in the
attached testcase.
Delete test/CodeGen/X86/stack-color-with-reg-2.ll, because its test
is extremely fragile and hard to meaningfully update.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104262 91177308-0d34-0410-b5e6-96231b3b80d8
the addressing modes don't make this trivially easy. This allows
it to avoid falling into the less precise heuristics in more
cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104186 91177308-0d34-0410-b5e6-96231b3b80d8
of its formulae have been removed into a helper function, and also
teach it how to update the RegUseTracker.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104087 91177308-0d34-0410-b5e6-96231b3b80d8
is inconsistent with the BaseRegs field. It's not print's job to
assert on an invalid condition, but it can make one more obvious.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104077 91177308-0d34-0410-b5e6-96231b3b80d8
when it detects undefined behavior. llvm.trap generally codegens into some
thing really small (e.g. a 2 byte ud2 instruction on x86) and debugging this
sort of thing is "nontrivial". For example, we now compile:
void foo() { *(int*)0 = 42; }
into:
_foo:
pushl %ebp
movl %esp, %ebp
ud2
Some may even claim that this is a security hole, though that seems dubious
to me. This addresses rdar://7958343 - Optimizing away null dereference
potentially allows arbitrary code execution
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103356 91177308-0d34-0410-b5e6-96231b3b80d8
LSRUse's Regs set after all pruning is done, rather than trying
to do it on the fly, which can produce an incomplete result.
This fixes a case where heuristic pruning was stripping all
formulae from a use, which led the solver to enter an infinite
loop.
Also, add a few asserts to diagnose this kind of situation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103328 91177308-0d34-0410-b5e6-96231b3b80d8
indirect branches in all the predecessors. This avoids unnecessarily
splitting edges in cases where load PRE is not possible anyway.
Thanks to Jakub Staszak for pointing this out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103034 91177308-0d34-0410-b5e6-96231b3b80d8
misses an opportunity to fold add operands, but folds them
after LSR has separated them out. This fixes rdar://7886751.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102157 91177308-0d34-0410-b5e6-96231b3b80d8
arguments are handled with a new InlineFunctionInfo class. This
makes it easier to extend InlineFunction to return more info in the
future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102137 91177308-0d34-0410-b5e6-96231b3b80d8
condition we're unswitching on. In this case, don't try to
simplify the second copy of the loop which may be dead or not,
but is probably a constant now. This fixes PR6879
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101870 91177308-0d34-0410-b5e6-96231b3b80d8
just ask ScalarEvolution for it on demand. This helps IVUsers be more robust
in the case of expressions changing underneath it. This fixes PR6862.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101819 91177308-0d34-0410-b5e6-96231b3b80d8
Probably the best way to know that all getOperand() calls have been handled
is to replace that API instead of updating.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101579 91177308-0d34-0410-b5e6-96231b3b80d8
callee is expected to be expanded to something else by codegen, so that
normal infinitely recursive calls are still transformed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101468 91177308-0d34-0410-b5e6-96231b3b80d8
with a fix for self-hosting
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101465 91177308-0d34-0410-b5e6-96231b3b80d8
with a fix
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101397 91177308-0d34-0410-b5e6-96231b3b80d8
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101364 91177308-0d34-0410-b5e6-96231b3b80d8
numerator is an induction variable. For example, with code like this:
for (i=0;i<n;++i)
x[i%n] = 0;
IndVarSimplify will now recognize that i is always less than n inside
the loop, and eliminate the remainder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101113 91177308-0d34-0410-b5e6-96231b3b80d8
expression is a UDiv and it doesn't appear that the UDiv came from
the user's source.
ScalarEvolution has recently figured out how to compute a tripcount
expression for the inner loop in
SingleSource/Benchmarks/Shootout/sieve.c, using a udiv. Emitting a
udiv instruction dramatically slows down the enclosing loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101068 91177308-0d34-0410-b5e6-96231b3b80d8
a ScalarEvolution bug with overflow handling is fixed, the normal analysis
code will automatically decline to operate on the icmp instructions which
are responsible for the loop exit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101032 91177308-0d34-0410-b5e6-96231b3b80d8
instead of deleting just the user. This makes it more consistent with
other code in IndVarSimplify, and theoretically can eliminate more users
earlier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101027 91177308-0d34-0410-b5e6-96231b3b80d8
the loop exit test. This usually doesn't come up for a variety of
reasons, but it isn't impossible, so make IndVarSimplify handle it
conservatively.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101008 91177308-0d34-0410-b5e6-96231b3b80d8
variables. For example, with code like this:
for (i=0;i<n;++i)
if (i<n)
x[i] = 0;
IndVarSimplify will now recognize that i is always less than n inside
the loop, and eliminate the if.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101000 91177308-0d34-0410-b5e6-96231b3b80d8
into adjacent loops. Also, ensure that the insert position is
dominated by the loop latch of any loop in the post-inc set which
has a latch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100906 91177308-0d34-0410-b5e6-96231b3b80d8
forced constant is changed to a constant, we would end
up adding the instruction to the wrong worklist,
preventing it from being properly revisited. This fixes
rdar://7832370
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100837 91177308-0d34-0410-b5e6-96231b3b80d8
explicitly split into stride-and-offset pairs. Also, add the
ability to track multiple post-increment loops on the same expression.
This refines the concept of "normalizing" SCEV expressions used for
to post-increment uses, and introduces a dedicated utility routine for
normalizing and denormalizing expressions.
This fixes the expansion of expressions which are post-increment users
of more than one loop at a time. More broadly, this takes LSR another
step closer to being able to reason about more than one loop at a time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100699 91177308-0d34-0410-b5e6-96231b3b80d8
undefs in branches/switches, we have two cases: a branch on a literal
undef or a branch on a symbolic value which is undef. If we have a
literal undef, the code was correct: forcing it to a constant is the
right thing to do.
If we have a branch on a symbolic value that is undef, we should force
the symbolic value to a constant, which then makes the successor block
live. Forcing the condition of the branch to being a constant isn't
safe if later paths become live and the value becomes overdefined. This
is the case that 'forcedconstant' is designed to handle, so just use it.
This fixes rdar://7765019 but there is no good testcase for this, the
one I have is too insane to be useful in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100478 91177308-0d34-0410-b5e6-96231b3b80d8
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100304 91177308-0d34-0410-b5e6-96231b3b80d8
exits the loop. With this information we can guarantee
the iteration count of the loop is bounded by the
compare. I think this xforms is finally safe now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100285 91177308-0d34-0410-b5e6-96231b3b80d8
checker. Amusingly, we already had tests that we should
have rejects because they would be miscompiled in the
testsuite.
The remaining issue with this is that we don't check that
the branch causes us to exit the loop if it fails, so we
don't actually know if we remain in bounds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100284 91177308-0d34-0410-b5e6-96231b3b80d8
to a signed vs unsigned value depending on the sign of the
constant fp means that we can't distinguish between a
truly negative number and a positive number so large the
32nd bit is set. So, do don't this!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100283 91177308-0d34-0410-b5e6-96231b3b80d8
the required validity checks in the first place, and supporting
a condition large enough to require the 32'nd bit isn't worth it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100280 91177308-0d34-0410-b5e6-96231b3b80d8
this cleans up a bunch of code and also fixes several crashes and
miscompiles. More to come unfortunately, this optimization
is quite broken.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100270 91177308-0d34-0410-b5e6-96231b3b80d8
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100191 91177308-0d34-0410-b5e6-96231b3b80d8
is necessary. Inherits from new templated baseclass CallSiteBase<>
which is highly customizable. Base CallSite on it too, in a configuration
that allows full mutation.
Adapt some call sites in analyses to employ ImmutableCallSite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100100 91177308-0d34-0410-b5e6-96231b3b80d8
generate wrong code pretty much anywhere AFAICT.
A case that hits the bug reproducibly is impossible,
but the situation was like this:
Addr = ...
Store -> Addr
Addr2 = GEP , 0, 0
Store -> Addr2
Handling the first store, the code changed replaced Addr
with a sunkaddr and deleted Addr, but not its table
entry. Code in OptimizedBlock replaced Addr2 with a
bitcast; if that happened to reuse the memory of Addr,
the old table entry was erroneously found when handling
the second store.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100044 91177308-0d34-0410-b5e6-96231b3b80d8
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99928 91177308-0d34-0410-b5e6-96231b3b80d8
pointer. There was also a SmallPtrSet whose settiness wasn't being used, so I
changed it to a SmallVector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99713 91177308-0d34-0410-b5e6-96231b3b80d8
I have audited all getOperandNo calls now, fixing
hidden assumptions. CallSite related uglyness will
be eliminated successively.
Note this patch has a long and griveous history,
for all the back-and-forths have a look at
CallSite.h's log.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99399 91177308-0d34-0410-b5e6-96231b3b80d8
so that the SCEVExpander doesn't retain a dangling pointer as its
insert position. The dangling pointer in this case wasn't ever used
to insert new instructions, but it was causing trouble with
SCEVExpander's code for automatically advancing its insert position
past debug intrinsics.
This fixes use-after-free errors that valgrind noticed in
test/Transforms/IndVarSimplify/2007-06-06-DeleteDanglesPtr.ll and
test/Transforms/IndVarSimplify/exit_value_tests.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99036 91177308-0d34-0410-b5e6-96231b3b80d8
This time I did a self-hosted bootstrap on Linux x86-64,
with no problems. Let's see how darwin 64-bit self-hosting
goes. At the first sign of failure I'll back this out.
Maybe the valgrind bots give me a hint of what may be wrong
(it at all).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98957 91177308-0d34-0410-b5e6-96231b3b80d8
out the remainder of the calls that we should lower in some way and
move the tests to the new correct directory. Fix up tests that are now
optimized more than they were before by -instcombine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97875 91177308-0d34-0410-b5e6-96231b3b80d8
can be used in more places. Add an argument for the TargetData that
most of them need. Update for the getInt8PtrTy() change. Should be
no functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97844 91177308-0d34-0410-b5e6-96231b3b80d8
predecessors before returning. Otherwise, if multiple predecessor edges need
splitting, we only get one of them per iteration. This makes a small but
measurable compile time improvement with -enable-full-load-pre.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97521 91177308-0d34-0410-b5e6-96231b3b80d8
which branch on undef to branch on a boolean constant for the edge
exiting the loop. This helps ScalarEvolution compute trip counts for
loops.
Teach ScalarEvolution to recognize single-value PHIs, when safe, and
ForgetSymbolicName to forget such single-value PHI nodes as apprpriate
in ForgetSymbolicName.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97126 91177308-0d34-0410-b5e6-96231b3b80d8
argument is non-null, pass it along to PHITranslateSubExpr so that it can
prefer using existing values that dominate the PredBB, instead of just
blindly picking the first equivalent value that it finds on a uselist.
Also when the DominatorTree is specified, have PHITranslateValue filter
out any result that does not dominate the PredBB. This is basically just
refactoring the check that used to be in GetAvailablePHITranslatedSubExpr
and also in GVN.
Despite my initial expectations, this change does not affect the results
of GVN for any testcases that I could find, but it should help compile time.
Before this change, if PHITranslateSubExpr picked a value that does not
dominate, PHITranslateWithInsertion would then insert a new value, which GVN
would later determine to be redundant and would replace. By picking a good
value to begin with, we save GVN the extra work of inserting and then
replacing a new value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97010 91177308-0d34-0410-b5e6-96231b3b80d8
induction variable value and a loop-variant value, don't force the
insert position to be at the post-increment position, because it may
not be dominated by the loop-variant value. This fixes a
use-before-def problem noticed on PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96774 91177308-0d34-0410-b5e6-96231b3b80d8
strides in foreign loops. This helps locate reuse opportunities
with existing induction variables in foreign loops and reduces
the need for inserting new ones. This fixes rdar://7657764.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96629 91177308-0d34-0410-b5e6-96231b3b80d8
a loop exit value, so that if a loop gets deleted, ScalarEvolution
isn't stick holding on to dangling SCEVAddRecExprs for that loop. This
fixes PR6339.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96626 91177308-0d34-0410-b5e6-96231b3b80d8
and T->isPointerTy(). Convert most instances of the first form to the second form.
Requested by Chris.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96344 91177308-0d34-0410-b5e6-96231b3b80d8
as it also peeks at which registers are being used by other uses. This
makes LSR less sensitive to use-list order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96308 91177308-0d34-0410-b5e6-96231b3b80d8
with multiplication by constants distributed through, occasionally
those subexpressions can include both x and -x. For now, if this
condition is discovered within LSR, just prune such cases away,
as they won't be profitable. This fixes a "zero allocated in a
base register" assertion failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96177 91177308-0d34-0410-b5e6-96231b3b80d8
and add a doxygen comment.
Cache the phi entry to avoid doing tons of
PHINode::getBasicBlockIndex calls in the common case.
On my insane testcase from re2c, this speeds up CGP from
617.4s to 7.9s (78x).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96083 91177308-0d34-0410-b5e6-96231b3b80d8
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.
This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95975 91177308-0d34-0410-b5e6-96231b3b80d8
block. Other blocks may have pointer cycles that will crash
basicaa and other alias analyses. In any case, there is no
point wasting cycles optimizing dead blocks. This fixes
rdar://7635088
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95852 91177308-0d34-0410-b5e6-96231b3b80d8
Initial skeleton and SCEVUnknown lowering implemented,
the rest should come relatively quickly. Move testcase
to new directory.
Move pass to right before SimplifyLibCalls - which is
moved down a bit so we can take advantage of a few opts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95628 91177308-0d34-0410-b5e6-96231b3b80d8
container data. This prevents it from holding onto dangling
pointers and potentially behaving unpredictably.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95409 91177308-0d34-0410-b5e6-96231b3b80d8
short-circuited conditions to AND/OR expressions, and those expressions
are often converted back to a short-circuited form in code gen. The
original source order may have been optimized to take advantage of the
expected values, and if we reassociate them, we change the order and
subvert that optimization. Radar 7497329.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95333 91177308-0d34-0410-b5e6-96231b3b80d8
The SRThreshold value makes perfect sense for checking if an entire aggregate
should be promoted to a scalar integer, but it is not so good for splitting
an aggregate into its separate elements. A struct may contain a large embedded
array along with some scalar fields that would benefit from being split apart
by SROA. Even if the total aggregate size is large, it may still be good to
perform SROA. Thus, the most important piece of this patch is simply moving
the aggregate size comparison vs. SRThreshold so that it guards only the
aggregate promotion.
We have also been checking the number of elements to decide if an aggregate
should be split up. The limit of "SRThreshold/4" seemed rather arbitrary,
and I don't think it's very useful to derive this limit from SRThreshold
anyway. I've collected some data showing that the current default limit of
32 (since SRThreshold defaults to 128) is a reasonable cutoff for struct
types. One thing suggested by the data is that distinguishing between structs
and arrays might be useful. There are (obviously) a lot more large arrays
than large structs (as measured by the number of elements and not the total
size -- a large array inside a struct still counts as a single element given
the way we do SROA right now). Out of 8377 arrays where we successfully
performed SROA while compiling a large set of benchmarks, only 16 of them had
more than 8 elements. And, for those 16 arrays, it's not at all clear that
SROA was actually beneficial. So, to offset the compile time cost of
investigating more large structs for SROA, the patch lowers the limit on array
elements to 8.
This fixes Apple Radar 7563690.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95224 91177308-0d34-0410-b5e6-96231b3b80d8
disabled by default. This divides the existing load PRE code into 2 phases:
first it checks that it is safe to move the load to each of the predecessors
where it is unavailable, and then if it is safe, the code is changed to move
the load. Radar 7571861.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95007 91177308-0d34-0410-b5e6-96231b3b80d8
unconditionally. Besides checking the offset, also check that the underlying
object is aligned as much as the load itself.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94875 91177308-0d34-0410-b5e6-96231b3b80d8
parameter with a default value, instead of just hardcoding it in the
implementation. The limit of MaxLookup = 6 was introduced in r69151 to fix
a performance problem with O(n^2) behavior in instcombine, but the scalarrepl
pass is relying on getUnderlyingObject to go all the way back to an AllocaInst.
Making the limit part of the method signature makes it clear that by default
the result is limited and should help avoid similar problems in the future.
This fixes pr6126.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94433 91177308-0d34-0410-b5e6-96231b3b80d8
for arbitrary terminators in predecessors, don't assume
it is a conditional or uncond branch. The testcase shows
an example where they can happen with switches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94323 91177308-0d34-0410-b5e6-96231b3b80d8
handle the case when we can infer an input to the xor
from all inputs that agree, instead of going into an
infinite loop. Another part of PR6199
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94321 91177308-0d34-0410-b5e6-96231b3b80d8
missing ones are libsupport, libsystem and libvmcore. libvmcore is
currently blocked on bugpoint, which uses EH. Once it stops using
EH, we can switch it off.
This #if 0's out 3 unit tests, because gtest requires RTTI information.
Suggestions welcome on how to fix this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94164 91177308-0d34-0410-b5e6-96231b3b80d8
loop-variant components, adds must be inserted after the increment.
Keep track of the increment position for this case, and insert
these adds in the correct location.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94110 91177308-0d34-0410-b5e6-96231b3b80d8
operands exceeds the number of registers used in the initial
solution, as that wouldn't lead to a profitable solution anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94107 91177308-0d34-0410-b5e6-96231b3b80d8
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.
It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94061 91177308-0d34-0410-b5e6-96231b3b80d8
than the scaled register. This makes it more likely that subsequent
AddrModeMatcher queries will match the new address the same way as the
old, instead of accidentally matching what had been the base register
as the new scaled register, and then failing to match the scaled register.
This fixes some problems with address-mode sinking multiple muls into a
block, which will be a lot more common with some upcoming
LoopStrengthReduction changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93935 91177308-0d34-0410-b5e6-96231b3b80d8
are the same. I had already fixed a similar problem where the source and
destination were different bitcasts derived from the same alloca, but the
previous fix still did not handle the case where both operands are exactly
the same value. Radar 7552893.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93848 91177308-0d34-0410-b5e6-96231b3b80d8
in JT.
2) When cloning blocks for PHI or xor conditions, use
instsimplify to simplify the code as we go. This allows us to
squish common cases early in JT which opens up opportunities for
subsequent iterations, and allows it to completely simplify the
testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93253 91177308-0d34-0410-b5e6-96231b3b80d8
condition is a xor with a phi node. This eliminates nonsense
like this from 176.gcc in several places:
LBB166_84:
testl %eax, %eax
- setne %al
- xorb %cl, %al
- notb %al
- testb $1, %al
- je LBB166_85
+ je LBB166_69
+ jmp LBB166_85
This is rdar://7391699
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93221 91177308-0d34-0410-b5e6-96231b3b80d8
on the example in PR4216. This doesn't trigger in the testsuite,
so I'd really appreciate someone scrutinizing the logic for
correctness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92458 91177308-0d34-0410-b5e6-96231b3b80d8
occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which
is copied in multiple apps) in _sch_istable, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92427 91177308-0d34-0410-b5e6-96231b3b80d8
when a consequtive sequence of elements all satisfies the
predicate. Like the double compare case, this generates better
code than the magic constant case and generalizes to more than
32/64 element array lookups.
Here are some examples where it triggers. From 403.gcc, most
accesses to the rtx_class array are handled, e.g.:
@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=547]
%142 = icmp eq i8 %141, 105
@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=543]
%165 = icmp eq i8 %164, 60
Also, most of the 59-element arrays (mode_class/rid_to_yy, etc)
optimized before are actually range compares. This lets 32-bit
machines optimize them.
400.perlbmk has stuff like this:
400.perlbmk: PL_regkind, even for 32-bit:
@PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]*> [#uses=4]
%811 = icmp ne i8 %810, 33
@PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]*> [#uses=94]
%12 = icmp ult i8 %10, 2
etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92426 91177308-0d34-0410-b5e6-96231b3b80d8
two elements match or don't match with two comparisons. For
example, the testcase compiles into:
define i1 @test5(i32 %X) {
%1 = icmp eq i32 %X, 2 ; <i1> [#uses=1]
%2 = icmp eq i32 %X, 7 ; <i1> [#uses=1]
%R = or i1 %1, %2 ; <i1> [#uses=1]
ret i1 %R
}
This generalizes the previous xforms when the array is larger than
64 elements (and this case matches) and generates better code for
cases where it overlaps with the magic bitshift case.
This generalizes more cases than you might expect. For example,
400.perlbmk has:
@PL_utf8skip = constant [256 x i8] c"\01\01\01\...
%15 = icmp ult i8 %7, 7
403.gcc has:
@rid_to_yy = internal constant [114 x i16] [i16 259, i16 260, ...
%18 = icmp eq i16 %16, 295
and xalancbmk has a bunch of examples, such as
_ZN11xercesc_2_5L15gCombiningCharsE and _ZN11xercesc_2_5L10gBaseCharsE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92417 91177308-0d34-0410-b5e6-96231b3b80d8
arrays with variable indices into a comparison of the index
with a constant. The most common occurrence of this that
I see by far is stuff like:
if ("foobar"[i] == '\0') ...
which we compile into: if (i == 6), saving a load and
materialization of the global address. This also exposes
loop trip count information to later passes in many cases.
This triggers hundreds of times in xalancbmk, which is where I first
noticed it, but it also triggers in many other apps. Here are a few
interesting ones from various apps:
@must_be_connected_without = internal constant [8 x i8*] [i8* getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8*]*> [#uses=2]
%scevgep.i = getelementptr [8 x i8*]* @must_be_connected_without, i64 0, i64 %indvar.i ; <i8**> [#uses=1]
%17 = load ...
%18 = icmp eq i8* %17, null ; <i1> [#uses=1]
-> icmp eq i64 %indvar.i, 7
@yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11*\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]*> [#uses=2]
%57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8*> [#uses=1]
%mode.0.in = getelementptr inbounds [9 x i32]* @mb_mode_table, i64 0, i64 %.pn ; <i32*> [#uses=1]
load ...
%64 = icmp eq i8 %58, 4 ; <i1> [#uses=1]
-> icmp eq i64 %.pn, 35 ; <i1> [#uses=0]
@gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767]
%scevgep.i = getelementptr [4 x i16]* @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1]
%425 = load %scevgep.i
%426 = icmp eq i16 %425, -32768 ; <i1> [#uses=0]
-> false
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92411 91177308-0d34-0410-b5e6-96231b3b80d8
pointer to int casts that confuse later optimizations. See PR3351
for details.
This improves but doesn't complete fix 483.xalancbmk because llvm-gcc
does this xform in GCC's "fold" routine as well. Clang++ will do
better I guess.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92408 91177308-0d34-0410-b5e6-96231b3b80d8
a constantexpr gep on the 'base' side of the expression.
This completes comment #4 in PR3351, which comes from
483.xalancbmk.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92402 91177308-0d34-0410-b5e6-96231b3b80d8