- The only meat here is in Value.{h,cpp} the rest is essential 'const
std::string &' -> 'const Twine &'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77048 91177308-0d34-0410-b5e6-96231b3b80d8
- Some clients which used DOUT have moved to DEBUG. We are deprecating the
"magic" DOUT behavior which avoided calling printing functions when the
statement was disabled. In addition to being unnecessary magic, it had the
downside of leaving code in -Asserts builds, and of hiding potentially
unnecessary computations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77019 91177308-0d34-0410-b5e6-96231b3b80d8
- Yay for '-'s and simplifications!
- I kept StringMap::GetOrCreateValue for compatibility purposes, this can
eventually go away. Likewise the StringMapEntry Create functions still follow
the old style.
- NIFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76888 91177308-0d34-0410-b5e6-96231b3b80d8
also apply to vectors. This allows us to compile this:
#include <emmintrin.h>
__m128i a(__m128 a, __m128 b) { return a==a & b==b; }
__m128i b(__m128 a, __m128 b) { return a!=a | b!=b; }
to:
_a:
cmpordps %xmm1, %xmm0
ret
_b:
cmpunordps %xmm1, %xmm0
ret
with clang instead of to a ton of horrible code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76863 91177308-0d34-0410-b5e6-96231b3b80d8
Getelementptrs that are defined to wrap are virtually useless to
optimization, and getelementptrs that are undefined on any kind
of overflow are too restrictive -- it's difficult to ensure that
all intermediate addresses are within bounds. I'm going to take
a different approach.
Remove a few optimizations that depended on this flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76437 91177308-0d34-0410-b5e6-96231b3b80d8
insertelement/extractelement.
I'm not entirely sure this is precisely what we want to do: should we
prefer bitcast(insertelement) or insertelement(bitcast)? Similarly. should we
prefer extractelement(bitcast) or bitcast(extractelement)?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76345 91177308-0d34-0410-b5e6-96231b3b80d8
all values belonging to the intersection will belong to the resulting range.
The former was inconsistent about that point (either way is fine, just pick
one.) This is part of PR4545.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76289 91177308-0d34-0410-b5e6-96231b3b80d8
ConstantExpr and Instruction. This involves duplicating some code
between GetElementPtrInst and GEPOperator, but it's not a lot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76265 91177308-0d34-0410-b5e6-96231b3b80d8
GEPOperator's hasNoPointer0verflow(), and make a few places in instcombine
that create GEPs that may overflow clear the NoOverflow value. Among
other things, this partially addresses PR2831.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76252 91177308-0d34-0410-b5e6-96231b3b80d8
in a convenient manner, factoring out some common code from
InstructionCombining and ValueTracking. Move the contents of
BinaryOperators.h into Operator.h and use Operator to generalize them
to support ConstantExprs as well as Instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76232 91177308-0d34-0410-b5e6-96231b3b80d8
isSafeToSpeculativelyExecute. The new method is a bit closer to what
the callers actually care about in that it rejects more things callers
don't want. It also adds more precise handling for integer
division, and unifies code for analyzing the legality of a speculative
load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76150 91177308-0d34-0410-b5e6-96231b3b80d8
operands; it's possible to end up with a constant-foldable operand to
most instructions, even those which can't trap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75845 91177308-0d34-0410-b5e6-96231b3b80d8
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75640 91177308-0d34-0410-b5e6-96231b3b80d8
(I think it's reasonably clear that we want to have a canonical form for
constructs like this; if anyone thinks that a select is not the best
canonical form, please tell me.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75531 91177308-0d34-0410-b5e6-96231b3b80d8
using the Curiously Recurring Template Pattern with LoopBase.
This will help further refactoring, and future functionality for
Loop. Also, Headers can now foward-declare Loop, instead of pulling
in LoopInfo.h or doing tricks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75519 91177308-0d34-0410-b5e6-96231b3b80d8
the changes are allowed by not calling this function for bitcasts.
The Instruction::AShr case is dead because
SimplifyDemandedInstructionBits handles that case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75514 91177308-0d34-0410-b5e6-96231b3b80d8
This involves temporarily hard wiring some parts to use the global context. This isn't ideal, but it's
the only way I could figure out to make this process vaguely incremental.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75445 91177308-0d34-0410-b5e6-96231b3b80d8
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75379 91177308-0d34-0410-b5e6-96231b3b80d8
per icmp predicate out of predsimplify and into ConstantRange.
Add another utility method that determines whether one range is a subset of
another. Combine with the former to determine whether icmp pred range, range
is known to be true or not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75357 91177308-0d34-0410-b5e6-96231b3b80d8
This way ScalarEvolution can examine the loop to determine what state
it needs to update, if it chooses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75029 91177308-0d34-0410-b5e6-96231b3b80d8
This was considering vector intrinsics to have cost 2, but non-vector
intrinsics to have cost 1, which is backward.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74698 91177308-0d34-0410-b5e6-96231b3b80d8
inserted to replace that value must dominate all of of the basic
blocks associated with the uses of the value in the PHI, not just
one of them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74376 91177308-0d34-0410-b5e6-96231b3b80d8
This helps it avoid reusing an instruction that doesn't dominate all
of the users, in cases where the original instruction was inserted
before all of the users were known. This may result in redundant
expansions of sub-expressions that depend on loop-unpredictable values
in some cases, however this isn't very common, and it primarily impacts
IndVarSimplify, so GVN can be expected to clean these up.
This eliminates the need for IndVarSimplify's FixUsesBeforeDefs,
which fixes several bugs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74352 91177308-0d34-0410-b5e6-96231b3b80d8
terminator, instead of after the last phi. This fixes a bug
exposed by ScalarEvolution analyzing more kinds of loops.
This fixes PR4436.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74072 91177308-0d34-0410-b5e6-96231b3b80d8
trip counts in more cases.
Generalize ScalarEvolution's isLoopGuardedByCond code to recognize
And and Or conditions, splitting the code out into an
isNecessaryCond helper function so that it can evaluate Ands and Ors
recursively, and make SCEVExpander be much more aggressive about
hoisting instructions out of loops.
test/CodeGen/X86/pr3495.ll has an additional instruction now, but
it appears to be due to an arbitrary register allocation difference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74048 91177308-0d34-0410-b5e6-96231b3b80d8
now, this hasn't mattered, because ScalarEvolution hasn't been able
to compute trip counts for loops with multiple exits. But it will
soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73864 91177308-0d34-0410-b5e6-96231b3b80d8
as if they were multiple uses of the same instruction. This interacts
well with the existing loadpre that j-t does to open up many new jump
threads earlier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73768 91177308-0d34-0410-b5e6-96231b3b80d8
as signed max tests. Along with r73717, this helps CodeGen avoid
emitting code for a maximum operation for this class of loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73718 91177308-0d34-0410-b5e6-96231b3b80d8
casted induction variables in cases where the cast
isn't foldable. It ended up being a pessimization in
many cases. This could be fixed, but it would require
a bunch of complicated code in IVUsers' clients. The
advantages of this approach aren't visible enough to
justify it at this time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73706 91177308-0d34-0410-b5e6-96231b3b80d8
move loads back past a check that the load address
is valid, see new testcase. The test that went
in with 72661 has exactly this case, except that
the conditional it's moving past is checking
something else; I've settled for changing that
test to reference a global, not a pointer. It
may be possible to scan all the tests you pass and
make sure none of them are checking any component
of the address, but it's not trivial and I'm not
trying to do that here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73632 91177308-0d34-0410-b5e6-96231b3b80d8
to ignore readonly calls, and factor it out of instcombine so
that it can be used by other passes. Patch by Frits van Bommel!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73506 91177308-0d34-0410-b5e6-96231b3b80d8
failures.
To support this, add some utility functions to Type to help support
vector/scalar-independent code. Change ConstantInt::get and
ConstantFP::get to support vector types, and add an overload to
ConstantInt::get that uses a static IntegerType type, for
convenience.
Introduce a new getConstant method for ScalarEvolution, to simplify
common use cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73431 91177308-0d34-0410-b5e6-96231b3b80d8
induction variable when the addrec to be expanded does not require
a wider type. This eliminates the need for IndVarSimplify to
micro-manage SCEV expansions, because SCEVExpander now
automatically expands them in the form that IndVarSimplify considers
to be canonical. (LSR still micro-manages its SCEV expansions,
because it's optimizing for the target, rather than for
other optimizations.)
Also, this uses the new getAnyExtendExpr, which has more clever
expression simplification logic than the IndVarSimplify code it
replaces, and this cleans up some ugly expansions in code such as
the included masked-iv.ll testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73294 91177308-0d34-0410-b5e6-96231b3b80d8
integer and floating-point opcodes, introducing
FAdd, FSub, and FMul.
For now, the AsmParser, BitcodeReader, and IRBuilder all preserve
backwards compatability, and the Core LLVM APIs preserve backwards
compatibility for IR producers. Most front-ends won't need to change
immediately.
This implements the first step of the plan outlined here:
http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72897 91177308-0d34-0410-b5e6-96231b3b80d8
instcombine doesn't know when it's safe. To partially compensate
for this, introduce new code to do this transformation in
dagcombine, which can use UnsafeFPMath.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72872 91177308-0d34-0410-b5e6-96231b3b80d8
RewriteStoreUserOfWholeAlloca deal with tail padding because
isSafeUseOfBitCastedAllocation expects them to. Otherwise, we crash
trying to erase the bitcast.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72688 91177308-0d34-0410-b5e6-96231b3b80d8
rewrite the comparison if there is any implicit extension or truncation
on the induction variable. I'm planning for IVUsers to eventually take
over some of the work of this code, and for it to be generalized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72496 91177308-0d34-0410-b5e6-96231b3b80d8
in the case where a loop exit value cannot be computed, instead of only in
some cases while using SCEVCouldNotCompute in others. This simplifies
getSCEVAtScope's callers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72375 91177308-0d34-0410-b5e6-96231b3b80d8
one of the RecursivelyDeleteTriviallyDeadInstructions.
Add a comment explaining why the cache needs to be cleared.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72372 91177308-0d34-0410-b5e6-96231b3b80d8
leave the original comparison in place if it has other uses, since the
other uses won't be dominated by the new comparison instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72369 91177308-0d34-0410-b5e6-96231b3b80d8
Fix by clearing the rewriter cache before deleting the trivially dead
instructions.
Also make InsertedExpressions use an AssertingVH to catch these
bugs easier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72364 91177308-0d34-0410-b5e6-96231b3b80d8
assuming that the use of the value is in a block dominated by the
"normal" destination. LangRef.html and other documentation sources
don't explicitly guarantee this, but it seems to be assumed in
other places in LLVM at least.
This fixes an assertion failure on the included testcase, which
is derived from the Ada testsuite.
FixUsesBeforeDefs is a temporary measure which I'm looking to
replace with a more capable solution.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72266 91177308-0d34-0410-b5e6-96231b3b80d8
Instcombine to be more aggressive about using SimplifyDemandedBits
on shift nodes. This allows a shift to be simplified to zero in the
included test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72204 91177308-0d34-0410-b5e6-96231b3b80d8
of the comparison is defined inside the loop. This fixes a
use-before-def problem, because the transformation puts a use
of the RHS outside the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72149 91177308-0d34-0410-b5e6-96231b3b80d8
instructions. It attempts to create high-level multi-operand GEPs,
though in cases where this isn't possible it falls back to casting
the pointer to i8* and emitting a GEP with that. Using GEP instructions
instead of ptrtoint+arithmetic+inttoptr helps pointer analyses that
don't use ScalarEvolution, such as BasicAliasAnalysis.
Also, make the AddrModeMatcher more aggressive in handling GEPs.
Previously it assumed that operand 0 of a GEP would require a register
in almost all cases. It now does extra checking and can do more
matching if operand 0 of the GEP is foldable. This fixes a problem
that was exposed by SCEVExpander using GEPs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72093 91177308-0d34-0410-b5e6-96231b3b80d8
is not known to be nothrow. This allows readnone/readonly functions
to be deleted even if we don't know whether the callee can throw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@71676 91177308-0d34-0410-b5e6-96231b3b80d8
without one. Use it where we were using abs on
int64_t objects.
(I strongly suspect the casts to unsigned in the
fragments in LoopStrengthReduce are not doing whatever
the original intent was, but the obvious change to
uint64_t doesn't work. Maybe later.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@71612 91177308-0d34-0410-b5e6-96231b3b80d8
and generalize it so that it can be used by IndVarSimplify. Implement the
base IndVarSimplify transformation code using IVUsers. This removes
TestOrigIVForWrap and associated code, as ScalarEvolution now has enough
builtin overflow detection and folding logic to handle all the same cases,
and more. Run "opt -iv-users -analyze -disable-output" on your favorite
loop for an example of what IVUsers does.
This lets IndVarSimplify eliminate IV casts and compute trip counts in
more cases. Also, this happens to finally fix the remaining testcases
in PR1301.
Now that IndVarSimplify is being more aggressive, it occasionally runs
into the problem where ScalarEvolutionExpander's code for avoiding
duplicate expansions makes it difficult to ensure that all expanded
instructions dominate all the instructions that will use them. As a
temporary measure, IndVarSimplify now uses a FixUsesBeforeDefs function
to fix up instructions inserted by SCEVExpander. Fortunately, this code
is contained, and can be easily removed once a more comprehensive
solution is available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@71535 91177308-0d34-0410-b5e6-96231b3b80d8
Also, if the compare is the only use, LSR would place the iv increment instruction before the compare instead in the latch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@71485 91177308-0d34-0410-b5e6-96231b3b80d8
count down to 0 instead, under very restricted
circumstances. Adjust 4 testcases in which this
optimization fires.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@71439 91177308-0d34-0410-b5e6-96231b3b80d8
method, fixing a crash on PR4146. While the store will
ultimately overwrite the "padded size" number of bits in memory,
the stored value may be a subset of this size. This function
only wants to handle the case where all bits are stored.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@71224 91177308-0d34-0410-b5e6-96231b3b80d8
the optimizers about this. For example, a readonly
function with no uses cannot be removed unless it is
also marked nounwind.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@71071 91177308-0d34-0410-b5e6-96231b3b80d8
CallbackVH, with fixes. allUsesReplacedWith need to
walk the def-use chains and invalidate all users of a
value that is replaced. SCEVs of users need to be
recalcualted even if the new value is equivalent. Also,
make forgetLoopPHIs walk def-use chains, since any
SCEV that depends on a PHI should be recalculated when
more information about that PHI becomes available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@70927 91177308-0d34-0410-b5e6-96231b3b80d8
makes ScalarEvolution::deleteValueFromRecords, and it's code that
subtly needed to be called before ReplaceAllUsesWith, unnecessary.
It also makes ValueDeletionListener unnecessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@70645 91177308-0d34-0410-b5e6-96231b3b80d8
of returning a list of pointers to Values that are deleted. This was
unsafe, because the pointers in the list are, by nature of what
RecursivelyDeleteDeadInstructions does, always dangling. Replace this
with a simple callback mechanism. This may eventually be removed if
all clients can reasonably be expected to use CallbackVH.
Use this to factor out the dead-phi-cycle-elimination code from LSR
utility function, and generalize it to use the
RecursivelyDeleteTriviallyDeadInstructions utility function.
This makes LSR more aggressive about eliminating dead PHI cycles;
adjust tests to either be less trivial or to simply expect fewer
instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@70636 91177308-0d34-0410-b5e6-96231b3b80d8
of LSR. This makes the AddUsersIfInteresting phase of LSR a pure
analysis instead of a phase that potentially does CFG modifications.
The conditions where this code would actually perform a split are
rare, and in the cases where it actually would do a split the split
is usually undone by CodeGenPrepare, and in cases where splits
actually survive into codegen, they appear to hurt more often than
they help.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@70625 91177308-0d34-0410-b5e6-96231b3b80d8
target hooks canLosslesslyBitCastTo and isTruncateFree. This allows
targets to avoid worrying about handling all combinations of integer
and pointer types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@70555 91177308-0d34-0410-b5e6-96231b3b80d8
with the persistent insertion point, and change IndVars to make
use of it. This fixes a bug where IndVars was holding on to a
stale insertion point and forcing the SCEVExpander to continue to
use it.
This fixes PR4038.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69892 91177308-0d34-0410-b5e6-96231b3b80d8
have pointer types, though in contrast to C pointer types, SCEV
addition is never implicitly scaled. This not only eliminates the
need for special code like IndVars' EliminatePointerRecurrence
and LSR's own GEP expansion code, it also does a better job because
it lets the normal optimizations handle pointer expressions just
like integer expressions.
Also, since LLVM IR GEPs can't directly index into multi-dimensional
VLAs, moving the GEP analysis out of client code and into the SCEV
framework makes it easier for clients to handle multi-dimensional
VLAs the same way as other arrays.
Some existing regression tests show improved optimization.
test/CodeGen/ARM/2007-03-13-InstrSched.ll in particular improved to
the point where if-conversion started kicking in; I turned it off
for this test to preserve the intent of the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69258 91177308-0d34-0410-b5e6-96231b3b80d8
and sext over (iv | const), if a longer iv is
available. Allow expressions to have more than
one zext/sext parent. All from OpenSSL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69241 91177308-0d34-0410-b5e6-96231b3b80d8
sext around sext(shorter IV + constant), using a
longer IV instead, when it can figure out the
add can't overflow. This comes up a lot in
subscripting; mainly affects 64 bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69123 91177308-0d34-0410-b5e6-96231b3b80d8
strncat :(
strncat(foo, "bar", 99)
would be optimized to
memcpy(foo+strlen(foo), "bar", 100, 1)
instead of
memcpy(foo+strlen(foo), "bar", 4, 1)"
Patch by Benjamin Kramer!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68905 91177308-0d34-0410-b5e6-96231b3b80d8
integer types, unless they are already strange. This prevents it from
turning the code produced by SROA into crazy libcalls and stuff that
the code generator can't handle. In the attached example, the result
was an i96 multiply that caused the x86 backend to assert.
Note that if TargetData had an idea of what the legal types are for
a target that this could be used to stop instcombine from introducing
i64 muls, as Scott wanted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68598 91177308-0d34-0410-b5e6-96231b3b80d8
instead of the place where it started to perform the string copy.
- PR3661
- Patch by Benjamin Kramer!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68443 91177308-0d34-0410-b5e6-96231b3b80d8
to/from integer types that are not intptr_t to convert to intptr_t
then do an integer conversion to the dest type. This exposes the
cast to the optimizer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67638 91177308-0d34-0410-b5e6-96231b3b80d8
1. Make instcombine always canonicalize trunc x to i1 into an icmp(x&1). This
exposes the AND to other instcombine xforms and is more of what the code
generator expects.
2. Rewrite the remaining trunc pattern match to use 'match', which
simplifies it a lot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67635 91177308-0d34-0410-b5e6-96231b3b80d8
linkage: the value may be replaced with something
different at link time. (Frontends that want to
allow values to be loaded out of weak constants can
give their constants weak_odr linkage).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67407 91177308-0d34-0410-b5e6-96231b3b80d8
and was deleting Instructions without clearing the
corresponding map entry. This led to nondeterministic
behavior if the same address got allocated to another
Instruction within a short time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67306 91177308-0d34-0410-b5e6-96231b3b80d8
it is not APInt clean, but even when it is it needs to be evaluated carefully
to determine whether it is actually profitable.
This fixes a crash on PR3806
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67134 91177308-0d34-0410-b5e6-96231b3b80d8
changes.
For InvokeInst now all arguments begin at op_begin().
The Callee, Cont and Fail are now faster to get by
access relative to op_end().
This patch introduces some temporary uglyness in CallSite.
Next I'll bring CallInst up to a similar scheme and then
the uglyness will magically vanish.
This patch also exposes all the reliance of the libraries
on InvokeInst's operand ordering. I am thinking of taking
care of that too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66920 91177308-0d34-0410-b5e6-96231b3b80d8
in the Ada testcase. Reverting this only covers up
the real problem, which is a nasty conceptual difficulty
in the phi elimination pass: when eliminating phi nodes
in landing pads, the register copies need to come before
the invoke, not at the end of the basic block which is
too late... See PR3784.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66826 91177308-0d34-0410-b5e6-96231b3b80d8
allocations. Apparently the assumption is there is an
instruction (terminator?) following the allocation so I
am allowing the same assumption.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66716 91177308-0d34-0410-b5e6-96231b3b80d8
use, check also for the case where it has two uses,
the other being a llvm.dbg.declare. This is needed so
debug info doesn't affect codegen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65970 91177308-0d34-0410-b5e6-96231b3b80d8
info with it.
Don't count debug info insns against the scan maximum
in FindAvailableLoadedValue (lest they affect codegen).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65910 91177308-0d34-0410-b5e6-96231b3b80d8
to more accurately describe what it does. Expand its doxygen comment
to describe what the backedge-taken count is and how it differs
from the actual iteration count of the loop. Adjust names and
comments in associated code accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65382 91177308-0d34-0410-b5e6-96231b3b80d8
ashr instcombine to help expose this code. And apply the fix to
SelectionDAG's copy of this code too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65364 91177308-0d34-0410-b5e6-96231b3b80d8
trip counts that use signed comparisons. It's not obviously the best
approach for preserving trip count information, and at any rate there
isn't anything in the tree right now that makes use of that, so for
now always using zero-extensions is preferable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65347 91177308-0d34-0410-b5e6-96231b3b80d8
so that ScalarEvolution doesn't hang onto a dangling Loop*, which
could be a problem if another Loop happens to get allocated at the
same address.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65323 91177308-0d34-0410-b5e6-96231b3b80d8
-std-compile-opts sequence, this avoids the need for ScalarEvolution to
be rerun before LoopDeletion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65318 91177308-0d34-0410-b5e6-96231b3b80d8
memcpy to match the alignment of the destination. It isn't necessary
for making loads and stores handled like the SSE loadu/storeu
intrinsics, and it was causing a performance regression in
MultiSource/Applications/JM/lencod.
The problem appears to have been a memcpy that copies from some
highly aligned array into an alloca; the alloca was then being
assigned a large alignment, which required codegen to perform
dynamic stack-pointer re-alignment, which forced the enclosing
function to have a frame pointer, which led to increased spilling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65289 91177308-0d34-0410-b5e6-96231b3b80d8
as legality. Make load sinking and gep sinking more careful: we only
do it when it won't pessimize loads from the stack. This has the added
benefit of not producing code that is unanalyzable to SROA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65209 91177308-0d34-0410-b5e6-96231b3b80d8
addresses, part 1. This fixes an obvious logic bug. Previously if the only
in-loop use is a PHI, it would return AllUsesAreAddresses as true.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65178 91177308-0d34-0410-b5e6-96231b3b80d8
reduction of address calculations down to basic pointer arithmetic.
This is currently off by default, as it needs a few other features
before it becomes generally useful. And even when enabled, full
strength reduction is only performed when it doesn't increase
register pressure, and when several other conditions are true.
This also factors out a bunch of exisiting LSR code out of
StrengthReduceStridedIVUsers into separate functions, and tidies
up IV insertion. This actually decreases register pressure even
in non-superhero mode. The change in iv-users-in-other-loops.ll
is an example of this; there are two more adds because there are
two fewer leas, and there is less spilling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65108 91177308-0d34-0410-b5e6-96231b3b80d8
trip count value when the original loop iteration condition is
signed and the canonical induction variable won't undergo signed
overflow. This isn't required for correctness; it just preserves
more information about original loop iteration values.
Add a getTruncateOrSignExtend method to ScalarEvolution,
following getTruncateOrZeroExtend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64918 91177308-0d34-0410-b5e6-96231b3b80d8
are multiple IV's in a loop, some of them may under go signed
or unsigned wrapping even if the IV that's used in the loop
exit condition doesn't. Restrict sign-extension-elimination
and zero-extension-elimination to only those that operate on
the original loop-controlling IV.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64866 91177308-0d34-0410-b5e6-96231b3b80d8
modified in a way that may effect the trip count calculation. Change
IndVars to use this method when it rewrites pointer or floating-point
induction variables instead of using a doInitialization method to
sneak these changes in before ScalarEvolution has a chance to see
the loop. This eliminates the need for LoopPass to depend on
ScalarEvolution.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64810 91177308-0d34-0410-b5e6-96231b3b80d8
eliminate all the extensions and all but the one required truncate
from the testcase, but the or/and/shift stuff still isn't zapped.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64809 91177308-0d34-0410-b5e6-96231b3b80d8
Enhance instcombine to use the preferred field of
GetOrEnforceKnownAlignment in more cases, so that regular IR operations are
optimized in the same way that the intrinsics currently are.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64623 91177308-0d34-0410-b5e6-96231b3b80d8
when I was looking at functions used by python.
Highlights include, better largefile support (64-bit file sizes on 32-bit
systems), fputs string is nocapture, popen/pclose added (popen being noalias
return), modf and frexp and friends. Also added some missing 'break' statements
and combined identical sections.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64615 91177308-0d34-0410-b5e6-96231b3b80d8
- Test for signed and unsigned wrapping conditions, instead of just
testing for non-negative induction ranges.
- Handle loops with GT comparisons, in addition to LT comparisons.
- Support more cases of induction variables that don't start at 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64532 91177308-0d34-0410-b5e6-96231b3b80d8
addrec in a different loop to check the value being added to
the accumulated Start value, not the Start value before it has
the new value added to it. This prevents LSR from going crazy
on the included testcase. Dale, please review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64440 91177308-0d34-0410-b5e6-96231b3b80d8
after sorting by stride value. This prevents it from missing
IV reuse opportunities in a host-sensitive manner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64415 91177308-0d34-0410-b5e6-96231b3b80d8
loop induction on LP64 targets. When the induction variable is
used in addressing, IndVars now is usually able to inserst a
64-bit induction variable and eliminates the sign-extending cast.
This is also useful for code using C "short" types for
induction variables on targets with 32-bit addressing.
Inserting a wider induction variable is easy; the tricky part is
determining when trunc(sext(i)) expressions are no-ops. This
requires range analysis of the loop trip count. A common case is
when the original loop iteration starts at 0 and exits when the
induction variable is signed-less-than a fixed value; this case
is now handled.
This replaces IndVarSimplify's OptimizeCanonicalIVType. It was
doing the same optimization, but it was limited to loops with
constant trip counts, because it was running after the loop
rewrite, and the information about the original induction
variable is lost by that point.
Rename ScalarEvolution's executesAtLeastOnce to
isLoopGuardedByCond, generalize it to be able to test for
ICMP_NE conditions, and move it to be a public function so that
IndVars can use it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64407 91177308-0d34-0410-b5e6-96231b3b80d8
accessed at least once as a vector. This prevents it from
compiling the example in not-a-vector into:
define double @test(double %A, double %B) {
%tmp4 = insertelement <7 x double> undef, double %A, i32 0
%tmp = insertelement <7 x double> %tmp4, double %B, i32 4
%tmp2 = extractelement <7 x double> %tmp, i32 4
ret double %tmp2
}
instead, producing the integer code. Producing vectors when they
aren't otherwise in the program is dangerous because a lot of other
code treats them carefully and doesn't want to break them down.
OTOH, many things want to break down tasty i448's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63638 91177308-0d34-0410-b5e6-96231b3b80d8
With the new world order, it can handle cases where the first
store into the alloca is an element of the vector, instead of
requiring the first analyzed store to have the vector type
itself. This allows us to un-xfail
test/CodeGen/X86/vec_ins_extract.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63590 91177308-0d34-0410-b5e6-96231b3b80d8
turn icmp eq a+x, b+x into icmp eq a, b if a+x or b+x has other uses. This
may have been increasing register pressure leading to the bzip2 slowdown.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63487 91177308-0d34-0410-b5e6-96231b3b80d8
improvements to the EvaluateInDifferentType code. This code works
by just inserted a bunch of new code and then seeing if it is
useful. Instcombine is not allowed to do this: it can only insert
new code if it is useful, and only when it is converging to a more
canonical fixed point. Now that we iterate when DCE makes progress,
this causes an infinite loop when the code ends up not being used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63483 91177308-0d34-0410-b5e6-96231b3b80d8
simplifydemandedbits to simplify instructions with *multiple
uses* in contexts where it can get away with it. This allows
it to simplify the code in multi-use-or.ll into a single 'add
double'.
This change is particularly interesting because it will cover
up for some common codegen bugs with large integers created due
to the recent SROA patch. When working on fixing those bugs,
this should be disabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63481 91177308-0d34-0410-b5e6-96231b3b80d8
Now, if it detects that "V" is the same as some other value,
SimplifyDemandedBits returns the new value instead of RAUW'ing it immediately.
This has two benefits:
1) simpler code in the recursive SimplifyDemandedBits routine.
2) it allows future fun stuff in instcombine where an operation has multiple
uses and can be simplified in one context, but not all.
#2 isn't implemented yet, this patch should have no functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63479 91177308-0d34-0410-b5e6-96231b3b80d8
not doing so prevents it from properly iterating and prevents it
from deleting the entire body of dce-iterate.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63476 91177308-0d34-0410-b5e6-96231b3b80d8
be able to handle *ANY* alloca that is poked by loads and stores of
bitcasts and GEPs with constant offsets. Before the code had a number
of annoying limitations and caused it to miss cases such as storing into
holes in structs and complex casts (as in bitfield-sroa) where we had
unions of bitfields etc. This also handles a number of important cases
that are exposed due to the ABI lowering stuff we do to pass stuff by
value.
One case that is pretty great is that we compile
2006-11-07-InvalidArrayPromote.ll into:
define i32 @func(<4 x float> %v0, <4 x float> %v1) nounwind {
%tmp10 = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %v1)
%tmp105 = bitcast <4 x i32> %tmp10 to i128
%tmp1056 = zext i128 %tmp105 to i256
%tmp.upgrd.43 = lshr i256 %tmp1056, 96
%tmp.upgrd.44 = trunc i256 %tmp.upgrd.43 to i32
ret i32 %tmp.upgrd.44
}
which turns into:
_func:
subl $28, %esp
cvttps2dq %xmm1, %xmm0
movaps %xmm0, (%esp)
movl 12(%esp), %eax
addl $28, %esp
ret
Which is pretty good code all things considering :).
One effect of this is that SROA will start generating arbitrary bitwidth
integers that are a multiple of 8 bits. In the case above, we got a
256 bit integer, but the codegen guys assure me that it can handle the
simple and/or/shift/zext stuff that we're doing on these operations.
This addresses rdar://6532315
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63469 91177308-0d34-0410-b5e6-96231b3b80d8
Thus we need to check whether the struct is empty before trying to index into
it. This fixes PR3381.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62918 91177308-0d34-0410-b5e6-96231b3b80d8
handling the case in Transforms/InstCombine/cast-store-gep.ll, which
is a heavily reduced testcase from Clang on x86-64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62904 91177308-0d34-0410-b5e6-96231b3b80d8
There is now a direct way from value-use-iterator to incoming block in PHINode's API.
This way we avoid the iterator->index->iterator trip, and especially the costly
getOperandNo() invocation. Additionally there is now an assertion that the iterator
really refers to one of the PHI's Uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62869 91177308-0d34-0410-b5e6-96231b3b80d8
Besides APFloat, this involved removing code
from two places that thought they knew the
result of frem(0., x) but were wrong.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62645 91177308-0d34-0410-b5e6-96231b3b80d8
putc, puts, perror, vscanf and vsscanf from getting annotations.
Add annotations for eight printf functions, memalign, pread and pwrite.
On Linux, llvm-gcc sometimes renames strdup, getc, putc, strtok_r, scanf and
sscanf. Match the alternate function names.
Fix a crash annotating opendir.
Don't mark fsetpos's second parameter as nocapture. It's supposed to be
captured.
Do mark fopen's path and mode strings as nocapture. Mark ferror as readonly,
but not fileno which may set errno.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62456 91177308-0d34-0410-b5e6-96231b3b80d8
- Looking at the number of sign bits of the a sext instruction to determine whether new trunc + sext pair should be added when its source is being evaluated in a different type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62263 91177308-0d34-0410-b5e6-96231b3b80d8
my earlier patch to this file.
The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop. This was extra bad
because register pressure later forced both base and IV into
memory. Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this. However,
there were side effects....
It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before. And when inserting
new code that feeds into a PHI, it's right to put such
code at the original location rather than in the PHI's
immediate predecessor(s) when the original location is outside
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.
Also, the mechanism for keeping SCEV's corresponding to GEP's
no longer works, as the GEP might change after its SCEV
is remembered, invalidating the SCEV, and we might get a bad
SCEV value when looking up the GEP again for a later loop.
This also couldn't happen before, as we weren't recursing
into GEP's outside the loop.
Also, when we build an expression that involves a (possibly
non-affine) IV from a different loop as well as an IV from
the one we're interested in (containsAddRecFromDifferentLoop),
don't recurse into that. We can't do much with it and will
get in trouble if we try to create new non-affine IVs or something.
More testcases are coming.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62212 91177308-0d34-0410-b5e6-96231b3b80d8
canonicalization transform based on duncan's comments:
1) improve the comment about %.
2) within our index loop make sure the offset stays
within the *type size*, instead of within the *abi size*.
This allows us to reason explicitly about landing in tail
padding and means that issues like non-zero offsets into
[0 x foo] types don't occur anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@62045 91177308-0d34-0410-b5e6-96231b3b80d8
loads from allocas that cover the entire aggregate. This handles
some memcpy/byval cases that are produced by llvm-gcc. This triggers
a few times in kc++ (with std::pair<std::_Rb_tree_const_iterator
<kc::impl_abstract_phylum*>,bool>) and once in 176.gcc (with %struct..0anon).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61915 91177308-0d34-0410-b5e6-96231b3b80d8
integer to a (transitive) bitcast the alloca and if that integer
has the full size of the alloca, then it clobbers the whole thing.
Handle this by extracting pieces out of the stored integer and
filing them away in the SROA'd elements.
This triggers fairly frequently because the CFE uses integers to
pass small structs by value and the inliner exposes these. For
example, in kimwitu++, I see a bunch of these with i64 stores to
"%struct.std::pair<std::_Rb_tree_const_iterator<kc::impl_abstract_phylum*>,bool>"
In 176.gcc I see a few i32 stores to "%struct..0anon".
In the testcase, this is a difference between compiling test1 to:
_test1:
subl $12, %esp
movl 20(%esp), %eax
movl %eax, 4(%esp)
movl 16(%esp), %eax
movl %eax, (%esp)
movl (%esp), %eax
addl 4(%esp), %eax
addl $12, %esp
ret
vs:
_test1:
movl 8(%esp), %eax
addl 4(%esp), %eax
ret
The second half of this will be to handle loads of the same form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61853 91177308-0d34-0410-b5e6-96231b3b80d8
as template arguments instead of as instance variables, exposing more
optimization opportunities to the compiler earlier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61776 91177308-0d34-0410-b5e6-96231b3b80d8
Finalization occurs after all the FunctionPasses in the group have run, which
is clearly not what we want.
This also means that we have to make sure that we apply the right param
attributes when creating a new function.
Also, add a missed optimization: strdup and strndup. NoCapture and
NoAlias return!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61658 91177308-0d34-0410-b5e6-96231b3b80d8
other SPEC breakage. I'll be reverting all recent
changes shortly, this checking is mostly so this
change doesn't get lost.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61402 91177308-0d34-0410-b5e6-96231b3b80d8
my last patch to this file.
The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop. This was extra bad
because register pressure later forced both base and IV into
memory. Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this. However,
there were side effects....
It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before. And when inserting
new code that feeds into a PHI, it's right to put such
code at the original location rather than in the PHI's
immediate predecessor(s) when the original location is outside
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.
Also, the mechanism for keeping SCEV's corresponding to GEP's
no longer works, as the GEP might change after its SCEV
is remembered, invalidating the SCEV, and we might get a bad
SCEV value when looking up the GEP again for a later loop.
This also couldn't happen before, as we weren't recursing
into GEP's outside the loop.
I owe some testcases for this, want to get it in for nightly runs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61362 91177308-0d34-0410-b5e6-96231b3b80d8
- Use SplitBlockPredecessors to factor out common predecessors of the critical edge destination. This is disabled for now due to some regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61248 91177308-0d34-0410-b5e6-96231b3b80d8
my last patch to this file.
The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop. This was extra bad
because register pressure later forced both base and IV into
memory. Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this. However,
there were side effects....
It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before. (This patch does not handle
all the cases where this can happen.) And when inserting
new code that feeds into a PHI, it's right to put such
code at the original location rather than in the PHI's
immediate predecessor(s) when the original location is outside
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.
Everything above is exercised in
CodeGen/X86/lsr-negative-stride.ll (and ifcvt4 in ARM which is
the same IR).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61178 91177308-0d34-0410-b5e6-96231b3b80d8
can be negative. Keep track of whether all uses of
an IV are outside the loop. Some cosmetics; no
functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61109 91177308-0d34-0410-b5e6-96231b3b80d8
CFG when there is exactly one predecessor where the load is not available.
This is designed to not increase code size but still eliminate partially
redundant loads. This fires 1765 times on 403.gcc even though it doesn't
do critical edge splitting yet (the most common reason for it to fail).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61027 91177308-0d34-0410-b5e6-96231b3b80d8
cleans up the generated code a bit. This should have the added benefit of
not randomly renaming functions/globals like my previous patch did. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61023 91177308-0d34-0410-b5e6-96231b3b80d8
llvm[2]: Linking Release executable opt (without symbols)
...
Undefined symbols:
"llvm::APFloat::IEEEsingle", referenced from:
__ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
__ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
__ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
"llvm::APFloat::IEEEdouble", referenced from:
__ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
__ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
__ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
ld: symbol(s) not found
This is in release mode. To replicate, compile llvm and llvm-gcc in optimized
mode. Then build llvm, in optimized mode, with the newly created compiler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60977 91177308-0d34-0410-b5e6-96231b3b80d8
of a pointer. This allows is to catch more equivalencies. For example,
the type_lists_compatible_p function used to require two iterations of
the gvn pass (!) to delete its 18 redundant loads because the first pass
would CSE all the addressing computation cruft, which would unblock the
second memdep/gvn passes from recognizing them. This change allows
memdep/gvn to catch all 18 when run just once on the function (as is
typical :) instead of just 3.
On all of 403.gcc, this bumps up the # reundandancies found from:
63 gvn - Number of instructions PRE'd
153991 gvn - Number of instructions deleted
50069 gvn - Number of loads deleted
to:
63 gvn - Number of instructions PRE'd
154137 gvn - Number of instructions deleted
50185 gvn - Number of loads deleted
+120 loads deleted isn't bad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60799 91177308-0d34-0410-b5e6-96231b3b80d8
MemDep::getNonLocalPointerDependency method. There are
some open issues with this (missed optimizations) and
plenty of future work, but this does allow GVN to eliminate
*slightly* more loads (49246 vs 49033).
Switching over now allows simplification of the other code
path in memdep.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60780 91177308-0d34-0410-b5e6-96231b3b80d8
jump threading has been shown to only expose problems not
have bugs itself. I'm sure it's completely bug free! ;-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60725 91177308-0d34-0410-b5e6-96231b3b80d8
doesn't do its own local caching, and is slightly more aggressive about
free/store dse (see testcase). This eliminates the last external client
of MemDep::getDependenceFrom().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60619 91177308-0d34-0410-b5e6-96231b3b80d8
loops when they can be subsumed into addressing modes.
Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60608 91177308-0d34-0410-b5e6-96231b3b80d8
1. Merge the 'None' result into 'Normal', making loads
and stores return their dependencies on allocations as Normal.
2. Split the 'Normal' result into 'Clobber' and 'Def' to
distinguish between the cases when memdep knows the value is
produced from when we just know if may be changed.
3. Move some of the logic for determining whether readonly calls
are CSEs into memdep instead of it being in GVN. This still
leaves verification that the arguments are hte same to GVN to
let it know about value equivalences in different contexts.
4. Change memdep's call/call dependency analysis to use
getModRefInfo(CallSite,CallSite) instead of doing something
very weak. This only really matters for things like DSA, but
someday maybe we'll have some other decent context sensitive
analyses :)
5. This reimplements the guts of memdep to handle the new results.
6. This simplifies GVN significantly:
a) readonly call CSE is slightly simpler
b) I eliminated the "getDependencyFrom" chaining for load
elimination and load CSE doesn't have to worry about
volatile (they are always clobbers) anymore.
c) GVN no longer does any 'lastLoad' caching, leaving it to
memdep.
7. The logic in DSE is simplified a bit and sped up. A potentially
unsafe case was eliminated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60607 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes many bugs. I will add more test cases in a separate check-in.
Some day, the code that manipulates CFG and updates dom. info could use refactoring help.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60554 91177308-0d34-0410-b5e6-96231b3b80d8
1) have it fold "br undef", which does occur with
surprising frequency as jump threading iterates.
2) teach j-t to delete dead blocks. This removes the successor
edges, reducing the in-edges of other blocks, allowing
recursive simplification.
3) Fold things like:
br COND, BBX, BBY
BBX:
br COND, BBZ, BBW
which also happens because jump threading iterates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60470 91177308-0d34-0410-b5e6-96231b3b80d8
straight-forward implementation. This does not require any extra
alias analysis queries beyond what we already do for non-local loads.
Some programs really really like load PRE. For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.
The biggest limitation to the implementation is that it does not split
critical edges. This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.
The implementation of this should incidentally speed up rejection of
non-local loads because it avoids creating the repl densemap in cases
when it won't be used for fully redundant loads.
This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact. This is pretty close to ready though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60408 91177308-0d34-0410-b5e6-96231b3b80d8
constant. If X is a constant, then this is folded elsewhere.
- Added a note to Target/README.txt to indicate that we'd like to implement
this when we're able.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60399 91177308-0d34-0410-b5e6-96231b3b80d8
a new value numbering set after splitting a critical edge. This increases
the number of instances of PRE on 403.gcc from ~60 to ~570.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60393 91177308-0d34-0410-b5e6-96231b3b80d8
figuring out the base of the IV. This produces better
code in the example. (Addresses use (IV) instead of
(BASE,IV) - a significant improvement on low-register
machines like x86).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60374 91177308-0d34-0410-b5e6-96231b3b80d8
instead of std::sort. This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.
We should start using array_pod_sort where possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60335 91177308-0d34-0410-b5e6-96231b3b80d8
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then
notifying.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60329 91177308-0d34-0410-b5e6-96231b3b80d8
new instructions it simplifies. Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block. Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60327 91177308-0d34-0410-b5e6-96231b3b80d8
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal. This makes load elimination
more aggressive. For example, on 403.gcc, we had:
< 68 gvn - Number of instructions PRE'd
< 152718 gvn - Number of instructions deleted
< 49699 gvn - Number of loads deleted
< 6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses
now we have:
> 64 gvn - Number of instructions PRE'd
> 153623 gvn - Number of instructions deleted
> 49856 gvn - Number of loads deleted
> 5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses
That's an extra 157 loads deleted and extra 905 other instructions nuked.
This slows down GVN very slightly, from 3.91 to 3.96s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60314 91177308-0d34-0410-b5e6-96231b3b80d8
vector instead of a densemap. This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster. This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc
This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case. Here's the stats
for 403.gcc:
6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses
yay for caching :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60313 91177308-0d34-0410-b5e6-96231b3b80d8