e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99928 91177308-0d34-0410-b5e6-96231b3b80d8
pointer. There was also a SmallPtrSet whose settiness wasn't being used, so I
changed it to a SmallVector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99713 91177308-0d34-0410-b5e6-96231b3b80d8
I have audited all getOperandNo calls now, fixing
hidden assumptions. CallSite related uglyness will
be eliminated successively.
Note this patch has a long and griveous history,
for all the back-and-forths have a look at
CallSite.h's log.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99399 91177308-0d34-0410-b5e6-96231b3b80d8
so that the SCEVExpander doesn't retain a dangling pointer as its
insert position. The dangling pointer in this case wasn't ever used
to insert new instructions, but it was causing trouble with
SCEVExpander's code for automatically advancing its insert position
past debug intrinsics.
This fixes use-after-free errors that valgrind noticed in
test/Transforms/IndVarSimplify/2007-06-06-DeleteDanglesPtr.ll and
test/Transforms/IndVarSimplify/exit_value_tests.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99036 91177308-0d34-0410-b5e6-96231b3b80d8
This time I did a self-hosted bootstrap on Linux x86-64,
with no problems. Let's see how darwin 64-bit self-hosting
goes. At the first sign of failure I'll back this out.
Maybe the valgrind bots give me a hint of what may be wrong
(it at all).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98957 91177308-0d34-0410-b5e6-96231b3b80d8
out the remainder of the calls that we should lower in some way and
move the tests to the new correct directory. Fix up tests that are now
optimized more than they were before by -instcombine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97875 91177308-0d34-0410-b5e6-96231b3b80d8
can be used in more places. Add an argument for the TargetData that
most of them need. Update for the getInt8PtrTy() change. Should be
no functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97844 91177308-0d34-0410-b5e6-96231b3b80d8
predecessors before returning. Otherwise, if multiple predecessor edges need
splitting, we only get one of them per iteration. This makes a small but
measurable compile time improvement with -enable-full-load-pre.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97521 91177308-0d34-0410-b5e6-96231b3b80d8
which branch on undef to branch on a boolean constant for the edge
exiting the loop. This helps ScalarEvolution compute trip counts for
loops.
Teach ScalarEvolution to recognize single-value PHIs, when safe, and
ForgetSymbolicName to forget such single-value PHI nodes as apprpriate
in ForgetSymbolicName.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97126 91177308-0d34-0410-b5e6-96231b3b80d8
argument is non-null, pass it along to PHITranslateSubExpr so that it can
prefer using existing values that dominate the PredBB, instead of just
blindly picking the first equivalent value that it finds on a uselist.
Also when the DominatorTree is specified, have PHITranslateValue filter
out any result that does not dominate the PredBB. This is basically just
refactoring the check that used to be in GetAvailablePHITranslatedSubExpr
and also in GVN.
Despite my initial expectations, this change does not affect the results
of GVN for any testcases that I could find, but it should help compile time.
Before this change, if PHITranslateSubExpr picked a value that does not
dominate, PHITranslateWithInsertion would then insert a new value, which GVN
would later determine to be redundant and would replace. By picking a good
value to begin with, we save GVN the extra work of inserting and then
replacing a new value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97010 91177308-0d34-0410-b5e6-96231b3b80d8
induction variable value and a loop-variant value, don't force the
insert position to be at the post-increment position, because it may
not be dominated by the loop-variant value. This fixes a
use-before-def problem noticed on PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96774 91177308-0d34-0410-b5e6-96231b3b80d8
strides in foreign loops. This helps locate reuse opportunities
with existing induction variables in foreign loops and reduces
the need for inserting new ones. This fixes rdar://7657764.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96629 91177308-0d34-0410-b5e6-96231b3b80d8
a loop exit value, so that if a loop gets deleted, ScalarEvolution
isn't stick holding on to dangling SCEVAddRecExprs for that loop. This
fixes PR6339.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96626 91177308-0d34-0410-b5e6-96231b3b80d8
and T->isPointerTy(). Convert most instances of the first form to the second form.
Requested by Chris.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96344 91177308-0d34-0410-b5e6-96231b3b80d8
as it also peeks at which registers are being used by other uses. This
makes LSR less sensitive to use-list order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96308 91177308-0d34-0410-b5e6-96231b3b80d8
with multiplication by constants distributed through, occasionally
those subexpressions can include both x and -x. For now, if this
condition is discovered within LSR, just prune such cases away,
as they won't be profitable. This fixes a "zero allocated in a
base register" assertion failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96177 91177308-0d34-0410-b5e6-96231b3b80d8
and add a doxygen comment.
Cache the phi entry to avoid doing tons of
PHINode::getBasicBlockIndex calls in the common case.
On my insane testcase from re2c, this speeds up CGP from
617.4s to 7.9s (78x).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96083 91177308-0d34-0410-b5e6-96231b3b80d8
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.
This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95975 91177308-0d34-0410-b5e6-96231b3b80d8
block. Other blocks may have pointer cycles that will crash
basicaa and other alias analyses. In any case, there is no
point wasting cycles optimizing dead blocks. This fixes
rdar://7635088
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95852 91177308-0d34-0410-b5e6-96231b3b80d8
Initial skeleton and SCEVUnknown lowering implemented,
the rest should come relatively quickly. Move testcase
to new directory.
Move pass to right before SimplifyLibCalls - which is
moved down a bit so we can take advantage of a few opts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95628 91177308-0d34-0410-b5e6-96231b3b80d8
container data. This prevents it from holding onto dangling
pointers and potentially behaving unpredictably.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95409 91177308-0d34-0410-b5e6-96231b3b80d8
short-circuited conditions to AND/OR expressions, and those expressions
are often converted back to a short-circuited form in code gen. The
original source order may have been optimized to take advantage of the
expected values, and if we reassociate them, we change the order and
subvert that optimization. Radar 7497329.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95333 91177308-0d34-0410-b5e6-96231b3b80d8
The SRThreshold value makes perfect sense for checking if an entire aggregate
should be promoted to a scalar integer, but it is not so good for splitting
an aggregate into its separate elements. A struct may contain a large embedded
array along with some scalar fields that would benefit from being split apart
by SROA. Even if the total aggregate size is large, it may still be good to
perform SROA. Thus, the most important piece of this patch is simply moving
the aggregate size comparison vs. SRThreshold so that it guards only the
aggregate promotion.
We have also been checking the number of elements to decide if an aggregate
should be split up. The limit of "SRThreshold/4" seemed rather arbitrary,
and I don't think it's very useful to derive this limit from SRThreshold
anyway. I've collected some data showing that the current default limit of
32 (since SRThreshold defaults to 128) is a reasonable cutoff for struct
types. One thing suggested by the data is that distinguishing between structs
and arrays might be useful. There are (obviously) a lot more large arrays
than large structs (as measured by the number of elements and not the total
size -- a large array inside a struct still counts as a single element given
the way we do SROA right now). Out of 8377 arrays where we successfully
performed SROA while compiling a large set of benchmarks, only 16 of them had
more than 8 elements. And, for those 16 arrays, it's not at all clear that
SROA was actually beneficial. So, to offset the compile time cost of
investigating more large structs for SROA, the patch lowers the limit on array
elements to 8.
This fixes Apple Radar 7563690.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95224 91177308-0d34-0410-b5e6-96231b3b80d8
disabled by default. This divides the existing load PRE code into 2 phases:
first it checks that it is safe to move the load to each of the predecessors
where it is unavailable, and then if it is safe, the code is changed to move
the load. Radar 7571861.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95007 91177308-0d34-0410-b5e6-96231b3b80d8
unconditionally. Besides checking the offset, also check that the underlying
object is aligned as much as the load itself.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94875 91177308-0d34-0410-b5e6-96231b3b80d8
parameter with a default value, instead of just hardcoding it in the
implementation. The limit of MaxLookup = 6 was introduced in r69151 to fix
a performance problem with O(n^2) behavior in instcombine, but the scalarrepl
pass is relying on getUnderlyingObject to go all the way back to an AllocaInst.
Making the limit part of the method signature makes it clear that by default
the result is limited and should help avoid similar problems in the future.
This fixes pr6126.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94433 91177308-0d34-0410-b5e6-96231b3b80d8
for arbitrary terminators in predecessors, don't assume
it is a conditional or uncond branch. The testcase shows
an example where they can happen with switches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94323 91177308-0d34-0410-b5e6-96231b3b80d8
handle the case when we can infer an input to the xor
from all inputs that agree, instead of going into an
infinite loop. Another part of PR6199
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94321 91177308-0d34-0410-b5e6-96231b3b80d8
missing ones are libsupport, libsystem and libvmcore. libvmcore is
currently blocked on bugpoint, which uses EH. Once it stops using
EH, we can switch it off.
This #if 0's out 3 unit tests, because gtest requires RTTI information.
Suggestions welcome on how to fix this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94164 91177308-0d34-0410-b5e6-96231b3b80d8
loop-variant components, adds must be inserted after the increment.
Keep track of the increment position for this case, and insert
these adds in the correct location.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94110 91177308-0d34-0410-b5e6-96231b3b80d8
operands exceeds the number of registers used in the initial
solution, as that wouldn't lead to a profitable solution anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94107 91177308-0d34-0410-b5e6-96231b3b80d8
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.
It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94061 91177308-0d34-0410-b5e6-96231b3b80d8
than the scaled register. This makes it more likely that subsequent
AddrModeMatcher queries will match the new address the same way as the
old, instead of accidentally matching what had been the base register
as the new scaled register, and then failing to match the scaled register.
This fixes some problems with address-mode sinking multiple muls into a
block, which will be a lot more common with some upcoming
LoopStrengthReduction changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93935 91177308-0d34-0410-b5e6-96231b3b80d8
are the same. I had already fixed a similar problem where the source and
destination were different bitcasts derived from the same alloca, but the
previous fix still did not handle the case where both operands are exactly
the same value. Radar 7552893.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93848 91177308-0d34-0410-b5e6-96231b3b80d8
in JT.
2) When cloning blocks for PHI or xor conditions, use
instsimplify to simplify the code as we go. This allows us to
squish common cases early in JT which opens up opportunities for
subsequent iterations, and allows it to completely simplify the
testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93253 91177308-0d34-0410-b5e6-96231b3b80d8
condition is a xor with a phi node. This eliminates nonsense
like this from 176.gcc in several places:
LBB166_84:
testl %eax, %eax
- setne %al
- xorb %cl, %al
- notb %al
- testb $1, %al
- je LBB166_85
+ je LBB166_69
+ jmp LBB166_85
This is rdar://7391699
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93221 91177308-0d34-0410-b5e6-96231b3b80d8
on the example in PR4216. This doesn't trigger in the testsuite,
so I'd really appreciate someone scrutinizing the logic for
correctness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92458 91177308-0d34-0410-b5e6-96231b3b80d8
occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which
is copied in multiple apps) in _sch_istable, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92427 91177308-0d34-0410-b5e6-96231b3b80d8
when a consequtive sequence of elements all satisfies the
predicate. Like the double compare case, this generates better
code than the magic constant case and generalizes to more than
32/64 element array lookups.
Here are some examples where it triggers. From 403.gcc, most
accesses to the rtx_class array are handled, e.g.:
@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=547]
%142 = icmp eq i8 %141, 105
@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=543]
%165 = icmp eq i8 %164, 60
Also, most of the 59-element arrays (mode_class/rid_to_yy, etc)
optimized before are actually range compares. This lets 32-bit
machines optimize them.
400.perlbmk has stuff like this:
400.perlbmk: PL_regkind, even for 32-bit:
@PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]*> [#uses=4]
%811 = icmp ne i8 %810, 33
@PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]*> [#uses=94]
%12 = icmp ult i8 %10, 2
etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92426 91177308-0d34-0410-b5e6-96231b3b80d8
two elements match or don't match with two comparisons. For
example, the testcase compiles into:
define i1 @test5(i32 %X) {
%1 = icmp eq i32 %X, 2 ; <i1> [#uses=1]
%2 = icmp eq i32 %X, 7 ; <i1> [#uses=1]
%R = or i1 %1, %2 ; <i1> [#uses=1]
ret i1 %R
}
This generalizes the previous xforms when the array is larger than
64 elements (and this case matches) and generates better code for
cases where it overlaps with the magic bitshift case.
This generalizes more cases than you might expect. For example,
400.perlbmk has:
@PL_utf8skip = constant [256 x i8] c"\01\01\01\...
%15 = icmp ult i8 %7, 7
403.gcc has:
@rid_to_yy = internal constant [114 x i16] [i16 259, i16 260, ...
%18 = icmp eq i16 %16, 295
and xalancbmk has a bunch of examples, such as
_ZN11xercesc_2_5L15gCombiningCharsE and _ZN11xercesc_2_5L10gBaseCharsE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92417 91177308-0d34-0410-b5e6-96231b3b80d8
arrays with variable indices into a comparison of the index
with a constant. The most common occurrence of this that
I see by far is stuff like:
if ("foobar"[i] == '\0') ...
which we compile into: if (i == 6), saving a load and
materialization of the global address. This also exposes
loop trip count information to later passes in many cases.
This triggers hundreds of times in xalancbmk, which is where I first
noticed it, but it also triggers in many other apps. Here are a few
interesting ones from various apps:
@must_be_connected_without = internal constant [8 x i8*] [i8* getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8*]*> [#uses=2]
%scevgep.i = getelementptr [8 x i8*]* @must_be_connected_without, i64 0, i64 %indvar.i ; <i8**> [#uses=1]
%17 = load ...
%18 = icmp eq i8* %17, null ; <i1> [#uses=1]
-> icmp eq i64 %indvar.i, 7
@yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11*\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]*> [#uses=2]
%57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8*> [#uses=1]
%mode.0.in = getelementptr inbounds [9 x i32]* @mb_mode_table, i64 0, i64 %.pn ; <i32*> [#uses=1]
load ...
%64 = icmp eq i8 %58, 4 ; <i1> [#uses=1]
-> icmp eq i64 %.pn, 35 ; <i1> [#uses=0]
@gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767]
%scevgep.i = getelementptr [4 x i16]* @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1]
%425 = load %scevgep.i
%426 = icmp eq i16 %425, -32768 ; <i1> [#uses=0]
-> false
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92411 91177308-0d34-0410-b5e6-96231b3b80d8
pointer to int casts that confuse later optimizations. See PR3351
for details.
This improves but doesn't complete fix 483.xalancbmk because llvm-gcc
does this xform in GCC's "fold" routine as well. Clang++ will do
better I guess.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92408 91177308-0d34-0410-b5e6-96231b3b80d8
a constantexpr gep on the 'base' side of the expression.
This completes comment #4 in PR3351, which comes from
483.xalancbmk.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92402 91177308-0d34-0410-b5e6-96231b3b80d8
positive and negative forms of constants together. This
allows us to compile:
int foo(int x, int y) {
return (x-y) + (x-y) + (x-y);
}
into:
_foo: ## @foo
subl %esi, %edi
leal (%rdi,%rdi,2), %eax
ret
instead of (where the 3 and -3 were not factored):
_foo:
imull $-3, 8(%esp), %ecx
imull $3, 4(%esp), %eax
addl %ecx, %eax
ret
this started out as:
movl 12(%ebp), %ecx
imull $3, 8(%ebp), %eax
subl %ecx, %eax
subl %ecx, %eax
subl %ecx, %eax
ret
This comes from PR5359.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92381 91177308-0d34-0410-b5e6-96231b3b80d8
non-templated IRBuilderBase class. Move that large CreateGlobalString
out of line, eliminating the need to #include GlobalVariable.h in IRBuilder.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92227 91177308-0d34-0410-b5e6-96231b3b80d8
SDISel. This optimization was causing simplifylibcalls to
introduce type-unsafe nastiness. This is the first step, I'll be
expanding the memcmp optimizations shortly, covering things that
we really really wouldn't want simplifylibcalls to do.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92098 91177308-0d34-0410-b5e6-96231b3b80d8
load is needed when we have a small store into a large alloca (at which
point we get a load/insert/store sequence), but when you do a full-sized
store, this load ends up being dead.
This dead load is bad in really large nasty testcases where the load ends
up causing mem2reg to insert large chains of dependent phi nodes which only
ADCE can delete. Instead of doing this, just don't insert the dead load.
This fixes rdar://6864035
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91917 91177308-0d34-0410-b5e6-96231b3b80d8
missing check that an array reference doesn't go past the end of the array,
and remove some redundant checks for in-bound array and vector references
that are no longer needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91897 91177308-0d34-0410-b5e6-96231b3b80d8
by merging all returns in a function into a single one, but simplifycfg
currently likes to duplicate the return (an unfortunate choice!)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91890 91177308-0d34-0410-b5e6-96231b3b80d8
instead of stored. This reduces memdep memory usage, and also eliminates a bunch of
weakvh's. This speeds up gvn on gcc.c-torture/20001226-1.c from 23.9s to 8.45s (2.8x)
on a different machine than earlier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91885 91177308-0d34-0410-b5e6-96231b3b80d8
load to avoid even messing around with SSAUpdate at all. In this case (which
is very common, we can just use the input value directly).
This speeds up GVN time on gcc.c-torture/20001226-1.c from 36.4s to 16.3s,
which still isn't great, but substantially better and this is a simple speedup
that applies to lots of different cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91851 91177308-0d34-0410-b5e6-96231b3b80d8
two-element arrays. After restructuring the SROA code, it was not safe to
do this without adding more checking. It is not clear that this special-case
has really been useful, and removing this simplifies the code quite a bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91828 91177308-0d34-0410-b5e6-96231b3b80d8
implement some optimizations for MIN(MIN()) and MAX(MAX()) and
MIN(MAX()) etc. This substantially improves the code in PR5822 but
doesn't kick in much elsewhere. 2 max's were optimized in
pairlocalalign and one in smg2000.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91814 91177308-0d34-0410-b5e6-96231b3b80d8
Use the presence of NSW/NUW to fold "icmp (x+cst), x" to a constant in
cases where it would otherwise be undefined behavior.
Surprisingly (to me at least), this triggers hundreds of the times in
a few benchmarks: lencode, ldecode, and 466.h264ref seem to *really*
like this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91812 91177308-0d34-0410-b5e6-96231b3b80d8
a bunch in lencode, ldecod, spass, 176.gcc, 252.eon, among others. It is
also the first part of PR5822
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91811 91177308-0d34-0410-b5e6-96231b3b80d8
where instcombine would have to split a critical edge due to a
phi node of an invoke. Since instcombine can't change the CFG,
it has to bail out from doing the transformation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91763 91177308-0d34-0410-b5e6-96231b3b80d8
* change FindElementAndOffset to return a uint64_t instead of unsigned, and
to identify the type to be used for that result in a GEP instruction.
* move "isa<ConstantInt>" to be first in conditional.
* replace some dyn_casts with casts.
* add a comment about handling mem intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91762 91177308-0d34-0410-b5e6-96231b3b80d8
bootstrap. This also replaces the WeakVH references that Chris objected to
with normal Value references.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91711 91177308-0d34-0410-b5e6-96231b3b80d8
contains another loop, or an instruction. The loop form is
substantially more efficient on large loops than the typical
code it replaces.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91654 91177308-0d34-0410-b5e6-96231b3b80d8
of 91296 that caused trouble -- the Processed list needs to be
preserved for the livetime of the pass, as AddUsersIfInteresting
is called from other passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91641 91177308-0d34-0410-b5e6-96231b3b80d8
to memcpy. (Such a memcpy is technically illegal, but in practice is safe
and is generated by struct self-assignment in C code.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91621 91177308-0d34-0410-b5e6-96231b3b80d8
problem", this broke llvm-gcc bootstrap for release builds on
x86_64-apple-darwin10.
This reverts commit db22309800b224a9f5f51baf76071d7a93ce59c9.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91534 91177308-0d34-0410-b5e6-96231b3b80d8
found last time. Instead of trying to modify the IR while iterating over it,
I've change it to keep a list of WeakVH references to dead instructions, and
then delete those instructions later. I also added some special case code to
detect and handle the situation when both operands of a memcpy intrinsic are
referencing the same alloca.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91459 91177308-0d34-0410-b5e6-96231b3b80d8