when doing this transform if the GEP is not inbounds. No testcase because
it is very difficult to trigger this: instcombine already canonicalizes
GEP indices to pointer size, so it relies specific permutations of the
instcombine worklist.
Thanks to Duncan for pointing this possible problem out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92495 91177308-0d34-0410-b5e6-96231b3b80d8
on the example in PR4216. This doesn't trigger in the testsuite,
so I'd really appreciate someone scrutinizing the logic for
correctness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92458 91177308-0d34-0410-b5e6-96231b3b80d8
occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which
is copied in multiple apps) in _sch_istable, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92427 91177308-0d34-0410-b5e6-96231b3b80d8
when a consequtive sequence of elements all satisfies the
predicate. Like the double compare case, this generates better
code than the magic constant case and generalizes to more than
32/64 element array lookups.
Here are some examples where it triggers. From 403.gcc, most
accesses to the rtx_class array are handled, e.g.:
@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=547]
%142 = icmp eq i8 %141, 105
@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=543]
%165 = icmp eq i8 %164, 60
Also, most of the 59-element arrays (mode_class/rid_to_yy, etc)
optimized before are actually range compares. This lets 32-bit
machines optimize them.
400.perlbmk has stuff like this:
400.perlbmk: PL_regkind, even for 32-bit:
@PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]*> [#uses=4]
%811 = icmp ne i8 %810, 33
@PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]*> [#uses=94]
%12 = icmp ult i8 %10, 2
etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92426 91177308-0d34-0410-b5e6-96231b3b80d8
two elements match or don't match with two comparisons. For
example, the testcase compiles into:
define i1 @test5(i32 %X) {
%1 = icmp eq i32 %X, 2 ; <i1> [#uses=1]
%2 = icmp eq i32 %X, 7 ; <i1> [#uses=1]
%R = or i1 %1, %2 ; <i1> [#uses=1]
ret i1 %R
}
This generalizes the previous xforms when the array is larger than
64 elements (and this case matches) and generates better code for
cases where it overlaps with the magic bitshift case.
This generalizes more cases than you might expect. For example,
400.perlbmk has:
@PL_utf8skip = constant [256 x i8] c"\01\01\01\...
%15 = icmp ult i8 %7, 7
403.gcc has:
@rid_to_yy = internal constant [114 x i16] [i16 259, i16 260, ...
%18 = icmp eq i16 %16, 295
and xalancbmk has a bunch of examples, such as
_ZN11xercesc_2_5L15gCombiningCharsE and _ZN11xercesc_2_5L10gBaseCharsE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92417 91177308-0d34-0410-b5e6-96231b3b80d8
arrays with variable indices into a comparison of the index
with a constant. The most common occurrence of this that
I see by far is stuff like:
if ("foobar"[i] == '\0') ...
which we compile into: if (i == 6), saving a load and
materialization of the global address. This also exposes
loop trip count information to later passes in many cases.
This triggers hundreds of times in xalancbmk, which is where I first
noticed it, but it also triggers in many other apps. Here are a few
interesting ones from various apps:
@must_be_connected_without = internal constant [8 x i8*] [i8* getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8*]*> [#uses=2]
%scevgep.i = getelementptr [8 x i8*]* @must_be_connected_without, i64 0, i64 %indvar.i ; <i8**> [#uses=1]
%17 = load ...
%18 = icmp eq i8* %17, null ; <i1> [#uses=1]
-> icmp eq i64 %indvar.i, 7
@yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11*\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]*> [#uses=2]
%57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8*> [#uses=1]
%mode.0.in = getelementptr inbounds [9 x i32]* @mb_mode_table, i64 0, i64 %.pn ; <i32*> [#uses=1]
load ...
%64 = icmp eq i8 %58, 4 ; <i1> [#uses=1]
-> icmp eq i64 %.pn, 35 ; <i1> [#uses=0]
@gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767]
%scevgep.i = getelementptr [4 x i16]* @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1]
%425 = load %scevgep.i
%426 = icmp eq i16 %425, -32768 ; <i1> [#uses=0]
-> false
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92411 91177308-0d34-0410-b5e6-96231b3b80d8
pointer to int casts that confuse later optimizations. See PR3351
for details.
This improves but doesn't complete fix 483.xalancbmk because llvm-gcc
does this xform in GCC's "fold" routine as well. Clang++ will do
better I guess.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92408 91177308-0d34-0410-b5e6-96231b3b80d8
a constantexpr gep on the 'base' side of the expression.
This completes comment #4 in PR3351, which comes from
483.xalancbmk.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92402 91177308-0d34-0410-b5e6-96231b3b80d8
implement some optimizations for MIN(MIN()) and MAX(MAX()) and
MIN(MAX()) etc. This substantially improves the code in PR5822 but
doesn't kick in much elsewhere. 2 max's were optimized in
pairlocalalign and one in smg2000.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91814 91177308-0d34-0410-b5e6-96231b3b80d8
Use the presence of NSW/NUW to fold "icmp (x+cst), x" to a constant in
cases where it would otherwise be undefined behavior.
Surprisingly (to me at least), this triggers hundreds of the times in
a few benchmarks: lencode, ldecode, and 466.h264ref seem to *really*
like this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91812 91177308-0d34-0410-b5e6-96231b3b80d8
a bunch in lencode, ldecod, spass, 176.gcc, 252.eon, among others. It is
also the first part of PR5822
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91811 91177308-0d34-0410-b5e6-96231b3b80d8
where instcombine would have to split a critical edge due to a
phi node of an invoke. Since instcombine can't change the CFG,
it has to bail out from doing the transformation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91763 91177308-0d34-0410-b5e6-96231b3b80d8
to memcpy. (Such a memcpy is technically illegal, but in practice is safe
and is generated by struct self-assignment in C code.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91621 91177308-0d34-0410-b5e6-96231b3b80d8
generates store to undef and some generates store to null as the idiom
for undefined behavior. Since simplifycfg zaps both, don't remove the
undefined behavior in instcombine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89971 91177308-0d34-0410-b5e6-96231b3b80d8
ConstantExpr, not just the top-level operator. This allows it to
fold many more constants.
Also, make GlobalOpt call ConstantFoldConstantExpression on
GlobalVariable initializers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89659 91177308-0d34-0410-b5e6-96231b3b80d8
llvm.invariant.start to be used without necessarily being paired with a call
to llvm.invariant.end. If you run the entire optimization pipeline then such
calls are in fact deleted (adce does it), but that's actually a good thing since
we probably do want them to be zapped late in the game. There should really be
an integration test that checks that the llvm.invariant.start call lasts long
enough that all passes that do interesting things with it get to do their stuff
before it is deleted. But since no passes do anything interesting with it yet
this will have to wait for later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@86840 91177308-0d34-0410-b5e6-96231b3b80d8