1. Teach it new tricks: in particular how to propagate through signed shr and sexts.
2. Teach it to return a bitset of known-1 and known-0 bits, instead of just zero.
3. Teach instcombine (AND X, C) to fold when we know all C bits of X.
This implements Regression/Transforms/InstCombine/bittest.ll, and allows
future things to be simplified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26087 91177308-0d34-0410-b5e6-96231b3b80d8
instruction onto the worklist (in case they are now dead).
Add a really trivial local DSE implementation to help out bitfield code.
We now fold this:
struct S {
unsigned char a : 1, b : 1, c : 1, d : 2, e : 3;
S();
};
S::S() : a(0), b(0), c(1), d(0), e(6) {}
to this:
void %_ZN1SC1Ev(%struct.S* %this) {
entry:
%tmp.1 = getelementptr %struct.S* %this, int 0, uint 0
store ubyte 38, ubyte* %tmp.1
ret void
}
much earlier (in gccas instead of only in gccld after DSE runs).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26050 91177308-0d34-0410-b5e6-96231b3b80d8
is just as efficient as MVIZ and is also more general.
Fix a few minor bugs introduced in recent patches
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26036 91177308-0d34-0410-b5e6-96231b3b80d8
mask. This allows the code to be simpler and more efficient.
Also, generalize some of the cases in MVIZ a bit, making it slightly more aggressive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26035 91177308-0d34-0410-b5e6-96231b3b80d8
'demanded bits', inspired by Nate's work in the dag combiner. This isn't
complete, but needs to unrelated instcombiner changes to continue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26033 91177308-0d34-0410-b5e6-96231b3b80d8
Turn A / (C1 << N), where C1 is "1<<C2" into A >> (N+C2) [udiv only].
Tested with: rem.ll:test5, div.ll:test10
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26003 91177308-0d34-0410-b5e6-96231b3b80d8
the shifts.
This allows us to fold this (which is the 'integer add a constant' sequence
from cozmic's scheme compmiler):
int %x(uint %anf-temporary776) {
%anf-temporary777 = shr uint %anf-temporary776, ubyte 1
%anf-temporary800 = cast uint %anf-temporary777 to int
%anf-temporary804 = shl int %anf-temporary800, ubyte 1
%anf-temporary805 = add int %anf-temporary804, -2
%anf-temporary806 = or int %anf-temporary805, 1
ret int %anf-temporary806
}
into this:
int %x(uint %anf-temporary776) {
%anf-temporary776 = cast uint %anf-temporary776 to int
%anf-temporary776.mask1 = add int %anf-temporary776, -2
%anf-temporary805 = or int %anf-temporary776.mask1, 1
ret int %anf-temporary805
}
note that instcombine already knew how to eliminate the AND that the two
shifts fold into. This is tested by InstCombine/shift.ll:test26
-Chris
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@25128 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for specifying alignment and size of setjmp jmpbufs.
No targets currently do anything with this information, nor is it presrved
in the bytecode representation. That's coming up next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24196 91177308-0d34-0410-b5e6-96231b3b80d8
a few times in crafty:
OLD: %tmp.36 = div int %tmp.35, 8 ; <int> [#uses=1]
NEW: %tmp.36 = div uint %tmp.35, 8 ; <uint> [#uses=0]
OLD: %tmp.19 = div int %tmp.18, 8 ; <int> [#uses=1]
NEW: %tmp.19 = div uint %tmp.18, 8 ; <uint> [#uses=0]
OLD: %tmp.117 = div int %tmp.116, 8 ; <int> [#uses=1]
NEW: %tmp.117 = div uint %tmp.116, 8 ; <uint> [#uses=0]
OLD: %tmp.92 = div int %tmp.91, 8 ; <int> [#uses=1]
NEW: %tmp.92 = div uint %tmp.91, 8 ; <uint> [#uses=0]
Which all turn into shrs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24190 91177308-0d34-0410-b5e6-96231b3b80d8
8 times in vortex, allowing the srems to be turned into shrs:
OLD: %tmp.104 = rem int %tmp.5.i37, 16 ; <int> [#uses=1]
NEW: %tmp.104 = rem uint %tmp.5.i37, 16 ; <uint> [#uses=0]
OLD: %tmp.98 = rem int %tmp.5.i24, 16 ; <int> [#uses=1]
NEW: %tmp.98 = rem uint %tmp.5.i24, 16 ; <uint> [#uses=0]
OLD: %tmp.91 = rem int %tmp.5.i19, 8 ; <int> [#uses=1]
NEW: %tmp.91 = rem uint %tmp.5.i19, 8 ; <uint> [#uses=0]
OLD: %tmp.88 = rem int %tmp.5.i14, 8 ; <int> [#uses=1]
NEW: %tmp.88 = rem uint %tmp.5.i14, 8 ; <uint> [#uses=0]
OLD: %tmp.85 = rem int %tmp.5.i9, 1024 ; <int> [#uses=2]
NEW: %tmp.85 = rem uint %tmp.5.i9, 1024 ; <uint> [#uses=0]
OLD: %tmp.82 = rem int %tmp.5.i, 512 ; <int> [#uses=2]
NEW: %tmp.82 = rem uint %tmp.5.i1, 512 ; <uint> [#uses=0]
OLD: %tmp.48.i = rem int %tmp.5.i.i161, 4 ; <int> [#uses=1]
NEW: %tmp.48.i = rem uint %tmp.5.i.i161, 4 ; <uint> [#uses=0]
OLD: %tmp.20.i2 = rem int %tmp.5.i.i, 4 ; <int> [#uses=1]
NEW: %tmp.20.i2 = rem uint %tmp.5.i.i, 4 ; <uint> [#uses=0]
it also occurs 9 times in gcc, but with odd constant divisors (1009 and 61)
so the payoff isn't as great.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24189 91177308-0d34-0410-b5e6-96231b3b80d8
one use (but one is a cast). This handles the very common case of:
X = alloc [n x byte]
Y = cast X to somethingbetter
seteq X, null
In order to avoid infinite looping when there are multiple casts, we only
allow this if the xform is strictly increasing the alignment of the
allocation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23961 91177308-0d34-0410-b5e6-96231b3b80d8
where the second has less alignment required. If we had explicit alignment
support in the IR, we could handle this case, but we can't until we do.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23960 91177308-0d34-0410-b5e6-96231b3b80d8
This is useful for 178.galgel where resolution of dope vectors (by the
optimizer) causes the scales to become apparent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23328 91177308-0d34-0410-b5e6-96231b3b80d8
if () { store A -> P; } else { store B -> P; }
into a PHI node with one store, in the most trival case. This implements
load.ll:test10.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23324 91177308-0d34-0410-b5e6-96231b3b80d8
load are exactly consequtive. This is picked up by other passes, but this
triggers thousands of times in fortran programs that use static locals
(and is thus a compile-time speedup).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23320 91177308-0d34-0410-b5e6-96231b3b80d8
BasicBlock's removePredecessor routine. This requires shuffling around
the definition and implementation of hasContantValue from Utils.h,cpp into
Instructions.h,cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22664 91177308-0d34-0410-b5e6-96231b3b80d8
Because the instcombine has to scan the entire function when it starts up
to begin with, we might as well do it in DFO so we can nuke unreachable code.
This fixes: Transforms/InstCombine/2005-07-07-DeadPHILoop.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22348 91177308-0d34-0410-b5e6-96231b3b80d8
It is actually always true. This fixes PR586 and
Transforms/InstCombine/2005-06-16-SetCCOrSetCCMiscompile.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22236 91177308-0d34-0410-b5e6-96231b3b80d8
in. This tends to get cases like this:
X = cast ubyte to int
Y = shr int X, ...
Tested by: shift.ll:test24
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21775 91177308-0d34-0410-b5e6-96231b3b80d8
the result, turn signed shift rights into unsigned shift rights if possible.
This leads to later simplification and happens *often* in 176.gcc. For example,
this testcase:
struct xxx { unsigned int code : 8; };
enum codes { A, B, C, D, E, F };
int foo(struct xxx *P) {
if ((enum codes)P->code == A)
bar();
}
used to be compiled to:
int %foo(%struct.xxx* %P) {
%tmp.1 = getelementptr %struct.xxx* %P, int 0, uint 0 ; <uint*> [#uses=1]
%tmp.2 = load uint* %tmp.1 ; <uint> [#uses=1]
%tmp.3 = cast uint %tmp.2 to int ; <int> [#uses=1]
%tmp.4 = shl int %tmp.3, ubyte 24 ; <int> [#uses=1]
%tmp.5 = shr int %tmp.4, ubyte 24 ; <int> [#uses=1]
%tmp.6 = cast int %tmp.5 to sbyte ; <sbyte> [#uses=1]
%tmp.8 = seteq sbyte %tmp.6, 0 ; <bool> [#uses=1]
br bool %tmp.8, label %then, label %UnifiedReturnBlock
Now it is compiled to:
%tmp.1 = getelementptr %struct.xxx* %P, int 0, uint 0 ; <uint*> [#uses=1]
%tmp.2 = load uint* %tmp.1 ; <uint> [#uses=1]
%tmp.2 = cast uint %tmp.2 to sbyte ; <sbyte> [#uses=1]
%tmp.8 = seteq sbyte %tmp.2, 0 ; <bool> [#uses=1]
br bool %tmp.8, label %then, label %UnifiedReturnBlock
which is the difference between this:
foo:
subl $4, %esp
movl 8(%esp), %eax
movl (%eax), %eax
shll $24, %eax
sarl $24, %eax
testb %al, %al
jne .LBBfoo_2
and this:
foo:
subl $4, %esp
movl 8(%esp), %eax
movl (%eax), %eax
testb %al, %al
jne .LBBfoo_2
This occurs 3243 times total in the External tests, 215x in povray,
6x in each f2c'd program, 1451x in 176.gcc, 7x in crafty, 20x in perl,
25x in gap, 3x in m88ksim, 25x in ijpeg.
Maybe this will cause a little jump on gcc tommorow :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21715 91177308-0d34-0410-b5e6-96231b3b80d8
This implements set.ll:test20.
This triggers 2x on povray, 9x on mesa, 11x on gcc, 2x on crafty, 1x on eon,
6x on perlbmk and 11x on m88ksim.
It allows us to compile these two functions into the same code:
struct s { unsigned int bit : 1; };
unsigned foo(struct s *p) {
if (p->bit)
return 1;
else
return 0;
}
unsigned bar(struct s *p) { return p->bit; }
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21690 91177308-0d34-0410-b5e6-96231b3b80d8
Completely rework the 'setcc (cast x to larger), y' code. This code has
the advantage of implementing setcc.ll:test19 (being more general than
the previous code) and being correct in all cases.
This allows us to unxfail 2004-11-27-SetCCForCastLargerAndConstant.ll,
and close PR454.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21491 91177308-0d34-0410-b5e6-96231b3b80d8
* Properly compile this:
struct a {};
int test() {
struct a b[2];
if (&b[0] != &b[1])
abort ();
return 0;
}
to 'return 0', not abort().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19875 91177308-0d34-0410-b5e6-96231b3b80d8
The second folds operations into selects, e.g. (select C, (X+Y), (Y+Z))
-> (Y+(select C, X, Z)
This occurs a few times across spec, e.g.
select add/sub
mesa: 83 0
povray: 5 2
gcc 4 2
parser 0 22
perlbmk 13 30
twolf 0 3
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19706 91177308-0d34-0410-b5e6-96231b3b80d8
Disable the xform for < > cases. It turns out that the following is being
miscompiled:
bool %test(sbyte %S) {
%T = cast sbyte %S to uint
%V = setgt uint %T, 255
ret bool %V
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19628 91177308-0d34-0410-b5e6-96231b3b80d8
* We can now fold cast instructions into select instructions that
have at least one constant operand.
* We now optimize expressions more aggressively based on bits that are
known to be zero. These optimizations occur a lot in code that uses
bitfields even in simple ways.
* We now turn more cast-cast sequences into AND instructions. Before we
would only do this if it if all types were unsigned. Now only the
middle type needs to be unsigned (guaranteeing a zero extend).
* We transform sign extensions into zero extensions in several cases.
This corresponds to these test/Regression/Transforms/InstCombine testcases:
2004-11-22-Missed-and-fold.ll
and.ll: test28-29
cast.ll: test21-24
and-or-and.ll
cast-cast-to-and.ll
zeroext-and-reduce.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19220 91177308-0d34-0410-b5e6-96231b3b80d8
successor block. This turns cases like this:
x = a op b
if (c) {
use x
}
into:
if (c) {
x = a op b
use x
}
This triggers 3965 times in spec, and is tested by
Regression/Transforms/InstCombine/sink_instruction.ll
This appears to expose a bug in the X86 backend for 177.mesa, which I'm
looking in to.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18677 91177308-0d34-0410-b5e6-96231b3b80d8
* Make sure we handle signed to unsigned conversion correctly
* Move this visitSetCondInst case to its own method.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18312 91177308-0d34-0410-b5e6-96231b3b80d8
If this happens, detect it early instead of relying on instcombine to notice
it later. This can be a big speedup, because PHI nodes can have many
incoming values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17741 91177308-0d34-0410-b5e6-96231b3b80d8
This exposes subsequent optimization possiblities and reduces code size.
This triggers 1423 times in spec.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17740 91177308-0d34-0410-b5e6-96231b3b80d8
%X = alloca ...
%Y = alloca ...
X == Y
into false. This allows us to simplify some stuff in eon (and probably
many other C++ programs) where operator= was checking for self assignment.
Folding this allows us to SROA several additional structs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17735 91177308-0d34-0410-b5e6-96231b3b80d8
for (X * C1) + (X * C2) (where * can be mul or shl), allowing us to fold:
Y+Y+Y+Y+Y+Y+Y+Y
into
%tmp.8 = shl long %Y, ubyte 3 ; <long> [#uses=1]
instead of
%tmp.4 = shl long %Y, ubyte 2 ; <long> [#uses=1]
%tmp.12 = shl long %Y, ubyte 2 ; <long> [#uses=1]
%tmp.8 = add long %tmp.4, %tmp.12 ; <long> [#uses=1]
This implements add.ll:test25
Also add support for (X*C1)-(X*C2) -> X*(C1-C2), implementing sub.ll:test18
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17704 91177308-0d34-0410-b5e6-96231b3b80d8