an instruction if it can be hoisted to a common dominator of the block.
This implements: test/Regression/Transforms/TailDup/MergeTest.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16758 91177308-0d34-0410-b5e6-96231b3b80d8
* SubOne/AddOne functions always return ConstantInt, declare them as such
* Pull code for handling setcc X, cst, where cst is at the end of the range,
or cc is LE or GE up earlier in visitSetCondInst. This reduces #iterations
in some cases.
* Fold: (div X, C1) op C2 -> range check, implementing div.ll:test6 - test9.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16588 91177308-0d34-0410-b5e6-96231b3b80d8
This takes something like this:
%A = phi int [ 3, %cond_false.0 ], [ 2, %endif.0.i ], [ 2, %endif.1.i ]
%B = div int %tmp.243, 4
and turns it into:
%A = phi int [ 3/4, %cond_false.0 ], [ 2/4, %endif.0.i ], [ 2/4, %endif.1.i ]
which is later simplified (in this case) into %A = 0.
This triggers thousands of times in spec, for example, 269 times in 176.gcc.
This is tested by InstCombine/add.ll:test23 and set.ll:test18.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16582 91177308-0d34-0410-b5e6-96231b3b80d8
Instcombine (setcc (truncate X), C1).
This occurs THOUSANDS of times in many benchmarks. Particularlly common
seem to be things like (seteq (cast bool X to int), int 0)
This turns it into (seteq bool %X, false), which then becomes (not %X).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16567 91177308-0d34-0410-b5e6-96231b3b80d8
This is important for several reasons:
1. Benchmarks have lots of code that looks like this (perlbmk in particular):
%tmp.2.i = setne int %tmp.0.i, 128 ; <bool> [#uses=1]
%tmp.6343 = seteq int %tmp.0.i, 1 ; <bool> [#uses=1]
%tmp.63 = and bool %tmp.2.i, %tmp.6343 ; <bool> [#uses=1]
we now fold away the setne, a clear improvement.
2. In the more important cases, such as (X >= 10) & (X < 20), we now produce
smaller code: (X-10) < 10.
3. Perhaps the nicest effect of this patch is that it really helps out the
code generators. In particular, for a 'range test' like the above,
instead of generating this on X86 (the difference on PPC is even more
pronounced):
cmp %EAX, 50
setge %CL
cmp %EAX, 100
setl %AL
and %CL, %AL
cmp %CL, 0
we now generate this:
add %EAX, -50
cmp %EAX, 50
Furthermore, this causes setcc's to be folded into branches more often.
These combinations trigger dozens of times in the spec benchmarks, particularly
in 176.gcc, 186.crafty, 253.perlbmk, 254.gap, & 099.go.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16559 91177308-0d34-0410-b5e6-96231b3b80d8
Implement (setcc (shl X, C1), C2) folding.
The second one occurs several dozen times in spec. The first was added
just in case. :)
These are tested by shift.ll:test2[12], and div.ll:test5
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16549 91177308-0d34-0410-b5e6-96231b3b80d8
This latent bug was exposed by recent changes, and is tested as:
llvm/test/Regression/Transforms/InstCombine/2004-09-28-BadShiftAndSetCC.llx
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16546 91177308-0d34-0410-b5e6-96231b3b80d8
where we folded (X & 254) -> X < 1 instead of X < 2. These problems were
latent problems exposed by the latest patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16528 91177308-0d34-0410-b5e6-96231b3b80d8
triggers often, for example:
6x in povray, 1x in gzip, 279x in gcc, 1x in crafty, 8x in eon, 11x in perlbmk,
362x in gap, 4x in vortex, 14 in m88ksim, 211x in 126.gcc, 1x in compress,
11x in ijpeg, and 4x in 147.vortex.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16521 91177308-0d34-0410-b5e6-96231b3b80d8
Move include/Config and include/Support into include/llvm/Config,
include/llvm/ADT and include/llvm/Support. From here on out, all LLVM
public header files must be under include/llvm/.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16137 91177308-0d34-0410-b5e6-96231b3b80d8
This also registers the pass with opt with a -lower-packed command line
option.
Patch contributed by Brad Jones.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15987 91177308-0d34-0410-b5e6-96231b3b80d8
assumed that a constant on the RHS of a multiplication was either an
IntConstant or an FPConstant. It checked for an IntConstant and then,
if it did not find one, did a hard cast to an FPConstant. That code
would crash if the RHS were a ConstantExpr that was neither an
IntConstant nor an FPConstant. This version replaces the hard cast
with a dyn_cast. It performs the same way for IntConstants and
FPConstants but does nothing, instead of crashing, for constant
expressions.
The regression test for this change is 2004-07-27-ConstantExprMul.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15291 91177308-0d34-0410-b5e6-96231b3b80d8
a bug in DSE).
* Delete dead operand uses iteratively instead of recursively, using a
SetVector.
* Defer deletion of dead operand uses until the end of processing, which means
we don't have to bother with updating the AliasSetTracker. This speeds up
DSE substantially.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15204 91177308-0d34-0410-b5e6-96231b3b80d8
can be improved in many ways. But: stop laughing, even with -basicaa it
deletes 15% of the stores in 252.eon :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15101 91177308-0d34-0410-b5e6-96231b3b80d8
* Test for whether bits are shifted out during the optzn.
If so, the fold is illegal, though it can be handled explicitly for setne/seteq
This fixes the miscompilation of 254.gap last night, which was a latent bug
exposed by other optimizer improvements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15085 91177308-0d34-0410-b5e6-96231b3b80d8
actually care about. Someday when the cast instruction is gone, we can do
better here, but this will do for now. This implements
instcombine/cast.ll:test17/18 as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15018 91177308-0d34-0410-b5e6-96231b3b80d8
Speed up SCCP substantially by processing overdefined values quickly. This
patch speeds up SCCP by about 30-40% on large testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14861 91177308-0d34-0410-b5e6-96231b3b80d8
"load (cast foo)". This allows us to compile C++ code like this:
class Bclass {
public: virtual int operator()() { return 666; }
};
class Dclass: public Bclass {
public: virtual int operator()() { return 667; }
} ;
int main(int argc, char** argv) {
Dclass x;
return x();
}
Into this:
int %main(int %argc, sbyte** %argv) {
entry:
call void %__main( )
ret int 667
}
Instead of this:
int %main(int %argc, sbyte** %argv) {
entry:
%x = alloca "struct.std::bad_typeid" ; <"struct.std::bad_typeid"*> [#uses=3]
call void %__main( )
%tmp.1.i.i = getelementptr "struct.std::bad_typeid"* %x, uint 0, uint 0, uint 0 ; <int (...)***> [#uses=1]
store int (...)** getelementptr ([3 x int (...)*]* %vtable for Bclass, int 0, long 2), int (...)*** %tmp.1.i.i
%tmp.3.i = getelementptr "struct.std::bad_typeid"* %x, int 0, uint 0, uint 0 ; <int (...)***> [#uses=1]
store int (...)** getelementptr ([3 x int (...)*]* %vtable for Dclass, int 0, long 2), int (...)*** %tmp.3.i
%tmp.5 = load int ("struct.std::bad_typeid"*)** cast (int (...)** getelementptr ([3 x int (...)*]* %vtable for Dclass, int 0, long 2) to int
("struct.std::bad_typeid"*)**) ; <int ("struct.std::bad_typeid"*)*> [#uses=1]
%tmp.6 = call int %tmp.5( "struct.std::bad_typeid"* %x ) ; <int> [#uses=1]
ret int %tmp.6
ret int 0
}
In order words, we now resolve the virtual function call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14783 91177308-0d34-0410-b5e6-96231b3b80d8
Don't touch GEPs for which DecomposeArrayRef is not going to do anything
special (e.g., < 2 indices, or 2 indices and the last one is a constant.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14647 91177308-0d34-0410-b5e6-96231b3b80d8
Also, remove X % -1 = 0, because it's not true for unsigneds, and the
signed case is superceeded by this new handling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14637 91177308-0d34-0410-b5e6-96231b3b80d8
was processing blocks in whatever order they happened to end up in the
dominator tree data structure. Force an ordering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14248 91177308-0d34-0410-b5e6-96231b3b80d8
186.crafty, fhourstones and 132.ijpeg.
Bugpoint makes really nasty miscompilations embarassingly easy to find. It
narrowed it down to the instcombiner and this testcase (from fhourstones):
bool %l7153_l4706_htstat_loopentry_2E_4_no_exit_2E_4(int* %i, [32 x int]* %works, int* %tmp.98.out) {
newFuncRoot:
%tmp.96 = load int* %i ; <int> [#uses=1]
%tmp.97 = getelementptr [32 x int]* %works, long 0, int %tmp.96 ; <int*> [#uses=1]
%tmp.98 = load int* %tmp.97 ; <int> [#uses=2]
%tmp.99 = load int* %i ; <int> [#uses=1]
%tmp.100 = and int %tmp.99, 7 ; <int> [#uses=1]
%tmp.101 = seteq int %tmp.100, 7 ; <bool> [#uses=2]
%tmp.102 = cast bool %tmp.101 to int ; <int> [#uses=0]
br bool %tmp.101, label %codeRepl4.exitStub, label %codeRepl3.exitStub
codeRepl4.exitStub: ; preds = %newFuncRoot
store int %tmp.98, int* %tmp.98.out
ret bool true
codeRepl3.exitStub: ; preds = %newFuncRoot
store int %tmp.98, int* %tmp.98.out
ret bool false
}
... which only has one combination performed on it:
$ llvm-as < t.ll | opt -instcombine -debug | llvm-dis
IC: Old = %tmp.101 = seteq int %tmp.100, 7 ; <bool> [#uses=1]
New = setne int %tmp.100, 0 ; <bool>:<badref> [#uses=0]
IC: MOD = br bool %tmp.101, label %codeRepl3.exitStub, label %codeRepl4.exitStub
IC: MOD = %tmp.97 = getelementptr [32 x int]* %works, uint 0, int %tmp.96 ; <int*> [#uses=1]
It doesn't get much better than this. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14109 91177308-0d34-0410-b5e6-96231b3b80d8
collapse this:
bool %le(int %A, int %B) {
%c1 = setgt int %A, %B
%tmp = select bool %c1, int 1, int 0
%c2 = setlt int %A, %B
%result = select bool %c2, int -1, int %tmp
%c3 = setle int %result, 0
ret bool %c3
}
into:
bool %le(int %A, int %B) {
%c3 = setle int %A, %B ; <bool> [#uses=1]
ret bool %c3
}
which is handy, because the Java FE makes these sequences all over the place.
This is tested as: test/Regression/Transforms/InstCombine/JavaCompare.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14086 91177308-0d34-0410-b5e6-96231b3b80d8
This code hadn't been updated after the "structs with more than 256 elements"
related changes to the GEP instruction. Also it was not handling the
ConstantAggregateZero class.
Now it does!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13834 91177308-0d34-0410-b5e6-96231b3b80d8
into (X & (C2 << C1)) != (C3 << C1), where the shift may be either left or
right and the compare may be any one.
This triggers 1546 times in 176.gcc alone, as it is a common pattern that
occurs for bitfield accesses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13740 91177308-0d34-0410-b5e6-96231b3b80d8
in the size calculation.
This is not something you want to see:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - UNROLLING!
The problem was that 2*2147483648 == 0.
Now we get:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - TOO LARGE: 4294967296>100
Thanks to some anonymous person playing with the demo page that repeatedly
caused zion to go into swapping land. That's one way to ensure you'll get
a quick bugfix. :)
Testcase here: Transforms/LoopUnroll/2004-05-13-DontUnrollTooMuch.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13564 91177308-0d34-0410-b5e6-96231b3b80d8
%tmp.0 = getelementptr [50 x sbyte]* %ar, uint 0, int 5 ; <sbyte*> [#uses=2]
%tmp.7 = getelementptr sbyte* %tmp.0, int 8 ; <sbyte*> [#uses=1]
together. This patch actually allows us to simplify and generalize the code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13415 91177308-0d34-0410-b5e6-96231b3b80d8
is only used by a cast, and the casted type is the same size as the original
allocation, it would eliminate the cast by folding it into the allocation.
Unfortunately, it was placing the new allocation instruction right before
the cast, which could pull (for example) alloca instructions into the body
of a function. This turns statically allocatable allocas into expensive
dynamically allocated allocas, which is bad bad bad.
This fixes the problem by placing the new allocation instruction at the same
place the old one was, duh. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13289 91177308-0d34-0410-b5e6-96231b3b80d8
loop. This eliminates the extra add from the previous case, but it's
not clear that this will be a performance win overall. Tommorows test
results will tell. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13103 91177308-0d34-0410-b5e6-96231b3b80d8
over its USES. If it's dead it doesn't have any uses! :)
Thanks to the fabulous and mysterious Bill Wendling for pointing this out. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13102 91177308-0d34-0410-b5e6-96231b3b80d8
structure to being dynamically computed on demand. This makes updating
loop information MUCH easier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13045 91177308-0d34-0410-b5e6-96231b3b80d8
that the exit block of the loop becomes the new entry block of the function.
This was causing a verifier assertion on 252.eon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13039 91177308-0d34-0410-b5e6-96231b3b80d8
block. The primary motivation for doing this is that we can now unroll nested loops.
This makes a pretty big difference in some cases. For example, in 183.equake,
we are now beating the native compiler with the CBE, and we are a lot closer
with LLC.
I'm now going to play around a bit with the unroll factor and see what effect
it really has.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13034 91177308-0d34-0410-b5e6-96231b3b80d8
limited. Even in it's extremely simple state (it can only *fully* unroll single
basic block loops that execute a constant number of times), it already helps improve
performance a LOT on some benchmarks, particularly with the native code generators.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13028 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of producing code like this:
Loop:
X = phi 0, X2
...
X2 = X + 1
if (X != N-1) goto Loop
We now generate code that looks like this:
Loop:
X = phi 0, X2
...
X2 = X + 1
if (X2 != N) goto Loop
This has two big advantages:
1. The trip count of the loop is now explicit in the code, allowing
the direct implementation of Loop::getTripCount()
2. This reduces register pressure in the loop, and allows X and X2 to be
put into the same register.
As a consequence of the second point, the code we generate for loops went
from:
.LBB2: # no_exit.1
...
mov %EDI, %ESI
inc %EDI
cmp %ESI, 2
mov %ESI, %EDI
jne .LBB2 # PC rel: no_exit.1
To:
.LBB2: # no_exit.1
...
inc %ESI
cmp %ESI, 3
jne .LBB2 # PC rel: no_exit.1
... which has two fewer moves, and uses one less register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12961 91177308-0d34-0410-b5e6-96231b3b80d8
This transforms code like this:
%C = or %A, %B
%D = select %cond, %C, %A
into:
%C = select %cond, %B, 0
%D = or %A, %C
Since B is often a constant, the select can often be eliminated. In any case,
this reduces the usage count of A, allowing subsequent optimizations to happen.
This xform applies when the operator is any of:
add, sub, mul, or, xor, and, shl, shr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12800 91177308-0d34-0410-b5e6-96231b3b80d8
that have a constant operand. This implements
add.ll:test19, shift.ll:test15*, and others that are not tested
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12794 91177308-0d34-0410-b5e6-96231b3b80d8
This also implements some new features for the indvars pass, including
linear function test replacement, exit value substitution, and it works with
a much more general class of induction variables and loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12620 91177308-0d34-0410-b5e6-96231b3b80d8
#1 is to unconditionally strip constantpointerrefs out of
instruction operands where they are absolutely pointless and inhibit
optimization. GRRR!
#2 is to implement InstCombine/getelementptr_const.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12519 91177308-0d34-0410-b5e6-96231b3b80d8
as it is making effectively arbitrary modifications to the CFG and we don't
have a domset/domfrontier implementations that can handle the dynamic updates.
Instead of having a bunch of code that doesn't actually work in practice,
just demote any potentially tricky values to the stack (causing the problem
to go away entirely). Later invocations of mem2reg will rebuild SSA for us.
This fixes all of the major performance regressions with tail duplication
from LLVM 1.1. For example, this loop:
---
int popcount(int x) {
int result = 0;
while (x != 0) {
result = result + (x & 0x1);
x = x >> 1;
}
return result;
}
---
Used to be compiled into:
int %popcount(int %X) {
entry:
br label %loopentry
loopentry: ; preds = %entry, %no_exit
%x.0 = phi int [ %X, %entry ], [ %tmp.9, %no_exit ] ; <int> [#uses=3]
%result.1.0 = phi int [ 0, %entry ], [ %tmp.6, %no_exit ] ; <int> [#uses=2]
%tmp.1 = seteq int %x.0, 0 ; <bool> [#uses=1]
br bool %tmp.1, label %loopexit, label %no_exit
no_exit: ; preds = %loopentry
%tmp.4 = and int %x.0, 1 ; <int> [#uses=1]
%tmp.6 = add int %tmp.4, %result.1.0 ; <int> [#uses=1]
%tmp.9 = shr int %x.0, ubyte 1 ; <int> [#uses=1]
br label %loopentry
loopexit: ; preds = %loopentry
ret int %result.1.0
}
And is now compiled into:
int %popcount(int %X) {
entry:
br label %no_exit
no_exit: ; preds = %entry, %no_exit
%x.0.0 = phi int [ %X, %entry ], [ %tmp.9, %no_exit ] ; <int> [#uses=2]
%result.1.0.0 = phi int [ 0, %entry ], [ %tmp.6, %no_exit ] ; <int> [#uses=1]
%tmp.4 = and int %x.0.0, 1 ; <int> [#uses=1]
%tmp.6 = add int %tmp.4, %result.1.0.0 ; <int> [#uses=2]
%tmp.9 = shr int %x.0.0, ubyte 1 ; <int> [#uses=2]
%tmp.1 = seteq int %tmp.9, 0 ; <bool> [#uses=1]
br bool %tmp.1, label %loopexit, label %no_exit
loopexit: ; preds = %no_exit
ret int %tmp.6
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12457 91177308-0d34-0410-b5e6-96231b3b80d8
time from 615s to 1.49s on a large testcase that has a gigantic switch statement
that all of the blocks in the function go to (an intepreter).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12442 91177308-0d34-0410-b5e6-96231b3b80d8
Fix InstCombine/2004-03-13-InstCombineInfLoop.ll which caused an infinite
loop compiling (I think) povray.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12365 91177308-0d34-0410-b5e6-96231b3b80d8
Note that this is a band-aid put over a band-aid. This just undisables
tail duplication in on very specific case that it seems to work in.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11989 91177308-0d34-0410-b5e6-96231b3b80d8
This is a really minor thing, but might help out the 'switch statement induction'
code in simplifycfg.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11900 91177308-0d34-0410-b5e6-96231b3b80d8