for (X * C1) + (X * C2) (where * can be mul or shl), allowing us to fold:
Y+Y+Y+Y+Y+Y+Y+Y
into
%tmp.8 = shl long %Y, ubyte 3 ; <long> [#uses=1]
instead of
%tmp.4 = shl long %Y, ubyte 2 ; <long> [#uses=1]
%tmp.12 = shl long %Y, ubyte 2 ; <long> [#uses=1]
%tmp.8 = add long %tmp.4, %tmp.12 ; <long> [#uses=1]
This implements add.ll:test25
Also add support for (X*C1)-(X*C2) -> X*(C1-C2), implementing sub.ll:test18
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17704 91177308-0d34-0410-b5e6-96231b3b80d8
loops. This optimization is not turned on by default yet, but may be run
with the opt tool's -loop-reduce flag. There are many FIXMEs listed in the
code that will make it far more applicable to a wide range of code, but you
have to start somewhere :)
This limited version currently triggers on the following tests in the
MultiSource directory:
pcompress2: 7 times
cfrac: 5 times
anagram: 2 times
ks: 6 times
yacr2: 2 times
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17134 91177308-0d34-0410-b5e6-96231b3b80d8
change hacks off 10K of bytecode from perlbmk (.5%) even though the front-end
is not generating them yet and we are not optimizing the resultant code.
This isn't too bad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17111 91177308-0d34-0410-b5e6-96231b3b80d8
exercise that I'm not interested in tackling right now. Just punt and treat them
like unwind's.
This 'fixes' test/Regression/Transforms/ADCE/unreachable-function.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17106 91177308-0d34-0410-b5e6-96231b3b80d8
pointer recurrences into expressions from this:
%P_addr.0.i.0 = phi sbyte* [ getelementptr ([8 x sbyte]* %.str_1, int 0, int 0), %entry ], [ %inc.0.i, %no_exit.i ]
%inc.0.i = getelementptr sbyte* %P_addr.0.i.0, int 1 ; <sbyte*> [#uses=2]
into this:
%inc.0.i = getelementptr sbyte* getelementptr ([8 x sbyte]* %.str_1, int 0, int 0), int %inc.0.i.rec
Actually create something nice, like this:
%inc.0.i = getelementptr [8 x sbyte]* %.str_1, int 0, int %inc.0.i.rec
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16924 91177308-0d34-0410-b5e6-96231b3b80d8
an instruction if it can be hoisted to a common dominator of the block.
This implements: test/Regression/Transforms/TailDup/MergeTest.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16758 91177308-0d34-0410-b5e6-96231b3b80d8
* SubOne/AddOne functions always return ConstantInt, declare them as such
* Pull code for handling setcc X, cst, where cst is at the end of the range,
or cc is LE or GE up earlier in visitSetCondInst. This reduces #iterations
in some cases.
* Fold: (div X, C1) op C2 -> range check, implementing div.ll:test6 - test9.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16588 91177308-0d34-0410-b5e6-96231b3b80d8
This takes something like this:
%A = phi int [ 3, %cond_false.0 ], [ 2, %endif.0.i ], [ 2, %endif.1.i ]
%B = div int %tmp.243, 4
and turns it into:
%A = phi int [ 3/4, %cond_false.0 ], [ 2/4, %endif.0.i ], [ 2/4, %endif.1.i ]
which is later simplified (in this case) into %A = 0.
This triggers thousands of times in spec, for example, 269 times in 176.gcc.
This is tested by InstCombine/add.ll:test23 and set.ll:test18.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16582 91177308-0d34-0410-b5e6-96231b3b80d8
Instcombine (setcc (truncate X), C1).
This occurs THOUSANDS of times in many benchmarks. Particularlly common
seem to be things like (seteq (cast bool X to int), int 0)
This turns it into (seteq bool %X, false), which then becomes (not %X).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16567 91177308-0d34-0410-b5e6-96231b3b80d8
This is important for several reasons:
1. Benchmarks have lots of code that looks like this (perlbmk in particular):
%tmp.2.i = setne int %tmp.0.i, 128 ; <bool> [#uses=1]
%tmp.6343 = seteq int %tmp.0.i, 1 ; <bool> [#uses=1]
%tmp.63 = and bool %tmp.2.i, %tmp.6343 ; <bool> [#uses=1]
we now fold away the setne, a clear improvement.
2. In the more important cases, such as (X >= 10) & (X < 20), we now produce
smaller code: (X-10) < 10.
3. Perhaps the nicest effect of this patch is that it really helps out the
code generators. In particular, for a 'range test' like the above,
instead of generating this on X86 (the difference on PPC is even more
pronounced):
cmp %EAX, 50
setge %CL
cmp %EAX, 100
setl %AL
and %CL, %AL
cmp %CL, 0
we now generate this:
add %EAX, -50
cmp %EAX, 50
Furthermore, this causes setcc's to be folded into branches more often.
These combinations trigger dozens of times in the spec benchmarks, particularly
in 176.gcc, 186.crafty, 253.perlbmk, 254.gap, & 099.go.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16559 91177308-0d34-0410-b5e6-96231b3b80d8
Implement (setcc (shl X, C1), C2) folding.
The second one occurs several dozen times in spec. The first was added
just in case. :)
These are tested by shift.ll:test2[12], and div.ll:test5
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16549 91177308-0d34-0410-b5e6-96231b3b80d8