Do not claim to not change the CFG. We do change the cfg to split critical
edges. This isn't causing us a problem now, but could likely do so in the
future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22824 91177308-0d34-0410-b5e6-96231b3b80d8
loop, because a IV-dependent value was used outside of the loop and didn't
have immediate-folding capability
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22798 91177308-0d34-0410-b5e6-96231b3b80d8
a problem in LoopStrengthReduction, where it would split critical edges
then confused itself with outdated loop information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22776 91177308-0d34-0410-b5e6-96231b3b80d8
edge so that the code is not always executed for both operands. This
prevents LSR from inserting code into loops whose exit blocks contain
PHI uses of IV expressions (which are outside of loops). On gzip, for
example, we turn this ugly code:
.LBB_test_1: ; loopentry
add r27, r3, r28
lhz r27, 3(r27)
add r26, r4, r28
lhz r26, 3(r26)
add r25, r30, r28 ;; Only live if exiting the loop
add r24, r29, r28 ;; Only live if exiting the loop
cmpw cr0, r27, r26
bne .LBB_test_5 ; loopexit
into this:
.LBB_test_1: ; loopentry
or r27, r28, r28
add r28, r3, r27
lhz r28, 3(r28)
add r26, r4, r27
lhz r26, 3(r26)
cmpw cr0, r28, r26
beq .LBB_test_3 ; shortcirc_next.0
.LBB_test_2: ; loopentry.loopexit_crit_edge
add r2, r30, r27
add r8, r29, r27
b .LBB_test_9 ; loopexit
.LBB_test_2: ; shortcirc_next.0
...
blt .LBB_test_1
into this:
.LBB_test_1: ; loopentry
or r27, r28, r28
add r28, r3, r27
lhz r28, 3(r28)
add r26, r4, r27
lhz r26, 3(r26)
cmpw cr0, r28, r26
beq .LBB_test_3 ; shortcirc_next.0
.LBB_test_2: ; loopentry.loopexit_crit_edge
add r2, r30, r27
add r8, r29, r27
b .LBB_t_3: ; shortcirc_next.0
.LBB_test_3: ; shortcirc_next.0
...
blt .LBB_test_1
Next step: get the block out of the loop so that the loop is all
fall-throughs again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22766 91177308-0d34-0410-b5e6-96231b3b80d8
Instead, just update the BB in-place. This is both faster, and it prevents
split-critical-edges from shuffling the PHI argument list unneccesarily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22765 91177308-0d34-0410-b5e6-96231b3b80d8
into just Y. This often occurs when it seperates loops that have collapsed loop
headers. This implements LoopSimplify/phi-node-simplify.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22746 91177308-0d34-0410-b5e6-96231b3b80d8
For code like this:
void foo(float *a, float *b, int n, int stride_a, int stride_b) {
int i;
for (i=0; i<n; i++)
a[i*stride_a] = b[i*stride_b];
}
we now emit:
.LBB_foo2_2: ; no_exit
lfs f0, 0(r4)
stfs f0, 0(r3)
addi r7, r7, 1
add r4, r2, r4
add r3, r6, r3
cmpw cr0, r7, r5
blt .LBB_foo2_2 ; no_exit
instead of:
.LBB_foo_2: ; no_exit
mullw r8, r2, r7 ;; multiply!
slwi r8, r8, 2
lfsx f0, r4, r8
mullw r8, r2, r6 ;; multiply!
slwi r8, r8, 2
stfsx f0, r3, r8
addi r2, r2, 1
cmpw cr0, r2, r5
blt .LBB_foo_2 ; no_exit
loops with variable strides occur pretty often. For example, in SPECFP2K
there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp,
56 in 168.wupwise, 36 in 172.mgrid.
Now we can allow indvars to turn functions written like this:
void foo2(float *a, float *b, int n, int stride_a, int stride_b) {
int i, ai = 0, bi = 0;
for (i=0; i<n; i++)
{
a[ai] = b[bi];
ai += stride_a;
bi += stride_b;
}
}
into code like the above for better analysis. With this patch, they generate
identical code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22740 91177308-0d34-0410-b5e6-96231b3b80d8
first is a correctness thing, and the later is an optzn thing. This also
is needed to support a future change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22720 91177308-0d34-0410-b5e6-96231b3b80d8
The termination condition actually wants to use the post-incremented value
of the loop, not a new indvar with an unusual base.
On PPC, for example, this allows us to compile
LoopStrengthReduce/exit_compare_live_range.ll to:
_foo:
li r2, 0
.LBB_foo_1: ; no_exit
li r5, 0
stw r5, 0(r3)
addi r2, r2, 1
cmpw cr0, r2, r4
bne .LBB_foo_1 ; no_exit
blr
instead of:
_foo:
li r2, 1 ;; IV starts at 1, not 0
.LBB_foo_1: ; no_exit
li r5, 0
stw r5, 0(r3)
addi r5, r2, 1
cmpw cr0, r2, r4
or r2, r5, r5 ;; Reg-reg copy, extra live range
bne .LBB_foo_1 ; no_exit
blr
This implements LoopStrengthReduce/exit_compare_live_range.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22699 91177308-0d34-0410-b5e6-96231b3b80d8
* Teach this code to move allocas out of the loop when tail call eliminating
a call marked 'tail'. This implements TailCallElim/move_alloca_for_tail_call.ll
* Do not perform this transformation if a call is marked 'tail' and if there
are allocas that we cannot move out of the loop in #2. Doing so would increase
the stack usage of the function. This implements fixes
PR615 and TailCallElim/dont-tce-tail-marked-call.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22690 91177308-0d34-0410-b5e6-96231b3b80d8
BasicBlock's removePredecessor routine. This requires shuffling around
the definition and implementation of hasContantValue from Utils.h,cpp into
Instructions.h,cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22664 91177308-0d34-0410-b5e6-96231b3b80d8
that the symbolic evaluator is not always able to use subtraction to remove
expressions. This makes the code faster, and fixes the last crash on 178.galgel.
Finally, add a statistic to see how many phi nodes are inserted.
On 178.galgel, we get the follow stats:
2562 loop-reduce - Number of PHIs inserted
3927 loop-reduce - Number of GEPs strength reduced
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22662 91177308-0d34-0410-b5e6-96231b3b80d8
method.
* Fix a crash on 178.galgel, where we would insert expressions before PHI
nodes instead of into the PHI node predecessor blocks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22657 91177308-0d34-0410-b5e6-96231b3b80d8
for (i = 0; i < N; ++i)
A[i][foo()] = 0;
here we still want to strength reduce the A[i] part, even though foo() is
l-v.
This also simplifies some of the 'CanReduce' logic.
This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22652 91177308-0d34-0410-b5e6-96231b3b80d8
1. We only analyze instructions once, guaranteed
2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with
something much simpler.
The next step is to handle expressions that are not all indvar+loop-invariant
values (e.g. handling indvar+loopvariant).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22649 91177308-0d34-0410-b5e6-96231b3b80d8
Only emit one PHI node for IV uses with identical bases and strides (after
moving foldable immediates to the load/store instruction).
This implements LoopStrengthReduce/dont_insert_redundant_ops.ll, allowing
us to generate this PPC code for test1:
or r30, r3, r3
.LBB_test1_1: ; Loop
li r2, 0
stw r2, 0(r30)
stw r2, 4(r30)
bl L_pred$stub
addi r30, r30, 8
cmplwi cr0, r3, 0
bne .LBB_test1_1 ; Loop
instead of this code:
or r30, r3, r3
or r29, r3, r3
.LBB_test1_1: ; Loop
li r2, 0
stw r2, 0(r29)
stw r2, 4(r30)
bl L_pred$stub
addi r30, r30, 8 ;; Two iv's with step of 8
addi r29, r29, 8
cmplwi cr0, r3, 0
bne .LBB_test1_1 ; Loop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22635 91177308-0d34-0410-b5e6-96231b3b80d8
unify some parallel vectors and get field names more descriptive than
"first" and "second". This isn't lisp afterall :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22633 91177308-0d34-0410-b5e6-96231b3b80d8
map from instruction* to SCEVHandles. When we delete instructions, we have
to tell it about it. We would run into nasty cases where new instructions
were reallocated at old instruction addresses and get the old map values.
Bad bad bad :(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22632 91177308-0d34-0410-b5e6-96231b3b80d8