mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-11-22 09:22:37 +00:00
Use a loop to simplify the runtime unrolling prologue.
Runtime unrolling will create a prologue to execute the extra
iterations which is can't divided by the unroll factor. It
generates an if-then-else sequence to jump into a factor -1
times unrolled loop body, like
extraiters = tripcount % loopfactor
if (extraiters == 0) jump Loop:
if (extraiters == loopfactor) jump L1
if (extraiters == loopfactor-1) jump L2
...
L1: LoopBody;
L2: LoopBody;
...
if tripcount < loopfactor jump End
Loop:
...
End:
It means if the unroll factor is 4, the loop body will be 7
times unrolled, 3 are in loop prologue, and 4 are in the loop.
This commit is to use a loop to execute the extra iterations
in prologue, like
extraiters = tripcount % loopfactor
if (extraiters == 0) jump Loop:
else jump Prol
Prol: LoopBody;
extraiters -= 1 // Omitted if unroll factor is 2.
if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2.
if (tripcount < loopfactor) jump End
Loop:
...
End:
Then when unroll factor is 4, the loop body will be copied by
only 5 times, 1 in the prologue loop, 4 in the original loop.
And if the unroll factor is 2, new loop won't be created, just
as the original solution.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218604 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
@@ -1,11 +1,11 @@
|
||||
; RUN: opt < %s -S -loop-unroll -unroll-runtime -unroll-count=4 | FileCheck %s
|
||||
; RUN: opt < %s -S -loop-unroll -unroll-runtime -unroll-count=2 | FileCheck %s
|
||||
|
||||
; This tests that setting the unroll count works
|
||||
|
||||
; CHECK: unr.cmp:
|
||||
; CHECK: for.body.unr:
|
||||
; CHECK: for.body.prol:
|
||||
; CHECK: br label %for.body.preheader.split
|
||||
; CHECK: for.body:
|
||||
; CHECK: br i1 %exitcond.3, label %for.end.loopexit{{.*}}, label %for.body
|
||||
; CHECK: br i1 %exitcond.1, label %for.end.loopexit.unr-lcssa, label %for.body
|
||||
; CHECK-NOT: br i1 %exitcond.4, label %for.end.loopexit{{.*}}, label %for.body
|
||||
|
||||
define i32 @test(i32* nocapture %a, i32 %n) nounwind uwtable readonly {
|
||||
|
||||
Reference in New Issue
Block a user