mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2026-01-05 23:23:53 +00:00
LoopVectorize: Use the dependence test utility class
We now no longer need alias analysis - the cases that alias analysis would
handle are now handled as accesses with a large dependence distance.
We can now vectorize loops with simple constant dependence distances.
for (i = 8; i < 256; ++i) {
a[i] = a[i+4] * a[i+8];
}
for (i = 8; i < 256; ++i) {
a[i] = a[i-4] * a[i-8];
}
We would be able to vectorize about 200 more loops (in many cases the cost model
instructs us no to) in the test suite now. Results on x86-64 are a wash.
I have seen one degradation in ammp. Interestingly, the function in which we
now vectorize a loop is never executed so we probably see some instruction
cache effects. There is a 2% improvement in h264ref. There is one or the other
TSCV loop kernel that speeds up.
radar://13681598
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184685 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
@@ -12,7 +12,7 @@ target triple = "x86_64-apple-macosx10.9.0"
|
||||
;CHECK: for.body.preheader:
|
||||
;CHECK: br i1 %cmp.zero, label %middle.block, label %vector.memcheck
|
||||
;CHECK: vector.memcheck:
|
||||
;CHECK: br i1 %found.conflict, label %middle.block, label %vector.ph
|
||||
;CHECK: br i1 %memcheck.conflict, label %middle.block, label %vector.ph
|
||||
;CHECK: load <4 x float>
|
||||
define i32 @foo(float* nocapture %a, float* nocapture %b, i32 %n) nounwind uwtable ssp {
|
||||
entry:
|
||||
|
||||
Reference in New Issue
Block a user