mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-10-16 11:19:51 +00:00
update and consolidate the load pre notes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90050 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
@@ -1116,6 +1116,8 @@ later.
|
|||||||
|
|
||||||
//===---------------------------------------------------------------------===//
|
//===---------------------------------------------------------------------===//
|
||||||
|
|
||||||
|
[STORE SINKING]
|
||||||
|
|
||||||
Store sinking: This code:
|
Store sinking: This code:
|
||||||
|
|
||||||
void f (int n, int *cond, int *res) {
|
void f (int n, int *cond, int *res) {
|
||||||
@@ -1171,6 +1173,8 @@ This is GCC PR38204.
|
|||||||
|
|
||||||
//===---------------------------------------------------------------------===//
|
//===---------------------------------------------------------------------===//
|
||||||
|
|
||||||
|
[STORE SINKING]
|
||||||
|
|
||||||
GCC PR37810 is an interesting case where we should sink load/store reload
|
GCC PR37810 is an interesting case where we should sink load/store reload
|
||||||
into the if block and outside the loop, so we don't reload/store it on the
|
into the if block and outside the loop, so we don't reload/store it on the
|
||||||
non-call path.
|
non-call path.
|
||||||
@@ -1198,7 +1202,7 @@ we don't sink the store. We need partially dead store sinking.
|
|||||||
|
|
||||||
//===---------------------------------------------------------------------===//
|
//===---------------------------------------------------------------------===//
|
||||||
|
|
||||||
[LOAD PRE with NON-AVAILABLE ADDRESS]
|
[LOAD PRE CRIT EDGE SPLITTING]
|
||||||
|
|
||||||
GCC PR37166: Sinking of loads prevents SROA'ing the "g" struct on the stack
|
GCC PR37166: Sinking of loads prevents SROA'ing the "g" struct on the stack
|
||||||
leading to excess stack traffic. This could be handled by GVN with some crazy
|
leading to excess stack traffic. This could be handled by GVN with some crazy
|
||||||
@@ -1217,63 +1221,58 @@ bb3: ; preds = %bb1, %bb2, %bb
|
|||||||
|
|
||||||
%11 is partially redundant, an in BB2 it should have the value %8.
|
%11 is partially redundant, an in BB2 it should have the value %8.
|
||||||
|
|
||||||
GCC PR33344 is a similar case.
|
GCC PR33344 and PR35287 are similar cases.
|
||||||
|
|
||||||
|
|
||||||
//===---------------------------------------------------------------------===//
|
//===---------------------------------------------------------------------===//
|
||||||
|
|
||||||
|
[LOAD PRE]
|
||||||
|
|
||||||
There are many load PRE testcases in testsuite/gcc.dg/tree-ssa/loadpre* in the
|
There are many load PRE testcases in testsuite/gcc.dg/tree-ssa/loadpre* in the
|
||||||
GCC testsuite. There are many pre testcases as ssa-pre-*.c
|
GCC testsuite, ones we don't get yet are (checked through loadpre25):
|
||||||
|
|
||||||
|
[CRIT EDGE BREAKING]
|
||||||
|
loadpre3.c predcom-4.c
|
||||||
|
|
||||||
|
[PRE OF READONLY CALL]
|
||||||
|
loadpre5.c
|
||||||
|
|
||||||
|
[TURN SELECT INTO BRANCH]
|
||||||
|
loadpre14.c loadpre15.c
|
||||||
|
|
||||||
|
actually a conditional increment: loadpre18.c loadpre19.c
|
||||||
|
|
||||||
|
|
||||||
|
//===---------------------------------------------------------------------===//
|
||||||
|
|
||||||
|
[SCALAR PRE]
|
||||||
|
There are many PRE testcases in testsuite/gcc.dg/tree-ssa/ssa-pre-*.c in the
|
||||||
|
GCC testsuite.
|
||||||
|
|
||||||
//===---------------------------------------------------------------------===//
|
//===---------------------------------------------------------------------===//
|
||||||
|
|
||||||
There are some interesting cases in testsuite/gcc.dg/tree-ssa/pred-comm* in the
|
There are some interesting cases in testsuite/gcc.dg/tree-ssa/pred-comm* in the
|
||||||
GCC testsuite. For example, predcom-1.c is:
|
GCC testsuite. For example, we get the first example in predcom-1.c, but
|
||||||
|
miss the second one:
|
||||||
|
|
||||||
for (i = 2; i < 1000; i++)
|
unsigned fib[1000];
|
||||||
fib[i] = (fib[i-1] + fib[i - 2]) & 0xffff;
|
unsigned avg[1000];
|
||||||
|
|
||||||
which compiles into:
|
__attribute__ ((noinline))
|
||||||
|
void count_averages(int n) {
|
||||||
|
int i;
|
||||||
|
for (i = 1; i < n; i++)
|
||||||
|
avg[i] = (((unsigned long) fib[i - 1] + fib[i] + fib[i + 1]) / 3) & 0xffff;
|
||||||
|
}
|
||||||
|
|
||||||
bb1: ; preds = %bb1, %bb1.thread
|
which compiles into two loads instead of one in the loop.
|
||||||
%indvar = phi i32 [ 0, %bb1.thread ], [ %0, %bb1 ]
|
|
||||||
%i.0.reg2mem.0 = add i32 %indvar, 2
|
|
||||||
%0 = add i32 %indvar, 1 ; <i32> [#uses=3]
|
|
||||||
%1 = getelementptr [1000 x i32]* @fib, i32 0, i32 %0
|
|
||||||
%2 = load i32* %1, align 4 ; <i32> [#uses=1]
|
|
||||||
%3 = getelementptr [1000 x i32]* @fib, i32 0, i32 %indvar
|
|
||||||
%4 = load i32* %3, align 4 ; <i32> [#uses=1]
|
|
||||||
%5 = add i32 %4, %2 ; <i32> [#uses=1]
|
|
||||||
%6 = and i32 %5, 65535 ; <i32> [#uses=1]
|
|
||||||
%7 = getelementptr [1000 x i32]* @fib, i32 0, i32 %i.0.reg2mem.0
|
|
||||||
store i32 %6, i32* %7, align 4
|
|
||||||
%exitcond = icmp eq i32 %0, 998 ; <i1> [#uses=1]
|
|
||||||
br i1 %exitcond, label %return, label %bb1
|
|
||||||
|
|
||||||
This is basically:
|
predcom-2.c is the same as predcom-1.c
|
||||||
LOAD fib[i+1]
|
|
||||||
LOAD fib[i]
|
|
||||||
STORE fib[i+2]
|
|
||||||
|
|
||||||
instead of handling this as a loop or other xform, all we'd need to do is teach
|
|
||||||
load PRE to phi translate the %0 add (i+1) into the predecessor as (i'+1+1) =
|
|
||||||
(i'+2) (where i' is the previous iteration of i). This would find the store
|
|
||||||
which feeds it.
|
|
||||||
|
|
||||||
predcom-2.c is apparently the same as predcom-1.c
|
|
||||||
predcom-3.c is very similar but needs loads feeding each other instead of
|
predcom-3.c is very similar but needs loads feeding each other instead of
|
||||||
store->load.
|
store->load.
|
||||||
predcom-4.c seems the same as the rest.
|
|
||||||
|
|
||||||
|
|
||||||
//===---------------------------------------------------------------------===//
|
|
||||||
|
|
||||||
Other simple load PRE cases:
|
|
||||||
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35287 [LPRE crit edge splitting]
|
|
||||||
|
|
||||||
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34677 (licm does this, LPRE crit edge)
|
|
||||||
llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as | opt -mem2reg -simplifycfg -gvn | llvm-dis
|
|
||||||
|
|
||||||
//===---------------------------------------------------------------------===//
|
//===---------------------------------------------------------------------===//
|
||||||
|
|
||||||
Type based alias analysis:
|
Type based alias analysis:
|
||||||
@@ -1305,7 +1304,7 @@ Interesting missed case because of control flow flattening (should be 2 loads):
|
|||||||
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26629
|
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26629
|
||||||
With: llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as |
|
With: llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as |
|
||||||
opt -mem2reg -gvn -instcombine | llvm-dis
|
opt -mem2reg -gvn -instcombine | llvm-dis
|
||||||
we miss it because we need 1) GEP PHI TRAN, 2) CRIT EDGE 3) MULTIPLE DIFFERENT
|
we miss it because we need 1) CRIT EDGE 2) MULTIPLE DIFFERENT
|
||||||
VALS PRODUCED BY ONE BLOCK OVER DIFFERENT PATHS
|
VALS PRODUCED BY ONE BLOCK OVER DIFFERENT PATHS
|
||||||
|
|
||||||
//===---------------------------------------------------------------------===//
|
//===---------------------------------------------------------------------===//
|
||||||
|
Reference in New Issue
Block a user