LLVM backend for 6502
Go to file
Chris Lattner c88d8e944d Fix the #1 code quality problem that I have seen on X86 (and it also affects
PPC and other targets).  In a particular, consider code like this:

struct Vector3 { double x, y, z; };
struct Matrix3 { Vector3 a, b, c; };
double dot(Vector3 &a, Vector3 &b) {
   return a.x * b.x  +  a.y * b.y  +  a.z * b.z;
}
Vector3 mul(Vector3 &a, Matrix3 &b) {
   Vector3 r;
   r.x = dot( a, b.a );
   r.y = dot( a, b.b );
   r.z = dot( a, b.c );
   return r;
}
void transform(Matrix3 &m, Vector3 *x, int n) {
   for (int i = 0; i < n; i++)
      x[i] = mul( x[i], m );
}

we compile transform to a loop with all of the GEP instructions for indexing
into 'm' pulled out of the loop (9 of them).  Because isel occurs a bb at a time
we are unable to fold the constant index into the loads in the loop, leading to
PPC code that looks like this:

LBB3_1: ; no_exit.preheader
        li r2, 0
        addi r6, r3, 64        ;; 9 values live across the loop body!
        addi r7, r3, 56
        addi r8, r3, 48
        addi r9, r3, 40
        addi r10, r3, 32
        addi r11, r3, 24
        addi r12, r3, 16
        addi r30, r3, 8
LBB3_2: ; no_exit
        lfd f0, 0(r30)
        lfd f1, 8(r4)
        fmul f0, f1, f0
        lfd f2, 0(r3)        ;; no constant indices folded into the loads!
        lfd f3, 0(r4)
        lfd f4, 0(r10)
        lfd f5, 0(r6)
        lfd f6, 0(r7)
        lfd f7, 0(r8)
        lfd f8, 0(r9)
        lfd f9, 0(r11)
        lfd f10, 0(r12)
        lfd f11, 16(r4)
        fmadd f0, f3, f2, f0
        fmul f2, f1, f4
        fmadd f0, f11, f10, f0
        fmadd f2, f3, f9, f2
        fmul f1, f1, f6
        stfd f0, 0(r4)
        fmadd f0, f11, f8, f2
        fmadd f1, f3, f7, f1
        stfd f0, 8(r4)
        fmadd f0, f11, f5, f1
        addi r29, r4, 24
        stfd f0, 16(r4)
        addi r2, r2, 1
        cmpw cr0, r2, r5
        or r4, r29, r29
        bne cr0, LBB3_2 ; no_exit

uh, yuck.  With this patch, we now sink the constant offsets into the loop, producing
this code:

LBB3_1: ; no_exit.preheader
        li r2, 0
LBB3_2: ; no_exit
        lfd f0, 8(r3)
        lfd f1, 8(r4)
        fmul f0, f1, f0
        lfd f2, 0(r3)
        lfd f3, 0(r4)
        lfd f4, 32(r3)       ;; much nicer.
        lfd f5, 64(r3)
        lfd f6, 56(r3)
        lfd f7, 48(r3)
        lfd f8, 40(r3)
        lfd f9, 24(r3)
        lfd f10, 16(r3)
        lfd f11, 16(r4)
        fmadd f0, f3, f2, f0
        fmul f2, f1, f4
        fmadd f0, f11, f10, f0
        fmadd f2, f3, f9, f2
        fmul f1, f1, f6
        stfd f0, 0(r4)
        fmadd f0, f11, f8, f2
        fmadd f1, f3, f7, f1
        stfd f0, 8(r4)
        fmadd f0, f11, f5, f1
        addi r6, r4, 24
        stfd f0, 16(r4)
        addi r2, r2, 1
        cmpw cr0, r2, r5
        or r4, r6, r6
        bne cr0, LBB3_2 ; no_exit

This is much nicer as it reduces register pressure in the loop a lot.  On X86,
this takes the function from having 9 spilled registers to 2.  This should help
some spec programs on X86 (gzip?)

This is currently only enabled with -enable-gep-isel-opt to allow perf testing
tonight.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24606 91177308-0d34-0410-b5e6-96231b3b80d8
2005-12-05 07:10:48 +00:00
autoconf add malloc_zone_statistics, remove mstats 2005-11-14 07:24:17 +00:00
docs attribute((used)) is now supported 2005-12-05 05:23:06 +00:00
examples When a function takes a variable number of pointer arguments, with a zero 2005-10-23 04:37:20 +00:00
include/llvm Add a flag to Module::getGlobalVariable to allow it to return vars with 2005-12-05 05:30:21 +00:00
lib Fix the #1 code quality problem that I have seen on X86 (and it also affects 2005-12-05 07:10:48 +00:00
projects unbreak the build again 2005-10-27 16:30:44 +00:00
runtime Add the remove() function from the C library. 2005-11-28 15:49:15 +00:00
test New testcase for PR660 2005-12-05 04:48:12 +00:00
tools Revert my previous patch which broke due to lazy streaming of functions 2005-12-02 19:00:22 +00:00
utils Implement PR673: for explicit register references, use type information 2005-12-05 02:36:37 +00:00
win32 Teach Visual Studio about new files. 2005-11-28 06:46:36 +00:00
Xcode Remove the lowerconstantexprs pass 2005-10-29 05:34:40 +00:00
.cvsignore Ignore the configure.out file generated by "make reconfigure" 2005-06-18 23:01:25 +00:00
configure regenearte 2005-11-14 07:25:50 +00:00
CREDITS.TXT add Evan and Jim. Please edit your entries as desired. 2005-11-29 00:57:06 +00:00
LICENSE.TXT Remove extraneous colons after program names for consistency 2005-05-12 21:39:01 +00:00
llvm.spec Onward to LLVM-1.6 and beyond! 2005-05-18 20:23:20 +00:00
llvm.spec.in Onward to LLVM-1.6 and beyond! 2005-05-18 20:23:20 +00:00
Makefile For PR614: 2005-08-25 04:59:49 +00:00
Makefile.common Update comments to reflect new variable names. Patch contributed by 2005-02-14 16:02:19 +00:00
Makefile.config.in Two changes: 2005-04-22 17:14:14 +00:00
Makefile.rules Move some constant folding code shared by Analysis and Transform passes 2005-10-27 15:54:34 +00:00
README.txt Make the text of this file a little more useful. 2004-09-02 22:49:27 +00:00

Low Level Virtual Machine (LLVM)
================================

This directory and its subdirectories contain source code for the Low Level 
Virtual Machine, a toolkit for the construction of highly optimized compilers,
optimizers, and runtime environments. 

LLVM is open source software. You may freely distribute it under the terms of
the license agreement found in LICENSE.txt.

Please see the HTML documentation provided in docs/index.html for further
assistance with LLVM.