32b: No longer pattern match fneg(fsub(fmul)) as fnmsub
Pattern match fsub a, mul(b, c) as fnmsub
Pattern match fadd a, mul(b, c) as fmadd
Those changes speed up hydro2d by 2.5%, distray by 6%, and scimark by 8%
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21161 91177308-0d34-0410-b5e6-96231b3b80d8
quotient, not the remainder. Also, make sure to remove the old div operand
from the ExprMap and let SelectExpr insert the new one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21111 91177308-0d34-0410-b5e6-96231b3b80d8
Have LegalizeDAG handle SREM and UREM for us
Codegen SDIV and UDIV by constant as a multiply by magic constant instead
of integer divide, which is very slow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21104 91177308-0d34-0410-b5e6-96231b3b80d8
indicate that it is not a boolean function.
Properly emit the pseudo instruction for conditional branch, so that we
can fix up conditional branches whose displacements are too large.
Reserve the right amount of opcode space for said pseudo instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21094 91177308-0d34-0410-b5e6-96231b3b80d8
that regalloc doesn't cleverly reuse early arg regs loading later arg regs.
This fixes almost all outstanding failures in the pattern isel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21086 91177308-0d34-0410-b5e6-96231b3b80d8
Teach the SelectionDAG code how to expand and promote it
Have PPC32 LowerCallTo generate ISD::UNDEF for int arg regs used up by fp
arguments, but not shadowing their value. This allows us to do the right
thing with both fixed and vararg floating point arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20988 91177308-0d34-0410-b5e6-96231b3b80d8
18.8 to 14.8 seconds. The Pattern ISel is now often faster than the
Simple ISel, esp. on memory intensive code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20973 91177308-0d34-0410-b5e6-96231b3b80d8
generate compare immediate for integer compare with constant
fold setcc into branch
fold setcc into select
Code generation quality for Shootout is now on par with the Simple ISel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20968 91177308-0d34-0410-b5e6-96231b3b80d8
LowerCallTo.
Handle ISD::ADD in SelectAddr, allowing us to have nonzero immediates for
loads and stores, amazing!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20946 91177308-0d34-0410-b5e6-96231b3b80d8
Tell the SelectionDAG ISel to expand SEXTLOAD of i1 and i8, rather than
complicate the code in ISD::SEXTLOAD to do it by hand
Combine the FP and Int ISD::LOAD codegen
Generate better code for constant pool loads
As a result, all of Shootout, and likely many other programs are now
working.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20945 91177308-0d34-0410-b5e6-96231b3b80d8
don't support things like memcpy directly. This allows a handful of the
Shootout programs to work, yay!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20939 91177308-0d34-0410-b5e6-96231b3b80d8
the correct register class. Also remove the loading of float data into int
regs part of varargs; it will need to be implemented differently later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20857 91177308-0d34-0410-b5e6-96231b3b80d8
going on with copies between floating point and integer register files
being generated. Once that is solved, varargs will be done.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20856 91177308-0d34-0410-b5e6-96231b3b80d8
handled correctly for floating point arguments, or more than 8 arguemnts.
This does however, allow hello world to run.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20832 91177308-0d34-0410-b5e6-96231b3b80d8
1) dynamic stack alloc
2) loads
3) shifts
4) subtract
5) immediate form of add, and, or, xor
6) change flag from -pattern-isel to -enable-ppc-pattern-isel
Remove dead arguments from getGlobalBaseReg in the simple ISel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20810 91177308-0d34-0410-b5e6-96231b3b80d8
be brought up to parity with the current simple ISel in the coming days.
Currently, -pattern-isel is required to trigger it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20805 91177308-0d34-0410-b5e6-96231b3b80d8
using Function::arg_{iterator|begin|end}. Likewise Module::g* -> Module::global_*.
This patch is contributed by Gabor Greif, thanks!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20597 91177308-0d34-0410-b5e6-96231b3b80d8
to save and restore the LR register on entry and exit of a leaf function
that needed to access globals or the constant pool. This should hopefully
fix oscar from sending the PPC tester spinning out of control.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20197 91177308-0d34-0410-b5e6-96231b3b80d8
- unsigned TrueValue = getReg(TrueVal, BB, BB->begin());
+ unsigned TrueValue = getReg(TrueVal);
Fixes the PPC regressions from last night.
The other hunk is just a clarity improvement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19263 91177308-0d34-0410-b5e6-96231b3b80d8
While we're at it, improve codegen of select instructions. For this
testcase:
int %test(bool %C, int %A, int %B) {
%D = select bool %C, int %A, int %B
ret int %D
}
We used to generate this code:
_test:
cmpwi cr0, r3, 0
bne .LBB_test_2 ;
.LBB_test_1: ;
b .LBB_test_3 ;
.LBB_test_2: ;
or r5, r4, r4
.LBB_test_3: ;
or r3, r5, r5
blr
Now we emit:
_test:
cmpwi cr0, r3, 0
bne .LBB_test_2 ;
.LBB_test_1: ;
or r4, r5, r5
.LBB_test_2: ;
or r3, r4, r4
blr
-Chris
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19214 91177308-0d34-0410-b5e6-96231b3b80d8
addi r3, r3, -1
instead of
addi r3, r3, 1
for 'sub int X, 1'.
Secondarily, this fixes several cases where we could crash given an unsigned
constant. And fixes a couple of minor missed optimization cases, such as
xor X, ~0U -> not X
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18379 91177308-0d34-0410-b5e6-96231b3b80d8
place to help bring up the PowerPC back end on Darwin. This code is no
longer serves any purpose now that the AsmPrinter does the right thing
all the time printing GlobalValues. --Cruft.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18267 91177308-0d34-0410-b5e6-96231b3b80d8
static global variables whose addresses are taken. This allows us to
convert the following code for taking the address of a static function foo
addis r2, r30, ha16(Ll1__2E_foo_2$non_lazy_ptr-"L00001$pb")
lwz r3, lo16(Ll1__2E_foo_2$non_lazy_ptr-"L00001$pb")(r2)
which also includes linker stub code emitted at the end of the .s file not
shown here, and replace it with this:
addis r2, r30, ha16(l1__2E_foo_2-"L00001$pb")
la r3, lo16(l1__2E_foo_2-"L00001$pb")(r2)
which in addition to not needing linker help, also has no load instruction.
For those not up on PowerPC mnemonics, la is shorthand for add immediate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18239 91177308-0d34-0410-b5e6-96231b3b80d8
LLVM blocks as the keys for the branch rewriter. This fixes treeadd and
many other programs with the JIT.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18223 91177308-0d34-0410-b5e6-96231b3b80d8
* Add relocations for refernces to non-lazy darwin stubs and implement
them correctly.
With this change, we can correctly references external globals, and now
all but two UnitTests and all but 1 Regression/C tests pass.
More importantly, bugpoint-jit will start giving us useful testcases,
instead of always telling us that references to external globals don't
work :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18222 91177308-0d34-0410-b5e6-96231b3b80d8
obscure problem where we were doing:
lmw r3,0(r9)
which is undefined on PPC. Now we do:
lmw r3,0(r2)
by force, not relying on the GCC register allocator for luck :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18212 91177308-0d34-0410-b5e6-96231b3b80d8
This should save all argument registers on entry and restore on exit, despite
that, simple things seem to work!!!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18161 91177308-0d34-0410-b5e6-96231b3b80d8
ones noted, which require funny PPC specific inline assembly.
If some angel felt the desire to help me, I think this is that last bit missing
for JIT support (however, generic code emitter might night work right with
the constant pool yet).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18151 91177308-0d34-0410-b5e6-96231b3b80d8
for external functions work. CompilationCallback has not been written, and
stubs for internal functions are not generated yet. This means you can call
printf and exit, and use global variables, but cannot call functions local to
a module yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18145 91177308-0d34-0410-b5e6-96231b3b80d8
reg-reg copies. The necessary conditions for this bug are a GEP that is
used outside the basic block in which it is defined, whose components
other than the pointer are all constant zero, and where the use is
selected before the definition (backwards branch to successsor block).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18084 91177308-0d34-0410-b5e6-96231b3b80d8
shouldn't be forced to coalesce for us: folded GEP operations. This too
fires thousands of times across the testsuite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17947 91177308-0d34-0410-b5e6-96231b3b80d8
directly rather than making a copy for the register allocator to coalesce.
This kills thousands of live intervals across the testsuite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17946 91177308-0d34-0410-b5e6-96231b3b80d8
and properly emitting signed short to unsigned int. This fixes the last
regression vs. the CBE, MultiSource/Applications/hbd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17942 91177308-0d34-0410-b5e6-96231b3b80d8
int test(int x) { return 32768 - x; }
Fixed by teaching the function that checks a constant's validity to be used
as an immediate argument about subtract-from instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17476 91177308-0d34-0410-b5e6-96231b3b80d8
by the recently committed rlwimi.ll test file. Also commit initial code
for bitfield extract, although it is turned off until fully debugged.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17207 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert register numbers from their opcode value to the real value, e.g.
PPC::R1 => 1 and PPC::F1 => 1
* Add correct handling of loading of global values which are PC-relative --
implement ha16() and lo16()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17190 91177308-0d34-0410-b5e6-96231b3b80d8
be listed second as that is how the instructions are usually created (and is the
correct asm syntax) so that it's assembled correctly from its constituents
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17183 91177308-0d34-0410-b5e6-96231b3b80d8
The decimal value given in the manual (8 or 9) really needs to be multiplied by
a factor of 32 because of the group of 5 zero bits after the register code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17182 91177308-0d34-0410-b5e6-96231b3b80d8
as the shift amount operand to a shift instruction. This was causing us to
emit unnecessary clear operations for code such as:
int foo(int x) { return 1 << x; }
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17175 91177308-0d34-0410-b5e6-96231b3b80d8
including registers, constants, and partial support for global addresses
* The JIT is disabled by default to allow building llvm-gcc, which wants to test
running programs during configure
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17149 91177308-0d34-0410-b5e6-96231b3b80d8
1. optional shift left
2. and x, immX
3. and y, immY
4. or z, x, y
==> rlwimi z, x, y, shift, mask begin, mask end
where immX == ~immY and immX is a run of set bits. This transformation
fires 32 times on voronoi, once on espresso, and probably several
dozen times on external benchmarks such as gcc.
To put this in terms of actual code generated for
struct B { unsigned a : 3; unsigned b : 2; };
void storeA (struct B *b, int v) { b->a = v;}
void storeB (struct B *b, int v) { b->b = v;}
Old:
_storeA:
rlwinm r2, r4, 0, 29, 31
lwz r4, 0(r3)
rlwinm r4, r4, 0, 0, 28
or r2, r4, r2
stw r2, 0(r3)
blr
_storeB:
rlwinm r2, r4, 3, 0, 28
rlwinm r2, r2, 0, 27, 28
lwz r4, 0(r3)
rlwinm r4, r4, 0, 29, 26
or r2, r2, r4
stw r2, 0(r3)
blr
New:
_storeA:
lwz r2, 0(r3)
rlwimi r2, r4, 0, 29, 31
stw r2, 0(r3)
blr
_storeB:
lwz r2, 0(r3)
rlwimi r2, r4, 3, 27, 28
stw r2, 0(r3)
blr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17078 91177308-0d34-0410-b5e6-96231b3b80d8
flag rotate left word immediate then mask insert (rlwimi) as a two-address
instruction, and update the ISel usage of the instruction accordingly.
This will allow us to properly schedule rlwimi, and use it to efficiently
codegen bitfield operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17068 91177308-0d34-0410-b5e6-96231b3b80d8
This transformation fires a few dozen times across the testsuite.
For example, int test2(int X) { return X ^ 0x0FF00FF0; }
Old:
_test2:
lis r2, 4080
ori r2, r2, 4080
xor r3, r3, r2
blr
New:
_test2:
xoris r3, r3, 4080
xori r3, r3, 4080
blr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17004 91177308-0d34-0410-b5e6-96231b3b80d8
addPassesToEmitMachineCode()
* Add support for registers and constants in getMachineOpValue()
This enables running "int main() { ret 0 }" via the PowerPC JIT.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16983 91177308-0d34-0410-b5e6-96231b3b80d8
and 64-bit code emitters that cannot share code unless we use virtual
functions
* Identify components being built by tablegen with more detail by assigning them
to PowerPC, PPC32, or PPC64 more specifically; also avoids seeing 'building
PowerPC XYZ' messages twice, where one is for PPC32 and one for PPC64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16980 91177308-0d34-0410-b5e6-96231b3b80d8
of one or more 1 bits (may wrap from least significant bit to most
significant bit) as the rlwinm rather than andi., andis., or some longer
instructons sequence.
int andn4(int z) { return z & -4; }
int clearhi(int z) { return z & 0x0000FFFF; }
int clearlo(int z) { return z & 0xFFFF0000; }
int clearmid(int z) { return z & 0x00FFFF00; }
int clearwrap(int z) { return z & 0xFF0000FF; }
_andn4:
rlwinm r3, r3, 0, 0, 29
blr
_clearhi:
rlwinm r3, r3, 0, 16, 31
blr
_clearlo:
rlwinm r3, r3, 0, 0, 15
blr
_clearmid:
rlwinm r3, r3, 0, 8, 23
blr
_clearwrap:
rlwinm r3, r3, 0, 24, 7
blr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16832 91177308-0d34-0410-b5e6-96231b3b80d8
1. Fix an illegal argument to getClassB when deciding whether or not to
sign extend a byte load.
2. Initial addition of isLoad and isStore flags to the instruction .td file
for eventual use in a scheduler.
3. Rewrite of how constants are handled in emitSimpleBinaryOperation so
that we can emit the PowerPC shifted immediate instructions far more
often. This allows us to emit the following code:
int foo(int x) { return x | 0x00F0000; }
_foo:
.LBB_foo_0: ; entry
; IMPLICIT_DEF
oris r3, r3, 15
blr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16826 91177308-0d34-0410-b5e6-96231b3b80d8
loading a 32bit constant into a register whose low halfword is all zeroes.
We now omit the ori after the lis for the following C code:
int bar(int y) { return y * 0x00F0000; }
_bar:
.LBB_bar_0: ; entry
; IMPLICIT_DEF
lis r2, 15
mullw r3, r3, r2
blr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16825 91177308-0d34-0410-b5e6-96231b3b80d8
integers that we can use as immediate values in instructions.
Example from yacr2:
- lis r10, -1
- ori r10, r10, 65535
- add r28, r28, r10
+ addi r28, r28, -1
addi r7, r7, 1
addi r9, r9, 1
b .LBB_main_9 ; loopentry.1.i214
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16566 91177308-0d34-0410-b5e6-96231b3b80d8
the ISel to use indexed and non-zero immediate offsets for GEPs that have
more than one use. This is common for instruction sequences such as a load
followed by a modify and store to the same address.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16493 91177308-0d34-0410-b5e6-96231b3b80d8
Darwin and AIX and is not 32- or 64-bit specific
* Bring back PowerPC.td as a result, to make it use the `PowerPC' class name
* Adjust Makefile accordingly
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16174 91177308-0d34-0410-b5e6-96231b3b80d8
Move include/Config and include/Support into include/llvm/Config,
include/llvm/ADT and include/llvm/Support. From here on out, all LLVM
public header files must be under include/llvm/.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16137 91177308-0d34-0410-b5e6-96231b3b80d8
cast fp->bool
cast ulong->fp
algebraic right shift long by non-constant value
These changes tested across most of the test suite. Fixes Regression/casts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16081 91177308-0d34-0410-b5e6-96231b3b80d8
hurt a lot in others. Instead, improve branching version of SetCC and
Select instructions. The old code will be in CVS should we ever need to
dig it up again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15979 91177308-0d34-0410-b5e6-96231b3b80d8
Change int->float cast code to put conversion constants in constant pool.
Shorten code sequence for constant pool fp loads.
Remove LOADLoDirect/LOADLoIndirect psuedo instructions and tweak asmwriter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15913 91177308-0d34-0410-b5e6-96231b3b80d8
that have a frame pointer. This change fixes Burg. In addition, make
the necessary changes to floating point code gen and constant loading after
Chris Lattner's fixes to the asm writer. These changes fix MallocBench/gs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15873 91177308-0d34-0410-b5e6-96231b3b80d8