packed types with an element count of 1, although more generic support is
coming. This allows LLVM to turn the following code:
void %foo(<1 x float> * %a) {
entry:
%tmp1 = load <1 x float> * %a;
%tmp2 = add <1 x float> %tmp1, %tmp1
store <1 x float> %tmp2, <1 x float> *%a
ret void
}
Into:
_foo:
lfs f0, 0(r3)
fadds f0, f0, f0
stfs f0, 0(r3)
blr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24416 91177308-0d34-0410-b5e6-96231b3b80d8
alignment information appropriately. Includes code for PowerPC to support
fixed-size allocas with alignment larger than the stack. Support for
arbitrarily aligned dynamic allocas coming soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24224 91177308-0d34-0410-b5e6-96231b3b80d8
a special case hack for X86, make the hack more general: if an incoming argument
register is not used in any block other than the entry block, don't copy it to
a vreg. This helps us compile code like this:
%struct.foo = type { int, int, [0 x ubyte] }
int %test(%struct.foo* %X) {
%tmp1 = getelementptr %struct.foo* %X, int 0, uint 2, int 100
%tmp = load ubyte* %tmp1 ; <ubyte> [#uses=1]
%tmp2 = cast ubyte %tmp to int ; <int> [#uses=1]
ret int %tmp2
}
to:
_test:
lbz r3, 108(r3)
blr
instead of:
_test:
lbz r2, 108(r3)
or r3, r2, r2
blr
The (dead) copy emitted to copy r3 into a vreg for extra-block uses was
increasing the live range of r3 past the load, preventing the coallescing.
This implements CodeGen/PowerPC/reg-coallesce-simple.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24115 91177308-0d34-0410-b5e6-96231b3b80d8
allows us to lower legal return types to something else, to meet ABI
requirements (such as that i64 be returned in two i32 regs on Darwin/ppc).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23802 91177308-0d34-0410-b5e6-96231b3b80d8
Though I have done extensive testing, it is possible that this will break
things in configs I can't test. Please let me know if this causes a problem
and I'll fix it ASAP.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23504 91177308-0d34-0410-b5e6-96231b3b80d8
instead of ZERO_EXTEND to eliminate extraneous extensions. This eliminates
dead zero extensions on formal arguments and other cases on PPC, implementing
the newly tightened up test/Regression/CodeGen/PowerPC/small-arguments.ll test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23205 91177308-0d34-0410-b5e6-96231b3b80d8
Nate noticed in yacr2 (and I know occurs in other places as well).
This is still rough, as the critical edge blocks are not intelligently placed
but is added to get some idea to see if this improves performance.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22825 91177308-0d34-0410-b5e6-96231b3b80d8
used to tack a register number onto the node.
Instead of doing this, make a new node, RegisterSDNode, which is a leaf
containing a register number. These three operations just become normal
DAG nodes now, instead of requiring special handling.
Note that with this change, it is no longer correct to make illegal
CopyFromReg/CopyToReg nodes. The legalizer will not touch them, and this
is bad, so don't do it. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22806 91177308-0d34-0410-b5e6-96231b3b80d8
CC out of the SetCC operation, making SETCC a standard ternary operation and
CC's a standard DAG leaf. This will make it possible for other node to use
CC's as operands in the future...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22728 91177308-0d34-0410-b5e6-96231b3b80d8
1. Pass Value*'s into lowering methods so that the proper pointers can be
added to load/stores from the valist
2. Intrinsics that return void should only return a token chain, not a token
chain/retval pair.
3. Rename LowerVAArgNext -> LowerVAArg, because VANext is long gone.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@22338 91177308-0d34-0410-b5e6-96231b3b80d8
population (ctpop). Generic lowering is implemented, however only promotion
is implemented for SelectionDAG at the moment.
More coming soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21676 91177308-0d34-0410-b5e6-96231b3b80d8
(TRUNK)Stores and (EXT|ZEXT|SEXT)Loads have an extra SDOperand which is a SrcValueSDNode which contains the Value*. Note that if the operation is introduced by the backend, it will still have the operand, but the value* will be null.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@21599 91177308-0d34-0410-b5e6-96231b3b80d8
them up after the code has been emitted. This allows targets to select one
mbb as multiple mbb's as needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20937 91177308-0d34-0410-b5e6-96231b3b80d8
returned integer values all of the way to 64-bits (we only did it to 32-bits
leaving the top bits undefined). This causes problems for targets like alpha
whose ABI's define the top bits too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20926 91177308-0d34-0410-b5e6-96231b3b80d8
using Function::arg_{iterator|begin|end}. Likewise Module::g* -> Module::global_*.
This patch is contributed by Gabor Greif, thanks!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@20597 91177308-0d34-0410-b5e6-96231b3b80d8
do it. This results in better code on X86 for floats (because if strict
precision is not required, we can elide some more expensive double -> float
conversions like the old isel did), and allows other targets to emit
CopyFromRegs that are not legal for arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19668 91177308-0d34-0410-b5e6-96231b3b80d8
X86/reg-pressure.ll again, and allows us to do nice things in other cases.
For example, we now codegen this sort of thing:
int %loadload(int *%X, int* %Y) {
%Z = load int* %Y
%Y = load int* %X ;; load between %Z and store
%Q = add int %Z, 1
store int %Q, int* %Y
ret int %Y
}
Into this:
loadload:
mov %EAX, DWORD PTR [%ESP + 4]
mov %EAX, DWORD PTR [%EAX]
mov %ECX, DWORD PTR [%ESP + 8]
inc DWORD PTR [%ECX]
ret
where we weren't able to form the 'inc [mem]' before. This also lets the
instruction selector emit loads in any order it wants to, which can be good
for register pressure as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19644 91177308-0d34-0410-b5e6-96231b3b80d8
the basic block that uses them if possible. This is a big win on X86, as it
lets us fold the argument loads into instructions and reduce register pressure
(by not loading all of the arguments in the entry block).
For this (contrived to show the optimization) testcase:
int %argtest(int %A, int %B) {
%X = sub int 12345, %A
br label %L
L:
%Y = add int %X, %B
ret int %Y
}
we used to produce:
argtest:
mov %ECX, DWORD PTR [%ESP + 4]
mov %EAX, 12345
sub %EAX, %ECX
mov %EDX, DWORD PTR [%ESP + 8]
.LBBargtest_1: # L
add %EAX, %EDX
ret
now we produce:
argtest:
mov %EAX, 12345
sub %EAX, DWORD PTR [%ESP + 4]
.LBBargtest_1: # L
add %EAX, DWORD PTR [%ESP + 8]
ret
This also fixes the FIXME in the code.
BTW, this occurs in real code. 164.gzip shrinks from 8623 to 8608 lines of
.s file. The stack frame in huft_build shrinks from 1644->1628 bytes,
inflate_codes shrinks from 116->108 bytes, and inflate_block from 2620->2612,
due to fewer spills.
Take that alkis. :-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19639 91177308-0d34-0410-b5e6-96231b3b80d8