following handy-dandy properties:
1. it is always correct now
2. it is much faster than before
3. it is easier to understand
This implementation builds off of the recent simplifications of the
legalizer that made it single-pass instead of iterative.
This fixes JM/lencod, JM/ldecod, and
CodeGen/Generic/2006-02-12-InsertLibcall.ll (at least on PPC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26144 91177308-0d34-0410-b5e6-96231b3b80d8
problem where it inline the map insertion call too aggressively. Before this
change it was producing a frame size of 24k for Select_store(), now it's down
to 10k (by calling this method rather than calling the map insertion operator).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26094 91177308-0d34-0410-b5e6-96231b3b80d8
Move all getTargetNode() out of SelectionDAG.h into SelectionDAG.cpp. This
prevents them from being inlined.
Change getTargetNode() so they return SDNode * instead of SDOperand to prevent
copying. It should also help compilation speed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26083 91177308-0d34-0410-b5e6-96231b3b80d8
int %rlwnm(int %A, int %B) {
%C = call int asm "rlwnm $0, $1, $2, $3, $4", "=r,r,r,n,n"(int %A, int %B, int 4, int 17)
ret int %C
}
into:
_rlwnm:
or r2, r3, r3
or r3, r4, r4
rlwnm r2, r2, r3, 4, 17 ;; note the immediates :)
or r3, r2, r2
blr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@25955 91177308-0d34-0410-b5e6-96231b3b80d8
store EAX -> [ss#0]
[ss#0] += 1
...
use(EAX)
In this case, it is not valid to rewrite this as:
store EAX -> [ss#0]
EAX += 1
store EAX -> [ss#0] ;;; this would also delete the store above
...
use(EAX)
... because EAX is not a dead at that point. Keep track of which registers
we are allowed to clobber, and which ones we aren't, and don't clobber the
ones we're not supposed to. :)
This should resolve the issues on X86 last night.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@25948 91177308-0d34-0410-b5e6-96231b3b80d8
and PhysRegsAvailable maps out into a new AvailableSpills struct. No
functionality change.
This paves the way for a bugfix, coming up next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@25947 91177308-0d34-0410-b5e6-96231b3b80d8
1. a target doesn't know how to fold load/stores into copies, or
2. the spiller rewrites the input to a copy to the same register as the dest
instead of to the reloaded reg.
This will be moved/improved in the near future, but allows elimination of
some ancient x86 hacks. This eliminates 92 copies from SMG2000 on X86 and
163 copies from 252.eon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@25922 91177308-0d34-0410-b5e6-96231b3b80d8
of this, and use it to our advantage (bwahahah). This allows us to eliminate another
60 instructions from smg2000 on PPC (probably significantly more on X86). A common
old-new diff looks like this:
stw r2, 3304(r1)
- lwz r2, 3192(r1)
stw r2, 3300(r1)
- lwz r2, 3192(r1)
stw r2, 3296(r1)
- lwz r2, 3192(r1)
stw r2, 3200(r1)
- lwz r2, 3192(r1)
stw r2, 3196(r1)
- lwz r2, 3192(r1)
+ or r2, r2, r2
stw r2, 3188(r1)
and
- lwz r31, 604(r1)
- lwz r13, 604(r1)
- lwz r14, 604(r1)
- lwz r15, 604(r1)
- lwz r16, 604(r1)
- lwz r30, 604(r1)
+ or r31, r30, r30
+ or r13, r30, r30
+ or r14, r30, r30
+ or r15, r30, r30
+ or r16, r30, r30
+ or r30, r30, r30
Removal of the R = R copies is coming next...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@25919 91177308-0d34-0410-b5e6-96231b3b80d8