This makes the code much simpler, and the two cases really do belong apart.
Once we do it, it's pretty obvious how flawed the logic was for A != A case,
so I fixed it (fixing PR369).
This also uses freeStackSlotAfter instead of inserting an fxchg then
popStackAfter'ing in the case where there is a dead result (unlikely, but
possible), producing better code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14139 91177308-0d34-0410-b5e6-96231b3b80d8
Get rid of separate numbering for LLVM BasicBlocks; use the automatically
generated MachineBasicBlock numbering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13567 91177308-0d34-0410-b5e6-96231b3b80d8
sized allocas in the entry block). Instead of generating code like this:
entry:
reg1024 = ESP+1234
... (much later)
*reg1024 = 17
Generate code that looks like this:
entry:
(no code generated)
... (much later)
t = ESP+1234
*t = 17
The advantage being that we DRAMATICALLY reduce the register pressure for these
silly temporaries (they were all being spilled to the stack, resulting in very
silly code). This is actually a manual implementation of rematerialization :)
I have a patch to fold the alloca address computation into loads & stores, which
will make this much better still, but just getting this right took way too much time
and I'm sleepy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13554 91177308-0d34-0410-b5e6-96231b3b80d8
compiling things like 'add long %X, 1'. The problem is that we were switching
the order of the operands for longs even though we can't fold them yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13451 91177308-0d34-0410-b5e6-96231b3b80d8
In InsertFPRegKills(), just check the MachineBasicBlock for successors
instead of its corresponding BasicBlock.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13213 91177308-0d34-0410-b5e6-96231b3b80d8
The iterator is pointing at the next instruction which should not disappear
when doing the load/store replacement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12954 91177308-0d34-0410-b5e6-96231b3b80d8
even when the "optimization" I added before is turned off. It generates this
extremely pointless code:
test:
fld QWORD PTR [%ESP + 4]
mov %AL, 0
test %AL, %AL
fcmove %ST(0), %ST(0)
ret
Good thing the optimizer will have removed this before code generation
anyway. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12939 91177308-0d34-0410-b5e6-96231b3b80d8
Fix several bugs in the intrinsics:
1. Make sure to copy the input registers before the instructions that use them
2. Make sure to copy the value returned by 'in' out of EAX into the register
it is supposed to be in.
This fixes assertions when using in/out and linear scan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12896 91177308-0d34-0410-b5e6-96231b3b80d8
of the fucom[p][p] instructions. This allows us to code generate this function
bool %test(double %X, double %Y) {
%C = setlt double %Y, %X
ret bool %C
}
... into:
test:
fld QWORD PTR [%ESP + 4]
fld QWORD PTR [%ESP + 12]
fucomip %ST(1)
fstp %ST(0)
setb %AL
movsx %EAX, %AL
ret
where before we generated:
test:
fld QWORD PTR [%ESP + 4]
fld QWORD PTR [%ESP + 12]
fucompp
** fnstsw
** sahf
setb %AL
movsx %EAX, %AL
ret
The two marked instructions (which are the ones eliminated) are very bad,
because they serialize execution of the processor. These instructions are
available on the PPRO and later, but since we already use cmov's we aren't
losing any portability.
I retained the old code for the day when we decide we want to support back
to the 386.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12852 91177308-0d34-0410-b5e6-96231b3b80d8
If the source of the cast is a load, we can just use the source memory location,
without having to create a temporary stack slot entry.
Before we code generated this:
double %int(int* %P) {
%V = load int* %P
%V2 = cast int %V to double
ret double %V2
}
into:
int:
sub %ESP, 4
mov %EAX, DWORD PTR [%ESP + 8]
mov %EAX, DWORD PTR [%EAX]
mov DWORD PTR [%ESP], %EAX
fild DWORD PTR [%ESP]
add %ESP, 4
ret
Now we produce this:
int:
mov %EAX, DWORD PTR [%ESP + 4]
fild DWORD PTR [%EAX]
ret
... which is nicer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12846 91177308-0d34-0410-b5e6-96231b3b80d8
for mul and div.
Instead of generating this:
test_divr:
fld QWORD PTR [%ESP + 4]
fld QWORD PTR [.CPItest_divr_0]
fdivrp %ST(1)
ret
We now generate this:
test_divr:
fld QWORD PTR [%ESP + 4]
fdivr QWORD PTR [.CPItest_divr_0]
ret
This code desperately needs refactoring, which will come in the next
patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12841 91177308-0d34-0410-b5e6-96231b3b80d8