this LLVM function:
int %foo() {
ret int cast (int** getelementptr (int** null, int 1) to int)
}
into:
foo:
mov %EAX, 0
lea %EAX, DWORD PTR [%EAX + 4]
ret
now we compile it into:
foo:
mov %EAX, 4
ret
This sequence is frequently generated by the MSIL front-end, and soon the malloc lowering pass and
Java front-ends as well..
-Chris
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14834 91177308-0d34-0410-b5e6-96231b3b80d8
float as a truncation by going through memory. This truncation was being
skipped, which caused 175.vpr to fail after aggressive register promotion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14473 91177308-0d34-0410-b5e6-96231b3b80d8
comparisons. In an 'isunordered' predicate, which looks like this at
the LLVM level:
%a = call bool %llvm.isnan(double %X)
%b = call bool %llvm.isnan(double %Y)
%COM = or bool %a, %b
We used to generate this code:
fxch %ST(1)
fucomip %ST(0), %ST(0)
setp %AL
fucomip %ST(0), %ST(0)
setp %AH
or %AL, %AH
With this patch, we generate this code:
fucomip %ST(0), %ST(1)
fstp %ST(0)
setp %AL
Which should make alkis happy. Tested as X86/compare_folding.llx:test1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14148 91177308-0d34-0410-b5e6-96231b3b80d8
sized allocas in the entry block). Instead of generating code like this:
entry:
reg1024 = ESP+1234
... (much later)
*reg1024 = 17
Generate code that looks like this:
entry:
(no code generated)
... (much later)
t = ESP+1234
*t = 17
The advantage being that we DRAMATICALLY reduce the register pressure for these
silly temporaries (they were all being spilled to the stack, resulting in very
silly code). This is actually a manual implementation of rematerialization :)
I have a patch to fold the alloca address computation into loads & stores, which
will make this much better still, but just getting this right took way too much time
and I'm sleepy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13554 91177308-0d34-0410-b5e6-96231b3b80d8
compiling things like 'add long %X, 1'. The problem is that we were switching
the order of the operands for longs even though we can't fold them yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13451 91177308-0d34-0410-b5e6-96231b3b80d8
In InsertFPRegKills(), just check the MachineBasicBlock for successors
instead of its corresponding BasicBlock.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13213 91177308-0d34-0410-b5e6-96231b3b80d8
The iterator is pointing at the next instruction which should not disappear
when doing the load/store replacement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12954 91177308-0d34-0410-b5e6-96231b3b80d8
Fix several bugs in the intrinsics:
1. Make sure to copy the input registers before the instructions that use them
2. Make sure to copy the value returned by 'in' out of EAX into the register
it is supposed to be in.
This fixes assertions when using in/out and linear scan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12896 91177308-0d34-0410-b5e6-96231b3b80d8
of the fucom[p][p] instructions. This allows us to code generate this function
bool %test(double %X, double %Y) {
%C = setlt double %Y, %X
ret bool %C
}
... into:
test:
fld QWORD PTR [%ESP + 4]
fld QWORD PTR [%ESP + 12]
fucomip %ST(1)
fstp %ST(0)
setb %AL
movsx %EAX, %AL
ret
where before we generated:
test:
fld QWORD PTR [%ESP + 4]
fld QWORD PTR [%ESP + 12]
fucompp
** fnstsw
** sahf
setb %AL
movsx %EAX, %AL
ret
The two marked instructions (which are the ones eliminated) are very bad,
because they serialize execution of the processor. These instructions are
available on the PPRO and later, but since we already use cmov's we aren't
losing any portability.
I retained the old code for the day when we decide we want to support back
to the 386.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12852 91177308-0d34-0410-b5e6-96231b3b80d8
If the source of the cast is a load, we can just use the source memory location,
without having to create a temporary stack slot entry.
Before we code generated this:
double %int(int* %P) {
%V = load int* %P
%V2 = cast int %V to double
ret double %V2
}
into:
int:
sub %ESP, 4
mov %EAX, DWORD PTR [%ESP + 8]
mov %EAX, DWORD PTR [%EAX]
mov DWORD PTR [%ESP], %EAX
fild DWORD PTR [%ESP]
add %ESP, 4
ret
Now we produce this:
int:
mov %EAX, DWORD PTR [%ESP + 4]
fild DWORD PTR [%EAX]
ret
... which is nicer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12846 91177308-0d34-0410-b5e6-96231b3b80d8
for mul and div.
Instead of generating this:
test_divr:
fld QWORD PTR [%ESP + 4]
fld QWORD PTR [.CPItest_divr_0]
fdivrp %ST(1)
ret
We now generate this:
test_divr:
fld QWORD PTR [%ESP + 4]
fdivr QWORD PTR [.CPItest_divr_0]
ret
This code desperately needs refactoring, which will come in the next
patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12841 91177308-0d34-0410-b5e6-96231b3b80d8