int -> FP casting code. Note that we don't have to set it for FP operations
that take FP values as operands: whatever produces the FP value will set the
flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19451 91177308-0d34-0410-b5e6-96231b3b80d8
evaluate the LHS or the RHS of an operation first. This causes good things
to happen. For example, instead of compiling a loop to this:
.LBBstrength_result7_1: # loopentry
movl 16(%esp), %edi
movl (%edi), %edi ;;; LOAD
movl (%ecx), %ebx
movl $2, (%eax,%ebx,4)
movl (%edx), %ebx
movl %esi, %ebp
addl $21, %ebp
addl $42, %esi
cmpl $0, %edi ;;; USE
cmovne %esi, %ebp
cmpl %ebp, %ebx
movl %ebp, %esi
jg .LBBstrength_result7_1
We now compile it to this:
.LBBstrength_result7_1: # loopentry
movl %edi, %ebx
addl $42, %ebx
addl $21, %edi
movl (%ecx), %ebp ;; LOAD
cmpl $0, %ebp ;; USE
cmovne %ebx, %edi
movl (%edx), %ebx
movl $2, (%eax,%ebx,4)
movl (%esi), %ebx
cmpl %edi, %ebx
jg .LBBstrength_result7_1
Which reduces register pressure enough (in this case) to avoid spilling in the
loop.
As another example, consider the CodeGen/X86/regpressure.ll testcase. We
used to generate this code for both cases:
regpressure1:
subl $32, %esp
movl %esi, 12(%esp)
movl %edi, 8(%esp)
movl %ebx, 4(%esp)
movl %ebp, (%esp)
movl 36(%esp), %ecx
movl (%ecx), %eax
movl 4(%ecx), %edx
movl %edx, 24(%esp)
movl 8(%ecx), %edx
movl %edx, 16(%esp)
movl 12(%ecx), %edx
movl 16(%ecx), %esi
movl 20(%ecx), %edi
movl 24(%ecx), %ebx
movl %ebx, 28(%esp)
movl 28(%ecx), %ebx
movl 32(%ecx), %ebp
movl %ebp, 20(%esp)
movl 36(%ecx), %ecx
imull 24(%esp), %eax
imull 16(%esp), %eax
imull %edx, %eax
imull %esi, %eax
imull %edi, %eax
imull 28(%esp), %eax
imull %ebx, %eax
imull 20(%esp), %eax
imull %ecx, %eax
movl (%esp), %ebp
movl 4(%esp), %ebx
movl 8(%esp), %edi
movl 12(%esp), %esi
addl $32, %esp
ret
This code is basically trying to do all of the loads first, then execute all
of the multiplies. Because we run out of registers, lots of spill code happens.
We now generate this code for both cases:
regpressure1:
movl 4(%esp), %ecx
movl (%ecx), %eax
movl 4(%ecx), %edx
imull %edx, %eax
movl 8(%ecx), %edx
imull %edx, %eax
movl 12(%ecx), %edx
imull %edx, %eax
movl 16(%ecx), %edx
imull %edx, %eax
movl 20(%ecx), %edx
imull %edx, %eax
movl 24(%ecx), %edx
imull %edx, %eax
movl 28(%ecx), %edx
imull %edx, %eax
movl 32(%ecx), %edx
imull %edx, %eax
movl 36(%ecx), %ecx
imull %ecx, %eax
ret
which is much nicer (when we fold loads into the muls it will be even better).
The old instruction selector used to produce the good code for regpressure1
but not for regpressure2, as it depended on the order of operations in the
LLVM code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19449 91177308-0d34-0410-b5e6-96231b3b80d8
of an ADDri (due to current restrictions on MachineOperand :( ). This allows
us to generate:
leal Data+16000, %edx
instead of:
movl $Data, %edx
addl $16000, %edx
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19420 91177308-0d34-0410-b5e6-96231b3b80d8
store float 123.45, float* %P
as an integer store. This adds handling of float immediate stores as integers
for arguments passed function calls.
This is now tested by CodeGen/X86/store-fp-constant.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19364 91177308-0d34-0410-b5e6-96231b3b80d8
For now, this is the default, as the current selector is missing some big pieces.
To enable the new selector, pass -disable-pattern-isel=false to llc or lli.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19335 91177308-0d34-0410-b5e6-96231b3b80d8
precisely represented as a float, put it into the constant pool as a
float.
2. Use the cbw/cwd/cdq instructions instead of an explicit SAR for signed
division.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19291 91177308-0d34-0410-b5e6-96231b3b80d8
- unsigned TrueValue = getReg(TrueVal, BB, BB->begin());
+ unsigned TrueValue = getReg(TrueVal);
Fixes the PPC regressions from last night.
The other hunk is just a clarity improvement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19263 91177308-0d34-0410-b5e6-96231b3b80d8
to get Visual Studio to link in X86.lib to the executables that need it. There
is another way of doing it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19252 91177308-0d34-0410-b5e6-96231b3b80d8
1. Add new instructions for checking parity flags: JP, JNP, SETP, SETNP.
2. Set the isCommutable and isPromotableTo3Address bits on several
instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19246 91177308-0d34-0410-b5e6-96231b3b80d8
While we're at it, improve codegen of select instructions. For this
testcase:
int %test(bool %C, int %A, int %B) {
%D = select bool %C, int %A, int %B
ret int %D
}
We used to generate this code:
_test:
cmpwi cr0, r3, 0
bne .LBB_test_2 ;
.LBB_test_1: ;
b .LBB_test_3 ;
.LBB_test_2: ;
or r5, r4, r4
.LBB_test_3: ;
or r3, r5, r5
blr
Now we emit:
_test:
cmpwi cr0, r3, 0
bne .LBB_test_2 ;
.LBB_test_1: ;
or r4, r5, r5
.LBB_test_2: ;
or r3, r4, r4
blr
-Chris
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19214 91177308-0d34-0410-b5e6-96231b3b80d8
save small amounts of time for functions that don't call llvm.returnaddress
or llvm.frameaddress (which is almost all functions).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@19006 91177308-0d34-0410-b5e6-96231b3b80d8
don't support long double anyway, and this gives us FP results closer to
other targets.
This also speeds up 179.art from 41.4s to 18.32s, by eliminating a problem
with extra precision that causes an FP == comparison to fail (leading to
extra loop iterations).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18895 91177308-0d34-0410-b5e6-96231b3b80d8
ctor parameters can be defaulted.
Print the transformed llvm code input to the instruction selector
when -print-machineinstrs is on, just like V9.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18794 91177308-0d34-0410-b5e6-96231b3b80d8
* Don't emit the Index * ElementSize multiply if Index is a constant.
* Use a shift, not a multiply, if ElementSize is 1/2/4/8.
* If ElementSize fits in the immediate field of SMUL, then put it there.
Fix a bug where struct offsets might be truncated (ConstantSInt::get is
now used instead of ConstantInt::get).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18792 91177308-0d34-0410-b5e6-96231b3b80d8
each pair. I think this fixes that.
One of these days, I swear I'm going to get the hang of C++ iterators.
Really.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18734 91177308-0d34-0410-b5e6-96231b3b80d8
intrinsic lowering ever introduces constants.
Rename local symbols before printing function bodies, fixing 255.vortex
with the CBE!!!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18534 91177308-0d34-0410-b5e6-96231b3b80d8
Including alloca.h on Solaris brings in the prototype of strftime(), which
breaks compilation of CBE generated code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18435 91177308-0d34-0410-b5e6-96231b3b80d8
instead of 80-bits of precision. This fixes PR467.
This change speeds up fldry on X86 with LLC from 7.32s on apoc to 4.68s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18433 91177308-0d34-0410-b5e6-96231b3b80d8
addi r3, r3, -1
instead of
addi r3, r3, 1
for 'sub int X, 1'.
Secondarily, this fixes several cases where we could crash given an unsigned
constant. And fixes a couple of minor missed optimization cases, such as
xor X, ~0U -> not X
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18379 91177308-0d34-0410-b5e6-96231b3b80d8
to Brian and the Sun compiler for pointing out that the obvious works :)
This also enables folding all long comparisons into setcc and branch
instructions: before we could only do == and !=
For example, for:
void test(unsigned long long A, unsigned long long B) {
if (A < B) foo();
}
We now generate:
test:
subl $4, %esp
movl %esi, (%esp)
movl 8(%esp), %eax
movl 12(%esp), %ecx
movl 16(%esp), %edx
movl 20(%esp), %esi
subl %edx, %eax
sbbl %esi, %ecx
jae .LBBtest_2 # UnifiedReturnBlock
.LBBtest_1: # then
call foo
movl (%esp), %esi
addl $4, %esp
ret
.LBBtest_2: # UnifiedReturnBlock
movl (%esp), %esi
addl $4, %esp
ret
Instead of:
test:
subl $12, %esp
movl %esi, 8(%esp)
movl %ebx, 4(%esp)
movl 16(%esp), %eax
movl 20(%esp), %ecx
movl 24(%esp), %edx
movl 28(%esp), %esi
cmpl %edx, %eax
setb %al
cmpl %esi, %ecx
setb %bl
cmove %ax, %bx
testb %bl, %bl
je .LBBtest_2 # UnifiedReturnBlock
.LBBtest_1: # then
call foo
movl 4(%esp), %ebx
movl 8(%esp), %esi
addl $12, %esp
ret
.LBBtest_2: # UnifiedReturnBlock
movl 4(%esp), %ebx
movl 8(%esp), %esi
addl $12, %esp
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18330 91177308-0d34-0410-b5e6-96231b3b80d8
place to help bring up the PowerPC back end on Darwin. This code is no
longer serves any purpose now that the AsmPrinter does the right thing
all the time printing GlobalValues. --Cruft.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18267 91177308-0d34-0410-b5e6-96231b3b80d8
static global variables whose addresses are taken. This allows us to
convert the following code for taking the address of a static function foo
addis r2, r30, ha16(Ll1__2E_foo_2$non_lazy_ptr-"L00001$pb")
lwz r3, lo16(Ll1__2E_foo_2$non_lazy_ptr-"L00001$pb")(r2)
which also includes linker stub code emitted at the end of the .s file not
shown here, and replace it with this:
addis r2, r30, ha16(l1__2E_foo_2-"L00001$pb")
la r3, lo16(l1__2E_foo_2-"L00001$pb")(r2)
which in addition to not needing linker help, also has no load instruction.
For those not up on PowerPC mnemonics, la is shorthand for add immediate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18239 91177308-0d34-0410-b5e6-96231b3b80d8
LLVM blocks as the keys for the branch rewriter. This fixes treeadd and
many other programs with the JIT.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18223 91177308-0d34-0410-b5e6-96231b3b80d8
* Add relocations for refernces to non-lazy darwin stubs and implement
them correctly.
With this change, we can correctly references external globals, and now
all but two UnitTests and all but 1 Regression/C tests pass.
More importantly, bugpoint-jit will start giving us useful testcases,
instead of always telling us that references to external globals don't
work :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18222 91177308-0d34-0410-b5e6-96231b3b80d8
obscure problem where we were doing:
lmw r3,0(r9)
which is undefined on PPC. Now we do:
lmw r3,0(r2)
by force, not relying on the GCC register allocator for luck :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18212 91177308-0d34-0410-b5e6-96231b3b80d8