use MachineBasicBlocks. To do this, we traverse the Machine CFG instead of
the LLVM CFG, which is also *MUCH* more efficient by having fewer levels of
indirections and mappings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13301 91177308-0d34-0410-b5e6-96231b3b80d8
is only used by a cast, and the casted type is the same size as the original
allocation, it would eliminate the cast by folding it into the allocation.
Unfortunately, it was placing the new allocation instruction right before
the cast, which could pull (for example) alloca instructions into the body
of a function. This turns statically allocatable allocas into expensive
dynamically allocated allocas, which is bad bad bad.
This fixes the problem by placing the new allocation instruction at the same
place the old one was, duh. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13289 91177308-0d34-0410-b5e6-96231b3b80d8
* Make contained ostream pointer, not reference
* Allow setting of that ostream via setStream()
class CachedWriter:
* setStream() in turn calls setStream() on the AssemblyWriter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13247 91177308-0d34-0410-b5e6-96231b3b80d8
In InsertFPRegKills(), just check the MachineBasicBlock for successors
instead of its corresponding BasicBlock.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13213 91177308-0d34-0410-b5e6-96231b3b80d8
Include SparcV9RegisterInfo.h.
Add a getRegisterInfo() accessor and SparcV9RegisterInfo instance, just like
on the X86 target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13146 91177308-0d34-0410-b5e6-96231b3b80d8
functions for now). This automatically turns on the printing of machine
registers using their own real names, instead of goofy things like %mreg(42),
and allows us to migrate code incrementally to the new interface as we see fit.
The register file description it uses is hand-written, so that the register
numbers will match the ones that the SparcV9 target already uses.
Perhaps someday we'll tablegen it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13145 91177308-0d34-0410-b5e6-96231b3b80d8
This prepares us to be able to de-virtualize and de-abstract it, and
take the register allocator bits out and move them into the register allocator
proper...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13127 91177308-0d34-0410-b5e6-96231b3b80d8
documentation that this module needs to be made independent of the
register file description of the current target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13125 91177308-0d34-0410-b5e6-96231b3b80d8
because 1) the first instruction might not be a call site, and
2) CS and SF.Caller were not getting set to point to the new call site
anyway (resulting in a crash on e.g. call %llvm.memset).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13122 91177308-0d34-0410-b5e6-96231b3b80d8
the Module. The default behavior keeps functionality as before: the chosen
function is the one that remains.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13111 91177308-0d34-0410-b5e6-96231b3b80d8
loop. This eliminates the extra add from the previous case, but it's
not clear that this will be a performance win overall. Tommorows test
results will tell. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13103 91177308-0d34-0410-b5e6-96231b3b80d8
over its USES. If it's dead it doesn't have any uses! :)
Thanks to the fabulous and mysterious Bill Wendling for pointing this out. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13102 91177308-0d34-0410-b5e6-96231b3b80d8
Eventually it would be nice if CallGraph maintained an ilist of CallGraphNode's instead
of a vector of pointers to them, but today is not that day.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13100 91177308-0d34-0410-b5e6-96231b3b80d8
of IntCC, FloatCC, and Special types.
Make SparcV9RegInfo::getRegClassIDOfRegType() return the right answer
if you ask for the class corresponding to SpecialRegType.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13095 91177308-0d34-0410-b5e6-96231b3b80d8
the name of %fsr (as the comment in SparcV9RegClassInfo.h used to suggest)
you would walk off the end of the FloatCCRegName array.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13070 91177308-0d34-0410-b5e6-96231b3b80d8
structure to being dynamically computed on demand. This makes updating
loop information MUCH easier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13045 91177308-0d34-0410-b5e6-96231b3b80d8
that the exit block of the loop becomes the new entry block of the function.
This was causing a verifier assertion on 252.eon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13039 91177308-0d34-0410-b5e6-96231b3b80d8
block. The primary motivation for doing this is that we can now unroll nested loops.
This makes a pretty big difference in some cases. For example, in 183.equake,
we are now beating the native compiler with the CBE, and we are a lot closer
with LLC.
I'm now going to play around a bit with the unroll factor and see what effect
it really has.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13034 91177308-0d34-0410-b5e6-96231b3b80d8
limited. Even in it's extremely simple state (it can only *fully* unroll single
basic block loops that execute a constant number of times), it already helps improve
performance a LOT on some benchmarks, particularly with the native code generators.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13028 91177308-0d34-0410-b5e6-96231b3b80d8
operations. This allows us to compile this testcase:
int main() {
int h = 1;
do h = 3 * h + 1; while (h <= 256);
printf("%d\n", h);
return 0;
}
into this:
int %main() {
entry:
call void %__main( )
%tmp.6 = call int (sbyte*, ...)* %printf( sbyte* getelementptr ([4 x sbyte]* %.str_1, long 0, long 0), int 364 ) ; <int> [#uses=0]
ret int 0
}
This testcase was taken directly from 256.bzip2, believe it or not.
This code is not as general as I would like. Next up is to refactor it
a bit to handle more cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13019 91177308-0d34-0410-b5e6-96231b3b80d8
even if the loop is using expressions that we can't compute as a closed-form.
This allows us to calculate that this function always returns 55:
int test() {
double X;
int Count = 0;
for (X = 100; X > 1; X = sqrt(X), ++Count)
/*empty*/;
return Count;
}
And allows us to compute trip counts for loops like:
int h = 1;
do h = 3 * h + 1; while (h <= 256);
(which occurs in bzip2), and for this function, which occurs after inlining
and other optimizations:
int popcount()
{
int x = 666;
int result = 0;
while (x != 0) {
result = result + (x & 0x1);
x = x >> 1;
}
return result;
}
We still cannot compute the exit values of result or h in the two loops above,
which means we cannot delete the loop, but we are getting closer. Being able to
compute a constant trip count for these two loops will allow us to unroll them
completely though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13017 91177308-0d34-0410-b5e6-96231b3b80d8
that does not dominate all of its users, but is in the same basic block as
its users. This class of error is what caused the mysterious CBE only
failures last night.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12979 91177308-0d34-0410-b5e6-96231b3b80d8
Basically we were using SimplifyCFG as a huge sledgehammer for a simple
optimization. Because simplifycfg does so many things, we can't use it
for this purpose.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12977 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of producing code like this:
Loop:
X = phi 0, X2
...
X2 = X + 1
if (X != N-1) goto Loop
We now generate code that looks like this:
Loop:
X = phi 0, X2
...
X2 = X + 1
if (X2 != N) goto Loop
This has two big advantages:
1. The trip count of the loop is now explicit in the code, allowing
the direct implementation of Loop::getTripCount()
2. This reduces register pressure in the loop, and allows X and X2 to be
put into the same register.
As a consequence of the second point, the code we generate for loops went
from:
.LBB2: # no_exit.1
...
mov %EDI, %ESI
inc %EDI
cmp %ESI, 2
mov %ESI, %EDI
jne .LBB2 # PC rel: no_exit.1
To:
.LBB2: # no_exit.1
...
inc %ESI
cmp %ESI, 3
jne .LBB2 # PC rel: no_exit.1
... which has two fewer moves, and uses one less register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12961 91177308-0d34-0410-b5e6-96231b3b80d8
The iterator is pointing at the next instruction which should not disappear
when doing the load/store replacement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12954 91177308-0d34-0410-b5e6-96231b3b80d8
at the bottom of the loop instead of the top. This reduces the number of
overlapping live ranges a lot, for example, eliminating a spill in an important
loop in 183.equake with linear scan.
I still need to make the exit comparison of the loop use the post-incremented
version of this variable, but this is an easy first step.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12952 91177308-0d34-0410-b5e6-96231b3b80d8
even when the "optimization" I added before is turned off. It generates this
extremely pointless code:
test:
fld QWORD PTR [%ESP + 4]
mov %AL, 0
test %AL, %AL
fcmove %ST(0), %ST(0)
ret
Good thing the optimizer will have removed this before code generation
anyway. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12939 91177308-0d34-0410-b5e6-96231b3b80d8
Fix several bugs in the intrinsics:
1. Make sure to copy the input registers before the instructions that use them
2. Make sure to copy the value returned by 'in' out of EAX into the register
it is supposed to be in.
This fixes assertions when using in/out and linear scan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12896 91177308-0d34-0410-b5e6-96231b3b80d8
SCC passes much more useful. In particular, this should fix the incredibly
stupid missed inlining opportunities that the inliner suffered from.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12860 91177308-0d34-0410-b5e6-96231b3b80d8
of the fucom[p][p] instructions. This allows us to code generate this function
bool %test(double %X, double %Y) {
%C = setlt double %Y, %X
ret bool %C
}
... into:
test:
fld QWORD PTR [%ESP + 4]
fld QWORD PTR [%ESP + 12]
fucomip %ST(1)
fstp %ST(0)
setb %AL
movsx %EAX, %AL
ret
where before we generated:
test:
fld QWORD PTR [%ESP + 4]
fld QWORD PTR [%ESP + 12]
fucompp
** fnstsw
** sahf
setb %AL
movsx %EAX, %AL
ret
The two marked instructions (which are the ones eliminated) are very bad,
because they serialize execution of the processor. These instructions are
available on the PPRO and later, but since we already use cmov's we aren't
losing any portability.
I retained the old code for the day when we decide we want to support back
to the 386.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12852 91177308-0d34-0410-b5e6-96231b3b80d8
If the source of the cast is a load, we can just use the source memory location,
without having to create a temporary stack slot entry.
Before we code generated this:
double %int(int* %P) {
%V = load int* %P
%V2 = cast int %V to double
ret double %V2
}
into:
int:
sub %ESP, 4
mov %EAX, DWORD PTR [%ESP + 8]
mov %EAX, DWORD PTR [%EAX]
mov DWORD PTR [%ESP], %EAX
fild DWORD PTR [%ESP]
add %ESP, 4
ret
Now we produce this:
int:
mov %EAX, DWORD PTR [%ESP + 4]
fild DWORD PTR [%EAX]
ret
... which is nicer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12846 91177308-0d34-0410-b5e6-96231b3b80d8
for mul and div.
Instead of generating this:
test_divr:
fld QWORD PTR [%ESP + 4]
fld QWORD PTR [.CPItest_divr_0]
fdivrp %ST(1)
ret
We now generate this:
test_divr:
fld QWORD PTR [%ESP + 4]
fdivr QWORD PTR [.CPItest_divr_0]
ret
This code desperately needs refactoring, which will come in the next
patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12841 91177308-0d34-0410-b5e6-96231b3b80d8
instructions use. This doesn't change any functionality except that long
constant expressions of these operations will now magically start working.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12840 91177308-0d34-0410-b5e6-96231b3b80d8
fld QWORD PTR [%ESP + 4]
fadd QWORD PTR [.CPItest_add_0]
instead of:
fld QWORD PTR [%ESP + 4]
fld QWORD PTR [.CPItest_add_0]
faddp %ST(1)
I also intend to do this for mul & div, but it appears that I have to
refactor a bit of code before I can do so.
This is tested by: test/Regression/CodeGen/X86/fp_constant_op.llx
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12839 91177308-0d34-0410-b5e6-96231b3b80d8
1. If an incoming argument is dead, don't load it from the stack
2. Do not code gen noop copies at all (ie, cast int -> uint), not even to
a move. This should reduce register pressure for allocators that are
unable to coallesce away these copies in some cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12835 91177308-0d34-0410-b5e6-96231b3b80d8
This transforms code like this:
%C = or %A, %B
%D = select %cond, %C, %A
into:
%C = select %cond, %B, 0
%D = or %A, %C
Since B is often a constant, the select can often be eliminated. In any case,
this reduces the usage count of A, allowing subsequent optimizations to happen.
This xform applies when the operator is any of:
add, sub, mul, or, xor, and, shl, shr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12800 91177308-0d34-0410-b5e6-96231b3b80d8
that have a constant operand. This implements
add.ll:test19, shift.ll:test15*, and others that are not tested
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12794 91177308-0d34-0410-b5e6-96231b3b80d8
InstSelectSimple.cpp:
Change the checks for proper I/O port address size into an exit() instead
of an assertion. Assertions aren't used in Release builds, and handling
this error should be graceful (not that this counts as graceful, but it's
more graceful).
Modified the generation of the IN/OUT instructions to have 0 arguments.
X86InstrInfo.td:
Added the OpSize attribute to the 16 bit IN and OUT instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12786 91177308-0d34-0410-b5e6-96231b3b80d8
I/O port instructions on x86. The specific code sequence is tailored to
the parameters and return value of the intrinsic call.
Added the ability for implicit defintions to be printed in the Instruction
Printer.
Added the ability for RawFrm instruction to print implict uses and
defintions with correct comma output. This required adjustment to some
methods so that a leading comma would or would not be printed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12782 91177308-0d34-0410-b5e6-96231b3b80d8
The Verifier ensures that their parameters are of integral types and have
the correct sign, but it does not enforce any size restrictions because
such restrictions are platform dependent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12781 91177308-0d34-0410-b5e6-96231b3b80d8
Now we collect all of the call sites we are interested in inlining, then inline
them. This entirely avoids issues with trying to inline a call site we got by
inlining another call site. This also eliminates iterator invalidation issues.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12770 91177308-0d34-0410-b5e6-96231b3b80d8
function prologues, and fix an off-by-one in visitCallInst that was
putting call args into the wrong registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12757 91177308-0d34-0410-b5e6-96231b3b80d8
have no good way of handling this until the code generator is improved.
We should probably just emit V9 instructions in the meantime.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12745 91177308-0d34-0410-b5e6-96231b3b80d8