scaled indexes. This allows us to compile GEP's like this:
int* %test([10 x { int, { int } }]* %X, int %Idx) {
%Idx = cast int %Idx to long
%X = getelementptr [10 x { int, { int } }]* %X, long 0, long %Idx, ubyte 1, ubyte 0
ret int* %X
}
Into a single address computation:
test:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
lea %EAX, DWORD PTR [%EAX + 8*%ECX + 4]
ret
Before it generated:
test:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
shl %ECX, 3
add %EAX, %ECX
lea %EAX, DWORD PTR [%EAX + 4]
ret
This is useful for things like int/float/double arrays, as the indexing can be folded into
the loads&stores, reducing register pressure and decreasing the pressure on the decode unit.
With these changes, I expect our performance on 256.bzip2 and gzip to improve a lot. On
bzip2 for example, we go from this:
10665 asm-printer - Number of machine instrs printed
40 ra-local - Number of loads/stores folded into instructions
1708 ra-local - Number of loads added
1532 ra-local - Number of stores added
1354 twoaddressinstruction - Number of instructions added
1354 twoaddressinstruction - Number of two-address instructions
2794 x86-peephole - Number of peephole optimization performed
to this:
9873 asm-printer - Number of machine instrs printed
41 ra-local - Number of loads/stores folded into instructions
1710 ra-local - Number of loads added
1521 ra-local - Number of stores added
789 twoaddressinstruction - Number of instructions added
789 twoaddressinstruction - Number of two-address instructions
2142 x86-peephole - Number of peephole optimization performed
... and these types of instructions are often in tight loops.
Linear scan is also helped, but not as much. It goes from:
8787 asm-printer - Number of machine instrs printed
2389 liveintervals - Number of identity moves eliminated after coalescing
2288 liveintervals - Number of interval joins performed
3522 liveintervals - Number of intervals after coalescing
5810 liveintervals - Number of original intervals
700 spiller - Number of loads added
487 spiller - Number of stores added
303 spiller - Number of register spills
1354 twoaddressinstruction - Number of instructions added
1354 twoaddressinstruction - Number of two-address instructions
363 x86-peephole - Number of peephole optimization performed
to:
7982 asm-printer - Number of machine instrs printed
1759 liveintervals - Number of identity moves eliminated after coalescing
1658 liveintervals - Number of interval joins performed
3282 liveintervals - Number of intervals after coalescing
4940 liveintervals - Number of original intervals
635 spiller - Number of loads added
452 spiller - Number of stores added
288 spiller - Number of register spills
789 twoaddressinstruction - Number of instructions added
789 twoaddressinstruction - Number of two-address instructions
258 x86-peephole - Number of peephole optimization performed
Though I'm not complaining about the drop in the number of intervals. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11820 91177308-0d34-0410-b5e6-96231b3b80d8
to do analysis.
*** FOLD getelementptr instructions into loads and stores when possible,
making use of some of the crazy X86 addressing modes.
For example, the following C++ program fragment:
struct complex {
double re, im;
complex(double r, double i) : re(r), im(i) {}
};
inline complex operator+(const complex& a, const complex& b) {
return complex(a.re+b.re, a.im+b.im);
}
complex addone(const complex& arg) {
return arg + complex(1,0);
}
Used to be compiled to:
_Z6addoneRK7complex:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
*** mov %EDX, %ECX
fld QWORD PTR [%EDX]
fld1
faddp %ST(1)
*** add %ECX, 8
fld QWORD PTR [%ECX]
fldz
faddp %ST(1)
*** mov %ECX, %EAX
fxch %ST(1)
fstp QWORD PTR [%ECX]
*** add %EAX, 8
fstp QWORD PTR [%EAX]
ret
Now it is compiled to:
_Z6addoneRK7complex:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
fld QWORD PTR [%ECX]
fld1
faddp %ST(1)
fld QWORD PTR [%ECX + 8]
fldz
faddp %ST(1)
fxch %ST(1)
fstp QWORD PTR [%EAX]
fstp QWORD PTR [%EAX + 8]
ret
Other programs should see similar improvements, across the board. Note that
in addition to reducing instruction count, this also reduces register pressure
a lot, always a good thing on X86. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11819 91177308-0d34-0410-b5e6-96231b3b80d8
into a single LEA instruction. This should improve the code generated for
things like X->A.B.C[12].D.
The bigger benefit is still coming though. Note that this uses an LEA instruction
instead of an add, giving the register allocator more freedom. We should probably
never generate ADDri32's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11817 91177308-0d34-0410-b5e6-96231b3b80d8
longer was getting this #include, it always fell back on the less precise
floating point initializer values, causing some testsuite failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11803 91177308-0d34-0410-b5e6-96231b3b80d8
block into MachineBasicBlock::getFirstTerminator().
This also fixes a bug in the implementation of the above in both
RegAllocLocal and InstrSched, where instructions where added after the
terminator if the basic block's only instruction was a terminator (it
shouldn't matter for RegAllocLocal since this case never occurs in
practice).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11748 91177308-0d34-0410-b5e6-96231b3b80d8
use FP instructions. This reduces the number of instructions inserted in
176.gcc (for example) from 58074 to 101 (it doesn't use much FP, which
is typical). This reduction speeds up the entire code generator. In the
case of 176.gcc, llc went from taking 31.38s to 24.78s. The passes that
sped up the most are the register allocator and the 2 live variable analysis
passes, which sped up 2.3, 1.3, and 1.5s respectively. The asmprinter
pass also sped up because it doesn't print the instructions in comments :)
Note that this patch is likely to expose latent bugs in machine code passes,
because now basicblock can be empty, where they were never empty before. I
cleaned out regalloclocal, but who knows about linscan :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11717 91177308-0d34-0410-b5e6-96231b3b80d8
switch statements in the constructors and simplifies the
implementation of the getUseType() member function. You will have to
specify defs using MachineOperand::Def instead of MOTy::Def though
(similarly for Use and UseAndDef).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11715 91177308-0d34-0410-b5e6-96231b3b80d8
(minor) benefits right now:
1. An extra dummy MOVrr32 is gone. This move would often be coallesced by
both allocators anyway.
2. The code now uses the gep_type_iterator to walk the gep, which should future
proof it a bit. It still assumes that array indexes are Longs though.
These don't really justify rewriting the code. The big benefit will come later
though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11710 91177308-0d34-0410-b5e6-96231b3b80d8
value is a physreg and one is a virtreg. For this reason, disable copy folding
entirely for physregs. Also, use the new isMoveInstr target hook which gives us
folding of FP moves as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11700 91177308-0d34-0410-b5e6-96231b3b80d8
hacks can be banished. Also, this gives us the opportunity to emit special code
for the setjmp/longjmps which alows the elimination of one GCC warning for every
setjmp/longjmp site (which is often THOUSANDS in C++ programs). Yaay!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11484 91177308-0d34-0410-b5e6-96231b3b80d8
MRegisterInfo::getNumRegs() instead of
MRegisterInfo::FirstVirtualRegister.
Also use MRegisterInfo::is{Physical,Virtual}Register where
appropriate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11477 91177308-0d34-0410-b5e6-96231b3b80d8
that will be responsible for the creation of MachineFunctions and will
be required by all MachineFunctionPass passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11453 91177308-0d34-0410-b5e6-96231b3b80d8
"expensive exceptions" controlled by an option. Also refactor and eliminate a bunch of cruft.
This is a temporary solution and causes millions of warnings to pour out of programs that use
exceptions, but it should fix the problem with sparc and the 'write' declaration (PR190).
Subsequent changes will make this stink much less
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11405 91177308-0d34-0410-b5e6-96231b3b80d8
ilist of MachineInstr objects. This allows constant time removal and
insertion of MachineInstr instances from anywhere in each
MachineBasicBlock. It also allows for constant time splicing of
MachineInstrs into or out of MachineBasicBlocks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11340 91177308-0d34-0410-b5e6-96231b3b80d8
MO_VirtualRegister but if the register number is one of a physical
register is it considered as a physical register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11315 91177308-0d34-0410-b5e6-96231b3b80d8
I will observe that the concept of using WriteAsOperand is completely broken,
but then we all knew that, didn't we?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11255 91177308-0d34-0410-b5e6-96231b3b80d8
instead of randomly groping about inside its outEdges array.
Make SchedGraph::addDummyEdges() use getNumOutEdges() instead of
outEdges.size().
Get rid of ifdefed-out code in SchedGraph::buildGraph().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11238 91177308-0d34-0410-b5e6-96231b3b80d8
FP_REG_KILL instructions at the end of blocks involved with critical edges.
Fix a bug where FP_REG_KILL instructions weren't inserted in fall through
unconditional branches. Perhaps this will fix some linscan problems?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11019 91177308-0d34-0410-b5e6-96231b3b80d8
It's not clear why the code was looking for signed chars < 0, but it can't
matter to the assembler anyway, so the check goes away. This also fixes
compatibility with arrays of [us]byte that have constantexprs in them.
Also slightly restructure some code to be cleaner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10854 91177308-0d34-0410-b5e6-96231b3b80d8
It's not clear why the code was looking for signed chars < 0, but it can't
matter to the assembler anyway, so the check goes away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10853 91177308-0d34-0410-b5e6-96231b3b80d8
Using the SlotCalculator is total overkill for this file, a simple map
will suffice. Why doesn't this use the NameMangler interface?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10823 91177308-0d34-0410-b5e6-96231b3b80d8
instruction selector by adding a new pseudo-instruction
FP_REG_KILL. This instruction implicitly defines all x86 fp registers
and is a terminator so that passes which add machine code at the end
of basic blocks (like phi elimination) do not add instructions between
it and the branch or return instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10562 91177308-0d34-0410-b5e6-96231b3b80d8
implementation of a Target{RegInfo, InstrInfo, Machine, etc} now has a separate
header and a separate implementation file.
This means that instead of a massive SparcInternals.h that forces a
recompilation of the whole target whenever a minor detail is changed, you should
only recompile a few files.
Note that SparcInternals.h is still around; its contents should be minimized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10500 91177308-0d34-0410-b5e6-96231b3b80d8
a) remove opIsUse(), opIsDefOnly(), opIsDefAndUse()
b) add isUse(), isDef()
c) rename opHiBits32() to isHiBits32(),
opLoBits32() to isLoBits32(),
opHiBits64() to isHiBits64(),
opLoBits64() to isLoBits64().
This results to much more readable code, for example compare
"op.opIsDef() || op.opIsDefAndUse()" to "op.isDef()" a pattern used
very often in the code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10461 91177308-0d34-0410-b5e6-96231b3b80d8
allocaton on the X86 to add information to the machine code denoting
that our floating point stackifier cannot handle virtual point
register that are alive across basic blocks. This pass adds an
implicit def of all virtual floating point register at the end of each
basic block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10446 91177308-0d34-0410-b5e6-96231b3b80d8
a pointer. This evades a warning emitted by GCC when we cast from
unsigned int (32 bit) to void * (64 bit) on SparcV9.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10435 91177308-0d34-0410-b5e6-96231b3b80d8
the write() system call because it returns 64 bits on Solaris 64 bit,
and an implicit return value of int says it returns 32 bits.
Admittedly, this is a bit of a hack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10375 91177308-0d34-0410-b5e6-96231b3b80d8
Eventually this pass will provide substantially better code in the interim between when we
have a crappy isel and nice isel. Unfortunately doing so requires fixing the backend to
actually SUPPORT all of the fancy addressing modes that we now generate, and writing a DCE
pass for machine code. Each of these is a fairly substantial job, so this will remain disabled
for the immediate future. :(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10276 91177308-0d34-0410-b5e6-96231b3b80d8
folding of instructions into addressing modes. This creates lots of dead
instructions, which are currently not deleted. It also creates a lot of
instructions that the X86 backend currently cannot handle. :(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10275 91177308-0d34-0410-b5e6-96231b3b80d8
* Restore registers *after* everything else to avoid any possible side effects
This fixes McCat-imp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10147 91177308-0d34-0410-b5e6-96231b3b80d8
storage duration that are local to the function containing the invocation of the
[...] setjmp macro that do not have volatile-qualified type and have been
changed between the setjmp invocation and longjmp call are indeterminate."
As such, we have to mark all variables in a function that uses 'invoke' as
volatile.
This fixes PR77
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10035 91177308-0d34-0410-b5e6-96231b3b80d8
* There is now only one pass to print out assembly instead of two
* It is a FunctionPass
* The Module-level printing of globals is now in doFinalization() method of the
FunctionPass
* The code has been reformatted to follow LLVM coding standards
* Some comments, not all, were doxygenified
* Last but not least, the function to create an instance of this pass is also no
longer a method in the UltraSparc class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9953 91177308-0d34-0410-b5e6-96231b3b80d8
externally-visible linkage, and SaveStateToModule must default to true for llc.
I don't remember why I made it const; perhaps it should be deconstified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9858 91177308-0d34-0410-b5e6-96231b3b80d8
each instruction produces as "operand" -1, and the other operands as 0
.. n, as before. PhyRegAlloc::saveState() is refactored into
PhyRegAlloc::saveStateForValue().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9842 91177308-0d34-0410-b5e6-96231b3b80d8
for the Sparc backend: breaking up constant expressions. Thus, we cannot have it
guarded by a conditional, it should never be disabled.
Also, it's now available for the JIT since it is a FunctionPass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9791 91177308-0d34-0410-b5e6-96231b3b80d8
MachineConstantPool. This involved refactoring the two classes involved in
printing out Sparc assembly. In fact, they should share all this code anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9776 91177308-0d34-0410-b5e6-96231b3b80d8
it will be converted to a MachineConstantPool index during instruction
selection
* This is now eligible to become a FunctionPass since it does not have any side
effects outside of the function it is processing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9773 91177308-0d34-0410-b5e6-96231b3b80d8
return the number of instructions added to/removed from the basic block
passed as their first argument.
Note: This is only needed because we use a std::vector instead of an
ilist to keep MachineBasicBlock instructions. Inserting an instruction
to a MachineBasicBlock invalidates all iterators to the basic
block. The return value can be used to update an index to the machine
basic block instruction vector and circumvent the iterator elimination
problem but this is really not needed if we move to a better
representation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9704 91177308-0d34-0410-b5e6-96231b3b80d8
strings with the stuff that used to print to an ostream directly. We now NEVER build
up big strings, only to print them once they are formed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9686 91177308-0d34-0410-b5e6-96231b3b80d8
* Emit bools as 1/0 instead of true/false, fixing compilation of eon and
PR 83 & Jello/2003-11-03-GlobalBool.llx
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9683 91177308-0d34-0410-b5e6-96231b3b80d8
implementing verifySavedState().
In saveState(), use the new AllocInfo::AllocStateTy enum, and increment
Insn each time through the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9617 91177308-0d34-0410-b5e6-96231b3b80d8
Prototype option to save state in a global instead of as a Constant in
the Module. (Turned off, for now, with the on/off switch welded in the off
position. You get the idea.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9500 91177308-0d34-0410-b5e6-96231b3b80d8
Make FnAllocState contain vectors of AllocInfo, instead of LLVM Constants.
Give doFinalization a method comment, and let it do the work of converting
AllocInfos to LLVM Constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9451 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix order of #includes
* Make code layout more consistent
* Eliminate extraneous whitespace and comment-lines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9433 91177308-0d34-0410-b5e6-96231b3b80d8
* Make file description more readable
* Make code layout more consistent, include comment in assert so it's visible
during execution if it hits
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9430 91177308-0d34-0410-b5e6-96231b3b80d8