(16) into certain areas of the SPARC V9 back-end. I'm fairly sure the US IIIi's
dcache has 32-byte lines, so I'm not sure where the 16 came from. However, in
the interest of not breaking things any more than they already are, I'm going
to leave the constant alone.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12043 91177308-0d34-0410-b5e6-96231b3b80d8
of generating this code:
mov %EAX, 4
mov DWORD PTR [%ESP], %EAX
mov %AX, 123
movsx %EAX, %AX
mov DWORD PTR [%ESP + 4], %EAX
call Y
we now generate:
mov DWORD PTR [%ESP], 4
mov DWORD PTR [%ESP + 4], 123
call Y
Which hurts the eyes less. :)
Considering that register pressure around call sites is already high (with all
of the callee clobber registers n stuff), this may help a lot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12028 91177308-0d34-0410-b5e6-96231b3b80d8
1) For 8-bit registers try to use first the ones that are parts of the
same register (AL then AH). This way we only alias 2 16/32-bit
registers after allocating 4 8-bit variables.
2) Move EBX as the last register to allocate. This will cause less
spills to happen since we will have 8-bit registers available up to
register excaustion (assuming we use the allocation order). It
would be nice if we could push all of the 8-bit aliased registers
towards the end but we much prefer to keep callee saved register to
the end to avoid saving them on entry and exit of the function.
For example this gives a slight reduction of spills with linear scan
on 164.gzip.
Before:
11221 asm-printer - Number of machine instrs printed
975 spiller - Number of loads added
675 spiller - Number of stores added
398 spiller - Number of register spills
After:
11182 asm-printer - Number of machine instrs printed
952 spiller - Number of loads added
652 spiller - Number of stores added
386 spiller - Number of register spills
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11996 91177308-0d34-0410-b5e6-96231b3b80d8
their names more decriptive. A name consists of the base name, a
default operand size followed by a character per operand with an
optional special size. For example:
ADD8rr -> add, 8-bit register, 8-bit register
IMUL16rmi -> imul, 16-bit register, 16-bit memory, 16-bit immediate
IMUL16rmi8 -> imul, 16-bit register, 16-bit memory, 8-bit immediate
MOVSX32rm16 -> movsx, 32-bit register, 16-bit memory
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11995 91177308-0d34-0410-b5e6-96231b3b80d8
parse. The name is now I (operand size)*. For example:
Im32 -> instruction with 32-bit memory operands.
Im16i8 -> instruction with 16-bit memory operands and 8 bit immediate
operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11970 91177308-0d34-0410-b5e6-96231b3b80d8
the size of the immediate and the memory operand on instructions that
use them. This resolves problems with instructions that take both a
memory and an immediate operand but their sizes differ (i.e. ADDmi32b).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11967 91177308-0d34-0410-b5e6-96231b3b80d8
an 8-bit immediate. So mark the shifts that take immediates as taking
an 8-bit argument. The rest with the implicit use of CL are marked
appropriately.
A bug still exists:
def SHLDmri32 : I2A8 <"shld", 0xA4, MRMDestMem>, TB; // [mem32] <<= [mem32],R32 imm8
The immediate in the above instruction is 8-bit but the memory
reference is 32-bit. The printer prints this as an 8-bit reference
which confuses the assembler. Same with SHRDmri32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11931 91177308-0d34-0410-b5e6-96231b3b80d8
scaled indexes. This allows us to compile GEP's like this:
int* %test([10 x { int, { int } }]* %X, int %Idx) {
%Idx = cast int %Idx to long
%X = getelementptr [10 x { int, { int } }]* %X, long 0, long %Idx, ubyte 1, ubyte 0
ret int* %X
}
Into a single address computation:
test:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
lea %EAX, DWORD PTR [%EAX + 8*%ECX + 4]
ret
Before it generated:
test:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
shl %ECX, 3
add %EAX, %ECX
lea %EAX, DWORD PTR [%EAX + 4]
ret
This is useful for things like int/float/double arrays, as the indexing can be folded into
the loads&stores, reducing register pressure and decreasing the pressure on the decode unit.
With these changes, I expect our performance on 256.bzip2 and gzip to improve a lot. On
bzip2 for example, we go from this:
10665 asm-printer - Number of machine instrs printed
40 ra-local - Number of loads/stores folded into instructions
1708 ra-local - Number of loads added
1532 ra-local - Number of stores added
1354 twoaddressinstruction - Number of instructions added
1354 twoaddressinstruction - Number of two-address instructions
2794 x86-peephole - Number of peephole optimization performed
to this:
9873 asm-printer - Number of machine instrs printed
41 ra-local - Number of loads/stores folded into instructions
1710 ra-local - Number of loads added
1521 ra-local - Number of stores added
789 twoaddressinstruction - Number of instructions added
789 twoaddressinstruction - Number of two-address instructions
2142 x86-peephole - Number of peephole optimization performed
... and these types of instructions are often in tight loops.
Linear scan is also helped, but not as much. It goes from:
8787 asm-printer - Number of machine instrs printed
2389 liveintervals - Number of identity moves eliminated after coalescing
2288 liveintervals - Number of interval joins performed
3522 liveintervals - Number of intervals after coalescing
5810 liveintervals - Number of original intervals
700 spiller - Number of loads added
487 spiller - Number of stores added
303 spiller - Number of register spills
1354 twoaddressinstruction - Number of instructions added
1354 twoaddressinstruction - Number of two-address instructions
363 x86-peephole - Number of peephole optimization performed
to:
7982 asm-printer - Number of machine instrs printed
1759 liveintervals - Number of identity moves eliminated after coalescing
1658 liveintervals - Number of interval joins performed
3282 liveintervals - Number of intervals after coalescing
4940 liveintervals - Number of original intervals
635 spiller - Number of loads added
452 spiller - Number of stores added
288 spiller - Number of register spills
789 twoaddressinstruction - Number of instructions added
789 twoaddressinstruction - Number of two-address instructions
258 x86-peephole - Number of peephole optimization performed
Though I'm not complaining about the drop in the number of intervals. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11820 91177308-0d34-0410-b5e6-96231b3b80d8
to do analysis.
*** FOLD getelementptr instructions into loads and stores when possible,
making use of some of the crazy X86 addressing modes.
For example, the following C++ program fragment:
struct complex {
double re, im;
complex(double r, double i) : re(r), im(i) {}
};
inline complex operator+(const complex& a, const complex& b) {
return complex(a.re+b.re, a.im+b.im);
}
complex addone(const complex& arg) {
return arg + complex(1,0);
}
Used to be compiled to:
_Z6addoneRK7complex:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
*** mov %EDX, %ECX
fld QWORD PTR [%EDX]
fld1
faddp %ST(1)
*** add %ECX, 8
fld QWORD PTR [%ECX]
fldz
faddp %ST(1)
*** mov %ECX, %EAX
fxch %ST(1)
fstp QWORD PTR [%ECX]
*** add %EAX, 8
fstp QWORD PTR [%EAX]
ret
Now it is compiled to:
_Z6addoneRK7complex:
mov %EAX, DWORD PTR [%ESP + 4]
mov %ECX, DWORD PTR [%ESP + 8]
fld QWORD PTR [%ECX]
fld1
faddp %ST(1)
fld QWORD PTR [%ECX + 8]
fldz
faddp %ST(1)
fxch %ST(1)
fstp QWORD PTR [%EAX]
fstp QWORD PTR [%EAX + 8]
ret
Other programs should see similar improvements, across the board. Note that
in addition to reducing instruction count, this also reduces register pressure
a lot, always a good thing on X86. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11819 91177308-0d34-0410-b5e6-96231b3b80d8
into a single LEA instruction. This should improve the code generated for
things like X->A.B.C[12].D.
The bigger benefit is still coming though. Note that this uses an LEA instruction
instead of an add, giving the register allocator more freedom. We should probably
never generate ADDri32's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11817 91177308-0d34-0410-b5e6-96231b3b80d8
block into MachineBasicBlock::getFirstTerminator().
This also fixes a bug in the implementation of the above in both
RegAllocLocal and InstrSched, where instructions where added after the
terminator if the basic block's only instruction was a terminator (it
shouldn't matter for RegAllocLocal since this case never occurs in
practice).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11748 91177308-0d34-0410-b5e6-96231b3b80d8
use FP instructions. This reduces the number of instructions inserted in
176.gcc (for example) from 58074 to 101 (it doesn't use much FP, which
is typical). This reduction speeds up the entire code generator. In the
case of 176.gcc, llc went from taking 31.38s to 24.78s. The passes that
sped up the most are the register allocator and the 2 live variable analysis
passes, which sped up 2.3, 1.3, and 1.5s respectively. The asmprinter
pass also sped up because it doesn't print the instructions in comments :)
Note that this patch is likely to expose latent bugs in machine code passes,
because now basicblock can be empty, where they were never empty before. I
cleaned out regalloclocal, but who knows about linscan :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11717 91177308-0d34-0410-b5e6-96231b3b80d8
switch statements in the constructors and simplifies the
implementation of the getUseType() member function. You will have to
specify defs using MachineOperand::Def instead of MOTy::Def though
(similarly for Use and UseAndDef).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11715 91177308-0d34-0410-b5e6-96231b3b80d8
(minor) benefits right now:
1. An extra dummy MOVrr32 is gone. This move would often be coallesced by
both allocators anyway.
2. The code now uses the gep_type_iterator to walk the gep, which should future
proof it a bit. It still assumes that array indexes are Longs though.
These don't really justify rewriting the code. The big benefit will come later
though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11710 91177308-0d34-0410-b5e6-96231b3b80d8
value is a physreg and one is a virtreg. For this reason, disable copy folding
entirely for physregs. Also, use the new isMoveInstr target hook which gives us
folding of FP moves as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11700 91177308-0d34-0410-b5e6-96231b3b80d8
MRegisterInfo::getNumRegs() instead of
MRegisterInfo::FirstVirtualRegister.
Also use MRegisterInfo::is{Physical,Virtual}Register where
appropriate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11477 91177308-0d34-0410-b5e6-96231b3b80d8
that will be responsible for the creation of MachineFunctions and will
be required by all MachineFunctionPass passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11453 91177308-0d34-0410-b5e6-96231b3b80d8
ilist of MachineInstr objects. This allows constant time removal and
insertion of MachineInstr instances from anywhere in each
MachineBasicBlock. It also allows for constant time splicing of
MachineInstrs into or out of MachineBasicBlocks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11340 91177308-0d34-0410-b5e6-96231b3b80d8
FP_REG_KILL instructions at the end of blocks involved with critical edges.
Fix a bug where FP_REG_KILL instructions weren't inserted in fall through
unconditional branches. Perhaps this will fix some linscan problems?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11019 91177308-0d34-0410-b5e6-96231b3b80d8
It's not clear why the code was looking for signed chars < 0, but it can't
matter to the assembler anyway, so the check goes away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10853 91177308-0d34-0410-b5e6-96231b3b80d8
instruction selector by adding a new pseudo-instruction
FP_REG_KILL. This instruction implicitly defines all x86 fp registers
and is a terminator so that passes which add machine code at the end
of basic blocks (like phi elimination) do not add instructions between
it and the branch or return instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10562 91177308-0d34-0410-b5e6-96231b3b80d8
a) remove opIsUse(), opIsDefOnly(), opIsDefAndUse()
b) add isUse(), isDef()
c) rename opHiBits32() to isHiBits32(),
opLoBits32() to isLoBits32(),
opHiBits64() to isHiBits64(),
opLoBits64() to isLoBits64().
This results to much more readable code, for example compare
"op.opIsDef() || op.opIsDefAndUse()" to "op.isDef()" a pattern used
very often in the code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10461 91177308-0d34-0410-b5e6-96231b3b80d8
allocaton on the X86 to add information to the machine code denoting
that our floating point stackifier cannot handle virtual point
register that are alive across basic blocks. This pass adds an
implicit def of all virtual floating point register at the end of each
basic block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10446 91177308-0d34-0410-b5e6-96231b3b80d8
a pointer. This evades a warning emitted by GCC when we cast from
unsigned int (32 bit) to void * (64 bit) on SparcV9.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10435 91177308-0d34-0410-b5e6-96231b3b80d8
Eventually this pass will provide substantially better code in the interim between when we
have a crappy isel and nice isel. Unfortunately doing so requires fixing the backend to
actually SUPPORT all of the fancy addressing modes that we now generate, and writing a DCE
pass for machine code. Each of these is a fairly substantial job, so this will remain disabled
for the immediate future. :(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10276 91177308-0d34-0410-b5e6-96231b3b80d8
folding of instructions into addressing modes. This creates lots of dead
instructions, which are currently not deleted. It also creates a lot of
instructions that the X86 backend currently cannot handle. :(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@10275 91177308-0d34-0410-b5e6-96231b3b80d8
return the number of instructions added to/removed from the basic block
passed as their first argument.
Note: This is only needed because we use a std::vector instead of an
ilist to keep MachineBasicBlock instructions. Inserting an instruction
to a MachineBasicBlock invalidates all iterators to the basic
block. The return value can be used to update an index to the machine
basic block instruction vector and circumvent the iterator elimination
problem but this is really not needed if we move to a better
representation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9704 91177308-0d34-0410-b5e6-96231b3b80d8
strings with the stuff that used to print to an ostream directly. We now NEVER build
up big strings, only to print them once they are formed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9686 91177308-0d34-0410-b5e6-96231b3b80d8
* Emit bools as 1/0 instead of true/false, fixing compilation of eon and
PR 83 & Jello/2003-11-03-GlobalBool.llx
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9683 91177308-0d34-0410-b5e6-96231b3b80d8
C is a constant which can be sign-extended from 8 bits without value loss,
and op is one of: add, sub, imul, and, or, xor.
This allows the JIT to emit the one byte version of the constant instead of
the two or 4 byte version. Because these instructions are very common, this
can save a LOT of code space. For example, I sampled two benchmarks, 176.gcc
and 254.gap.
BM Old New Reduction
176.gcc 2673621 2548962 4.89%
254.gap 498261 475104 4.87%
Note that while the percentage is not spectacular, this did eliminate
124.6 _KILOBYTES_ of codespace from gcc. Not bad.
Note that this doesn't effect the llc version at all, because the assembler
already does this optimization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9284 91177308-0d34-0410-b5e6-96231b3b80d8
* Implement R1 = R2 * C where R1 and R2 are 32 or 16 bits. This avoids an
extra copy into a register, reducing register pressure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9278 91177308-0d34-0410-b5e6-96231b3b80d8
getelementptr code path for use by other code paths (like malloc and alloca).
* Optimize comparisons with zero
* Generate neg, not, inc, and dec instructions, when possible.
This gives some code size wins, which might translate into performance. We'll
see tommorow in the nightly tester.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9267 91177308-0d34-0410-b5e6-96231b3b80d8
X86/linux. :( The problem is that a signal delivered while the function
is executing could clobber the functions stack. This is a partial fix
for PR41.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@9113 91177308-0d34-0410-b5e6-96231b3b80d8
much cleaner and easier.
Labeled .td as a suffix for tblgen files in Makefile.rules.
Modified build rules so that source files generated during the build are placed
in the build directory and not the source directory (and not in a Debug
directory). This makes the system cleaner and allows us to have a read-only
source tree.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@8424 91177308-0d34-0410-b5e6-96231b3b80d8
into the struct case.
* Extend printConstantValueOnly to print .zero's if the initializer is zero
* Delete dead isConstantFunctionPointerRef function
* Emit the appropriate assembly for the various linkage types!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@8417 91177308-0d34-0410-b5e6-96231b3b80d8
Adjusted Makefile to work with new autoconf-style object root.
Specifically, use the new -I option of tblgen to find include files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@8379 91177308-0d34-0410-b5e6-96231b3b80d8
until we implement unwinding.
Add support for the invoke instruction, which codegens just like a call with
a branch after it.
The end effect of this change is that programs using the invoke instruction,
but never unwinding, will work fine. Programs that unwind will abort until
we get unwind support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@8187 91177308-0d34-0410-b5e6-96231b3b80d8
This bug caused miscompilation of programs using 'struct stat', but only if
compiled with support for 64-bit filesystems. This could in theory effect
other things, but only if the LLVM code shared data structures with native code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7928 91177308-0d34-0410-b5e6-96231b3b80d8
function-at-a-time compilation and emission of code.
Separate addPassesToEmitAssembly from addPassesToJITCompile, because
the latter requires you to use FunctionPasses, and the former might
diverge anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7817 91177308-0d34-0410-b5e6-96231b3b80d8
consumably by the cygwin assembler. This is really just a nasty hack until we
get real target triple support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7742 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes test case test/Programs/LLVMSource/2003-08-03-ReservedWordGlobal.ll.
Also: Refactor implicit-uses printing into its own method.
Remove a couple of unused variables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7737 91177308-0d34-0410-b5e6-96231b3b80d8
* Use .zero to emit padding between struct elements
* Emit .comm symbols when we can, this dramatically reduces the amount of gunk we have to print
* Print global variable identifiers next to initializer more nicely.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7551 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix bug in the createNOP method, which was not marking the operands of the
generated XCHG as useanddef. I don't think this method is actually used,
so it wasn't breaking anything, but it should be fixed anyway...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7539 91177308-0d34-0410-b5e6-96231b3b80d8
SlotCalculator in CWriter. (Unfortunately, all this means a lot of
X86/Printer's methods have to be de-constified again. Oh well.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7299 91177308-0d34-0410-b5e6-96231b3b80d8
doFinalization too except that would have made them shadow, not override,
the parent class :-P.
Allow *any* constant cast expression between pointers and longs,
or vice-versa, or any widening (not just same-size) conversion that
isLosslesslyConvertibleTo approves. This fixes oopack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7288 91177308-0d34-0410-b5e6-96231b3b80d8
Printer::doFinalization() out in the cold. Now we pass in a TargetMachine
to Printer's constructor and get the TargetData from the TargetMachine.
Don't pass TargetMachine or MRegisterInfo objects around in the Printer.
Constify TargetData references.
X86.h: Update comment and prototype of createX86CodePrinterPass().
X86TargetMachine.cpp: Update callers of createX86CodePrinterPass().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7275 91177308-0d34-0410-b5e6-96231b3b80d8
Stop passing ostreams around: we already have one perfectly good ostream
and we can all share it.
Stop stashing a pointer to TargetData in the Pass object, because that will
lead to a crash if there are no functions in the module (ouch!) Instead,
use addRequired() and getAnalysis(), like we always should have done.
Move the check for ConstantExpr up before the check for isPrimitiveType,
because we need to be able to catch e.g. ubyte (cast bool false to ubyte),
whose type is primitive but which is nevertheless a ConstantExpr, by calling
our specialized handler instead of the AsmWriter. This would result in
assembler errors when we would try to output something like ".byte (cast
bool false to ubyte)".
GC some unused variable declarations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7265 91177308-0d34-0410-b5e6-96231b3b80d8
Avoid a fall-through in the (stubby) treatment of the longjmp intrinsic
call which causes llc & lli to core-dump.
Add a sort-of treatment of cast double to ulong. I am not really sure
what a user should expect to see upon casting a negative FP value to
unsigned long long. But with what is given here, I was able to write
a program that could cast -123.456 to ulong and back and get -123.0,
which seems like a step in the right direction. GCC seems to give you
0. I don't know if I'd consider that useful.
These cases were coming up in GNU coreutils-5.0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7205 91177308-0d34-0410-b5e6-96231b3b80d8
out the entire llvm disassembly for the function at global constant-output
time, which caused the assembler to barf in 164.gzip. This fixes that
particular problem (though 164.gzip has other problems with X86 llc.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7168 91177308-0d34-0410-b5e6-96231b3b80d8
Fhourstones, McCat-vor, and many others...)
Printer.cpp: Print implicit uses for AddRegFrm instructions. Break gas
bug workarounds up into separate stanzas of code for each bug. Add new
workarounds for fild and fistp.
X86InstrInfo.def: Add O_ST0 implicit uses for more FP instrs where they
obviously apply. Also add PrintImplUses flags for FP instrs where they
are necessary for gas to understand the output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@7165 91177308-0d34-0410-b5e6-96231b3b80d8
SingleSource except oopack and Oscar. (Sorry, Oscar.)
include/llvm/Target/TargetInstrInfo.h: Remove virtual print method. Add
accessors for ImplicitUses/Defs.
lib/Target/TargetInstrInfo.cpp: Remove virtual print method. If you
really wanted this, just use MI->print(O, TM); instead...
lib/Target/X86:
FloatingPoint.cpp: ...like this.
X86InstrInfo.h: Remove virtual print method. Define the PrintImplUses
target-specific flag bit.
X86InstrInfo.def: Add the PrintImplUses flag to all the instructions
which implicitly use CL, because the assembler needs to see the CL in
order to generate the right instruction.
Printer.cpp: Ditch fnIndex at Chris's request. Now we use CurrentFnName
to name constants in the constant pool for each function instead. This
avoids keeping state between runOnMachineFunction() invocations, which
is a no-no. Having MangledGlobals be global is a bogon I'd like to get
rid of too, but making it a static member of Printer causes link errors
(why???).
Make NumberForBB into a member of Printer instead of a global, too.
Make printOp and printMemReference into methods of Printer.
X86InstrInfo::print is now Printer::printMachineInstruction, because
TargetInstrInfo::print is history. (Because of this, we have to qualify
the names of some TargetInstrInfo methods we call.)
Print out the ImplicitUses field of any instruction we print that has
the PrintImplUses bit set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@6924 91177308-0d34-0410-b5e6-96231b3b80d8
involves removing the [bwl] suffixes from instruction names, as well
as some other distinguishing marks (32/64/80 on fp insns, _i suffixes, etc.)
Lowercase all instr. names as well for consistency's sake.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@6790 91177308-0d34-0410-b5e6-96231b3b80d8
to print various things on a module-by-module basis (currently, only the
former is used).
Don't print < > around names. The assembler can't take it.
Print pseudoinstructions only as comments. The poor little assembler can't
take that, either.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@6789 91177308-0d34-0410-b5e6-96231b3b80d8
This avoid generating a register to hold C, which in turn speeds up the
register allocator by a lot: ~9% on 164.gzip and ~17% on 256.bzip2. This
also speeds up other passes. This also speeds up execution of the program
marginally, and makes the asm much easier to read. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@6626 91177308-0d34-0410-b5e6-96231b3b80d8
and exit of the function. This fixes bug: Jello/2003-05-06-LivenessClobber.llx
and the Fhourstones benchmark
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@6010 91177308-0d34-0410-b5e6-96231b3b80d8
This improves the performance of the power benchmark by a few percent.
This will be neccesary for SSE code, which requires 16 byte alignment of
the stack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@5320 91177308-0d34-0410-b5e6-96231b3b80d8