To enable a load of a call address to be folded with that call, this
load is moved from outside of callseq into callseq. Such a moving
adds a non-glued node (that load) into a glued sequence. This non-glue
load is only removed when DAG selection folds them into a memory form
call instruction. When such instruction selection is disabled, it breaks
DAG schedule.
To prevent that, such moving is disabled when target favors register
indirect call.
Previous workaround disabling CALL32m/CALL64m insn selection is removed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178308 91177308-0d34-0410-b5e6-96231b3b80d8
There is special magic happening when returning floating point values on
the x87 stack. The RET instructions get extra f80 operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162592 91177308-0d34-0410-b5e6-96231b3b80d8
Function argument and return value registers aren't part of the
encoding, so they should be implicit operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159728 91177308-0d34-0410-b5e6-96231b3b80d8
The getPointerRegClass() hook will return GR32_TC, or whatever is
appropriate for the current function.
Patch by Yiannis Tsiouris!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156459 91177308-0d34-0410-b5e6-96231b3b80d8
The different calling conventions and call-preserved registers are
represented with regmask operands that are added dynamically.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150708 91177308-0d34-0410-b5e6-96231b3b80d8
Call instructions no longer have a list of 43 call-clobbered registers.
Instead, they get a single register mask operand with a bit vector of
call-preserved registers.
This saves a lot of memory, 42 x 32 bytes = 1344 bytes per call
instruction, and it speeds up building call instructions because those
43 imp-def operands no longer need to be added to use-def lists. (And
removed and shifted and re-added for every explicit call operand).
Passes like LiveVariables, LiveIntervals, RAGreedy, PEI, and
BranchFolding are significantly faster because they can deal with call
clobbers in bulk.
Overall, clang -O2 is between 0% and 8% faster, uniformly distributed
depending on call density in the compiled code. Debug builds using
clang -O0 are 0% - 3% faster.
I have verified that this patch doesn't change the assembly generated
for the LLVM nightly test suite when building with -disable-copyprop
and -disable-branch-fold.
Branch folding behaves slightly differently in a few cases because call
instructions have different hash values now.
Copy propagation flushes its data structures when it crosses a register
mask operand. This causes it to leave a few dead copies behind, on the
order of 20 instruction across the entire nightly test suite, including
SPEC. Fixing this properly would require the pass to use different data
structures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150638 91177308-0d34-0410-b5e6-96231b3b80d8
Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.
Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.
Adds a test to verify that the scheduler is working.
Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.
Patch by Preston Gurd!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149558 91177308-0d34-0410-b5e6-96231b3b80d8
The Win64 calling convention has xmm6-15 as callee-saved while still
clobbering all ymm registers.
Add a YMM_HI_6_15 pseudo-register that aliases the clobbered part of the
ymm registers, and mark that as call-clobbered. This allows live xmm
registers across calls.
This hack wouldn't be necessary with RegisterMask operands representing
the call clobbers, but they are not quite operational yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149088 91177308-0d34-0410-b5e6-96231b3b80d8
prologue and epilogue if the adjustment is 8. Similarly, use pushl / popl if
the adjustment is 4 in 32-bit mode.
In the epilogue, takes care to pop to a caller-saved register that's not live
at the exit (either return or tailcall instruction).
rdar://8771137
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122783 91177308-0d34-0410-b5e6-96231b3b80d8
be more complete. These are only expected to be used by llvm-mc with assembly
source so there is no pattern, [], in the .td files. Most are being added to
X86InstrInfo.td as Chris suggested and only comments about register uses are
added. Suggestions welcome on the .td changes as I'm not sure on every detail
of the x86 records. More missing instructions will be coming.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116716 91177308-0d34-0410-b5e6-96231b3b80d8
control flow stuff out to X86InstrControl.td. Move
some compiler pseudo instructions and Pat<> patterns
out to X86InstrCompiler.td
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115596 91177308-0d34-0410-b5e6-96231b3b80d8