Call instructions no longer have a list of 43 call-clobbered registers.
Instead, they get a single register mask operand with a bit vector of
call-preserved registers.
This saves a lot of memory, 42 x 32 bytes = 1344 bytes per call
instruction, and it speeds up building call instructions because those
43 imp-def operands no longer need to be added to use-def lists. (And
removed and shifted and re-added for every explicit call operand).
Passes like LiveVariables, LiveIntervals, RAGreedy, PEI, and
BranchFolding are significantly faster because they can deal with call
clobbers in bulk.
Overall, clang -O2 is between 0% and 8% faster, uniformly distributed
depending on call density in the compiled code. Debug builds using
clang -O0 are 0% - 3% faster.
I have verified that this patch doesn't change the assembly generated
for the LLVM nightly test suite when building with -disable-copyprop
and -disable-branch-fold.
Branch folding behaves slightly differently in a few cases because call
instructions have different hash values now.
Copy propagation flushes its data structures when it crosses a register
mask operand. This causes it to leave a few dead copies behind, on the
order of 20 instruction across the entire nightly test suite, including
SPEC. Fixing this properly would require the pass to use different data
structures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150638 91177308-0d34-0410-b5e6-96231b3b80d8
The existing framework for postra scheduling is library local. We want to keep it that way. Soon we will have a more general MachineScheduler interface. At that time, various bits will be exposed to targets. In the meantime, the VLIWPacketizer wants to use ScheduleDAGInstrs directly, so it needs to wrapped in a PIMPL to avoid exposing it to the target interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150633 91177308-0d34-0410-b5e6-96231b3b80d8
method. This allows the target lowering code to not have to deal with MDNodes.
Also, avoid leaking memory like a sieve by not creating a global variable for
the image info section, but just emitting the code directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150624 91177308-0d34-0410-b5e6-96231b3b80d8
Accomplished by moving the body of StringRef::edit_distance into
a separate function that accepts two ArrayRefs, and making
StringRef::edit_distance a wrapper around the new function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150621 91177308-0d34-0410-b5e6-96231b3b80d8
The c'tor list is stored as a list of 'void ()*'s, so all of the functions are
bitcast to that. However, the dyn_cast doesn't automagically look through
bitcasts. Do that for it.
<rdar://problem/10813350>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150572 91177308-0d34-0410-b5e6-96231b3b80d8
I'll put MachineLICM back before PEI. All my arm/x86 benchmarks look good, but buildbots don't like it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150568 91177308-0d34-0410-b5e6-96231b3b80d8
The llc command line options for enabling/disabling passes are local to CodeGen/Passes.cpp. This patch associates those options with standard pass IDs so they work regardless of how the target configures the passes.
A target has two ways of overriding standard passes:
1) Redefine the pass pipeline (override TargetPassConfig::add%Stage)
2) Replace or suppress individiual passes with TargetPassConfig::substitutePass.
In both cases, the command line options associated with the pass override the target default.
For example, say a target wants to disable machine instruction scheduling by default:
- The target calls disablePass(MachineSchedulerID) but otherwise does not override any TargetPassConfig methods.
- Without any llc options, no scheduler is run.
- With -enable-misched, the standard machine scheduler is run and honors the -misched=... flag to select the scheduler variant, which may be used for performance evaluation or testing.
Sorry overridePass is ugly. I haven't thought of a better way without replacing the cl::opt framework. I hope to do that one day...
I haven't figured out why CodeGen uses char& for pass IDs. AnalysisID is much easier to use and less bug prone. I'm using it wherever I can for internal implementation. Maybe later we can change the global pass ID definitions as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150563 91177308-0d34-0410-b5e6-96231b3b80d8
Pretend that regmask interference ends at the 'dead' slot, even when
there is other interference ending at the 'reg' slot of the same
instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150531 91177308-0d34-0410-b5e6-96231b3b80d8
Perform all comparisons at instruction granularity, and make sure
register masks on uses count in both gaps.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150530 91177308-0d34-0410-b5e6-96231b3b80d8
Only accept register masks when looking for an 'overlapping' def. When
Overlap is not set, the function searches for a proper definition of
Reg.
This means MI->modifiesRegister() considers register masks, but
MI->definesRegister() doesn't.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150529 91177308-0d34-0410-b5e6-96231b3b80d8
When a physreg is live in to a basic block, look for any instruction in
the block that clobbers the physreg.
The instruction doesn't have to properly redefine the register, any
overlapping clobber is OK.
This slightly changes live ranges when compiling with register masks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150528 91177308-0d34-0410-b5e6-96231b3b80d8