These edges are not really necessary, but it is consistent with the
way we currently create physreg edges. Scheduler heuristics that
expect a DAG edge to the block terminator could benefit from this
change. Although in the future I hope we have a better mechanism for
modeling latency across scheduling regions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152895 91177308-0d34-0410-b5e6-96231b3b80d8
New flags: -misched-topdown, -misched-bottomup. They can be used with
the default scheduler or with -misched=shuffle. Without either
topdown/bottomup flag -misched=shuffle now alternates scheduling
direction.
LiveIntervals update is unimplemented with bottom-up scheduling, so
only -misched-topdown currently works.
Capped the ScheduleDAG hierarchy with a concrete ScheduleDAGMI class.
ScheduleDAGMI is aware of the top and bottom of the unscheduled zone
within the current region. Scheduling policy can be plugged into
the ScheduleDAGMI driver by implementing MachineSchedStrategy.
ConvergingScheduler is now the default scheduling algorithm.
It exercises the new driver but still does no reordering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152700 91177308-0d34-0410-b5e6-96231b3b80d8
ScheduleDAGInstrs will be the main interface for MI-level
schedulers. Make sure it's readable: one page of protected fields, one
page of public methids.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152258 91177308-0d34-0410-b5e6-96231b3b80d8
ScheduleDAG is responsible for the DAG: SUnits and SDeps. It provides target hooks for latency computation.
ScheduleDAGInstrs extends ScheduleDAG and defines the current scheduling region in terms of MachineInstr iterators. It has access to the target's scheduling itinerary data. ScheduleDAGInstrs provides the logic for building the ScheduleDAG for the sequence of MachineInstrs in the current region. Target's can implement highly custom schedulers by extending this class.
ScheduleDAGPostRATDList provides the driver and diagnostics for current postRA scheduling. It maintains a current Sequence of scheduled machine instructions and logic for splicing them into the block. During scheduling, it uses the ScheduleHazardRecognizer provided by the target.
Specific changes:
- Removed driver code from ScheduleDAG. clearDAG is the only interface needed.
- Added enterRegion/exitRegion hooks to ScheduleDAGInstrs to delimit the scope of each scheduling region and associated DAG. They should be used to setup and cleanup any region-specific state in addition to the DAG itself. This is necessary because we reuse the same ScheduleDAG object for the entire function. The target may extend these hooks to do things at regions boundaries, like bundle terminators. The hooks are called even if we decide not to schedule the region. So all instructions in a block are "covered" by these calls.
- Added ScheduleDAGInstrs::begin()/end() public API.
- Moved Sequence into the driver layer, which is specific to the scheduling algorithm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152208 91177308-0d34-0410-b5e6-96231b3b80d8
Added array subscript to SparseSet for convenience.
Slight reorg to make it easier to manage the def/use sets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151228 91177308-0d34-0410-b5e6-96231b3b80d8
The vast majority of virtual register definitions don't need an entry
in the DAG builder's VRegDefs set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151136 91177308-0d34-0410-b5e6-96231b3b80d8
Affect on SD scheduling and postRA scheduling:
Printing the DAG will display the nodes in top-down topological order.
This matches the order within the MBB and makes my life much easier in general.
Affect on misched:
We don't need to track virtual register uses at all. This is awesome.
I also intend to rely on the SUnit ID as a topo-sort index. So if A < B then we cannot have an edge B -> A.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151135 91177308-0d34-0410-b5e6-96231b3b80d8
Passes after RegAlloc should be able to rely on MRI->getNumVirtRegs() == 0.
This makes sharing code for pre/postRA passes more robust.
Now, to check if a pass is running before the RA pipeline begins, use MRI->isSSA().
To check if a pass is running after the RA pipeline ends, use !MRI->getNumVirtRegs().
PEI resets virtual regs when it's done scavenging.
PTX will either have to provide its own PEI pass or assign physregs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151032 91177308-0d34-0410-b5e6-96231b3b80d8
opportunities that only present themselves after late optimizations
such as tail duplication .e.g.
## BB#1:
movl %eax, %ecx
movl %ecx, %eax
ret
The register allocator also leaves some of them around (due to false
dep between copies from phi-elimination, etc.)
This required some changes in codegen passes. Post-ra scheduler and the
pseudo-instruction expansion passes have been moved after branch folding
and tail merging. They were before branch folding before because it did
not always update block livein's. That's fixed now. The pass change makes
independently since we want to properly schedule instructions after
branch folding / tail duplication.
rdar://10428165
rdar://10640363
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147716 91177308-0d34-0410-b5e6-96231b3b80d8
r0 = mov #0
r0 = moveq #1
Then the second instruction has an implicit data dependency on the first
instruction. Sadly I have yet to come up with a small test case that
demonstrate the post-ra scheduler taking advantage of this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146583 91177308-0d34-0410-b5e6-96231b3b80d8
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
prevent IT blocks from being broken apart.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146542 91177308-0d34-0410-b5e6-96231b3b80d8
generator to it. For non-bundle instructions, these behave exactly the same
as the MC layer API.
For properties like mayLoad / mayStore, look into the bundle and if any of the
bundled instructions has the property it would return true.
For properties like isPredicable, only return true if *all* of the bundled
instructions have the property.
For properties like canFoldAsLoad, isCompare, conservatively return false for
bundles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146026 91177308-0d34-0410-b5e6-96231b3b80d8
1. Added opcode BUNDLE
2. Taught MachineInstr class to deal with bundled MIs
3. Changed MachineBasicBlock iterator to skip over bundled MIs; added an iterator to walk all the MIs
4. Taught MachineBasicBlock methods about bundled MIs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145975 91177308-0d34-0410-b5e6-96231b3b80d8
sink them into MC layer.
- Added MCInstrInfo, which captures the tablegen generated static data. Chang
TargetInstrInfo so it's based off MCInstrInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134021 91177308-0d34-0410-b5e6-96231b3b80d8
Instead, use simpler approach and let DBG_VALUE follow its predecessor instruction. After live debug value analysis pass, all DBG_VALUE instruction are placed at the right place. Thanks Jakob for the hint!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132483 91177308-0d34-0410-b5e6-96231b3b80d8
BuildSchedGraph was quadratic in the number of calls in the basic
block. After this fix, it keeps only a single call at the top of the
DefList so compile time doesn't blow up on large blocks. This reduces
postRA sched time on an external test case from 81s to 0.3s. Although
r130800 (reduced ARM register alias defs) also partially fixes the
issue by reducing the constant overhead of checking call interference
by an order of magnitude.
Fixes <rdar://problem/7662664> very poor compile time with post RA scheduling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130943 91177308-0d34-0410-b5e6-96231b3b80d8
Instead encode llvm IR level property "HasSideEffects" in an operand (shared
with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check
the operand when the instruction is an INLINEASM.
This allows memory instructions to be moved around INLINEASM instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123044 91177308-0d34-0410-b5e6-96231b3b80d8
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
"optimize for latency". Call instructions don't have the right latency and
this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
not # of micro-ops since multi-latency instructions is completely executed
even when the predicate is false. Also, some instruction will be "slower"
when they are predicated due to the register def becoming implicit input.
rdar://8598427
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118135 91177308-0d34-0410-b5e6-96231b3b80d8