mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-02-23 20:29:30 +00:00
[Unroll] Rework the naming and structure of the new unroll heuristics.
The new naming is (to me) much easier to understand. Here is a summary of the new state of the world: - '*Threshold' is the threshold for full unrolling. It is measured against the estimated unrolled cost as computed by getUserCost in TTI (or CodeMetrics, etc). We will exceed this threshold when unrolling loops where unrolling exposes a significant degree of simplification of the logic within the loop. - '*PercentDynamicCostSavedThreshold' is the percentage of the loop's estimated dynamic execution cost which needs to be saved by unrolling to apply a discount to the estimated unrolled cost. - '*DynamicCostSavingsDiscount' is the discount applied to the estimated unrolling cost when the dynamic savings are expected to be high. When actually analyzing the loop, we now produce both an estimated unrolled cost, and an estimated rolled cost. The rolled cost is notably a dynamic estimate based on our analysis of the expected execution of each iteration. While we're still working to build up the infrastructure for making these estimates, to me it is much more clear *how* to make them better when they have reasonably descriptive names. For example, we may want to apply estimated (from heuristics or profiles) dynamic execution weights to the *dynamic* cost estimates. If we start doing that, we would also need to track the static unrolled cost and the dynamic unrolled cost, as only the latter could reasonably be weighted by profile information. This patch is sadly not without functionality change for the new unroll analysis logic. Buried in the heuristic management were several things that surprised me. For example, we never subtracted the optimized instruction count off when comparing against the unroll heursistics! I don't know if this just got lost somewhere along the way or what, but with the new accounting of things, this is much easier to keep track of and we use the post-simplification cost estimate to compare to the thresholds, and use the dynamic cost reduction ratio to select whether we can exceed the baseline threshold. The old values of these flags also don't necessarily make sense. My impression is that none of these thresholds or discounts have been tuned yet, and so they're just arbitrary placehold numbers. As such, I've not bothered to adjust for the fact that this is now a discount and not a tow-tier threshold model. We need to tune all these values once the logic is ready to be enabled. Differential Revision: http://reviews.llvm.org/D9966 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239164 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
parent
8a5883aabe
commit
862b2ad204
@ -221,19 +221,21 @@ public:
|
|||||||
|
|
||||||
/// Parameters that control the generic loop unrolling transformation.
|
/// Parameters that control the generic loop unrolling transformation.
|
||||||
struct UnrollingPreferences {
|
struct UnrollingPreferences {
|
||||||
/// The cost threshold for the unrolled loop, compared to
|
/// The cost threshold for the unrolled loop. Should be relative to the
|
||||||
/// CodeMetrics.NumInsts aggregated over all basic blocks in the loop body.
|
/// getUserCost values returned by this API, and the expectation is that
|
||||||
/// The unrolling factor is set such that the unrolled loop body does not
|
/// the unrolled loop's instructions when run through that interface should
|
||||||
/// exceed this cost. Set this to UINT_MAX to disable the loop body cost
|
/// not exceed this cost. However, this is only an estimate. Also, specific
|
||||||
|
/// loops may be unrolled even with a cost above this threshold if deemed
|
||||||
|
/// profitable. Set this to UINT_MAX to disable the loop body cost
|
||||||
/// restriction.
|
/// restriction.
|
||||||
unsigned Threshold;
|
unsigned Threshold;
|
||||||
/// If complete unrolling could help other optimizations (e.g. InstSimplify)
|
/// If complete unrolling will reduce the cost of the loop below its
|
||||||
/// to remove N% of instructions, then we can go beyond unroll threshold.
|
/// expected dynamic cost while rolled by this percentage, apply a discount
|
||||||
/// This value set the minimal percent for allowing that.
|
/// (below) to its unrolled cost.
|
||||||
unsigned MinPercentOfOptimized;
|
unsigned PercentDynamicCostSavedThreshold;
|
||||||
/// The absolute cost threshold. We won't go beyond this even if complete
|
/// The discount applied to the unrolled cost when the *dynamic* cost
|
||||||
/// unrolling could result in optimizing out 90% of instructions.
|
/// savings of unrolling exceed the \c PercentDynamicCostSavedThreshold.
|
||||||
unsigned AbsoluteThreshold;
|
unsigned DynamicCostSavingsDiscount;
|
||||||
/// The cost threshold for the unrolled loop when optimizing for size (set
|
/// The cost threshold for the unrolled loop when optimizing for size (set
|
||||||
/// to UINT_MAX to disable).
|
/// to UINT_MAX to disable).
|
||||||
unsigned OptSizeThreshold;
|
unsigned OptSizeThreshold;
|
||||||
|
@ -38,25 +38,25 @@ using namespace llvm;
|
|||||||
#define DEBUG_TYPE "loop-unroll"
|
#define DEBUG_TYPE "loop-unroll"
|
||||||
|
|
||||||
static cl::opt<unsigned>
|
static cl::opt<unsigned>
|
||||||
UnrollThreshold("unroll-threshold", cl::init(150), cl::Hidden,
|
UnrollThreshold("unroll-threshold", cl::init(150), cl::Hidden,
|
||||||
cl::desc("The cut-off point for automatic loop unrolling"));
|
cl::desc("The baseline cost threshold for loop unrolling"));
|
||||||
|
|
||||||
|
static cl::opt<unsigned> UnrollPercentDynamicCostSavedThreshold(
|
||||||
|
"unroll-percent-dynamic-cost-saved-threshold", cl::init(20), cl::Hidden,
|
||||||
|
cl::desc("The percentage of estimated dynamic cost which must be saved by "
|
||||||
|
"unrolling to allow unrolling up to the max threshold."));
|
||||||
|
|
||||||
|
static cl::opt<unsigned> UnrollDynamicCostSavingsDiscount(
|
||||||
|
"unroll-dynamic-cost-savings-discount", cl::init(2000), cl::Hidden,
|
||||||
|
cl::desc("This is the amount discounted from the total unroll cost when "
|
||||||
|
"the unrolled form has a high dynamic cost savings (triggered by "
|
||||||
|
"the '-unroll-perecent-dynamic-cost-saved-threshold' flag)."));
|
||||||
|
|
||||||
static cl::opt<unsigned> UnrollMaxIterationsCountToAnalyze(
|
static cl::opt<unsigned> UnrollMaxIterationsCountToAnalyze(
|
||||||
"unroll-max-iteration-count-to-analyze", cl::init(0), cl::Hidden,
|
"unroll-max-iteration-count-to-analyze", cl::init(0), cl::Hidden,
|
||||||
cl::desc("Don't allow loop unrolling to simulate more than this number of"
|
cl::desc("Don't allow loop unrolling to simulate more than this number of"
|
||||||
"iterations when checking full unroll profitability"));
|
"iterations when checking full unroll profitability"));
|
||||||
|
|
||||||
static cl::opt<unsigned> UnrollMinPercentOfOptimized(
|
|
||||||
"unroll-percent-of-optimized-for-complete-unroll", cl::init(20), cl::Hidden,
|
|
||||||
cl::desc("If complete unrolling could trigger further optimizations, and, "
|
|
||||||
"by that, remove the given percent of instructions, perform the "
|
|
||||||
"complete unroll even if it's beyond the threshold"));
|
|
||||||
|
|
||||||
static cl::opt<unsigned> UnrollAbsoluteThreshold(
|
|
||||||
"unroll-absolute-threshold", cl::init(2000), cl::Hidden,
|
|
||||||
cl::desc("Don't unroll if the unrolled size is bigger than this threshold,"
|
|
||||||
" even if we can remove big portion of instructions later."));
|
|
||||||
|
|
||||||
static cl::opt<unsigned>
|
static cl::opt<unsigned>
|
||||||
UnrollCount("unroll-count", cl::init(0), cl::Hidden,
|
UnrollCount("unroll-count", cl::init(0), cl::Hidden,
|
||||||
cl::desc("Use this unroll count for all loops including those with "
|
cl::desc("Use this unroll count for all loops including those with "
|
||||||
@ -82,16 +82,18 @@ namespace {
|
|||||||
static char ID; // Pass ID, replacement for typeid
|
static char ID; // Pass ID, replacement for typeid
|
||||||
LoopUnroll(int T = -1, int C = -1, int P = -1, int R = -1) : LoopPass(ID) {
|
LoopUnroll(int T = -1, int C = -1, int P = -1, int R = -1) : LoopPass(ID) {
|
||||||
CurrentThreshold = (T == -1) ? UnrollThreshold : unsigned(T);
|
CurrentThreshold = (T == -1) ? UnrollThreshold : unsigned(T);
|
||||||
CurrentAbsoluteThreshold = UnrollAbsoluteThreshold;
|
CurrentPercentDynamicCostSavedThreshold =
|
||||||
CurrentMinPercentOfOptimized = UnrollMinPercentOfOptimized;
|
UnrollPercentDynamicCostSavedThreshold;
|
||||||
|
CurrentDynamicCostSavingsDiscount = UnrollDynamicCostSavingsDiscount;
|
||||||
CurrentCount = (C == -1) ? UnrollCount : unsigned(C);
|
CurrentCount = (C == -1) ? UnrollCount : unsigned(C);
|
||||||
CurrentAllowPartial = (P == -1) ? UnrollAllowPartial : (bool)P;
|
CurrentAllowPartial = (P == -1) ? UnrollAllowPartial : (bool)P;
|
||||||
CurrentRuntime = (R == -1) ? UnrollRuntime : (bool)R;
|
CurrentRuntime = (R == -1) ? UnrollRuntime : (bool)R;
|
||||||
|
|
||||||
UserThreshold = (T != -1) || (UnrollThreshold.getNumOccurrences() > 0);
|
UserThreshold = (T != -1) || (UnrollThreshold.getNumOccurrences() > 0);
|
||||||
UserAbsoluteThreshold = (UnrollAbsoluteThreshold.getNumOccurrences() > 0);
|
UserPercentDynamicCostSavedThreshold =
|
||||||
UserPercentOfOptimized =
|
(UnrollPercentDynamicCostSavedThreshold.getNumOccurrences() > 0);
|
||||||
(UnrollMinPercentOfOptimized.getNumOccurrences() > 0);
|
UserDynamicCostSavingsDiscount =
|
||||||
|
(UnrollDynamicCostSavingsDiscount.getNumOccurrences() > 0);
|
||||||
UserAllowPartial = (P != -1) ||
|
UserAllowPartial = (P != -1) ||
|
||||||
(UnrollAllowPartial.getNumOccurrences() > 0);
|
(UnrollAllowPartial.getNumOccurrences() > 0);
|
||||||
UserRuntime = (R != -1) || (UnrollRuntime.getNumOccurrences() > 0);
|
UserRuntime = (R != -1) || (UnrollRuntime.getNumOccurrences() > 0);
|
||||||
@ -115,18 +117,18 @@ namespace {
|
|||||||
|
|
||||||
unsigned CurrentCount;
|
unsigned CurrentCount;
|
||||||
unsigned CurrentThreshold;
|
unsigned CurrentThreshold;
|
||||||
unsigned CurrentAbsoluteThreshold;
|
unsigned CurrentPercentDynamicCostSavedThreshold;
|
||||||
unsigned CurrentMinPercentOfOptimized;
|
unsigned CurrentDynamicCostSavingsDiscount;
|
||||||
bool CurrentAllowPartial;
|
bool CurrentAllowPartial;
|
||||||
bool CurrentRuntime;
|
bool CurrentRuntime;
|
||||||
bool UserCount; // CurrentCount is user-specified.
|
|
||||||
bool UserThreshold; // CurrentThreshold is user-specified.
|
// Flags for whether the 'current' settings are user-specified.
|
||||||
bool UserAbsoluteThreshold; // CurrentAbsoluteThreshold is
|
bool UserCount;
|
||||||
// user-specified.
|
bool UserThreshold;
|
||||||
bool UserPercentOfOptimized; // CurrentMinPercentOfOptimized is
|
bool UserPercentDynamicCostSavedThreshold;
|
||||||
// user-specified.
|
bool UserDynamicCostSavingsDiscount;
|
||||||
bool UserAllowPartial; // CurrentAllowPartial is user-specified.
|
bool UserAllowPartial;
|
||||||
bool UserRuntime; // CurrentRuntime is user-specified.
|
bool UserRuntime;
|
||||||
|
|
||||||
bool runOnLoop(Loop *L, LPPassManager &LPM) override;
|
bool runOnLoop(Loop *L, LPPassManager &LPM) override;
|
||||||
|
|
||||||
@ -156,8 +158,9 @@ namespace {
|
|||||||
void getUnrollingPreferences(Loop *L, const TargetTransformInfo &TTI,
|
void getUnrollingPreferences(Loop *L, const TargetTransformInfo &TTI,
|
||||||
TargetTransformInfo::UnrollingPreferences &UP) {
|
TargetTransformInfo::UnrollingPreferences &UP) {
|
||||||
UP.Threshold = CurrentThreshold;
|
UP.Threshold = CurrentThreshold;
|
||||||
UP.AbsoluteThreshold = CurrentAbsoluteThreshold;
|
UP.PercentDynamicCostSavedThreshold =
|
||||||
UP.MinPercentOfOptimized = CurrentMinPercentOfOptimized;
|
CurrentPercentDynamicCostSavedThreshold;
|
||||||
|
UP.DynamicCostSavingsDiscount = CurrentDynamicCostSavingsDiscount;
|
||||||
UP.OptSizeThreshold = OptSizeUnrollThreshold;
|
UP.OptSizeThreshold = OptSizeUnrollThreshold;
|
||||||
UP.PartialThreshold = CurrentThreshold;
|
UP.PartialThreshold = CurrentThreshold;
|
||||||
UP.PartialOptSizeThreshold = OptSizeUnrollThreshold;
|
UP.PartialOptSizeThreshold = OptSizeUnrollThreshold;
|
||||||
@ -186,8 +189,8 @@ namespace {
|
|||||||
void selectThresholds(const Loop *L, bool HasPragma,
|
void selectThresholds(const Loop *L, bool HasPragma,
|
||||||
const TargetTransformInfo::UnrollingPreferences &UP,
|
const TargetTransformInfo::UnrollingPreferences &UP,
|
||||||
unsigned &Threshold, unsigned &PartialThreshold,
|
unsigned &Threshold, unsigned &PartialThreshold,
|
||||||
unsigned &AbsoluteThreshold,
|
unsigned &PercentDynamicCostSavedThreshold,
|
||||||
unsigned &PercentOfOptimizedForCompleteUnroll) {
|
unsigned &DynamicCostSavingsDiscount) {
|
||||||
// Determine the current unrolling threshold. While this is
|
// Determine the current unrolling threshold. While this is
|
||||||
// normally set from UnrollThreshold, it is overridden to a
|
// normally set from UnrollThreshold, it is overridden to a
|
||||||
// smaller value if the current function is marked as
|
// smaller value if the current function is marked as
|
||||||
@ -195,11 +198,13 @@ namespace {
|
|||||||
// specified.
|
// specified.
|
||||||
Threshold = UserThreshold ? CurrentThreshold : UP.Threshold;
|
Threshold = UserThreshold ? CurrentThreshold : UP.Threshold;
|
||||||
PartialThreshold = UserThreshold ? CurrentThreshold : UP.PartialThreshold;
|
PartialThreshold = UserThreshold ? CurrentThreshold : UP.PartialThreshold;
|
||||||
AbsoluteThreshold = UserAbsoluteThreshold ? CurrentAbsoluteThreshold
|
PercentDynamicCostSavedThreshold =
|
||||||
: UP.AbsoluteThreshold;
|
UserPercentDynamicCostSavedThreshold
|
||||||
PercentOfOptimizedForCompleteUnroll = UserPercentOfOptimized
|
? CurrentPercentDynamicCostSavedThreshold
|
||||||
? CurrentMinPercentOfOptimized
|
: UP.PercentDynamicCostSavedThreshold;
|
||||||
: UP.MinPercentOfOptimized;
|
DynamicCostSavingsDiscount = UserDynamicCostSavingsDiscount
|
||||||
|
? CurrentDynamicCostSavingsDiscount
|
||||||
|
: UP.DynamicCostSavingsDiscount;
|
||||||
|
|
||||||
if (!UserThreshold &&
|
if (!UserThreshold &&
|
||||||
L->getHeader()->getParent()->hasFnAttribute(
|
L->getHeader()->getParent()->hasFnAttribute(
|
||||||
@ -220,9 +225,9 @@ namespace {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
bool canUnrollCompletely(Loop *L, unsigned Threshold,
|
bool canUnrollCompletely(Loop *L, unsigned Threshold,
|
||||||
unsigned AbsoluteThreshold, uint64_t UnrolledSize,
|
unsigned PercentDynamicCostSavedThreshold,
|
||||||
unsigned NumberOfOptimizedInstructions,
|
unsigned DynamicCostSavingsDiscount,
|
||||||
unsigned PercentOfOptimizedForCompleteUnroll);
|
unsigned UnrolledCost, unsigned RolledDynamicCost);
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -556,11 +561,12 @@ private:
|
|||||||
|
|
||||||
namespace {
|
namespace {
|
||||||
struct EstimatedUnrollCost {
|
struct EstimatedUnrollCost {
|
||||||
/// \brief Count the number of optimized instructions.
|
/// \brief The estimated cost after unrolling.
|
||||||
unsigned NumberOfOptimizedInstructions;
|
unsigned UnrolledCost;
|
||||||
|
|
||||||
/// \brief Count the total number of instructions.
|
/// \brief The estimated dynamic cost of executing the instructions in the
|
||||||
unsigned UnrolledLoopSize;
|
/// rolled form.
|
||||||
|
unsigned RolledDynamicCost;
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -597,8 +603,15 @@ analyzeLoopUnrollCost(const Loop *L, unsigned TripCount, ScalarEvolution &SE,
|
|||||||
// each iteration. This cache is lazily self-populating.
|
// each iteration. This cache is lazily self-populating.
|
||||||
SCEVCache SC(*L, SE);
|
SCEVCache SC(*L, SE);
|
||||||
|
|
||||||
unsigned NumberOfOptimizedInstructions = 0;
|
// The estimated cost of the unrolled form of the loop. We try to estimate
|
||||||
unsigned UnrolledLoopSize = 0;
|
// this by simplifying as much as we can while computing the estimate.
|
||||||
|
unsigned UnrolledCost = 0;
|
||||||
|
// We also track the estimated dynamic (that is, actually executed) cost in
|
||||||
|
// the rolled form. This helps identify cases when the savings from unrolling
|
||||||
|
// aren't just exposing dead control flows, but actual reduced dynamic
|
||||||
|
// instructions due to the simplifications which we expect to occur after
|
||||||
|
// unrolling.
|
||||||
|
unsigned RolledDynamicCost = 0;
|
||||||
|
|
||||||
// Simulate execution of each iteration of the loop counting instructions,
|
// Simulate execution of each iteration of the loop counting instructions,
|
||||||
// which would be simplified.
|
// which would be simplified.
|
||||||
@ -618,17 +631,20 @@ analyzeLoopUnrollCost(const Loop *L, unsigned TripCount, ScalarEvolution &SE,
|
|||||||
// it. We don't change the actual IR, just count optimization
|
// it. We don't change the actual IR, just count optimization
|
||||||
// opportunities.
|
// opportunities.
|
||||||
for (Instruction &I : *BB) {
|
for (Instruction &I : *BB) {
|
||||||
UnrolledLoopSize += TTI.getUserCost(&I);
|
unsigned InstCost = TTI.getUserCost(&I);
|
||||||
|
|
||||||
// Visit the instruction to analyze its loop cost after unrolling,
|
// Visit the instruction to analyze its loop cost after unrolling,
|
||||||
// and if the visitor returns true, then we can optimize this
|
// and if the visitor returns false, include this instruction in the
|
||||||
// instruction away.
|
// unrolled cost.
|
||||||
if (Analyzer.visit(I))
|
if (!Analyzer.visit(I))
|
||||||
NumberOfOptimizedInstructions += TTI.getUserCost(&I);
|
UnrolledCost += InstCost;
|
||||||
|
|
||||||
|
// Also track this instructions expected cost when executing the rolled
|
||||||
|
// loop form.
|
||||||
|
RolledDynamicCost += InstCost;
|
||||||
|
|
||||||
// If unrolled body turns out to be too big, bail out.
|
// If unrolled body turns out to be too big, bail out.
|
||||||
if (UnrolledLoopSize - NumberOfOptimizedInstructions >
|
if (UnrolledCost > MaxUnrolledLoopSize)
|
||||||
MaxUnrolledLoopSize)
|
|
||||||
return None;
|
return None;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -640,10 +656,10 @@ analyzeLoopUnrollCost(const Loop *L, unsigned TripCount, ScalarEvolution &SE,
|
|||||||
|
|
||||||
// If we found no optimization opportunities on the first iteration, we
|
// If we found no optimization opportunities on the first iteration, we
|
||||||
// won't find them on later ones too.
|
// won't find them on later ones too.
|
||||||
if (!NumberOfOptimizedInstructions)
|
if (UnrolledCost == RolledDynamicCost)
|
||||||
return None;
|
return None;
|
||||||
}
|
}
|
||||||
return {{NumberOfOptimizedInstructions, UnrolledLoopSize}};
|
return {{UnrolledCost, RolledDynamicCost}};
|
||||||
}
|
}
|
||||||
|
|
||||||
/// ApproximateLoopSize - Approximate the size of the loop.
|
/// ApproximateLoopSize - Approximate the size of the loop.
|
||||||
@ -749,46 +765,56 @@ static void SetLoopAlreadyUnrolled(Loop *L) {
|
|||||||
L->setLoopID(NewLoopID);
|
L->setLoopID(NewLoopID);
|
||||||
}
|
}
|
||||||
|
|
||||||
bool LoopUnroll::canUnrollCompletely(
|
bool LoopUnroll::canUnrollCompletely(Loop *L, unsigned Threshold,
|
||||||
Loop *L, unsigned Threshold, unsigned AbsoluteThreshold,
|
unsigned PercentDynamicCostSavedThreshold,
|
||||||
uint64_t UnrolledSize, unsigned NumberOfOptimizedInstructions,
|
unsigned DynamicCostSavingsDiscount,
|
||||||
unsigned PercentOfOptimizedForCompleteUnroll) {
|
unsigned UnrolledCost,
|
||||||
|
unsigned RolledDynamicCost) {
|
||||||
|
|
||||||
if (Threshold == NoThreshold) {
|
if (Threshold == NoThreshold) {
|
||||||
DEBUG(dbgs() << " Can fully unroll, because no threshold is set.\n");
|
DEBUG(dbgs() << " Can fully unroll, because no threshold is set.\n");
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (UnrolledSize <= Threshold) {
|
if (UnrolledCost <= Threshold) {
|
||||||
DEBUG(dbgs() << " Can fully unroll, because unrolled size: "
|
DEBUG(dbgs() << " Can fully unroll, because unrolled cost: "
|
||||||
<< UnrolledSize << "<" << Threshold << "\n");
|
<< UnrolledCost << "<" << Threshold << "\n");
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
assert(UnrolledSize && "UnrolledSize can't be 0 at this point.");
|
assert(UnrolledCost && "UnrolledCost can't be 0 at this point.");
|
||||||
unsigned PercentOfOptimizedInstructions =
|
assert(RolledDynamicCost >= UnrolledCost &&
|
||||||
(uint64_t)NumberOfOptimizedInstructions * 100ull / UnrolledSize;
|
"Cannot have a higher unrolled cost than a rolled cost!");
|
||||||
|
|
||||||
if (UnrolledSize <= AbsoluteThreshold &&
|
// Compute the percentage of the dynamic cost in the rolled form that is
|
||||||
PercentOfOptimizedInstructions >= PercentOfOptimizedForCompleteUnroll) {
|
// saved when unrolled. If unrolling dramatically reduces the estimated
|
||||||
DEBUG(dbgs() << " Can fully unroll, because unrolling will help removing "
|
// dynamic cost of the loop, we use a higher threshold to allow more
|
||||||
<< PercentOfOptimizedInstructions
|
// unrolling.
|
||||||
<< "% instructions (threshold: "
|
unsigned PercentDynamicCostSaved =
|
||||||
<< PercentOfOptimizedForCompleteUnroll << "%)\n");
|
(uint64_t)(RolledDynamicCost - UnrolledCost) * 100ull / RolledDynamicCost;
|
||||||
DEBUG(dbgs() << " Unrolled size (" << UnrolledSize
|
|
||||||
<< ") is less than the threshold (" << AbsoluteThreshold
|
if (PercentDynamicCostSaved >= PercentDynamicCostSavedThreshold &&
|
||||||
<< ").\n");
|
(int64_t)UnrolledCost - (int64_t)DynamicCostSavingsDiscount <=
|
||||||
|
(int64_t)Threshold) {
|
||||||
|
DEBUG(dbgs() << " Can fully unroll, because unrolling will reduce the "
|
||||||
|
"expected dynamic cost by " << PercentDynamicCostSaved
|
||||||
|
<< "% (threshold: " << PercentDynamicCostSavedThreshold
|
||||||
|
<< "%)\n"
|
||||||
|
<< " and the unrolled cost (" << UnrolledCost
|
||||||
|
<< ") is less than the max threshold ("
|
||||||
|
<< DynamicCostSavingsDiscount << ").\n");
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
DEBUG(dbgs() << " Too large to fully unroll:\n");
|
DEBUG(dbgs() << " Too large to fully unroll:\n");
|
||||||
DEBUG(dbgs() << " Unrolled size: " << UnrolledSize << "\n");
|
DEBUG(dbgs() << " Threshold: " << Threshold << "\n");
|
||||||
DEBUG(dbgs() << " Estimated number of optimized instructions: "
|
DEBUG(dbgs() << " Max threshold: " << DynamicCostSavingsDiscount << "\n");
|
||||||
<< NumberOfOptimizedInstructions << "\n");
|
DEBUG(dbgs() << " Percent cost saved threshold: "
|
||||||
DEBUG(dbgs() << " Absolute threshold: " << AbsoluteThreshold << "\n");
|
<< PercentDynamicCostSavedThreshold << "%\n");
|
||||||
DEBUG(dbgs() << " Minimum percent of removed instructions: "
|
DEBUG(dbgs() << " Unrolled cost: " << UnrolledCost << "\n");
|
||||||
<< PercentOfOptimizedForCompleteUnroll << "\n");
|
DEBUG(dbgs() << " Rolled dynamic cost: " << RolledDynamicCost << "\n");
|
||||||
DEBUG(dbgs() << " Threshold for small loops: " << Threshold << "\n");
|
DEBUG(dbgs() << " Percent cost saved: " << PercentDynamicCostSaved
|
||||||
|
<< "\n");
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -899,9 +925,11 @@ bool LoopUnroll::runOnLoop(Loop *L, LPPassManager &LPM) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
unsigned Threshold, PartialThreshold;
|
unsigned Threshold, PartialThreshold;
|
||||||
unsigned AbsoluteThreshold, PercentOfOptimizedForCompleteUnroll;
|
unsigned PercentDynamicCostSavedThreshold;
|
||||||
|
unsigned DynamicCostSavingsDiscount;
|
||||||
selectThresholds(L, HasPragma, UP, Threshold, PartialThreshold,
|
selectThresholds(L, HasPragma, UP, Threshold, PartialThreshold,
|
||||||
AbsoluteThreshold, PercentOfOptimizedForCompleteUnroll);
|
PercentDynamicCostSavedThreshold,
|
||||||
|
DynamicCostSavingsDiscount);
|
||||||
|
|
||||||
// Given Count, TripCount and thresholds determine the type of
|
// Given Count, TripCount and thresholds determine the type of
|
||||||
// unrolling which is to be performed.
|
// unrolling which is to be performed.
|
||||||
@ -910,20 +938,18 @@ bool LoopUnroll::runOnLoop(Loop *L, LPPassManager &LPM) {
|
|||||||
if (TripCount && Count == TripCount) {
|
if (TripCount && Count == TripCount) {
|
||||||
Unrolling = Partial;
|
Unrolling = Partial;
|
||||||
// If the loop is really small, we don't need to run an expensive analysis.
|
// If the loop is really small, we don't need to run an expensive analysis.
|
||||||
if (canUnrollCompletely(
|
if (canUnrollCompletely(L, Threshold, 100, DynamicCostSavingsDiscount,
|
||||||
L, Threshold, AbsoluteThreshold,
|
UnrolledSize, UnrolledSize)) {
|
||||||
UnrolledSize, 0, 100)) {
|
|
||||||
Unrolling = Full;
|
Unrolling = Full;
|
||||||
} else {
|
} else {
|
||||||
// The loop isn't that small, but we still can fully unroll it if that
|
// The loop isn't that small, but we still can fully unroll it if that
|
||||||
// helps to remove a significant number of instructions.
|
// helps to remove a significant number of instructions.
|
||||||
// To check that, run additional analysis on the loop.
|
// To check that, run additional analysis on the loop.
|
||||||
if (Optional<EstimatedUnrollCost> Cost =
|
if (Optional<EstimatedUnrollCost> Cost = analyzeLoopUnrollCost(
|
||||||
analyzeLoopUnrollCost(L, TripCount, *SE, TTI, AbsoluteThreshold))
|
L, TripCount, *SE, TTI, Threshold + DynamicCostSavingsDiscount))
|
||||||
if (canUnrollCompletely(L, Threshold, AbsoluteThreshold,
|
if (canUnrollCompletely(L, Threshold, PercentDynamicCostSavedThreshold,
|
||||||
Cost->UnrolledLoopSize,
|
DynamicCostSavingsDiscount, Cost->UnrolledCost,
|
||||||
Cost->NumberOfOptimizedInstructions,
|
Cost->RolledDynamicCost)) {
|
||||||
PercentOfOptimizedForCompleteUnroll)) {
|
|
||||||
Unrolling = Full;
|
Unrolling = Full;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
; Check that we don't crash on corner cases.
|
; Check that we don't crash on corner cases.
|
||||||
; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-absolute-threshold=10 -unroll-threshold=10 -unroll-percent-of-optimized-for-complete-unroll=20 -o /dev/null
|
; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-threshold=10 -unroll-percent-dynamic-cost-saved-threshold=20 -o /dev/null
|
||||||
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
|
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
|
||||||
|
|
||||||
define void @foo1() {
|
define void @foo1() {
|
||||||
|
@ -1,8 +1,8 @@
|
|||||||
; In this test we check how heuristics for complete unrolling work. We have
|
; In this test we check how heuristics for complete unrolling work. We have
|
||||||
; three knobs:
|
; three knobs:
|
||||||
; 1) -unroll-threshold
|
; 1) -unroll-threshold
|
||||||
; 2) -unroll-absolute-threshold and
|
; 3) -unroll-percent-dynamic-cost-saved-threshold and
|
||||||
; 3) -unroll-percent-of-optimized-for-complete-unroll
|
; 2) -unroll-dynamic-cost-savings-discount
|
||||||
;
|
;
|
||||||
; They control loop-unrolling according to the following rules:
|
; They control loop-unrolling according to the following rules:
|
||||||
; * If size of unrolled loop exceeds the absoulte threshold, we don't unroll
|
; * If size of unrolled loop exceeds the absoulte threshold, we don't unroll
|
||||||
@ -17,10 +17,10 @@
|
|||||||
; optimizations to remove ~55% of the instructions, the loop body size is 9,
|
; optimizations to remove ~55% of the instructions, the loop body size is 9,
|
||||||
; and unrolled size is 65.
|
; and unrolled size is 65.
|
||||||
|
|
||||||
; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-absolute-threshold=10 -unroll-threshold=10 -unroll-percent-of-optimized-for-complete-unroll=20 | FileCheck %s -check-prefix=TEST1
|
; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-threshold=10 -unroll-percent-dynamic-cost-saved-threshold=20 -unroll-dynamic-cost-savings-discount=0 | FileCheck %s -check-prefix=TEST1
|
||||||
; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-absolute-threshold=100 -unroll-threshold=10 -unroll-percent-of-optimized-for-complete-unroll=20 | FileCheck %s -check-prefix=TEST2
|
; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-threshold=10 -unroll-percent-dynamic-cost-saved-threshold=20 -unroll-dynamic-cost-savings-discount=90 | FileCheck %s -check-prefix=TEST2
|
||||||
; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-absolute-threshold=100 -unroll-threshold=10 -unroll-percent-of-optimized-for-complete-unroll=80 | FileCheck %s -check-prefix=TEST3
|
; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-threshold=10 -unroll-percent-dynamic-cost-saved-threshold=80 -unroll-dynamic-cost-savings-discount=90 | FileCheck %s -check-prefix=TEST3
|
||||||
; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-absolute-threshold=100 -unroll-threshold=100 -unroll-percent-of-optimized-for-complete-unroll=80 | FileCheck %s -check-prefix=TEST4
|
; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-threshold=100 -unroll-percent-dynamic-cost-saved-threshold=80 -unroll-dynamic-cost-savings-discount=0 | FileCheck %s -check-prefix=TEST4
|
||||||
|
|
||||||
; If the absolute threshold is too low, or if we can't optimize away requested
|
; If the absolute threshold is too low, or if we can't optimize away requested
|
||||||
; percent of instructions, we shouldn't unroll:
|
; percent of instructions, we shouldn't unroll:
|
||||||
|
Loading…
x
Reference in New Issue
Block a user