llvm-6502/lib/Analysis
Chandler Carruth 34afa06cf2 [inliner] Fix the early-exit of the inline cost analysis to correctly
model the dense vector instruction bonuses.

Previously, this code really didn't effectively compute the density of
inlined vector instructions and apply the intended inliner bonus. It
would try to compute it repeatedly while analyzing the function and
didn't handle the case where future vector instructions would tip the
scales back towards the bonus.

Instead, speculatively apply all possible bonuses to the threshold
initially. Once we *know* that a certain bonus can not be applied,
subtract it. This should delay early bailout enough to get much more
consistent results without actually causing us to analyze huge swaths of
code. I expect some (hopefully mild) compile time hit here, and some
swings in performance, but this was definitely the intended behavior of
these bonuses.

This also dramatically simplifies the computation of the bonuses to not
interact with each other in confusing ways. The previous code didn't do
a good job of this and the values for bonuses may be surprising but are
at least now clearly written in the code.

Finally, fix code to be in line with comments and use zero as the
bailout condition.

Patch by Easwaran Raman, with some comment tweaks by me to try and
further clarify what is going on with this code.

http://reviews.llvm.org/D8267

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238276 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 02:49:05 +00:00
..
IPA [inliner] Fix the early-exit of the inline cost analysis to correctly 2015-05-27 02:49:05 +00:00
AliasAnalysis.cpp Test commit: Remove unnecessary spaces. 2015-05-13 15:04:14 +00:00
AliasAnalysisCounter.cpp Use 'override/final' instead of 'virtual' for overridden methods 2015-04-11 02:11:45 +00:00
AliasAnalysisEvaluator.cpp [CallSite] Make construction from Value* (or Instruction*) explicit. 2015-04-10 14:50:08 +00:00
AliasDebugger.cpp
AliasSetTracker.cpp Constify arguments in AliasSetTracker methods. NFC 2015-05-13 01:12:12 +00:00
Analysis.cpp Divergence analysis for GPU programs 2015-04-10 05:03:50 +00:00
AssumptionCache.cpp
BasicAliasAnalysis.cpp Revert r236894 "[BasicAA] Fix zext & sext handling" 2015-05-22 01:27:37 +00:00
BlockFrequencyInfo.cpp
BlockFrequencyInfoImpl.cpp Remove 4,096 loop scale limitation. 2015-04-01 17:42:27 +00:00
BranchProbabilityInfo.cpp Fix information loss in branch probability computation. 2015-05-07 17:22:06 +00:00
CaptureTracking.cpp
CFG.cpp
CFGPrinter.cpp
CFLAliasAnalysis.cpp Convert PHI getIncomingValue() to foreach over incoming_values(). NFC. 2015-05-12 20:05:31 +00:00
CGSCCPassManager.cpp
CMakeLists.txt Move IDF Calculation to a separate file, expose an interface to it. 2015-04-21 19:13:02 +00:00
CodeMetrics.cpp
ConstantFolding.cpp [ConstantFolding] Fix wrong folding of intrinsic 'convert.from.fp16'. 2015-05-14 18:01:48 +00:00
CostModel.cpp
Delinearization.cpp
DependenceAnalysis.cpp [DependenceAnalysis] Fix for PR21585: collectUpperBound triggers asserts 2015-05-15 12:17:22 +00:00
DivergenceAnalysis.cpp Divergence analysis for GPU programs 2015-04-10 05:03:50 +00:00
DominanceFrontier.cpp
DomPrinter.cpp
InstCount.cpp
InstructionSimplify.cpp [InstSimplify] Handle some overflow intrinsics in InstSimplify 2015-05-22 03:56:46 +00:00
Interval.cpp
IntervalPartition.cpp
IteratedDominanceFrontier.cpp Move IDF Calculation to a separate file, expose an interface to it. 2015-04-21 19:13:02 +00:00
IVUsers.cpp
LazyCallGraph.cpp
LazyValueInfo.cpp
LibCallAliasAnalysis.cpp
LibCallSemantics.cpp [WinEH] Start EH preparation for 32-bit x86, it uses no arguments 2015-04-29 22:49:54 +00:00
Lint.cpp
LLVMBuild.txt
Loads.cpp
LoopAccessAnalysis.cpp [LoopAccesses] If shouldRetryWithRuntimeCheck, reset InterestingDependences 2015-05-18 15:37:03 +00:00
LoopInfo.cpp Add llvm::all_of which wraps std::all_of. 2015-05-13 22:19:13 +00:00
LoopPass.cpp
Makefile
MemDepPrinter.cpp [CallSite] Make construction from Value* (or Instruction*) explicit. 2015-04-10 14:50:08 +00:00
MemDerefPrinter.cpp Test commit. Fix typo in MemDerefPrinter.cpp comment. 2015-05-21 11:57:38 +00:00
MemoryBuiltins.cpp
MemoryDependenceAnalysis.cpp Revamp PredIteratorCache interface to be cleaner. 2015-04-21 21:11:50 +00:00
ModuleDebugInfoPrinter.cpp IR: Give 'DI' prefix to debug info metadata 2015-04-29 16:38:44 +00:00
NoAliasAnalysis.cpp
PHITransAddr.cpp
PostDominators.cpp
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp
RegionPass.cpp Change range-based for-loops to be -Wrange-loop-analysis clean. 2015-04-15 01:21:15 +00:00
RegionPrinter.cpp One more -Wrange-loop-analysis cleanup. 2015-04-15 21:40:50 +00:00
ScalarEvolution.cpp [ScalarEvolution] refactor: extract interface getGEPExpr 2015-05-18 17:03:25 +00:00
ScalarEvolutionAliasAnalysis.cpp
ScalarEvolutionExpander.cpp [SCEV] Strengthen SCEVExpander::isHighCostExpansion. 2015-04-14 03:20:32 +00:00
ScalarEvolutionNormalization.cpp
ScopedNoAliasAA.cpp
SparsePropagation.cpp
StratifiedSets.h
TargetLibraryInfo.cpp Populate list of vectorizable functions for Accelerate library. 2015-05-07 17:11:51 +00:00
TargetTransformInfo.cpp [X86] Disable loop unrolling in loop vectorization pass when VF is 1. 2015-05-06 17:12:25 +00:00
Trace.cpp
TypeBasedAliasAnalysis.cpp
ValueTracking.cpp Reapply r237539 with a fix for the Chromium build. 2015-05-20 18:41:25 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//