llvm-6502/lib/Analysis
Chandler Carruth 0bdc7cd5de [inliner] Completely change (and fix) how the inline cost analysis
handles terminator instructions.

The inline cost analysis inheritted some pretty rough handling of
terminator insts from the original cost analysis, and then made it much,
much worse by factoring all of the important analyses into a separate
instruction visitor. That instruction visitor never visited the
terminator.

This works fine for things like conditional branches, but for many other
things we simply computed The Wrong Value. First example are
unconditional branches, which should be free but were counted as full
cost. This is most significant for conditional branches where the
condition simplifies and folds during inlining. We paid a 1 instruction
tax on every branch in a straight line specialized path. =[

Oh, we also claimed that the unreachable instruction had cost.

But it gets worse. Let's consider invoke. We never applied the call
penalty. We never accounted for the cost of the arguments. Nope. Worse
still, we didn't handle the *correctness* constraints of not inlining
recursive invokes, or exception throwing returns_twice functions. Oops.
See PR18206. Sadly, PR18206 requires yet another fix, but this
refactoring is at least a huge step in that direction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197215 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-13 07:59:56 +00:00
..
IPA [inliner] Completely change (and fix) how the inline cost analysis 2013-12-13 07:59:56 +00:00
AliasAnalysis.cpp
AliasAnalysisCounter.cpp
AliasAnalysisEvaluator.cpp
AliasDebugger.cpp
AliasSetTracker.cpp
Analysis.cpp delinearization of arrays 2013-11-12 22:47:20 +00:00
BasicAliasAnalysis.cpp Use correct size for address space in BasicAA. 2013-11-16 00:36:43 +00:00
BlockFrequencyInfo.cpp Added BlockFrequencyInfo::view for displaying the block frequency propagation graph via graphviz. 2013-11-14 02:27:46 +00:00
BranchProbabilityInfo.cpp
CaptureTracking.cpp
CFG.cpp
CFGPrinter.cpp
CMakeLists.txt delinearization of arrays 2013-11-12 22:47:20 +00:00
CodeMetrics.cpp
ConstantFolding.cpp Add addrspacecast instruction. 2013-11-15 01:34:59 +00:00
CostModel.cpp
Delinearization.cpp add more comments around the delinearization of arrays 2013-11-13 22:37:58 +00:00
DependenceAnalysis.cpp add more comments around the delinearization of arrays 2013-11-13 22:37:58 +00:00
DominanceFrontier.cpp
DomPrinter.cpp
InstCount.cpp
InstructionSimplify.cpp
Interval.cpp
IntervalPartition.cpp
IVUsers.cpp
LazyValueInfo.cpp
LibCallAliasAnalysis.cpp
LibCallSemantics.cpp
Lint.cpp
LLVMBuild.txt
Loads.cpp
LoopInfo.cpp Simplify code. No functionality change. 2013-11-13 20:18:38 +00:00
LoopPass.cpp
Makefile
MemDepPrinter.cpp Fix typo. 2013-12-04 23:55:09 +00:00
MemoryBuiltins.cpp
MemoryDependenceAnalysis.cpp Fixing a heisenbug where the memory dependence analysis behaves differently 2013-11-14 01:10:52 +00:00
ModuleDebugInfoPrinter.cpp
NoAliasAnalysis.cpp
PHITransAddr.cpp Correct word hyphenations 2013-12-05 05:44:44 +00:00
PostDominators.cpp
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp
RegionPass.cpp
RegionPrinter.cpp
ScalarEvolution.cpp Annotate APInt methods where it's not clear whether they are in place with warn_unused_result. 2013-11-16 16:25:41 +00:00
ScalarEvolutionAliasAnalysis.cpp
ScalarEvolutionExpander.cpp Don't #include heavy Dominators.h file in LoopInfo.h. This change reduces 2013-12-07 21:20:17 +00:00
ScalarEvolutionNormalization.cpp
SparsePropagation.cpp
TargetTransformInfo.cpp
Trace.cpp
TypeBasedAliasAnalysis.cpp
ValueTracking.cpp Don't speculate loads under ThreadSanitizer 2013-11-21 07:29:28 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//