llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-10-04 16:01:46 +00:00

History

Chandler Carruth 0bdc7cd5de [inliner] Completely change (and fix) how the inline cost analysis handles terminator instructions. The inline cost analysis inheritted some pretty rough handling of terminator insts from the original cost analysis, and then made it much, much worse by factoring all of the important analyses into a separate instruction visitor. That instruction visitor never visited the terminator. This works fine for things like conditional branches, but for many other things we simply computed The Wrong Value. First example are unconditional branches, which should be free but were counted as full cost. This is most significant for conditional branches where the condition simplifies and folds during inlining. We paid a 1 instruction tax on every branch in a straight line specialized path. =[ Oh, we also claimed that the unreachable instruction had cost. But it gets worse. Let's consider invoke. We never applied the call penalty. We never accounted for the cost of the arguments. Nope. Worse still, we didn't handle the correctness constraints of not inlining recursive invokes, or exception throwing returns_twice functions. Oops. See PR18206. Sadly, PR18206 requires yet another fix, but this refactoring is at least a huge step in that direction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197215 91177308-0d34-0410-b5e6-96231b3b80d8		2013-12-13 07:59:56 +00:00
..
IPA	[inliner] Completely change (and fix) how the inline cost analysis	2013-12-13 07:59:56 +00:00
AliasAnalysis.cpp
AliasAnalysisCounter.cpp
AliasAnalysisEvaluator.cpp
AliasDebugger.cpp
AliasSetTracker.cpp
Analysis.cpp	delinearization of arrays	2013-11-12 22:47:20 +00:00
BasicAliasAnalysis.cpp	Use correct size for address space in BasicAA.	2013-11-16 00:36:43 +00:00
BlockFrequencyInfo.cpp	Added BlockFrequencyInfo::view for displaying the block frequency propagation graph via graphviz.	2013-11-14 02:27:46 +00:00
BranchProbabilityInfo.cpp
CaptureTracking.cpp
CFG.cpp
CFGPrinter.cpp
CMakeLists.txt	delinearization of arrays	2013-11-12 22:47:20 +00:00
CodeMetrics.cpp
ConstantFolding.cpp	Add addrspacecast instruction.	2013-11-15 01:34:59 +00:00
CostModel.cpp
Delinearization.cpp	add more comments around the delinearization of arrays	2013-11-13 22:37:58 +00:00
DependenceAnalysis.cpp	add more comments around the delinearization of arrays	2013-11-13 22:37:58 +00:00
DominanceFrontier.cpp
DomPrinter.cpp
InstCount.cpp
InstructionSimplify.cpp
Interval.cpp
IntervalPartition.cpp
IVUsers.cpp
LazyValueInfo.cpp
LibCallAliasAnalysis.cpp
LibCallSemantics.cpp
Lint.cpp
LLVMBuild.txt
Loads.cpp
LoopInfo.cpp	Simplify code. No functionality change.	2013-11-13 20:18:38 +00:00
LoopPass.cpp
Makefile
MemDepPrinter.cpp	Fix typo.	2013-12-04 23:55:09 +00:00
MemoryBuiltins.cpp
MemoryDependenceAnalysis.cpp	Fixing a heisenbug where the memory dependence analysis behaves differently	2013-11-14 01:10:52 +00:00
ModuleDebugInfoPrinter.cpp
NoAliasAnalysis.cpp
PHITransAddr.cpp	Correct word hyphenations	2013-12-05 05:44:44 +00:00
PostDominators.cpp
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp
RegionPass.cpp
RegionPrinter.cpp
ScalarEvolution.cpp	Annotate APInt methods where it's not clear whether they are in place with warn_unused_result.	2013-11-16 16:25:41 +00:00
ScalarEvolutionAliasAnalysis.cpp
ScalarEvolutionExpander.cpp	Don't #include heavy Dominators.h file in LoopInfo.h. This change reduces	2013-12-07 21:20:17 +00:00
ScalarEvolutionNormalization.cpp
SparsePropagation.cpp
TargetTransformInfo.cpp
Trace.cpp
TypeBasedAliasAnalysis.cpp
ValueTracking.cpp	Don't speculate loads under ThreadSanitizer	2013-11-21 07:29:28 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//