llvm-6502/lib/Analysis
Arnold Schwaighofer 2ced33808e SCEVExpander: Try hard not to create derived induction variables in other loops
During LSR of one loop we can run into a situation where we have to expand the
start of a recurrence of a loop induction variable in this loop. This start
value is a value derived of the induction variable of a preceeding loop. SCEV
has cannonicalized this value to a different recurrence than the recurrence of
the preceeding loop's induction variable (the type and/or step direction) has
changed). When we come to instantiate this SCEV we created a second induction
variable in this preceeding loop.  This patch tries to base such derived
induction variables of the preceeding loop's induction variable.

This helps twolf on arm and seems to help scimark2 on x86.

Reapply with a fix for the case of a value derived from a pointer.

radar://15970709

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201496 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-16 15:49:50 +00:00
..
IPA GlobalsModRef: Unify and clean up duplicated pointer analysis code. 2014-02-10 14:17:30 +00:00
AliasAnalysis.cpp [cleanup] Move the Dominators.h and Verifier.h headers into the IR 2014-01-13 09:26:24 +00:00
AliasAnalysisCounter.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
AliasAnalysisEvaluator.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
AliasDebugger.cpp
AliasSetTracker.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
Analysis.cpp [PM] Make the verifier work independently of any pass manager. 2014-01-19 02:22:18 +00:00
BasicAliasAnalysis.cpp Fix known typos 2014-01-24 17:20:08 +00:00
BlockFrequencyInfo.cpp BlockFrequencyInfo: Readded getEntryFreq. 2013-12-20 22:11:11 +00:00
BranchProbabilityInfo.cpp
CaptureTracking.cpp Make nocapture analysis work with addrspacecast 2014-01-14 19:11:52 +00:00
CFG.cpp Make succ_iterator a real random access iterator and clean up a couple of users. 2014-02-10 14:17:42 +00:00
CFGPrinter.cpp
CMakeLists.txt [PM] Add a new "lazy" call graph analysis pass for the new pass manager. 2014-02-06 04:37:03 +00:00
CodeMetrics.cpp
ConstantFolding.cpp
CostModel.cpp Reduce code duplication resulting from the ConstantVector/ConstantDataVector split. 2014-02-13 16:48:38 +00:00
Delinearization.cpp Re-sort all of the includes with ./utils/sort_includes.py so that 2014-01-07 11:48:04 +00:00
DependenceAnalysis.cpp Fix known typos 2014-01-24 17:20:08 +00:00
DominanceFrontier.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
DomPrinter.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
InstCount.cpp
InstructionSimplify.cpp InstSimplify: Make shift, select and GEP simplifications vector-aware. 2014-01-24 17:09:53 +00:00
Interval.cpp
IntervalPartition.cpp
IVUsers.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
LazyCallGraph.cpp [PM] Fix horrible typos that somehow didn't cause a failure in a C++11 2014-02-06 05:17:02 +00:00
LazyValueInfo.cpp
LibCallAliasAnalysis.cpp
LibCallSemantics.cpp
Lint.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
LLVMBuild.txt
Loads.cpp
LoopInfo.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
LoopPass.cpp Disable most IR-level transform passes on functions marked 'optnone'. 2014-02-06 00:07:05 +00:00
Makefile
MemDepPrinter.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
MemoryBuiltins.cpp Update optimization passes to handle inalloca arguments 2014-01-28 02:38:36 +00:00
MemoryDependenceAnalysis.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
ModuleDebugInfoPrinter.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
NoAliasAnalysis.cpp
PHITransAddr.cpp [cleanup] Move the Dominators.h and Verifier.h headers into the IR 2014-01-13 09:26:24 +00:00
PostDominators.cpp [PM] Pull the generic graph algorithms and data structures for dominator 2014-01-13 10:52:56 +00:00
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp [PM] Split DominatorTree into a concrete analysis result object which 2014-01-13 13:07:17 +00:00
RegionPass.cpp
RegionPrinter.cpp
ScalarEvolution.cpp SCEV: Cast switched values to make -Wswitch more useful. 2014-02-11 19:02:55 +00:00
ScalarEvolutionAliasAnalysis.cpp
ScalarEvolutionExpander.cpp SCEVExpander: Try hard not to create derived induction variables in other loops 2014-02-16 15:49:50 +00:00
ScalarEvolutionNormalization.cpp [cleanup] Move the Dominators.h and Verifier.h headers into the IR 2014-01-13 09:26:24 +00:00
SparsePropagation.cpp
TargetTransformInfo.cpp Make succ_iterator a real random access iterator and clean up a couple of users. 2014-02-10 14:17:42 +00:00
Trace.cpp Put the functionality for printing a value to a raw_ostream as an 2014-01-09 02:29:41 +00:00
TypeBasedAliasAnalysis.cpp
ValueTracking.cpp Allow speculating llvm.sqrt, fma and fmuladd 2014-01-31 00:09:00 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//