llvm-6502/lib/Analysis
Jingyue Wu 7d4d116067 [SCEV] Apply NSW and NUW flags via poison value analysis
Summary:
Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.

There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial.

This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this:

  for (int i = 0; i < limit; ++i) {
    sum += ptr[i + offset];
  }

Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.


There are more details in this discussion on llvmdev.
June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234
July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392

Patch by Bjarke Roune

Reviewers: eliben, atrick, sanjoy

Subscribers: majnemer, hfinkel, jingyue, meheff, llvm-commits

Differential Revision: http://reviews.llvm.org/D11212

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243460 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-28 18:22:40 +00:00
..
IPA [GMR] Teach GlobalsModRef to distinguish an important and safe case of 2015-07-28 11:11:11 +00:00
AliasAnalysis.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
AliasAnalysisCounter.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
AliasAnalysisEvaluator.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
AliasDebugger.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
AliasSetTracker.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
Analysis.cpp Create a wrapper pass for BranchProbabilityInfo. 2015-07-15 22:48:29 +00:00
AssumptionCache.cpp
BasicAliasAnalysis.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
BlockFrequencyInfo.cpp Add new constructors for LoopInfo/DominatorTree/BFI/BPI 2015-07-16 23:23:35 +00:00
BlockFrequencyInfoImpl.cpp
BranchProbabilityInfo.cpp Create a wrapper pass for BranchProbabilityInfo. 2015-07-15 22:48:29 +00:00
CaptureTracking.cpp [CaptureTracking] Avoid long compilation time on large basic blocks 2015-06-24 17:53:17 +00:00
CFG.cpp [CaptureTracking] Avoid long compilation time on large basic blocks 2015-06-24 17:53:17 +00:00
CFGPrinter.cpp
CFLAliasAnalysis.cpp
CGSCCPassManager.cpp
CMakeLists.txt Move VectorUtils from Transforms to Analysis to correct layering violation 2015-06-26 18:02:52 +00:00
CodeMetrics.cpp
ConstantFolding.cpp Fix assert when inlining a constantexpr addrspacecast 2015-07-27 18:31:03 +00:00
CostModel.cpp Roll forward r243250 2015-07-26 19:10:03 +00:00
Delinearization.cpp Move delinearization from SCEVAddRecExpr to ScalarEvolution 2015-06-29 14:42:48 +00:00
DependenceAnalysis.cpp Move delinearization from SCEVAddRecExpr to ScalarEvolution 2015-06-29 14:42:48 +00:00
DivergenceAnalysis.cpp
DominanceFrontier.cpp
DomPrinter.cpp
InstCount.cpp
InstructionSimplify.cpp [InstSimplify] Teach InstSimplify how to simplify extractelement 2015-07-13 01:15:53 +00:00
Interval.cpp
IntervalPartition.cpp
IteratedDominanceFrontier.cpp
IVUsers.cpp [LSR] don't attempt to promote ephemeral values to indvars 2015-07-13 03:28:53 +00:00
LazyCallGraph.cpp
LazyValueInfo.cpp [LVI] Cleanup whitespaces. NFC 2015-07-28 15:53:21 +00:00
LibCallAliasAnalysis.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
LibCallSemantics.cpp
Lint.cpp
LLVMBuild.txt
Loads.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
LoopAccessAnalysis.cpp [LAA] Add clarifying comments for the checking pointer grouping algorithm. NFC 2015-07-28 13:44:08 +00:00
LoopInfo.cpp Add new constructors for LoopInfo/DominatorTree/BFI/BPI 2015-07-16 23:23:35 +00:00
LoopPass.cpp
Makefile
MemDepPrinter.cpp
MemDerefPrinter.cpp
MemoryBuiltins.cpp
MemoryDependenceAnalysis.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
MemoryLocation.cpp
ModuleDebugInfoPrinter.cpp
NoAliasAnalysis.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
PHITransAddr.cpp
PostDominators.cpp
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp
RegionPass.cpp
RegionPrinter.cpp
ScalarEvolution.cpp [SCEV] Apply NSW and NUW flags via poison value analysis 2015-07-28 18:22:40 +00:00
ScalarEvolutionAliasAnalysis.cpp
ScalarEvolutionExpander.cpp [LSR] canonicalize Prod*(1<<C) to Prod<<C 2015-06-24 19:28:40 +00:00
ScalarEvolutionNormalization.cpp
ScopedNoAliasAA.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
SparsePropagation.cpp
StratifiedSets.h
TargetLibraryInfo.cpp
TargetTransformInfo.cpp [TargetTransformInfo][NFCI] Add TargetTransformInfo::isZExtFree. 2015-07-27 23:27:43 +00:00
Trace.cpp
TypeBasedAliasAnalysis.cpp [PM/AA] Extract the ModRef enums from the AliasAnalysis class in 2015-07-22 23:15:57 +00:00
ValueTracking.cpp [SCEV] Apply NSW and NUW flags via poison value analysis 2015-07-28 18:22:40 +00:00
VectorUtils.cpp [InstSimplify] Teach InstSimplify how to simplify extractelement 2015-07-13 01:15:53 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//