llvm-6502/lib/Analysis
Hao Liu 43be1d53d1 [LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
       a = A[i];         // load of even element
       b = A[i+1];       // load of odd element
       ...               // operations on a, b, c, d
       A[i] = c;         // store of even element
       A[i+1] = d;       // store of odd element
     }

  The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
     %wide.vec = load <8 x i32>, <8 x i32>* %ptr
     %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
     %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>

  The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
     %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> 
     store <8 x i32> %interleaved.vec, <8 x i32>* %ptr

This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239291 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 06:39:56 +00:00
..
IPA [inliner] Fix the early-exit of the inline cost analysis to correctly 2015-05-27 02:49:05 +00:00
AliasAnalysis.cpp [PM/AA] Start refactoring AliasAnalysis to remove the analysis group and 2015-06-04 02:03:15 +00:00
AliasAnalysisCounter.cpp Use 'override/final' instead of 'virtual' for overridden methods 2015-04-11 02:11:45 +00:00
AliasAnalysisEvaluator.cpp [PM/AA] Start refactoring AliasAnalysis to remove the analysis group and 2015-06-04 02:03:15 +00:00
AliasDebugger.cpp Make DataLayout Non-Optional in the Module 2015-03-04 18:43:29 +00:00
AliasSetTracker.cpp Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types 2015-05-29 19:43:39 +00:00
Analysis.cpp Divergence analysis for GPU programs 2015-04-10 05:03:50 +00:00
AssumptionCache.cpp [PM] Actually add the new pass manager support for the assumption cache. 2015-01-22 21:53:09 +00:00
BasicAliasAnalysis.cpp Revert r236894 "[BasicAA] Fix zext & sext handling" 2015-05-22 01:27:37 +00:00
BlockFrequencyInfo.cpp Remove superfluous .str() and replace std::string concatenation with Twine. 2015-03-27 17:51:30 +00:00
BlockFrequencyInfoImpl.cpp Remove 4,096 loop scale limitation. 2015-04-01 17:42:27 +00:00
BranchProbabilityInfo.cpp Add BranchProbabilityInfo::releaseMemory to clear the Weights field. 2015-05-28 19:43:06 +00:00
CaptureTracking.cpp [cleanup] Re-sort all the #include lines in LLVM using 2015-01-14 11:23:27 +00:00
CFG.cpp Standardize {pred,succ,use,user}_empty() 2015-01-13 03:46:47 +00:00
CFGPrinter.cpp Remove superfluous .str() and replace std::string concatenation with Twine. 2015-03-27 17:51:30 +00:00
CFLAliasAnalysis.cpp Convert PHI getIncomingValue() to foreach over incoming_values(). NFC. 2015-05-12 20:05:31 +00:00
CGSCCPassManager.cpp [PM] Remove the defunt CGSCC-specific debug flag. 2015-01-13 22:45:13 +00:00
CMakeLists.txt [PM/AA] Start refactoring AliasAnalysis to remove the analysis group and 2015-06-04 02:03:15 +00:00
CodeMetrics.cpp Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. 2015-03-23 19:32:43 +00:00
ConstantFolding.cpp [ConstantFolding] Fix wrong folding of intrinsic 'convert.from.fp16'. 2015-05-14 18:01:48 +00:00
CostModel.cpp [multiversion] Thread a function argument through all the callers of the 2015-02-01 12:01:35 +00:00
Delinearization.cpp [PM] Split the LoopInfo object apart from the legacy pass, creating 2015-01-17 14:16:18 +00:00
DependenceAnalysis.cpp [DependenceAnalysis] Extend unifySubscriptType for handling coupled subscript groups. 2015-05-29 16:58:08 +00:00
DivergenceAnalysis.cpp Divergence analysis for GPU programs 2015-04-10 05:03:50 +00:00
DominanceFrontier.cpp
DomPrinter.cpp
InstCount.cpp
InstructionSimplify.cpp [InstCombine, InstSimplify] Move xforms from Combine to Simplify 2015-06-06 22:40:21 +00:00
Interval.cpp
IntervalPartition.cpp
IteratedDominanceFrontier.cpp Move IDF Calculation to a separate file, expose an interface to it. 2015-04-21 19:13:02 +00:00
IVUsers.cpp DataLayout is mandatory, update the API to reflect it with references. 2015-03-10 02:37:25 +00:00
LazyCallGraph.cpp Revert r225854: [PM] Move the LazyCallGraph printing functionality to 2015-01-14 00:27:45 +00:00
LazyValueInfo.cpp [ConstantRange] Split makeICmpRegion in two. 2015-03-18 00:41:24 +00:00
LibCallAliasAnalysis.cpp Make DataLayout Non-Optional in the Module 2015-03-04 18:43:29 +00:00
LibCallSemantics.cpp [WinEH] Start EH preparation for 32-bit x86, it uses no arguments 2015-04-29 22:49:54 +00:00
Lint.cpp Fix doxygen comments from r232268 2015-03-16 17:49:03 +00:00
LLVMBuild.txt Update libdeps since TLI was moved from Target to Analysis in r226078. 2015-01-15 05:21:00 +00:00
Loads.cpp DataLayout is mandatory, update the API to reflect it with references. 2015-03-10 02:37:25 +00:00
LoopAccessAnalysis.cpp [LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses. 2015-06-08 06:39:56 +00:00
LoopInfo.cpp Add llvm::all_of which wraps std::all_of. 2015-05-13 22:19:13 +00:00
LoopPass.cpp Purge unused includes throughout libSupport. 2015-03-23 18:07:13 +00:00
Makefile
MemDepPrinter.cpp [CallSite] Make construction from Value* (or Instruction*) explicit. 2015-04-10 14:50:08 +00:00
MemDerefPrinter.cpp Test commit. Fix typo in MemDerefPrinter.cpp comment. 2015-05-21 11:57:38 +00:00
MemoryBuiltins.cpp DataLayout is mandatory, update the API to reflect it with references. 2015-03-10 02:37:25 +00:00
MemoryDependenceAnalysis.cpp [PM/AA] Start refactoring AliasAnalysis to remove the analysis group and 2015-06-04 02:03:15 +00:00
MemoryLocation.cpp [PM/AA] Start refactoring AliasAnalysis to remove the analysis group and 2015-06-04 02:03:15 +00:00
ModuleDebugInfoPrinter.cpp IR: Give 'DI' prefix to debug info metadata 2015-04-29 16:38:44 +00:00
NoAliasAnalysis.cpp Make DataLayout Non-Optional in the Module 2015-03-04 18:43:29 +00:00
PHITransAddr.cpp [PHITransAddr] Don't translate unreachable values 2015-06-01 00:15:08 +00:00
PostDominators.cpp
PtrUseVisitor.cpp Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> 2014-11-19 07:49:26 +00:00
README.txt
RegionInfo.cpp [cleanup] Re-sort all the #include lines in LLVM using 2015-01-14 11:23:27 +00:00
RegionPass.cpp Change range-based for-loops to be -Wrange-loop-analysis clean. 2015-04-15 01:21:15 +00:00
RegionPrinter.cpp One more -Wrange-loop-analysis cleanup. 2015-04-15 21:40:50 +00:00
ScalarEvolution.cpp [ScalarEvolution] refactor: extract interface getGEPExpr 2015-05-18 17:03:25 +00:00
ScalarEvolutionAliasAnalysis.cpp Make DataLayout Non-Optional in the Module 2015-03-04 18:43:29 +00:00
ScalarEvolutionExpander.cpp Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types 2015-05-29 19:43:39 +00:00
ScalarEvolutionNormalization.cpp Fix typos in comments, NFC 2014-08-29 21:53:01 +00:00
ScopedNoAliasAA.cpp Make DataLayout Non-Optional in the Module 2015-03-04 18:43:29 +00:00
SparsePropagation.cpp
StratifiedSets.h Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> 2014-11-19 07:49:26 +00:00
TargetLibraryInfo.cpp Populate list of vectorizable functions for Accelerate library. 2015-05-07 17:11:51 +00:00
TargetTransformInfo.cpp [LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses. 2015-06-08 06:39:56 +00:00
Trace.cpp
TypeBasedAliasAnalysis.cpp Teach TBAA analysis to report errors on cyclic TBAA metadata rather than hanging. 2015-03-13 07:09:33 +00:00
ValueTracking.cpp Reapply r237539 with a fix for the Chromium build. 2015-05-20 18:41:25 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//