llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 20:29:48 +00:00

History

Hao Liu 43be1d53d1 [LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses. Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector. E.g. for (i = 0; i < N; i+=2) { a = A[i]; // load of even element b = A[i+1]; // load of odd element ... // operations on a, b, c, d A[i] = c; // store of even element A[i+1] = d; // store of odd element } The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6> %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7> The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like: %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> store <8 x i32> %interleaved.vec, <8 x i32>* %ptr This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239291 91177308-0d34-0410-b5e6-96231b3b80d8		2015-06-08 06:39:56 +00:00
..
IPA	[inliner] Fix the early-exit of the inline cost analysis to correctly	2015-05-27 02:49:05 +00:00
AliasAnalysis.cpp	[PM/AA] Start refactoring AliasAnalysis to remove the analysis group and	2015-06-04 02:03:15 +00:00
AliasAnalysisCounter.cpp	Use 'override/final' instead of 'virtual' for overridden methods	2015-04-11 02:11:45 +00:00
AliasAnalysisEvaluator.cpp	[PM/AA] Start refactoring AliasAnalysis to remove the analysis group and	2015-06-04 02:03:15 +00:00
AliasDebugger.cpp	Make DataLayout Non-Optional in the Module	2015-03-04 18:43:29 +00:00
AliasSetTracker.cpp	Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types	2015-05-29 19:43:39 +00:00
Analysis.cpp	Divergence analysis for GPU programs	2015-04-10 05:03:50 +00:00
AssumptionCache.cpp	[PM] Actually add the new pass manager support for the assumption cache.	2015-01-22 21:53:09 +00:00
BasicAliasAnalysis.cpp	Revert r236894 "[BasicAA] Fix zext & sext handling"	2015-05-22 01:27:37 +00:00
BlockFrequencyInfo.cpp	Remove superfluous .str() and replace std::string concatenation with Twine.	2015-03-27 17:51:30 +00:00
BlockFrequencyInfoImpl.cpp	Remove 4,096 loop scale limitation.	2015-04-01 17:42:27 +00:00
BranchProbabilityInfo.cpp	Add BranchProbabilityInfo::releaseMemory to clear the Weights field.	2015-05-28 19:43:06 +00:00
CaptureTracking.cpp	[cleanup] Re-sort all the #include lines in LLVM using	2015-01-14 11:23:27 +00:00
CFG.cpp	Standardize {pred,succ,use,user}_empty()	2015-01-13 03:46:47 +00:00
CFGPrinter.cpp	Remove superfluous .str() and replace std::string concatenation with Twine.	2015-03-27 17:51:30 +00:00
CFLAliasAnalysis.cpp	Convert PHI getIncomingValue() to foreach over incoming_values(). NFC.	2015-05-12 20:05:31 +00:00
CGSCCPassManager.cpp	[PM] Remove the defunt CGSCC-specific debug flag.	2015-01-13 22:45:13 +00:00
CMakeLists.txt	[PM/AA] Start refactoring AliasAnalysis to remove the analysis group and	2015-06-04 02:03:15 +00:00
CodeMetrics.cpp	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.	2015-03-23 19:32:43 +00:00
ConstantFolding.cpp	[ConstantFolding] Fix wrong folding of intrinsic 'convert.from.fp16'.	2015-05-14 18:01:48 +00:00
CostModel.cpp	[multiversion] Thread a function argument through all the callers of the	2015-02-01 12:01:35 +00:00
Delinearization.cpp	[PM] Split the LoopInfo object apart from the legacy pass, creating	2015-01-17 14:16:18 +00:00
DependenceAnalysis.cpp	[DependenceAnalysis] Extend unifySubscriptType for handling coupled subscript groups.	2015-05-29 16:58:08 +00:00
DivergenceAnalysis.cpp	Divergence analysis for GPU programs	2015-04-10 05:03:50 +00:00
DominanceFrontier.cpp
DomPrinter.cpp
InstCount.cpp
InstructionSimplify.cpp	[InstCombine, InstSimplify] Move xforms from Combine to Simplify	2015-06-06 22:40:21 +00:00
Interval.cpp
IntervalPartition.cpp
IteratedDominanceFrontier.cpp	Move IDF Calculation to a separate file, expose an interface to it.	2015-04-21 19:13:02 +00:00
IVUsers.cpp	DataLayout is mandatory, update the API to reflect it with references.	2015-03-10 02:37:25 +00:00
LazyCallGraph.cpp	Revert r225854: [PM] Move the LazyCallGraph printing functionality to	2015-01-14 00:27:45 +00:00
LazyValueInfo.cpp	[ConstantRange] Split makeICmpRegion in two.	2015-03-18 00:41:24 +00:00
LibCallAliasAnalysis.cpp	Make DataLayout Non-Optional in the Module	2015-03-04 18:43:29 +00:00
LibCallSemantics.cpp	[WinEH] Start EH preparation for 32-bit x86, it uses no arguments	2015-04-29 22:49:54 +00:00
Lint.cpp	Fix doxygen comments from r232268	2015-03-16 17:49:03 +00:00
LLVMBuild.txt	Update libdeps since TLI was moved from Target to Analysis in r226078.	2015-01-15 05:21:00 +00:00
Loads.cpp	DataLayout is mandatory, update the API to reflect it with references.	2015-03-10 02:37:25 +00:00
LoopAccessAnalysis.cpp	[LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.	2015-06-08 06:39:56 +00:00
LoopInfo.cpp	Add llvm::all_of which wraps std::all_of.	2015-05-13 22:19:13 +00:00
LoopPass.cpp	Purge unused includes throughout libSupport.	2015-03-23 18:07:13 +00:00
Makefile
MemDepPrinter.cpp	[CallSite] Make construction from Value* (or Instruction*) explicit.	2015-04-10 14:50:08 +00:00
MemDerefPrinter.cpp	Test commit. Fix typo in MemDerefPrinter.cpp comment.	2015-05-21 11:57:38 +00:00
MemoryBuiltins.cpp	DataLayout is mandatory, update the API to reflect it with references.	2015-03-10 02:37:25 +00:00
MemoryDependenceAnalysis.cpp	[PM/AA] Start refactoring AliasAnalysis to remove the analysis group and	2015-06-04 02:03:15 +00:00
MemoryLocation.cpp	[PM/AA] Start refactoring AliasAnalysis to remove the analysis group and	2015-06-04 02:03:15 +00:00
ModuleDebugInfoPrinter.cpp	IR: Give 'DI' prefix to debug info metadata	2015-04-29 16:38:44 +00:00
NoAliasAnalysis.cpp	Make DataLayout Non-Optional in the Module	2015-03-04 18:43:29 +00:00
PHITransAddr.cpp	[PHITransAddr] Don't translate unreachable values	2015-06-01 00:15:08 +00:00
PostDominators.cpp
PtrUseVisitor.cpp	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>	2014-11-19 07:49:26 +00:00
README.txt
RegionInfo.cpp	[cleanup] Re-sort all the #include lines in LLVM using	2015-01-14 11:23:27 +00:00
RegionPass.cpp	Change range-based for-loops to be -Wrange-loop-analysis clean.	2015-04-15 01:21:15 +00:00
RegionPrinter.cpp	One more -Wrange-loop-analysis cleanup.	2015-04-15 21:40:50 +00:00
ScalarEvolution.cpp	[ScalarEvolution] refactor: extract interface getGEPExpr	2015-05-18 17:03:25 +00:00
ScalarEvolutionAliasAnalysis.cpp	Make DataLayout Non-Optional in the Module	2015-03-04 18:43:29 +00:00
ScalarEvolutionExpander.cpp	Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types	2015-05-29 19:43:39 +00:00
ScalarEvolutionNormalization.cpp	Fix typos in comments, NFC	2014-08-29 21:53:01 +00:00
ScopedNoAliasAA.cpp	Make DataLayout Non-Optional in the Module	2015-03-04 18:43:29 +00:00
SparsePropagation.cpp
StratifiedSets.h	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>	2014-11-19 07:49:26 +00:00
TargetLibraryInfo.cpp	Populate list of vectorizable functions for Accelerate library.	2015-05-07 17:11:51 +00:00
TargetTransformInfo.cpp	[LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.	2015-06-08 06:39:56 +00:00
Trace.cpp
TypeBasedAliasAnalysis.cpp	Teach TBAA analysis to report errors on cyclic TBAA metadata rather than hanging.	2015-03-13 07:09:33 +00:00
ValueTracking.cpp	Reapply r237539 with a fix for the Chromium build.	2015-05-20 18:41:25 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//