mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2024-12-14 11:32:34 +00:00
43be1d53d1
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector. E.g. for (i = 0; i < N; i+=2) { a = A[i]; // load of even element b = A[i+1]; // load of odd element ... // operations on a, b, c, d A[i] = c; // store of even element A[i+1] = d; // store of odd element } The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6> %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7> The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like: %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> store <8 x i32> %interleaved.vec, <8 x i32>* %ptr This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239291 91177308-0d34-0410-b5e6-96231b3b80d8 |
||
---|---|---|
.. | ||
ADCE | ||
AddDiscriminators | ||
AlignmentFromAssumptions | ||
ArgumentPromotion | ||
AtomicExpand/ARM | ||
BBVectorize | ||
BDCE | ||
BranchFolding | ||
CodeExtractor | ||
CodeGenPrepare | ||
ConstantHoisting | ||
ConstantMerge | ||
ConstProp | ||
CorrelatedValuePropagation | ||
DeadArgElim | ||
DeadStoreElimination | ||
EarlyCSE | ||
Float2Int | ||
FunctionAttrs | ||
GCOVProfiling | ||
GlobalDCE | ||
GlobalOpt | ||
GVN | ||
IndVarSimplify | ||
Inline | ||
InstCombine | ||
InstMerge | ||
InstSimplify | ||
Internalize | ||
IPConstantProp | ||
IRCE | ||
JumpThreading | ||
LCSSA | ||
LICM | ||
LoadCombine | ||
LoopDeletion | ||
LoopDistribute | ||
LoopIdiom | ||
LoopInterchange | ||
LoopReroll | ||
LoopRotate | ||
LoopSimplify | ||
LoopStrengthReduce | ||
LoopUnroll | ||
LoopUnswitch | ||
LoopVectorize | ||
LowerAtomic | ||
LowerBitSets | ||
LowerExpectIntrinsic | ||
LowerInvoke | ||
LowerSwitch | ||
Mem2Reg | ||
MemCpyOpt | ||
MergeFunc | ||
MetaRenamer | ||
NaryReassociate | ||
ObjCARC | ||
PartiallyInlineLibCalls | ||
PhaseOrdering | ||
PlaceSafepoints | ||
PruneEH | ||
Reassociate | ||
Reg2Mem | ||
RewriteStatepointsForGC | ||
SampleProfile | ||
Scalarizer | ||
ScalarRepl | ||
SCCP | ||
SeparateConstOffsetFromGEP | ||
SimplifyCFG | ||
Sink | ||
SLPVectorizer | ||
SpeculativeExecution | ||
SROA | ||
StraightLineStrengthReduce | ||
StripSymbols | ||
StructurizeCFG | ||
TailCallElim | ||
TailDup | ||
Util |