ARM cost model: Add cost for gather/scather

Fixes a 35% degradation compared to unvectorized code in
MiBench/automotive-susan and an equally serious regression on a private
image processing benchmark.

radar://14351991

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186188 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Arnold Schwaighofer
2013-07-12 19:16:04 +00:00
parent c0a11edba6
commit 4a1c764264
2 changed files with 97 additions and 0 deletions

View File

@@ -426,6 +426,15 @@ unsigned ARMTTI::getCmpSelInstrCost(unsigned Opcode, Type *ValTy,
}
unsigned ARMTTI::getAddressComputationCost(Type *Ty, bool IsComplex) const {
// Address computations in vectorized code with non-consecutive addresses will
// likely result in more instructions compared to scalar code where the
// computation can more often be merged into the index mode. The resulting
// extra micro-ops can significantly decrease throughput.
unsigned NumVectorInstToHideOverhead = 10;
if (Ty->isVectorTy() && IsComplex)
return NumVectorInstToHideOverhead;
// In many cases the address computation is not merged into the instruction
// addressing mode.
return 1;