llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 20:29:48 +00:00

History

Arnold Schwaighofer 5f0d9dbdf4 X86 cost model: Adjust cost for custom lowered vector multiplies This matters for example in following matrix multiply: int mmult(int rows, int cols, int m1, int m2, int m3) { int i, j, k, val; for (i=0; i<rows; i++) { for (j=0; j<cols; j++) { val = 0; for (k=0; k<cols; k++) { val += m1[i][k] * m2[k][j]; } m3[i][j] = val; } } return(m3); } Taken from the test-suite benchmark Shootout. We estimate the cost of the multiply to be 2 while we generate 9 instructions for it and end up being quite a bit slower than the scalar version (48% on my machine). Also, properly differentiate between avx1 and avx2. On avx-1 we still split the vector into 2 128bits and handle the subvector muls like above with 9 instructions. Only on avx-2 will we have a cost of 9 for v4i64. I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an add instead of a mul because with a mul we now no longer vectorize. I did verify that the mul would be indeed more expensive when vectorized with 3 kernels: for (i ...) r += a[i] * 3; for (i ...) m1[i] = m1[i] * 3; // This matches the test case in avx1.ll and a matrix multiply. In each case the vectorized version was considerably slower. radar://13304919 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176403 91177308-0d34-0410-b5e6-96231b3b80d8		2013-03-02 04:02:52 +00:00
..
BasicAA	Use references to attribute groups on the call/invoke instructions.	2013-02-22 09:09:42 +00:00
BlockFrequencyInfo	Replace all instances of dg.exp file with lit.local.cfg, since all tests are run with LIT now and now Dejagnu. dg.exp is no longer needed.	2012-02-16 06:28:33 +00:00
BranchProbabilityInfo	BranchProb: modify the definition of an edge in BranchProbabilityInfo to handle	2012-08-24 18:14:27 +00:00
CallGraph	Now that invoke of an intrinsic is possible (for the llvm.do.nothing intrinsic)	2012-09-26 17:16:01 +00:00
CostModel	X86 cost model: Adjust cost for custom lowered vector multiplies	2013-03-02 04:02:52 +00:00
DependenceAnalysis	Modified dump() to provide a little	2012-11-30 00:44:47 +00:00
Dominators	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID	2012-12-30 02:33:22 +00:00
GlobalsModRef	MemoryDependenceAnalysis attempts to find the first memory dependency for function calls.	2012-08-13 23:03:43 +00:00
LoopInfo	Convert all tests using TCL-style quoting to use shell-style quoting.	2012-07-02 12:47:22 +00:00
PostDominators	Replace all instances of dg.exp file with lit.local.cfg, since all tests are run with LIT now and now Dejagnu. dg.exp is no longer needed.	2012-02-16 06:28:33 +00:00
Profiling	AArch64: adjust tests which rely on a default JIT	2013-02-18 11:08:37 +00:00
RegionInfo	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID	2012-12-30 02:33:22 +00:00
ScalarEvolution	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID	2012-12-30 02:33:22 +00:00
TypeBasedAliasAnalysis	Use references to attribute groups on the call/invoke instructions.	2013-02-22 09:09:42 +00:00