mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-02-23 05:29:23 +00:00
Summary: Straight-line optimizations can simplify the loop body and make LSR's cost analysis more precise. This significantly improves several Eigen3 CUDA benchmarks. With this change, EigenContractionKernel runs up to 40% faster (753ceee5f2/unsupported/Eigen/CXX11/src/Tensor/TensorContractionCuda.h (cl-502)
). EigenConvolutionKernel2D runs up to 10% faster (753ceee5f2/unsupported/Eigen/CXX11/src/Tensor/TensorConvolution.h (cl-605)
). I have some difficulties writing small tests that benefit from this reordering due to a seemingly issue with LSR (being discussed at http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088244.html). See the review thread for the compilation time impact of GVN. Reviewers: eliben, jholewinski Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11304 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242982 91177308-0d34-0410-b5e6-96231b3b80d8