mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2024-11-04 06:09:05 +00:00
c9f86c1260
Summary: [NVPTX] make load on global readonly memory to use ldg Summary: As describe in [1], ld.global.nc may be used to load memory by nvcc when __restrict__ is used and compiler can detect whether read-only data cache is safe to use. This patch will try to check whether ldg is safe to use and use them to replace ld.global when possible. This change can improve the performance by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in S3D benchmark of shoc [2]. Patched by Xuetian Weng. [1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache [2] https://github.com/vetter/shoc Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll Reviewers: jholewinski, jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D11314 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242713 91177308-0d34-0410-b5e6-96231b3b80d8 |
||
---|---|---|
.. | ||
AArch64 | ||
AMDGPU | ||
ARM | ||
BPF | ||
CPP | ||
Generic | ||
Hexagon | ||
Inputs | ||
Mips | ||
MIR | ||
MSP430 | ||
NVPTX | ||
PowerPC | ||
SPARC | ||
SystemZ | ||
Thumb | ||
Thumb2 | ||
WebAssembly | ||
WinEH | ||
X86 | ||
XCore |