llvm-6502/test/CodeGen
Jingyue Wu c9f86c1260 [NVPTX] make load on global readonly memory to use ldg
Summary:
[NVPTX] make load on global readonly memory to use ldg

Summary:
As describe in [1], ld.global.nc may be used to load memory by nvcc when
__restrict__ is used and compiler can detect whether read-only data cache
is safe to use.

This patch will try to check whether ldg is safe to use and use them to
replace ld.global when possible. This change can improve the performance
by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in 
S3D benchmark of shoc [2]. 

Patched by Xuetian Weng. 

[1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache
[2] https://github.com/vetter/shoc

Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll

Reviewers: jholewinski, jingyue

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D11314

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242713 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-20 21:28:54 +00:00
..
AArch64 [AArch64] Change EON pattern to match more often. 2015-07-20 18:42:27 +00:00
AMDGPU AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops 2015-07-20 14:28:41 +00:00
ARM ARM: Enable MachineScheduler and disable PostRAScheduler for swift. 2015-07-17 23:18:30 +00:00
BPF
CPP
Generic
Hexagon [Hexagon] Generate MUX from conditional transfers when dot-new not possible 2015-07-20 21:23:25 +00:00
Inputs
Mips [SDAG] Optimize unordered comparison in soft-float mode (patch by Anton Nadolskiy) 2015-07-15 08:39:35 +00:00
MIR MIR Serialization: Initial serialization of machine constant pools. 2015-07-20 20:51:18 +00:00
MSP430
NVPTX [NVPTX] make load on global readonly memory to use ldg 2015-07-20 21:28:54 +00:00
PowerPC Add missing test for r242296 (vec_sld) 2015-07-20 15:43:21 +00:00
SPARC [SPARC] Cleanup handling of the Y/ASR registers. 2015-07-08 16:25:12 +00:00
SystemZ
Thumb
Thumb2 ARM: Add scheduling information for LDRLIT instructions to swift scheduling model 2015-07-17 23:18:26 +00:00
WebAssembly [WebAssembly] Create a CodeGen unittest directory. 2015-07-06 23:14:57 +00:00
WinEH [WinEH] Strip the \01 character from the __CxxFrameHandler3 thunk name 2015-07-13 17:55:14 +00:00
X86 [ImplicitNullChecks] Work with implicit defs. 2015-07-20 20:31:39 +00:00
XCore