llvm-6502/test/CodeGen/NVPTX
Jingyue Wu e08f05f3a5 [NVPTX] expand extload/truncstore for vectors of floats
Summary:
According to PTX ISA:

For convenience, ld, st, and cvt instructions permit source and destination data operands to be wider than the instruction-type size, so that narrow values may be loaded, stored, and converted using regular-width registers. For example, 8-bit or 16-bit values may be held directly in 32-bit or 64-bit registers when being loaded, stored, or converted to other types and sizes. The operand type checking rules are relaxed for bit-size and integer (signed and unsigned) instruction types; floating-point instruction types still require that the operand type-size matches exactly, unless the operand is of bit-size type.

So, the ISA does not support load with extending/store with truncatation for floating numbers. This is reflected in setting the loadext/truncstore actions to expand in the code for floating numbers, but vectors of floating numbers are not taken care of.

As a result, loading a vector of floats followed by a fp_extend may be combined by DAGCombiner to a extload, and the extload may be lowered to NVPTXISD::LoadV2 with extending information. However, NVPTXISD::LoadV2 does not perform extending, and no extending instructions are inserted. Finally, PTX instructions with mismatched types are generated, like
ld.v2.f32 {%fd3, %fd4}, [%rd2]

This patch adds the correct actions for vectors of floats, so DAGCombiner would not create loads with extending, and correct code is generated.

Patched by Gang Hu. 

Test Plan: Test case attached.

Reviewers: jingyue

Reviewed By: jingyue

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D10876

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241191 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-01 21:32:42 +00:00
..
access-non-generic.ll [NVPTX] fix a crash bug in NVPTXFavorNonGenericAddrSpaces 2015-06-09 21:50:32 +00:00
add-128bit.ll Revert revisions r234755, r234759, r234760 2015-04-13 17:47:15 +00:00
addrspacecast-gvar.ll [NVPTX] Handle addrspacecast constant expressions in aggregate initializers 2015-04-28 17:18:30 +00:00
addrspacecast.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
aggr-param.ll
annotations.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
arg-lowering.ll [NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors 2014-06-27 18:35:44 +00:00
arithmetic-fp-sm20.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
arithmetic-int.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
atomics.ll Add some tests for NVPTX lowering of cmpxchg 2014-07-21 22:54:44 +00:00
bfe.ll [NVPTX] Add isel patterns for bit-field extract (bfe) 2014-06-27 18:35:27 +00:00
bug17709.ll
bug21465.ll [NVPTX] run SROA after NVPTXFavorNonGenericAddrSpaces 2015-06-09 00:05:56 +00:00
bug22246.ll [NVPTX] Generate a more optimal sequence for select of i1 2015-01-26 19:52:20 +00:00
bug22322.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
call-with-alloca-buffer.ll Add NVPTXPeephole pass to reduce unnecessary address cast 2015-06-24 20:20:16 +00:00
callchain.ll
calling-conv.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
compare-int.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
constant-vectors.ll
convert-fp.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
convert-int-sm20.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
ctlz.ll
ctpop.ll
cttz.ll
div-ri.ll
envreg.ll [NVPTX] Add support for envreg reads 2014-06-27 18:35:21 +00:00
extloadv.ll [NVPTX] expand extload/truncstore for vectors of floats 2015-07-01 21:32:42 +00:00
fast-math.ll
fma-assoc.ll Check that the TLI callback enableAggressiveFMAFusion has the desired effect on FMA folding. 2015-01-14 15:36:28 +00:00
fma-disable.ll
fma.ll Check that the TLI callback enableAggressiveFMAFusion has the desired effect on FMA folding. 2015-01-14 15:36:28 +00:00
fp16.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
fp-contract.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
fp-literals.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
function-align.ll [NVPTXAsmPrinter] do not print .align on function headers 2015-03-12 01:50:30 +00:00
generic-to-nvvm.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
global-ordering.ll
globals_init.ll The constant initialization for globals in NVPTX is generated as an 2015-06-09 16:29:34 +00:00
globals_lowering.ll Force relocation mode to be default, regardless of what is passed to the backend. 2015-06-30 17:18:00 +00:00
gvar-init.ll [NVPTX] Error out if initializer is given for variable in an address space that does not support initialization 2014-06-27 18:36:01 +00:00
half.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
i1-global.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
i1-int-to-fp.ll
i1-param.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
i8-param.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
imad.ll [NVPTX] Implement fma and imad contraction as target DAGCombiner patterns 2014-06-27 18:35:37 +00:00
implicit-def.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
inline-asm.ll [NVPTX] Add 'b' asm constraint 2014-06-27 18:36:06 +00:00
intrin-nocapture.ll Reapply 239795 - [InstCombine] Propagate non-null facts to call parameters 2015-06-16 20:24:25 +00:00
intrinsic-old.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
intrinsics.ll [NVPTX] Added missing test case for llvm.nvvm.sqrt.f NVPTX intrinsic 2015-06-23 18:22:17 +00:00
isspacep.ll [NVPTX] Add support for isspacep instruction 2014-06-27 18:35:24 +00:00
ld-addrspace.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
ld-generic.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
ldparam-v4.ll
ldu-i8.ll [NVPTX] Make the alignment an explicit argument to ldu/ldg 2014-08-29 15:30:20 +00:00
ldu-ldg.ll [NVPTX] Make the alignment an explicit argument to ldu/ldg 2014-08-29 15:30:20 +00:00
ldu-reg-plus-offset.ll [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction 2015-02-27 19:29:02 +00:00
lit.local.cfg Reduce verbiage of lit.local.cfg files 2014-06-09 22:42:55 +00:00
load-sext-i1.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
local-stack-frame.ll [NVPTX] Move NVPTXPeephole after NVPTXPrologEpilogPass 2015-07-01 20:08:06 +00:00
lower-alloca.ll Add NVPTXLowerAlloca pass to convert alloca'ed memory to local address 2015-06-17 22:31:02 +00:00
lower-kernel-ptr-arg.ll [NVPTX] noop when kernel pointers are already global 2015-06-26 22:35:43 +00:00
machine-sink.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
managed.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
misaligned-vector-ldst.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
module-inline-asm.ll
mulwide.ll [NVPTX] Add some extra tests for mul.wide to test non-power-of-two source types 2014-07-23 20:23:49 +00:00
noduplicate-syncthreads.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
nounroll.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
nvcl-param-align.ll [NVPTX] Fix bugs related to isSingleValueType 2014-12-17 17:59:04 +00:00
nvvm-reflect.ll Add support for __nvvm_reflect changes in libdevice in CUDA-7.0 2015-03-19 17:05:35 +00:00
param-align.ll
pr13291-i1-store.ll [NVPTX] roll forward r239082 2015-06-04 21:28:26 +00:00
pr16278.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
pr17529.ll [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction 2015-02-27 19:29:02 +00:00
refl1.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
rotate.ll [NVPTX] Add support for efficient rotate instructions on SM 3.2+ 2014-06-27 18:35:33 +00:00
rsqrt.ll
sched1.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
sched2.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
sext-in-reg.ll
sext-params.ll
shift-parts.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
simple-call.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
sm-version-20.ll
sm-version-21.ll
sm-version-30.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-32.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-35.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-37.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-50.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-52.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-53.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
st-addrspace.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
st-generic.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
surf-read-cuda.ll [NVPTX] roll forward r239082 2015-06-04 21:28:26 +00:00
surf-read.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
surf-write-cuda.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
surf-write.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
symbol-naming.ll [opaque pointer type] Add textual IR support for explicit type parameter to the call instruction 2015-04-16 23:24:18 +00:00
tex-read-cuda.ll [NVPTX] roll forward r239082 2015-06-04 21:28:26 +00:00
tex-read.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
texsurf-queries.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
tuple-literal.ll
vec8.ll
vec-param-load.ll
vector-args.ll
vector-call.ll [NVPTX] Add missing .v4 qualifier on vector store instruction 2014-07-17 16:58:56 +00:00
vector-compare.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
vector-global.ll [NVPTX] Fix bugs related to isSingleValueType 2014-12-17 17:59:04 +00:00
vector-loads.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
vector-return.ll [NVPTX] aligned byte-buffers for vector return types 2014-10-25 03:46:16 +00:00
vector-select.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
vector-stores.ll
weak-global.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
weak-linkage.ll [NVPTX] Do not emit .weak symbols for NVPTX 2014-12-01 21:16:17 +00:00