mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2024-12-14 11:32:34 +00:00
e08f05f3a5
Summary: According to PTX ISA: For convenience, ld, st, and cvt instructions permit source and destination data operands to be wider than the instruction-type size, so that narrow values may be loaded, stored, and converted using regular-width registers. For example, 8-bit or 16-bit values may be held directly in 32-bit or 64-bit registers when being loaded, stored, or converted to other types and sizes. The operand type checking rules are relaxed for bit-size and integer (signed and unsigned) instruction types; floating-point instruction types still require that the operand type-size matches exactly, unless the operand is of bit-size type. So, the ISA does not support load with extending/store with truncatation for floating numbers. This is reflected in setting the loadext/truncstore actions to expand in the code for floating numbers, but vectors of floating numbers are not taken care of. As a result, loading a vector of floats followed by a fp_extend may be combined by DAGCombiner to a extload, and the extload may be lowered to NVPTXISD::LoadV2 with extending information. However, NVPTXISD::LoadV2 does not perform extending, and no extending instructions are inserted. Finally, PTX instructions with mismatched types are generated, like ld.v2.f32 {%fd3, %fd4}, [%rd2] This patch adds the correct actions for vectors of floats, so DAGCombiner would not create loads with extending, and correct code is generated. Patched by Gang Hu. Test Plan: Test case attached. Reviewers: jingyue Reviewed By: jingyue Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D10876 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241191 91177308-0d34-0410-b5e6-96231b3b80d8 |
||
---|---|---|
.. | ||
access-non-generic.ll | ||
add-128bit.ll | ||
addrspacecast-gvar.ll | ||
addrspacecast.ll | ||
aggr-param.ll | ||
annotations.ll | ||
arg-lowering.ll | ||
arithmetic-fp-sm20.ll | ||
arithmetic-int.ll | ||
atomics.ll | ||
bfe.ll | ||
bug17709.ll | ||
bug21465.ll | ||
bug22246.ll | ||
bug22322.ll | ||
call-with-alloca-buffer.ll | ||
callchain.ll | ||
calling-conv.ll | ||
compare-int.ll | ||
constant-vectors.ll | ||
convert-fp.ll | ||
convert-int-sm20.ll | ||
ctlz.ll | ||
ctpop.ll | ||
cttz.ll | ||
div-ri.ll | ||
envreg.ll | ||
extloadv.ll | ||
fast-math.ll | ||
fma-assoc.ll | ||
fma-disable.ll | ||
fma.ll | ||
fp16.ll | ||
fp-contract.ll | ||
fp-literals.ll | ||
function-align.ll | ||
generic-to-nvvm.ll | ||
global-ordering.ll | ||
globals_init.ll | ||
globals_lowering.ll | ||
gvar-init.ll | ||
half.ll | ||
i1-global.ll | ||
i1-int-to-fp.ll | ||
i1-param.ll | ||
i8-param.ll | ||
imad.ll | ||
implicit-def.ll | ||
inline-asm.ll | ||
intrin-nocapture.ll | ||
intrinsic-old.ll | ||
intrinsics.ll | ||
isspacep.ll | ||
ld-addrspace.ll | ||
ld-generic.ll | ||
ldparam-v4.ll | ||
ldu-i8.ll | ||
ldu-ldg.ll | ||
ldu-reg-plus-offset.ll | ||
lit.local.cfg | ||
load-sext-i1.ll | ||
local-stack-frame.ll | ||
lower-alloca.ll | ||
lower-kernel-ptr-arg.ll | ||
machine-sink.ll | ||
managed.ll | ||
misaligned-vector-ldst.ll | ||
module-inline-asm.ll | ||
mulwide.ll | ||
noduplicate-syncthreads.ll | ||
nounroll.ll | ||
nvcl-param-align.ll | ||
nvvm-reflect.ll | ||
param-align.ll | ||
pr13291-i1-store.ll | ||
pr16278.ll | ||
pr17529.ll | ||
refl1.ll | ||
rotate.ll | ||
rsqrt.ll | ||
sched1.ll | ||
sched2.ll | ||
sext-in-reg.ll | ||
sext-params.ll | ||
shift-parts.ll | ||
simple-call.ll | ||
sm-version-20.ll | ||
sm-version-21.ll | ||
sm-version-30.ll | ||
sm-version-32.ll | ||
sm-version-35.ll | ||
sm-version-37.ll | ||
sm-version-50.ll | ||
sm-version-52.ll | ||
sm-version-53.ll | ||
st-addrspace.ll | ||
st-generic.ll | ||
surf-read-cuda.ll | ||
surf-read.ll | ||
surf-write-cuda.ll | ||
surf-write.ll | ||
symbol-naming.ll | ||
tex-read-cuda.ll | ||
tex-read.ll | ||
texsurf-queries.ll | ||
tuple-literal.ll | ||
vec8.ll | ||
vec-param-load.ll | ||
vector-args.ll | ||
vector-call.ll | ||
vector-compare.ll | ||
vector-global.ll | ||
vector-loads.ll | ||
vector-return.ll | ||
vector-select.ll | ||
vector-stores.ll | ||
weak-global.ll | ||
weak-linkage.ll |