The OpenCL specs say: "The vector versions of the math functions operate
component-wise. The description is per-component."
Patch by: Jan Vesely
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200773 91177308-0d34-0410-b5e6-96231b3b80d8
v2: Add ftrunc->TRUNC pattern instead of replacing int_AMDGPU_trunc
v3: move ftrunc pattern next to TRUNC definition, it's available since R600
Patch By: Jan Vesely
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197783 91177308-0d34-0410-b5e6-96231b3b80d8
This is to avoid this transformation in some cases:
fold (conv (load x)) -> (load (conv*)x)
On architectures that don't natively support some vector
loads efficiently casting the load to a smaller vector of
larger types and loading is more efficient.
Patch by Micah Villmow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194783 91177308-0d34-0410-b5e6-96231b3b80d8
The SelectionDAGBuilder was promoting vector kernel arguments to legal
types, but this won't work for R600 and SI since kernel arguments are
stored in memory and can't be promoted. In order to handle vector
arguments correctly we need to look at the original types from the LLVM IR
function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193215 91177308-0d34-0410-b5e6-96231b3b80d8
For _XYZ, the type of VDATA is v4i32, because v3i32 doesn't exist.
The ADDR64 bit is not exposed. A simpler intrinsic that doesn't take
a resource descriptor might be nicer.
The maximum number of input SGPRs is bumped to 17.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190575 91177308-0d34-0410-b5e6-96231b3b80d8
This adds minimal support to the SelectionDAG for handling address spaces
with different pointer sizes. The SelectionDAG should now correctly
lower pointer function arguments to the correct size as well as generate
the correct code when lowering getelementptr.
This patch also updates the R600 DataLayout to use 32-bit pointers for
the local address space.
v2:
- Add more helper functions to TargetLoweringBase
- Use CHECK-LABEL for tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189221 91177308-0d34-0410-b5e6-96231b3b80d8
Now that compute support is better on SI, we can't continue using v16i8
for descriptors since this is also a legal type in OpenCL.
This patch fixes numerous hangs with the piglit OpenCL test and since
we now use a target specific DAG node for LOAD_CONSTANT with the
correct MemOperandFlags, this should also fix:
https://bugs.freedesktop.org/show_bug.cgi?id=66805
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188429 91177308-0d34-0410-b5e6-96231b3b80d8
We use MVT::i32 for the vector index type, because we use 32-bit
operations to caculate offsets when dynamically indexing vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187749 91177308-0d34-0410-b5e6-96231b3b80d8
* Added R600_Reg64 class
* Added T#Index#.XY registers definition
* Added v2i32 register reads from parameter and global space
* Added f32 and i32 elements extraction from v2f32 and v2i32
* Added v2i32 -> v2f32 conversions
Tom Stellard:
- Mark vec2 operations as expand. The addition of a vec2 register
class made them all legal.
Patch by: Dmitry Cherkassov
Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187582 91177308-0d34-0410-b5e6-96231b3b80d8
This commit also implements these functions for R600 and removes a test
case that was relying on the buggy behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187007 91177308-0d34-0410-b5e6-96231b3b80d8
A side-effect of this is that now the compiler expects kernel arguments
to be 4-byte aligned.
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186916 91177308-0d34-0410-b5e6-96231b3b80d8
Enough for the radeonsi driver to use it for calculating derivatives.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186012 91177308-0d34-0410-b5e6-96231b3b80d8
By default, we expand these operations for both EG and SI. Move the
duplicated code into a common space for now. If the targets ever actually
implement these operations as instructions, we can override that in the relevant
target.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184848 91177308-0d34-0410-b5e6-96231b3b80d8
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182703 91177308-0d34-0410-b5e6-96231b3b80d8
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178126 91177308-0d34-0410-b5e6-96231b3b80d8