build_vector is lowered to REG_SEQUENCE, which is something the register
allocator does a good job at optimizing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187397 91177308-0d34-0410-b5e6-96231b3b80d8
This patch prevents the following combine when the input vector is used more
than once.
insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx
=>
build_vector elt0, ..., NewEltIdx, ..., eltN
The reasons are:
- Building a vector may be expensive, so try to reuse the existing part of a
vector instead of creating a new one (think big vectors).
- elt0 to eltN now have two users instead of one. This may prevent some other
optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187396 91177308-0d34-0410-b5e6-96231b3b80d8
This commit also implements these functions for R600 and removes a test
case that was relying on the buggy behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187007 91177308-0d34-0410-b5e6-96231b3b80d8
These are really the same address space in hardware. The only
difference is that CONSTANT_ADDRESS uses a special cache for faster
access. When we are unable to use the constant kcache for some reason
(e.g. smaller types or lack of indirect addressing) then the instruction
selector must use GLOBAL_ADDRESS loads instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187006 91177308-0d34-0410-b5e6-96231b3b80d8
This increases the number of opportunites we have for folding. With the
previous implementation we were unable to fold into any instructions
other than the first when multiple instructions were selected from a
single SDNode.
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186919 91177308-0d34-0410-b5e6-96231b3b80d8
A side-effect of this is that now the compiler expects kernel arguments
to be 4-byte aligned.
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186916 91177308-0d34-0410-b5e6-96231b3b80d8
I'm guessing the failure had something to do with the double precision
floating point constant used in the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186191 91177308-0d34-0410-b5e6-96231b3b80d8
Enough for the radeonsi driver to use it for calculating derivatives.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186012 91177308-0d34-0410-b5e6-96231b3b80d8
Test is not included as it is several 1000 lines long.
To test this functionnality, a test case must generate at least 2 ALU clauses,
where an ALU clause is ~110 instructions long.
NOTE: This is a candidate for the stable branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185943 91177308-0d34-0410-b5e6-96231b3b80d8
The purpose of this test was to check boundary conditions for the size
of an ALU clause. This test is very sensitive to changes to the
optimizer or scheduler, because it requires an exact number of ALU
instructions in order to remain valid. It's not good to have a test
this sensitive, because it is confusing to developers who implement
optimizations and then 'break' the test.
I'm not sure if there is a good way to test these limits using lit, but
if I can come up with replacement test that isn't as sensitive I'll add
it back to the tree.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185084 91177308-0d34-0410-b5e6-96231b3b80d8
Note: Only adding test for evergreen, not SI yet.
When I attempted to expand vselect for SI, I got the following:
llc: /home/awatry/src/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp:522:
llvm::SDValue llvm::DAGTypeLegalizer::PromoteIntRes_SETCC(llvm::SDNode*):
Assertion `SVT.isVector() == N->getOperand(0).getValueType().isVector() &&
"Vector compare must return a vector result!"' failed.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184847 91177308-0d34-0410-b5e6-96231b3b80d8
Add test cases for both vector sizes on SI and also add v2i32 test for EG.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184846 91177308-0d34-0410-b5e6-96231b3b80d8
No test/expansion for SI has been added yet. Attempts to expand this
operation for SI resulted in a stacktrace in (IIRC) LegalizeIntegerTypes
which was complaining about vector comparisons being required to return
a vector type.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184845 91177308-0d34-0410-b5e6-96231b3b80d8
Also add lit test for both cases on SI, and v2i32 for evergreen.
Note: I followed the guidance of the v4i32 EG check... UREM produces really
complex code, so let's just check that the instruction was lowered
successfully.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184844 91177308-0d34-0410-b5e6-96231b3b80d8
Also add lit test for both cases on SI, and v2i32 for evergreen.
Note: I followed the guidance of the v4i32 EG check... UDIV produces really
complex code, so let's just check that the instruction was lowered
successfully.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184843 91177308-0d34-0410-b5e6-96231b3b80d8