This fixes a couple of bugs and incorrect assumptions,
in total four more piglit tests now pass.
v2: fix small bug in the dominator updating
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174762 91177308-0d34-0410-b5e6-96231b3b80d8
Patch by: Christian König
Intersecting loop handling was wrong.
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174761 91177308-0d34-0410-b5e6-96231b3b80d8
Otherwise we sometimes produce invalid code.
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174760 91177308-0d34-0410-b5e6-96231b3b80d8
same so we put in the comment field an indicator when we think we are
emitting the 16 bit version. For the direct object emitter, the difference is
important as well as for other passes which need an accurate count of
program size. There will be other similar putbacks to this for various
instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174747 91177308-0d34-0410-b5e6-96231b3b80d8
Thanks to help from Nadav and Hal, I have a more reasonable (and even
correct!) approach. This specifically penalizes the insertelement
and extractelement operations for the performance hit that will occur
on PowerPC processors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174725 91177308-0d34-0410-b5e6-96231b3b80d8
Adds a function to target transform info to query for the cost of address
computation. The cost model analysis pass now also queries this interface.
The code in LoopVectorize adds the cost of address computation as part of the
memory instruction cost calculation. Only there, we know whether the instruction
will be scalarized or not.
Increase the penality for inserting in to D registers on swift. This becomes
necessary because we now always assume that address computation has a cost and
three is a closer value to the architecture.
radar://13097204
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174713 91177308-0d34-0410-b5e6-96231b3b80d8
allowed size for the instruction. This code uses RegScavenger to fix this.
We sometimes need 2 registers for Mips16 so we must handle things
differently than how register scavenger is normally used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174696 91177308-0d34-0410-b5e6-96231b3b80d8
Certain vector operations don't vectorize well with the current
PowerPC implementation. Element insert/extract performs poorly
without VSX support because Altivec requires going through memory.
SREM, UREM, and VSELECT all produce bad scalar code.
There's a lot of work to do for the cost model before
autovectorization will be tuned well, and this is not an attempt to
address the larger problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174660 91177308-0d34-0410-b5e6-96231b3b80d8
Remove all the unused code.
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174656 91177308-0d34-0410-b5e6-96231b3b80d8
Allows nexuiz to run with radeonsi.
Patch by: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174655 91177308-0d34-0410-b5e6-96231b3b80d8
20 more little piglits with radeonsi.
Patch by: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174654 91177308-0d34-0410-b5e6-96231b3b80d8
The _SGPR variants where wrong.
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174653 91177308-0d34-0410-b5e6-96231b3b80d8
v2: rebased on current upstream
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174652 91177308-0d34-0410-b5e6-96231b3b80d8
This is for the case when no processor is passed to the backend. This
prevents the
'' is not a recognized processor for this target (ignoring processor)
warning from being generated by clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174651 91177308-0d34-0410-b5e6-96231b3b80d8
Patch by: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174634 91177308-0d34-0410-b5e6-96231b3b80d8
Handle vectors of 1 to 16 integers.
Change the intrinsic names to prevent the wrong one from being selected at
runtime due to the overloading.
Patch By: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174633 91177308-0d34-0410-b5e6-96231b3b80d8
v1i32, v2i32, v8i32 and v16i32.
Only add VGPR register classes for integer vector types, to avoid attempts
copying from VGPR to SGPR registers, which is not possible.
Patch By: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174632 91177308-0d34-0410-b5e6-96231b3b80d8
Use sub0-15 everywhere.
Patch by: Michel Dänzerr
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174610 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions compare two floating point values and return an
integer true (-1) or false (0) value.
When compiling code generated by the Mesa GLSL frontend, the SET*_DX10
instructions save us four instructions for most branch decisions that
use floating-point comparisons.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174609 91177308-0d34-0410-b5e6-96231b3b80d8
account. Atoms use LEA for updating SP in prologs/epilogs, and the
exact LEA opcode depends on the data model.
Also reapplying the test case which was added and then reverted
(because of Atom failures), this time specifying explicitly the CPU in
addition to the triple. The test case now checks all variations (data
mode, cpu Atom vs. Core).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174542 91177308-0d34-0410-b5e6-96231b3b80d8
Most of PPCCallingConv.td is used only by the 32-bit SVR4 ABI. Rename
things to clarify this. Also delete some code that's been commented out
for a long time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174526 91177308-0d34-0410-b5e6-96231b3b80d8
Only implemented for R600 so far. SI is missing implementations of a
few callbacks used by the Indirect Addressing pass and needs code to
handle frame indices.
At the moment R600 only supports array sizes of 16 dwords or less.
Register packing of vector types is currently disabled, which means that a
vec4 is stored in T0_X, T1_X, T2_X, T3_X, rather than T0_XYZW. In order
to correctly pack registers in all cases, we will need to implement an
analysis pass for R600 that determines the correct vector width for each
array.
v2:
- Add support for i8 zext load from stack.
- Coding style fixes
v3:
- Don't reserve registers for indirect addressing when it isn't
being used.
- Fix bug caused by LLVM limiting the number of SubRegIndex
declarations.
v4:
- Fix 64-bit defines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174525 91177308-0d34-0410-b5e6-96231b3b80d8
Weakly defined symbols should evaluate to 0 if they're undefined at
link-time. This is impossible to do with the usual address generation
patterns, so we should use a literal pool entry to materlialise the
address.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174518 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions are a late addition to the architecture, and may
yet end up behind an optional attribute, but for now they're available
at all times.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174496 91177308-0d34-0410-b5e6-96231b3b80d8
This adds hints to the various "prfm" instructions so that they can
affect the instruction cache as well as the data cache.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174495 91177308-0d34-0410-b5e6-96231b3b80d8
Use the validateTargetOperandClass() hook to match literal '#0' operands in
InstAlias definitions. Previously this required per-instruction C++ munging of the
operand list, but not is handled as a natural part of the matcher. Much better.
No additional tests are required, as the pre-existing tests for these instructions
exercise the new behaviour as being functionally equivalent to the old.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174488 91177308-0d34-0410-b5e6-96231b3b80d8
pointer in function prologs/epilogs. The opcodes should depend on the
data model (LP64 vs. ILP32) rather than the architecture bit-ness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174446 91177308-0d34-0410-b5e6-96231b3b80d8
is a vararg function.
The original code was examining flag OutputArg::IsFixed to determine whether
CC_MipsN_VarArg or CC_MipsN should be called. This is not correct, since this
flag is often set to false when the function being analyzed is a non-variadic
function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174442 91177308-0d34-0410-b5e6-96231b3b80d8