The AMDGPUIndirectAddressing pass was previously responsible for
lowering private loads and stores to indirect addressing instructions.
However, this pass was buggy and way too complicated. The only
advantage it had over the new simplified code was that it saved one
instruction per direct write to private memory. This optimization
likely has a minimal impact on performance, and we may be able
to duplicate it using some other transformation.
For the private address space, we now:
1. Lower private loads/store to Register(Load|Store) instructions
2. Reserve part of the register file as 'private memory'
3. After regalloc lower the Register(Load|Store) instructions to
MOV instructions that use indirect addressing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193179 91177308-0d34-0410-b5e6-96231b3b80d8
For targets that have instruction itineraries this means no change. Targets
that move over to the new schedule model will use be able the new schedule
module for instruction latencies in the if-converter (the logic is such that if
there is no itineary we will use the new sched model for the latencies).
Before, we queried "TTI->getInstructionLatency()" for the instruction latency
and the extra prediction cost. Now, we query the TargetSchedule abstraction for
the instruction latency and TargetInstrInfo for the extra predictation cost. The
TargetSchedule abstraction will internally call "TTI->getInstructionLatency" if
an itinerary exists, otherwise it will use the new schedule model.
ATTENTION: Out of tree targets!
(I will also send out an email later to LLVMDev)
This means, if your target implements
unsigned getInstrLatency(const InstrItineraryData *ItinData,
const MachineInstr *MI,
unsigned *PredCost);
and returns a value for "PredCost", you now also need to implement
unsigned getPredictationCost(const MachineInstr *MI);
(if your target uses the IfConversion.cpp pass)
radar://15077010
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191671 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes some regressions in the piglit local memory store tests
introduced by recent commits which made the scheduler aware of the trans
slot.
It's not possible to test this using lit, because there is no way to
determine from the assembly dumps whether or not an instruction is in
the trans slot.
Even if this were possible, the test would be highly sensitive to
changes in the scheduler and might generate confusing false negatives.
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190574 91177308-0d34-0410-b5e6-96231b3b80d8
* Added R600_Reg64 class
* Added T#Index#.XY registers definition
* Added v2i32 register reads from parameter and global space
* Added f32 and i32 elements extraction from v2f32 and v2i32
* Added v2i32 -> v2f32 conversions
Tom Stellard:
- Mark vec2 operations as expand. The addition of a vec2 register
class made them all legal.
Patch by: Dmitry Cherkassov
Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187582 91177308-0d34-0410-b5e6-96231b3b80d8
This increases the number of opportunites we have for folding. With the
previous implementation we were unable to fold into any instructions
other than the first when multiple instructions were selected from a
single SDNode.
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186919 91177308-0d34-0410-b5e6-96231b3b80d8
Test is not included as it is several 1000 lines long.
To test this functionnality, a test case must generate at least 2 ALU clauses,
where an ALU clause is ~110 instructions long.
NOTE: This is a candidate for the stable branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185943 91177308-0d34-0410-b5e6-96231b3b80d8
This should simplify the subtarget definitions and make it easier to
add new ones.
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183566 91177308-0d34-0410-b5e6-96231b3b80d8
Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register
coalescer to remove some unneeded COPY.
This patch also defines some structures/functions that can be used to handle
every vector instructions (CUBE, Cayman special instructions...) in a similar
fashion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182126 91177308-0d34-0410-b5e6-96231b3b80d8
Only implemented for R600 so far. SI is missing implementations of a
few callbacks used by the Indirect Addressing pass and needs code to
handle frame indices.
At the moment R600 only supports array sizes of 16 dwords or less.
Register packing of vector types is currently disabled, which means that a
vec4 is stored in T0_X, T1_X, T2_X, T3_X, rather than T0_XYZW. In order
to correctly pack registers in all cases, we will need to implement an
analysis pass for R600 that determines the correct vector width for each
array.
v2:
- Add support for i8 zext load from stack.
- Coding style fixes
v3:
- Don't reserve registers for indirect addressing when it isn't
being used.
- Fix bug caused by LLVM limiting the number of SubRegIndex
declarations.
v4:
- Fix 64-bit defines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174525 91177308-0d34-0410-b5e6-96231b3b80d8
Use one intrinsic for all sorts of interpolation.
Use two separate unexpanded instructions to represent INTERP_XY and _ZW -
this will allow to eliminate one part if it's not used.
Track liveness of special interpolation regs instead of reserving them -
this will allow to reuse those regs, lowering reg pressure.
Patch By: Vadim Girlin
v2[Vincent Lejeune]: Rebased against current llvm master
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174394 91177308-0d34-0410-b5e6-96231b3b80d8
Remove Cxxx registers, add new special register - "ALU_CONST" and new
operand for each alu src - "sel". ALU_CONST is used to designate that the
new operand contains the value to override src.sel, src.kc_bank, src.chan
for constants in the driver.
Patch by: Vadim Girlin
Vincent Lejeune:
- Use pointers for constants
- Fold CONST_ADDRESS when possible
Tom Stellard:
- Give CONSTANT_BUFFER_0 its own address space
- Use integer types for constant loads
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173222 91177308-0d34-0410-b5e6-96231b3b80d8