Since read2 / write2 are emitted for 4-byte aligned 8-byte
accesses, these are seen by the scheduler.
The DAG scheduler is semi-deprecated, so just
ignore these for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217969 91177308-0d34-0410-b5e6-96231b3b80d8
Need to convert the 64 element offset into bytes, not just the element
size like the normal case instructions.
Noticed by inspection. This can't be hit now because
st64 instructions aren't emitted during instruction selection,
and the post-RA scheduler isn't enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217560 91177308-0d34-0410-b5e6-96231b3b80d8
We must constrain the destination register class of legalized operands
to a VGPR class or else the illegal operand may be folded back into
the instruction by the register coalescer.
This fixes a bug in add.ll that will be uncovered by future commits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217249 91177308-0d34-0410-b5e6-96231b3b80d8
Also fix bug this exposed where when legalizing an immediate
operand, a v_mov_b32 would be created with a VSrc dest register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217108 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a crash in the OpenCV test:
ImgprocWarpResizeArea/Resize.Mat/16
There is no test case for this, because this failure depends on a
specific ordering of the loads, which could easily change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217040 91177308-0d34-0410-b5e6-96231b3b80d8
This will simplify the SGPR spilling and also allow us to use
MachineFrameInfo for calculating offsets, which should be more
reliable than our custom code.
This fixes a crash in some cases where a register would be spilled
in a branch such that the VGPR defined for spilling did not dominate
all the uses when restoring.
This fixes a crash in an ocl conformance test. The test requries
register spilling and is too big to include.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216217 91177308-0d34-0410-b5e6-96231b3b80d8
Ordinarily (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2)
is only done if the add has one use. If the resulting constant
add can be folded into an addressing mode, force this to happen
for the pointer operand.
This ends up happening a lot because of how LDS objects are allocated.
Since the globals are allocated next to each other, acessing the first
element of the second object is directly indexed by a shifted pointer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215739 91177308-0d34-0410-b5e6-96231b3b80d8
This saves us from having to copy a 64-bit 0 value into VGPRs for
BUFFER_* instruction which only have a 12-bit immediate offset.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215399 91177308-0d34-0410-b5e6-96231b3b80d8
This currently has a noticable effect on the kernel argument loads.
LDS and global loads are more problematic, I think because of how copies
are currently inserted to ensure that the address is a VGPR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214942 91177308-0d34-0410-b5e6-96231b3b80d8
Abs/neg folding has moved out of foldOperands and into the instruction
selection phase using complex patterns. As a consequence of this
change, we now prefer to select the 64-bit encoding for most
instructions and the modifier operands have been dropped from
integer VOP3 instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214467 91177308-0d34-0410-b5e6-96231b3b80d8
We were incorrectly assuming that all VOP2 instructions can read SGPRs
in Src0, but this is not true for instructions that read carry-in from
VCC.
The old logic has been replaced with new logic which checks the defined
register classes of the VOP2 instruction to determine whether or not to
legalize the operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214465 91177308-0d34-0410-b5e6-96231b3b80d8
We were commuting the instruction by still shrinking it using the
original opcode.
NOTE: This is a candidate for the 3.5 branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214463 91177308-0d34-0410-b5e6-96231b3b80d8
We can treat ds_read2_* as a single offset if the offsets are adjacent.
No test since emission of read2 instructions for partially
aligned loads isn't implemented yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214269 91177308-0d34-0410-b5e6-96231b3b80d8
This implements a solution for constant initializers suggested
by Vadim Girlin, where we store the data after the shader code
and then use the S_GETPC instruction to compute its address.
This saves use the trouble of creating a new buffer for constant data
and then having to pass the pointer to the kernel via user SGPRs or the
input buffer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213530 91177308-0d34-0410-b5e6-96231b3b80d8
I can't get VGPR spilling to work reliable, so for now just emit
an error when the register allocator tries to spill VGPRs.
v2:
- Fix build
v3:
- Added crash fix when spilling SPGRs
v4:
- Use V_MOV_B32 as a dummy instruction instead of S_NOP
Patch by: Darren Powell
https://bugs.freedesktop.org/show_bug.cgi?id=75276
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210588 91177308-0d34-0410-b5e6-96231b3b80d8
The register spiller assumes that only one new instruction is created
when spilling and restoring registers, so we need to emit pseudo
instructions for vector register spills and lower them after
register allocation.
v2:
- Fix calculation of lane index
- Extend VGPR liveness to end of program.
v3:
- Use SIMM16 field of S_NOP to specify multiple NOPs.
https://bugs.freedesktop.org/show_bug.cgi?id=75005
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207843 91177308-0d34-0410-b5e6-96231b3b80d8
Use scalar BFE with constant shift and offset when possible.
This is complicated by the fact that the scalar version packs
the two operands of the vector version into one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206558 91177308-0d34-0410-b5e6-96231b3b80d8