We were passing the scratch buffer address to the shaders via user sgprs,
but now we use external symbols and have the driver patch the shader
using reloc information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226586 91177308-0d34-0410-b5e6-96231b3b80d8
We don't have a good way of legalizing this if the frame index offset
is more than the 12-bits, which is size of MUBUF's offset field, so
now we store the frame index in the vaddr field.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226584 91177308-0d34-0410-b5e6-96231b3b80d8
The fixes are to note that AArch64 has additional restrictions on when local
relocations can be used. In particular, ld64 requires that relocations to
cstring/cfstrings use linker visible symbols.
Original message:
In an assembly expression like
bar:
.long L0 + 1
the intended semantics is that bar will contain a pointer one byte past L0.
In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.
The solution used in ELF to use relocation with symbols if there is a non-zero
addend.
In MachO before this patch we would just keep all symbols in some sections.
This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.
This patch implements the non-zero addend logic for MachO too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226503 91177308-0d34-0410-b5e6-96231b3b80d8
Instructions with 1 operand can still use source modifiers,
so make sure we don't print an extra comma afterwards.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226226 91177308-0d34-0410-b5e6-96231b3b80d8
This removes some duplicated classes and definitions.
These instructions are defined:
_e32 // pseudo
_e32_si
_e64 // pseudo
_e64_si
_e64_vi
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226191 91177308-0d34-0410-b5e6-96231b3b80d8
v2: modify hasVALU32BitEncoding instead
v3: - add pseudoToMCOpcode helper to AMDGPUInstInfo, which is used by both
hasVALU32BitEncoding and AMDGPUMCInstLower::lower
- report an error if a pseudo can't be lowered
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226188 91177308-0d34-0410-b5e6-96231b3b80d8
utils/sort_includes.py.
I clearly haven't done this in a while, so more changed than usual. This
even uncovered a missing include from the InstrProf library that I've
added. No functionality changed here, just mechanical cleanup of the
include order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225974 91177308-0d34-0410-b5e6-96231b3b80d8
Don't do the v4i8 -> v4f32 combine if the load will need to
be expanded due to alignment. This stops adding instructions
to repack into a single register that the v_cvt_ubyteN_f32
instructions read.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225926 91177308-0d34-0410-b5e6-96231b3b80d8
Now that the source and destination types can be specified,
allow doing an expansion that doesn't use an EXTLOAD of the
result type. Try to do a legal extload to an intermediate type
and extend that if possible.
This generalizes the special case custom lowering of extloads
R600 has been using to work around this problem.
This also happens to fix a bug that would incorrectly use more
aligned loads than should be used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225925 91177308-0d34-0410-b5e6-96231b3b80d8
The machine scheduler is still disabled by default.
The schedule model is not complete yet, and could be improved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225913 91177308-0d34-0410-b5e6-96231b3b80d8
The backend now assumes that all immediates are integers. This allows
us to simplify immediate handling code, becasue we no longer need to
handle fp and integer immediates differently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225844 91177308-0d34-0410-b5e6-96231b3b80d8
This requires a new hook to prevent expanding sqrt in terms
of rsqrt and reciprocal. v_rcp_f32, v_rsq_f32, and v_sqrt_f32 are
all the same rate, so this expansion would just double the number
of instructions and cycles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225828 91177308-0d34-0410-b5e6-96231b3b80d8
Only do for f32 since I'm unclear on both what this is expecting
for the refinement steps in terms of accuracy, and what
f64 instruction actually provides.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225827 91177308-0d34-0410-b5e6-96231b3b80d8
Speculating things is generally good. SI+ has instructions for these
for 32-bit values. This is still probably better even with the expansion
for 64-bit values, although it is odd that this callback doesn't have
the size as a parameter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225822 91177308-0d34-0410-b5e6-96231b3b80d8
There are some operands which can take either immediates or registers
and we were previously using different register class to distinguish
between operands that could take immediates and those that could not.
This patch switches to using RegisterOperands which should simplify the
backend by reducing the number of register classes and also make it
easier to implement the assembler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225662 91177308-0d34-0410-b5e6-96231b3b80d8
One is that AArch64 has additional restrictions on when local relocations can
be used. We have to take those into consideration when deciding to put a L
symbol in the symbol table or not.
The other is that ld64 requires the relocations to cstring to use linker
visible symbols on AArch64.
Thanks to Michael Zolotukhin for testing this!
Remove doesSectionRequireSymbols.
In an assembly expression like
bar:
.long L0 + 1
the intended semantics is that bar will contain a pointer one byte past L0.
In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.
The solution used in ELF to use relocation with symbols if there is a non-zero
addend.
In MachO before this patch we would just keep all symbols in some sections.
This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.
This patch implements the non-zero addend logic for MachO too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225644 91177308-0d34-0410-b5e6-96231b3b80d8
Its functionality has been replaced by calling
SIInstrInfo::legalizeOperands() from
SIISelLowering::AdjstInstrPostInstrSelection() and running the
SIFoldOperands and SIShrinkInstructions passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225445 91177308-0d34-0410-b5e6-96231b3b80d8
type (in addition to the memory type).
The *LoadExt* legalization handling used to only have one type, the
memory type. This forced users to assume that as long as the extload
for the memory type was declared legal, and the result type was legal,
the whole extload was legal.
However, this isn't always the case. For instance, on X86, with AVX,
this is legal:
v4i32 load, zext from v4i8
but this isn't:
v4i64 load, zext from v4i8
Whereas v4i64 is (arguably) legal, even without AVX2.
Note that the same thing was done a while ago for truncstores (r46140),
but I assume no one needed it yet for extloads, so here we go.
Calls to getLoadExtAction were changed to add the value type, found
manually in the surrounding code.
Calls to setLoadExtAction were mechanically changed, by wrapping the
call in a loop, to match previous behavior. The loop iterates over
the MVT subrange corresponding to the memory type (FP vectors, etc...).
I also pulled neighboring setTruncStoreActions into some of the loops;
those shouldn't make a difference, as the additional types are illegal.
(e.g., i128->i1 truncstores on PPC.)
No functional change intended.
Differential Revision: http://reviews.llvm.org/D6532
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225421 91177308-0d34-0410-b5e6-96231b3b80d8
Folding the same immediate into multiple instruction will increase
program size, which can hurt performance.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225405 91177308-0d34-0410-b5e6-96231b3b80d8
Use VGPR_32 register class instead. These two register classes were
identical and having separate classes was causing
SIInstrInfo::isLegalOperands() to be overly conservative in some cases.
This change is necessary to prevent future paches from missing a folding
opportunity in fneg-fabs.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225382 91177308-0d34-0410-b5e6-96231b3b80d8
In DS write instructions, the address operand comes before the value
operand(s) which is reversed from every other instruction type.
The SIInsertWait assumed that the first use for each instruction
was the value, so for DS write it was protecting the address
operand with s_waitcnt instructions when it should have been
protecting the value operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225289 91177308-0d34-0410-b5e6-96231b3b80d8
This is equivalent to the AMDGPUTargetMachine now, but it is the
starting point for separating R600 and GCN functionality into separate
targets.
It is recommened that users start using the gcn triple for GCN-based
GPUs, because using the r600 triple for these GPUs will be deprecated in
the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225277 91177308-0d34-0410-b5e6-96231b3b80d8
The dump was dependent on a feature string, which meant that it couldn't
be disabled or enable on a per compile basis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225275 91177308-0d34-0410-b5e6-96231b3b80d8
Make sure they all have llvm_unreachable on the default path out of the switch. Remove unnecessary "default: break". Remove a 'return' after unreachable. Fix some indentation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225114 91177308-0d34-0410-b5e6-96231b3b80d8
The issues was that AArch64 has additional restrictions on when local
relocations can be used. We have to take those into consideration when
deciding to put a L symbol in the symbol table or not.
Original message:
Remove doesSectionRequireSymbols.
In an assembly expression like
bar:
.long L0 + 1
the intended semantics is that bar will contain a pointer one byte past L0.
In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.
The solution used in ELF to use relocation with symbols if there is a non-zero
addend.
In MachO before this patch we would just keep all symbols in some sections.
This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.
This patch implements the non-zero addend logic for MachO too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225048 91177308-0d34-0410-b5e6-96231b3b80d8