Summary: Building FP16 constant vectors caused the FP16 data to be bitcast to i64. This patch creates a BITCAST node with the correct value, and adds a test to verify correct handling.
Reviewers: mcrosier
Reviewed By: mcrosier
Subscribers: mcrosier, jmolloy, ab, srhines, llvm-commits, rengolin, aemerson
Differential Revision: http://reviews.llvm.org/D8369
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232562 91177308-0d34-0410-b5e6-96231b3b80d8
Before this patch code wanting to create temporary labels for a given entity
(function, cu, exception range, etc) had to keep its own counter to have stable
symbol names.
createTempSymbol would still add a suffix to make sure a new symbol was always
returned, but it kept a single counter. Because of that, if we were to use
just createTempSymbol("cu_begin"), the label could change from cu_begin42 to
cu_begin43 because some other code started using temporary labels.
Simplify this by just keeping one counter per prefix and removing the various
specialized counters.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232535 91177308-0d34-0410-b5e6-96231b3b80d8
Optimize concat_vectors of truncated vectors, where the intermediate
type is illegal, to avoid said illegality, e.g.,
(v4i16 (concat_vectors (v2i16 (truncate (v2i64))),
(v2i16 (truncate (v2i64)))))
->
(v4i16 (truncate (v4i32 (concat_vectors (v2i32 (truncate (v2i64))),
(v2i32 (truncate (v2i64)))))))
This isn't really target-specific, and, as such, would best go in the
DAGCombiner. However, ISD::TRUNCATE legality isn't keyed on both input
and result type, so we might generate worse code when we don't know
better. On AArch64 we know it's fine for v2i64->v4i16 and v4i32->v8i8.
rdar://20022387
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232459 91177308-0d34-0410-b5e6-96231b3b80d8
This covers essentially all of llvm's headers and libs. One or two weird
cases I wasn't sure were worth/appropriate to fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232394 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is instead of doing this in target independent code and is the last
non-functional change before targets begin to distinguish between
different memory constraints when selecting code for the ISD::INLINEASM
node.
Next, each target will individually move away from the idea that all
memory constraints behave like 'm'.
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D8173
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232373 91177308-0d34-0410-b5e6-96231b3b80d8
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.
This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break
anything.
The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate
Constraint_* values.
PR22883 was caused the matching operands copying the whole of the operand flags
for the matched operand. This included the constraint id which needed to be
replaced with the operand number. This has been fixed with a conversion
function. Following on from this, matching operands also used the operand
number as the constraint id. This has been fixed by looking up the matched
operand and taking it from there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232165 91177308-0d34-0410-b5e6-96231b3b80d8
implementation. This requires a bit of scaffolding and a few fixups
that'll go away once all of the ports have been migrated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232103 91177308-0d34-0410-b5e6-96231b3b80d8
This (r232027) has caused PR22883; so it seems those bits might be used by
something else after all. Reverting until we can figure out what else to do.
Original commit message:
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.
This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break anything.
The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate Constraint_*
values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232093 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.
This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break anything.
The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate Constraint_*
values.
Reviewers: hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D8171
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232027 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
I don't know why every singled backend had to redeclare its own DataLayout.
There was a virtual getDataLayout() on the common base TargetMachine, the
default implementation returned nullptr. It was not clear from this that
we could assume at call site that a DataLayout will be available with
each Target.
Now getDataLayout() is no longer virtual and return a pointer to the
DataLayout member of the common base TargetMachine. I plan to turn it into
a reference in a future patch.
The only backend that didn't have a DataLayout previsouly was the CPPBackend.
It now initializes the default DataLayout. This commit is NFC for all the
other backends.
Test Plan: clang+llvm ninja check-all
Reviewers: echristo
Subscribers: jfb, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D8243
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231987 91177308-0d34-0410-b5e6-96231b3b80d8
time. The target independent code was passing in one all the
time and targets weren't checking validity before using. Update
a few calls to pass in a MachineFunction where necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231970 91177308-0d34-0410-b5e6-96231b3b80d8
update all ports accordingly. Required a couple of small rewrites
in handling subtarget features during creation in PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231861 91177308-0d34-0410-b5e6-96231b3b80d8
This adds new node types for each intrinsic.
For instance, for addv, we have AArch64ISD::UADDV, such that:
(v4i32 (uaddv ...))
is the same as
(v4i32 (scalar_to_vector (i32 (int_aarch64_neon_uaddv ...))))
that is,
(v4i32 (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)),
(i32 (int_aarch64_neon_uaddv ...)), ssub)
In a combine, we transform all such across-vector-lanes intrinsics to:
(i32 (extract_vector_elt (uaddv ...), 0))
This has one big advantage: by making the extract_element explicit, we
enable the existing patterns for lane-aware instructions to fire.
This lets us avoid needlessly going through the GPRs. Consider:
uint32x4_t test_mul(uint32x4_t a, uint32x4_t b) {
return vmulq_n_u32(a, vaddvq_u32(b));
}
We now generate:
addv.4s s1, v1
mul.4s v0, v0, v1[0]
instead of the previous:
addv.4s s1, v1
fmov w8, s1
dup.4s v1, w8
mul.4s v0, v1, v0
rdar://20044838
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231840 91177308-0d34-0410-b5e6-96231b3b80d8
Most are redundant, and they never seem to fire.
The V128 integer patterns already exist in the INS multiclass.
The duplicates only fire when the vector index type isn't i64,
because they accept "imm" instead of an explicit "i64", as the
instruction definition patterns do.
TLI::getVectorIdxTy is i64 on AArch64, so this should never happen.
Also, one of them had a typo: for i64, INSvi32lane was used.
I noticed because I mistakenly used an explicit i32 as the idx type,
and got ins.s for an i64 vector_insert.
The V64 patterns also don't seem to ever fire, as V64 vector
extract/insert are legalized to V128.
The equivalent float patterns are unique and useful, so keep them.
No functional change intended; none exhibited on the LIT and LNT tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231838 91177308-0d34-0410-b5e6-96231b3b80d8
For inner one of nested loops, it is more likely to be a hot loop,
and the runtime check can be promoted out from patch 0001, so the
overhead is less, we can try a doubled threshold to unroll more loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231632 91177308-0d34-0410-b5e6-96231b3b80d8
In theory this allows the compiler to skip materializing the array on
the stack. In practice clang often fails to do that, but that's a
different story. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231571 91177308-0d34-0410-b5e6-96231b3b80d8
Teach the load store optimizer how to sign extend a result of a load pair when
it helps creating more pairs.
The rational is that loads are more expensive than sign extensions, so if we
gather some in one instruction this is better!
<rdar://problem/20072968>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231527 91177308-0d34-0410-b5e6-96231b3b80d8
Add MachO 32-bit (i.e. arm and x86) support for replacing global GOT equivalent
symbol accesses. Unlike 64-bit targets, there's no GOTPCREL relocation, and
access through a non_lazy_symbol_pointers section is used instead.
-- before
_extgotequiv:
.long _extfoo
_delta:
.long _extgotequiv-_delta
-- after
_delta:
.long L_extfoo$non_lazy_ptr-_delta
.section __IMPORT,__pointers,non_lazy_symbol_pointers
L_extfoo$non_lazy_ptr:
.indirect_symbol _extfoo
.long 0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231475 91177308-0d34-0410-b5e6-96231b3b80d8
Follow up r230264 and add ARM64 support for replacing global GOT
equivalent symbol accesses by references to the GOT entry for the final
symbol instead, example:
-- before
.globl _foo
_foo:
.long 42
.globl _gotequivalent
_gotequivalent:
.quad _foo
.globl _delta
_delta:
.long _gotequivalent-_delta
-- after
.globl _foo
_foo:
.long 42
.globl _delta
Ltmp3:
.long _foo@GOT-Ltmp3
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231474 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In PNaCl, most atomic instructions have their own @llvm.nacl.atomic.* function, each one, with a few exceptions, represents a consistent behaviour across all NaCl-supported targets. Unfortunately, the atomic RMW operations nand, [u]min, and [u]max aren't directly represented by any such @llvm.nacl.atomic.* function. This patch refines shouldExpandAtomicRMWInIR in TargetLowering so that a future `Le32TargetLowering` class can selectively inform the caller how the target desires the atomic RMW instruction to be expanded (ie via load-linked/store-conditional for ARM/AArch64, via cmpxchg for X86/others?, or not at all for Mips) if at all.
This does not represent a behavioural change and as such no tests were added.
Patch by: Richard Diamond.
Reviewers: jfb
Reviewed By: jfb
Subscribers: jfb, aemerson, t.p.northover, llvm-commits
Differential Revision: http://reviews.llvm.org/D7713
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231250 91177308-0d34-0410-b5e6-96231b3b80d8
As is described at http://llvm.org/bugs/show_bug.cgi?id=22408, the GNU linkers
ld.bfd and ld.gold currently only support a subset of the whole range of AArch64
ELF TLS relocations. Furthermore, they assume that some of the code sequences to
access thread-local variables are produced in a very specific sequence.
When the sequence is not as the linker expects, it can silently mis-relaxe/mis-optimize
the instructions.
Even if that wouldn't be the case, it's good to produce the exact sequence,
as that ensures that linkers can perform optimizing relaxations.
This patch:
* implements support for 16MiB TLS area size instead of 4GiB TLS area size. Ideally clang
would grow an -mtls-size option to allow support for both, but that's not part of this patch.
* by default doesn't produce local dynamic access patterns, as even modern ld.bfd and ld.gold
linkers do not support the associated relocations. An option (-aarch64-elf-ldtls-generation)
is added to enable generation of local dynamic code sequence, but is off by default.
* makes sure that the exact expected code sequence for local dynamic and general dynamic
accesses is produced, by making use of a new pseudo instruction. The patch also removes
two (AArch64ISD::TLSDESC_BLR, AArch64ISD::TLSDESC_CALL) pre-existing AArch64-specific pseudo
SDNode instructions that are superseded by the new one (TLSDESC_CALLSEQ).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231227 91177308-0d34-0410-b5e6-96231b3b80d8
Accidentally committed a few more of these cleanup changes than
intended. Still breaking these out & tidying them up.
This reverts commit r231135.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231136 91177308-0d34-0410-b5e6-96231b3b80d8
There doesn't seem to be any need to assert that iterator assignment is
between iterators over the same node - if you want to reuse an iterator
variable to iterate another node, that's perfectly acceptable. Just
don't mix comparisons between iterators into disjoint sequences, as
usual.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231135 91177308-0d34-0410-b5e6-96231b3b80d8
uses of TM->getSubtargetImpl and propagate to all calls.
This could be a debugging regression in places where we had a
TargetMachine and/or MachineFunction but don't have it as part
of the MachineInstr. Fixing this would require passing a
MachineFunction/Function down through the print operator, but
none of the existing uses in tree seem to do this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230710 91177308-0d34-0410-b5e6-96231b3b80d8
a lookup, pass that in rather than use a naked call to getSubtargetImpl.
This involved passing down and around either a TargetMachine or
TargetRegisterInfo. Update all callers/definitions around the targets
and SelectionDAG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230699 91177308-0d34-0410-b5e6-96231b3b80d8
This required plumbing a TargetRegisterInfo through computeRegisterProperties
and into findRepresentativeClass which uses it for register class
iteration. This required passing a subtarget into a few target specific
initializations of TargetLowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230583 91177308-0d34-0410-b5e6-96231b3b80d8
The reason why these large shift sizes happen is because OpaqueConstants
currently inhibit alot of DAG combining, but that has to be addressed in
another commit (like the proposal in D6946).
Differential Revision: http://reviews.llvm.org/D6940
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230355 91177308-0d34-0410-b5e6-96231b3b80d8