Currently, we're adding a uint64_t describing the current subtarget so
that matching can check whether the specified register is valid.
However, we want to move to a bitset for those bits (x86 has more than
64 of them).
This can't live in a union so it's probably better to do the checks
early (especially as there are only 3 of them).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226841 91177308-0d34-0410-b5e6-96231b3b80d8
AAPCS64 says that it's up to the platform to specify whether x18 is
reserved, and a first step on that way is to add a flag controlling
it.
From: Andrew Turner <andrew@fubar.geek.nz>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226664 91177308-0d34-0410-b5e6-96231b3b80d8
The fixes are to note that AArch64 has additional restrictions on when local
relocations can be used. In particular, ld64 requires that relocations to
cstring/cfstrings use linker visible symbols.
Original message:
In an assembly expression like
bar:
.long L0 + 1
the intended semantics is that bar will contain a pointer one byte past L0.
In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.
The solution used in ELF to use relocation with symbols if there is a non-zero
addend.
In MachO before this patch we would just keep all symbols in some sections.
This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.
This patch implements the non-zero addend logic for MachO too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226503 91177308-0d34-0410-b5e6-96231b3b80d8
utils/sort_includes.py.
I clearly haven't done this in a while, so more changed than usual. This
even uncovered a missing include from the InstrProf library that I've
added. No functionality changed here, just mechanical cleanup of the
include order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225974 91177308-0d34-0410-b5e6-96231b3b80d8
One is that AArch64 has additional restrictions on when local relocations can
be used. We have to take those into consideration when deciding to put a L
symbol in the symbol table or not.
The other is that ld64 requires the relocations to cstring to use linker
visible symbols on AArch64.
Thanks to Michael Zolotukhin for testing this!
Remove doesSectionRequireSymbols.
In an assembly expression like
bar:
.long L0 + 1
the intended semantics is that bar will contain a pointer one byte past L0.
In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.
The solution used in ELF to use relocation with symbols if there is a non-zero
addend.
In MachO before this patch we would just keep all symbols in some sections.
This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.
This patch implements the non-zero addend logic for MachO too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225644 91177308-0d34-0410-b5e6-96231b3b80d8
type (in addition to the memory type).
The *LoadExt* legalization handling used to only have one type, the
memory type. This forced users to assume that as long as the extload
for the memory type was declared legal, and the result type was legal,
the whole extload was legal.
However, this isn't always the case. For instance, on X86, with AVX,
this is legal:
v4i32 load, zext from v4i8
but this isn't:
v4i64 load, zext from v4i8
Whereas v4i64 is (arguably) legal, even without AVX2.
Note that the same thing was done a while ago for truncstores (r46140),
but I assume no one needed it yet for extloads, so here we go.
Calls to getLoadExtAction were changed to add the value type, found
manually in the surrounding code.
Calls to setLoadExtAction were mechanically changed, by wrapping the
call in a loop, to match previous behavior. The loop iterates over
the MVT subrange corresponding to the memory type (FP vectors, etc...).
I also pulled neighboring setTruncStoreActions into some of the loops;
those shouldn't make a difference, as the additional types are illegal.
(e.g., i128->i1 truncstores on PPC.)
No functional change intended.
Differential Revision: http://reviews.llvm.org/D6532
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225421 91177308-0d34-0410-b5e6-96231b3b80d8
A few loops do trickier things than just iterating on an MVT subset,
so I'll leave them be for now.
Follow-up of r225387.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225392 91177308-0d34-0410-b5e6-96231b3b80d8
Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case
where the original subtraction could have caused an overflow.
Reverting the same.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225341 91177308-0d34-0410-b5e6-96231b3b80d8
We used to generate code similar to:
umov.b w8, v0[2]
strb w8, [x0, x1]
because the STR*ro* patterns were preferred to ST1*.
Instead, we can avoid going through GPRs, and generate:
add x8, x0, x1
st1.b { v0 }[2], [x8]
This patch increases the ST1* AddedComplexity to achieve that.
rdar://16372710
Differential Revision: http://reviews.llvm.org/D6202
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225183 91177308-0d34-0410-b5e6-96231b3b80d8
For 0-lane stores, we used to generate code similar to:
fmov w8, s0
str w8, [x0, x1, lsl #2]
instead of:
str s0, [x0, x1, lsl #2]
To correct that: for store lane 0 patterns, directly match to STR <subreg>0.
Byte-sized instructions don't have the special case for a 0 index,
because FPR8s are defined to have untyped content.
rdar://16372710
Differential Revision: http://reviews.llvm.org/D6772
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225181 91177308-0d34-0410-b5e6-96231b3b80d8
Weak externals are resolved statically, so we can actually generate the tail
call on PE/COFF targets without breaking the requirements. It is questionable
whether we want to propagate the current behaviour for MachO as the requirements
are part of the ARM ELF specifications, and it seems that prior to the SVN
r215890, we would have tail'ed the call. For now, be conservative and only
permit it on PE/COFF where the call will always be fully resolved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225119 91177308-0d34-0410-b5e6-96231b3b80d8
Make sure they all have llvm_unreachable on the default path out of the switch. Remove unnecessary "default: break". Remove a 'return' after unreachable. Fix some indentation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225114 91177308-0d34-0410-b5e6-96231b3b80d8
The issues was that AArch64 has additional restrictions on when local
relocations can be used. We have to take those into consideration when
deciding to put a L symbol in the symbol table or not.
Original message:
Remove doesSectionRequireSymbols.
In an assembly expression like
bar:
.long L0 + 1
the intended semantics is that bar will contain a pointer one byte past L0.
In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.
The solution used in ELF to use relocation with symbols if there is a non-zero
addend.
In MachO before this patch we would just keep all symbols in some sections.
This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.
This patch implements the non-zero addend logic for MachO too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225048 91177308-0d34-0410-b5e6-96231b3b80d8
In an assembly expression like
bar:
.long L0 + 1
the intended semantics is that bar will contain a pointer one byte past L0.
In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.
The solution used in ELF to use relocation with symbols if there is a non-zero
addend.
In MachO before this patch we would just keep all symbols in some sections.
This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.
This patch implements the non-zero addend logic for MachO too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224985 91177308-0d34-0410-b5e6-96231b3b80d8
Debug info marks the first instruction without the FrameSetup flag
as being the end of the function prologue. Any CFI instructions in the
middle of the function prologue would cause debug info to end the prologue
too early and worse, attach the line number of the CFI instruction, which
incidentally is often 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224294 91177308-0d34-0410-b5e6-96231b3b80d8
Add in definedness checks for shift operators, null checks when
pointers are assumed by the code to be non-null, and explicit
unreachables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224255 91177308-0d34-0410-b5e6-96231b3b80d8
Previously print+verify passes were added in a very unsystematic way, which is
annoying when debugging as you miss intermediate steps and allows bugs to stay
unnotice when no verification is performed.
To make this change practical I added the possibility to explicitely disable
verification. I used this option on all places where no verification was
performed previously (because alot of places actually don't pass the
MachineVerifier).
In the long term these problems should be fixed properly and verification
enabled after each pass. I'll enable some more verification in subsequent
commits.
This is the 2nd attempt at this after realizing that PassManager::add() may
actually delete the pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224059 91177308-0d34-0410-b5e6-96231b3b80d8
Previously print+verify passes were added in a very unsystematic way, which is
annoying when debugging as you miss intermediate steps and allows bugs to stay
unnotice when no verification is performed.
To make this change practical I added the possibility to explicitely disable
verification. I used this option on all places where no verification was
performed previously (because alot of places actually don't pass the
MachineVerifier).
In the long term these problems should be fixed properly and verification
enabled after each pass. I'll enable some more verification in subsequent
commits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224042 91177308-0d34-0410-b5e6-96231b3b80d8
In the large code model we have to first get the address of the GOT entry, load
the address of the constant, and then load the constant itself.
To avoid these loads and the GOT entry alltogether this commit changes the way
how FP constants are materialized in the large code model. The constats are now
materialized in a GPR and then bitconverted/moved into the FPR.
Reviewed by Tim Northover
Fixes rdar://problem/16572564.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223941 91177308-0d34-0410-b5e6-96231b3b80d8
The load/store value type is currently not available when lowering the memcpy
intrinsic. Add the missing nullptr check to support this in 'computeAddress'.
Fixes rdar://problem/19178947.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223818 91177308-0d34-0410-b5e6-96231b3b80d8
DenseSet used to be implemented as DenseMap<Key, char>, which usually doubled
the memory footprint of the map. Now we use a compressed set so the second
element uses no memory at all. This required some surgery on DenseMap as
all accesses to the bucket now have to go through methods; this should
have no impact on the behavior of DenseMap though. The new default bucket
type for DenseMap is a slightly extended std::pair as we expose it through
DenseMap's iterator and don't want to break any existing users.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223588 91177308-0d34-0410-b5e6-96231b3b80d8
All our patterns use MVT::i64, but the ISelLowering nodes were inconsistent in
their choice.
No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223551 91177308-0d34-0410-b5e6-96231b3b80d8
Use the MCAsmInfo instead of the DataLayout, and allow
specifying a custom prefix for labels specifically. HSAIL
requires that labels begin with @, but global symbols with &.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223323 91177308-0d34-0410-b5e6-96231b3b80d8
A global variable without an explicit alignment specified should be assumed to
be ABI-aligned according to its type, like on other platforms. This allows us
to use better memory operations when accessing it.
rdar://18533701
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223180 91177308-0d34-0410-b5e6-96231b3b80d8
This frequently leads to cases like:
ldr xD, [xN, :lo12:var]
add xA, xN, :lo12:var
ldr xD, [xA, #8]
where the ADD would have been needed anyway, and the two distinct addressing
modes can prevent the formation of an ldp. Because of how we handle ADRP
(aggressively forming an ADRP/ADD pseudo-inst at ISel time), this pattern also
results in duplicated ADRP instructions (one on its own to cover the ldr, and
one combined with the add).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223172 91177308-0d34-0410-b5e6-96231b3b80d8
Reduce the number of nops emitted for stackmap shadows on AArch64 by counting
non-stackmap instructions up to the next branch target towards the requested
shadow.
<rdar://problem/14959522>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223156 91177308-0d34-0410-b5e6-96231b3b80d8
The blocking code originated in ARM, which is more aggressive about casting
types to a canonical representative before doing anything else, so I missed out
most vector HFAs and broke the ABI. This should fix it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223126 91177308-0d34-0410-b5e6-96231b3b80d8
r208210 introduced an optimization that improves the vector select
codegen by doing the setcc on vectors directly.
This is a problem they the setcc operands are i1s, because the
optimization would create vectors of i1, which aren't legal.
Part of PR21549.
Differential Revision: http://reviews.llvm.org/D6308
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223075 91177308-0d34-0410-b5e6-96231b3b80d8
r213378 improved f16 bitcasts, so that they go directly through subregs,
instead of through the stack. That code now causes an assertion failure
for bitcasts from other 16-bits types (most importantly v2i8).
Correct that by doing the custom lowering for i16 bitcasts only when the
input is an f16.
Part of PR21549.
Differential Revision: http://reviews.llvm.org/D6307
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223074 91177308-0d34-0410-b5e6-96231b3b80d8
The AAPCS treats small structs and homogeneous floating (or vector) aggregates
specially, and guarantees they either get passed as a contiguous block of
registers, or prevent any future use of those registers and get passed on the
stack.
This concept can fit quite neatly into LLVM's own type system, mapping an HFA
to [N x float] and so on, and small structs to [N x i64]. Doing so allows
front-ends to emit AAPCS compliant code without having to duplicate the
register counting logic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222903 91177308-0d34-0410-b5e6-96231b3b80d8
This mostly entails adding relocations, however there are a couple of
changes to existing relocations:
1. R_AARCH64_NONE is defined to be zero rather than 256
R_AARCH64_NONE has been defined to be zero for a long time elsewhere
e.g. binutils and glibc since the submission of the AArch64 port in
2012 so this is required for compatibility.
2. R_AARCH64_TLSDESC_ADR_PAGE renamed to R_AARCH64_TLSDESC_ADR_PAGE21
I don't think there is any way for relocation names to leak out of LLVM
so this should not break anything.
Tested with check-all with no regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222821 91177308-0d34-0410-b5e6-96231b3b80d8