This is a complete re-write if the bottom-up vectorization class.
Before this commit we scanned the instruction tree 3 times. First in search of merge points for the trees. Second, for estimating the cost. And finally for vectorization.
There was a lot of code duplication and adding the DCE exposed bugs. The new design is simpler and DCE was a part of the design.
In this implementation we build the tree once. After that we estimate the cost by scanning the different entries in the constructed tree (in any order). The vectorization phase also works on the built tree.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185774 91177308-0d34-0410-b5e6-96231b3b80d8
For alignment purposes, the instruction array will always have an even
number of entries, with the final entry potentially unused (in which
case the array will be one longer than indicated by the count of unwind
codes field).
Reviewed by Charles Davis and Nico Rieck.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185760 91177308-0d34-0410-b5e6-96231b3b80d8
data structures.
The Win64 EH data structures must be of type IMAGE_REL_AMD64_ADDR32NB
instead of IMAGE_REL_AMD64_ADDR32. This is easiely achieved by adding
the VK_COFF_IMGREL32 modifier to the symbol reference.
Change also references to start and end of the SEH range of a function
as offsets to start of the function.
Reviewed by Charles Davis and Nico Rieck.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185759 91177308-0d34-0410-b5e6-96231b3b80d8
The code offset for unwind code SET_FPREG is wrong because it is set
to constant 0. The fix is to do the same as for the other unwind
codes: emit a label and later the absolute difference between the
label and the begin of the prologue.
Also enables the failing test case MC/COFF/seh.s
Reviewed by Charles Davis and Nico Rieck.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185758 91177308-0d34-0410-b5e6-96231b3b80d8
ReduceLoadWidth unconditionally drops extensions from loads. Limit it to the
case when all of the bits the extension would otherwise produce are dropped by
the shrink. It would be possible to shrink the load in more cases by merging
the extensions, but this isn't trivial and a very rare case. I left a TODO for
that case.
Fixes PR16551.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185755 91177308-0d34-0410-b5e6-96231b3b80d8
This prevents the emission of DAG-generated vreg definitions after a
tail call be dropping them entirely (on the grounds that nothing could
use them anyway, and they interfere with O0 CodeGen).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185754 91177308-0d34-0410-b5e6-96231b3b80d8
functions. Make the function attributes pass add it to known library functions
and when it can deduce it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185735 91177308-0d34-0410-b5e6-96231b3b80d8
A "pkhtb x, x, y asr #num" uses the lower 16 bits of "y asr #num" and packs them
in the bottom half of "x". An arithmetic and logic shift are only equivalent in
this context if the shift amount is 16. We would be shifting in ones into the
bottom 16bits instead of zeros if "y" is negative.
radar://14338767
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185712 91177308-0d34-0410-b5e6-96231b3b80d8
The stack coloring pass has code to delete stores and loads that become
trivially dead after coloring. Extend it to cope with single instructions
that copy from one frame index to another.
The testcase happens to show an example of this kicking in at the moment.
It did occur in Real Code too though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185705 91177308-0d34-0410-b5e6-96231b3b80d8
The stack coloring pass renumbered frame indexes with a loop of the form:
for each frame index FI
for each instruction I that uses FI
for each use of FI in I
rename FI to FI'
This caused problems if an instruction used two frame indexes F0 and F1
and if F0 was renamed to F1 and F1 to F2. The first time we visited the
instruction we changed F0 to F1, then we changed both F1s to F2.
In other words, the problem was that SSRefs recorded which instructions
used an FI, but not which MachineOperands and MachineMemOperands within
that instruction used it.
This is easily fixed for MachineOperands by walking the instructions
once and processing each operand in turn. There's already a loop to
do that for dead store elimination, so it seemed more efficient to
fuse the two at the block level.
MachineMemOperands are more tricky because they can be shared between
instructions. The patch handles them by making SSRefs an array of
MachineMemOperands rather than an array of MachineInstrs. We might end
up processing the same MachineMemOperand twice, but that's OK because
we always know from the SSRefs index what the original frame index was.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185703 91177308-0d34-0410-b5e6-96231b3b80d8
...now that the problem that prompted the restriction has been fixed.
The original spill-02.py was a compromise because at the time I couldn't
find an example that actually failed without the two scavenging slots.
The version included here did.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185701 91177308-0d34-0410-b5e6-96231b3b80d8
When a target@got@tprel or target@got@tprel@l symbol variant is used in
a fixup_ppc_half16 (*not* fixup_ppc_half16ds) context, we currently fail,
since the corresponding R_PPC64_GOT_TPREL16 / R_PPC64_GOT_TPREL16_LO
relocation types do not exist.
However, since such symbol variants resolve to GOT offsets which are
always 4-aligned, we can simply instead use the _DS variants of the
relocation types, which *do* exist.
The same applies for the @got@dtprel variants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185700 91177308-0d34-0410-b5e6-96231b3b80d8
This is another prerequisite for frame-to-frame MVC copies.
I'll commit the patch that makes use of the slot separately.
The downside of trying to test many corner cases with each of the
available addressing modes is that a fair few tests need to account
for the new frame layout. I do still think it's useful to have all
these tests though, since it's something that wouldn't get much coverage
otherwise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185698 91177308-0d34-0410-b5e6-96231b3b80d8
The ppc64-fixups.s test currently fails to build with GNU as, since it
does not support plain symbols as arguments to li/lis. Rewrite the test
for R_PPC64_ADDR16 and R_PPC64_REL16 to use lwz instead.
Allowing the test case to be built with both LLVM and GNU as makes it
easier to spot unwanted difference in the output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185694 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for the last missing construct to parse TLS-related
assembler code:
add 3, 4, symbol@tls
The ADD8TLS currently hard-codes the @tls into the assembler string.
This cannot be handled by the asm parser, since @tls is parsed as
a symbol variant. This patch changes ADD8TLS to have the @tls suffix
printed as symbol variant on output too, which allows us to remove
the isCodeGenOnly marker from ADD8TLS. This in turn means that we
can add a AsmOperand to accept @tls marked symbols on input.
As a side effect, this means that the fixup_ppc_tlsreg fixup type
is no longer necessary and can be merged into fixup_ppc_nofixup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185692 91177308-0d34-0410-b5e6-96231b3b80d8
In the SelectionDAG immediate operands to inline asm are constructed as
two separate operands. The first is a constant of value InlineAsm::Kind_Imm
and the second is a constant with the value of the immediate.
In ARMDAGToDAGISel::SelectInlineAsm, if we reach an operand of Kind_Imm we
should skip over the next operand too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185688 91177308-0d34-0410-b5e6-96231b3b80d8
We were being a bit too aggresive here in classifying global variables
with no global reference or constant value to be invalid - this would
cause LLVM to not emit the DWARF description of the global variable if
it had been optimized away, which isn't helpful for users who might
benefit from the global variable's description even if there's no
location information.
This also fixes a crasher issue here that I was unable to reduce a test
case for - involving a using decl (& subsequent
DW_TAG_imported_declaration ) of such a global variable that, once
optimized away, would crash when an attempt to emit the imported
declaration was made.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185675 91177308-0d34-0410-b5e6-96231b3b80d8
This transform was originally added in r185257 but later removed in
r185415. The original transform would create instructions speculatively
and then discard them if the speculation was proved incorrect. This has
been replaced with a scheme that splits the transform into two parts:
preflight and fold. While we preflight, we build up fold actions that
inform the folding stage on how to act.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185667 91177308-0d34-0410-b5e6-96231b3b80d8
This implements a proper PPCAsmBackend::writeNopData routine
that actually writes PowerPC nop instructions.
This fixes the last remaining difference in object file output
(text section) between the integrated assembler and GNU as
that I've seen anywhere.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185662 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a new decoder table/namespace 'VFPV8', as these instructions have their
top 4 bits as 0b1111, while other Thumb instructions have 0b1110.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185642 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for specifying condition registers and
condition register fields via expressions using the symbols
defined by the PowerISA, like "4*cr2+eq".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185633 91177308-0d34-0410-b5e6-96231b3b80d8
This allows us to create switches even if instcombine has munged two of the
incombing compares into one and some bit twiddling. This was motivated by enum
compares that are common in clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185632 91177308-0d34-0410-b5e6-96231b3b80d8
In the ARM back-end, build_vector nodes are lowered to a target specific
build_vector that uses floating point type.
This works well, unless the inserted bitcasts survive until instruction
selection. In that case, they incur moves between integer unit and floating
point unit that may result in inefficient code.
In other words, this conversion may introduce artificial dependencies when the
code leading to the build vector cannot be completed with a floating point type.
In particular, this happens when loads are not aligned.
Before this patch, in that case, the compiler generates general purpose loads
and creates the floating point vector from them, instead of directly using the
vector unit.
The patch uses a vector friendly sequence of code when the inserted bitcasts to
floating point survived DAGCombine.
This is done by a target specific DAGCombine that changes the target specific
build_vector into a sequence of insert_vector_elt that get rid of the bitcasts.
<rdar://problem/14170854>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185587 91177308-0d34-0410-b5e6-96231b3b80d8
Before the fix Thumb2 instructions of type "add rD, rN, #imm" (T3 encoding, see ARM ARM A8.8.4) with rD and rN both being low registers (r0-r7) were classified as having the T4 encoding.
The T4 encoding doesn't have a cc_out operand so for above instructions the operand gets erroneously removed, corrupting the token stream and leading to parse errors later in the process.
This bug prevented "add r1, r7, #0xcbcbcbcb" from being assembled correctly.
Fixes <rdar://problem/14224440>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185575 91177308-0d34-0410-b5e6-96231b3b80d8
Just as with mfocrf, it is also preferable to use mtocrf instead of
mtcrf when only a single CR register is to be written.
Current code however always emits mtcrf. This probably does not matter
when using an external assembler, since the GNU assembler will in fact
automatically replace mtcrf with mtocrf when possible. It does create
inefficient code with the integrated assembler, however.
To fix this, this patch adds MTOCRF/MTOCRF8 instruction patterns and
uses those instead of MTCRF/MTCRF8 everything. Just as done in the
MFOCRF patch committed as 185556, these patterns will be converted
back to MTCRF if MTOCRF is not available on the machine.
As a side effect, this allows to modify the MTCRF pattern to accept
the full range of mask operands for the benefit of the asm parser.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185561 91177308-0d34-0410-b5e6-96231b3b80d8
It was only passing because 'grep andpd' was not finding any andpd, but
we don't fail if part of a pipe fails.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185552 91177308-0d34-0410-b5e6-96231b3b80d8
This changes behavior of -msan-poison-stack=0 flag from not poisoning stack
allocations to actively unpoisoning them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185538 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for the generic forms of mtspr/mfspr
for the asm parser. The compiler will continue to use
the specialized patters for mtlr etc. since those are
needed to correctly describe data flow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185532 91177308-0d34-0410-b5e6-96231b3b80d8
Add a mapping from register-based <INSN>R instructions to the corresponding
memory-based <INSN>. Use it to cut down on the number of spill loads.
Some instructions extend their operands from smaller fields, so this
required a new TSFlags field to say how big the unextended operand is.
This optimisation doesn't trigger for C(G)R and CL(G)R because in practice
we always combine those instructions with a branch. Adding a test for every
other case probably seems excessive, but it did catch a missed optimisation
for DSGF (fixed in r185435).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185529 91177308-0d34-0410-b5e6-96231b3b80d8
1. it should accept only 4-byte aligned addresses
2. the maximum offset should be 1020
3. it should be encoded with the offset scaled by two bits
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185528 91177308-0d34-0410-b5e6-96231b3b80d8
Swift cores implement store barriers that are stronger than the ARM
specification but weaker than general barriers. They are, in fact, just about
enough to provide the ordering needed for atomic operations with release
semantics.
This patch makes use of that quirk.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185527 91177308-0d34-0410-b5e6-96231b3b80d8
This implies annotating it as nounwind and its arguments as nocapture. To be
conservative, we do not annotate the arguments with noalias since some platforms
do not have restrict on the declaration for gettimeofday.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185502 91177308-0d34-0410-b5e6-96231b3b80d8
Correctly handles ref_addr depending on the Dwarf version. Emit Dwarf with
version from module flag.
TODO: turn on/off features depending on the Dwarf version.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185484 91177308-0d34-0410-b5e6-96231b3b80d8
This patch now adds support for recognizing TLS call sequences in
the asm parser. This needs a new pattern BL8_TLS, which is like
BL8_NOP_TLS except without nop. That pattern is used for the
asm parser only.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185478 91177308-0d34-0410-b5e6-96231b3b80d8
The PowerPC-specific modifiers VK_PPC_TLSGD and VK_PPC_TLSLD
correspond exactly to the generic modifiers VK_TLSGD and VK_TLSLD.
This causes some confusion with the asm parser, since VK_PPC_TLSGD
is output as @tlsgd, which is then read back in as VK_TLSGD.
To avoid this confusion, this patch removes the PowerPC-specific
modifiers and uses the generic modifiers throughout. (The only
drawback is that the generic modifiers are printed in upper case
while the usual convention on PowerPC is to use lower-case modifiers.
But this is just a cosmetic issue.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185476 91177308-0d34-0410-b5e6-96231b3b80d8
This adds an implementation of getDebugThreadLocalSymbol for
(64-bit) PowerPC. This needs to return a generic MCExpr
since on ppc64, we need to add a bias of 0x8000 to the
value returned by the R_PPC64_DTPREL64 relocation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185461 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes some cases where we were using full 64-bit division for (sdiv i32, i32)
and (sdiv i64, i32).
The "32" in "SDIVREM32" just refers to the second operand. The first operand
of all *DIVREM*s is a GR128.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185435 91177308-0d34-0410-b5e6-96231b3b80d8
Try to use MVC when spilling the destination of a simple load or the source
of a simple store. As explained in the comment, this doesn't yet handle
the case where the load or store location is also a frame index, since
that could lead to two simultaneous scavenger spills, something the
backend can't handle yet. spill-02.py tests that this restriction kicks in,
but unfortunately I've not yet found a case that would fail without it.
The volatile trick I used for other scavenger tests doesn't work here
because we can't use MVC for volatile accesses anyway.
I'm planning on relaxing the restriction later, hopefully with a test
that does trigger the problem...
Tests @f8 and @f9 also showed that L(G)RL and ST(G)RL were wrongly
classified as SimpleBDX{Load,Store}. It wouldn't be easy to test for
that bug separately, which is why I didn't split out the fix as a
separate patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185434 91177308-0d34-0410-b5e6-96231b3b80d8
This is the first use of D(L,B) addressing, which required a fair bit
of surgery. For that reason, the patch just adds the instruction
definition and the associated assembler and disassembler support.
A later patch will actually make use of it for codegen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185433 91177308-0d34-0410-b5e6-96231b3b80d8
r182680 replaced CountLeadingZeros_32 with a template function
countLeadingZeros that relies on using the correct argument type to give
the right result. The type passed in the XCore backend after this
revision was incorrect in a couple of places.
Patch by Robert Lytton.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185430 91177308-0d34-0410-b5e6-96231b3b80d8
According to ARM EHABI section 9.2, if the
__aeabi_unwind_cpp_pr1() or __aeabi_unwind_cpp_pr2() is
used, then the handler data must be emitted after the unwind
opcodes. The handler data consists of several words, and
should be terminated by zero.
In case that the .handlerdata directive is not specified by
the programmer, we should emit zero to terminate the handler
data.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185422 91177308-0d34-0410-b5e6-96231b3b80d8
DAGCombiner was counting all uses of a load node when considering whether it's
worth combining into a zextload. Really, it wants to ignore the chain and just
count real uses.
rdar://problem/13896307
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185419 91177308-0d34-0410-b5e6-96231b3b80d8
I'm reverting this commit because:
1. As discussed during review, it needs to be rewritten (to avoid creating and
then deleting instructions).
2. This is causing optimizer crashes. Specifically, I'm seeing things like
this:
While deleting: i1 %
Use still stuck around after Def is destroyed: <badref> = select i1 <badref>, i32 0, i32 1
opt: /src/llvm-trunk/lib/IR/Value.cpp:79: virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed.
I'd guess that these will go away once we're no longer creating/deleting
instructions here, but just in case, I'm adding a regression test.
Because the code is bring rewritten, I've just XFAIL'd the original regression test. Original commit message:
InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms
Real world code sometimes has the denominator of a 'udiv' be a
'select'. LLVM can handle such cases but only when the 'select'
operands are symmetric in structure (both select operands are a constant
power of two or a left shift, etc.). This falls apart if we are dealt a
'udiv' where the code is not symetric or if the select operands lead us
to more select instructions.
Instead, we should treat the LHS and each select operand as a distinct
divide operation and try to optimize them independently. If we can
to simplify each operation, then we can replace the 'udiv' with, say, a
'lshr' that has a new select with a bunch of new operands for the
select.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185415 91177308-0d34-0410-b5e6-96231b3b80d8
There are a couple of (small) related changes here:
1. The printed name of the VRSAVE register has been changed from VRsave to
vrsave in order to match the name accepted by GNU binutils.
2. Support for parsing vrsave has been added to the asm parser (it seems that
there was no test case specifically covering this code, so I've added one).
3. The list of Altivec registers, which was common to all calling conventions,
has been separated out. This allows us to define the base CSR lists, and then
lists for each ABI with Altivec included. This allows SjLj, for example, to
work correctly on non-Altivec targets without using unnatural definitions of
the NoRegs CSR list.
4. VRSAVE is now always reserved on non-Darwin targets and all Altivec
registers are reserved when Altivec is disabled.
With these changes, it is now possible to compile a function containing
__builtin_unwind_init() on Linux/PPC64 with debugging information. This did not
work previously because GNU binutils assumes that all .cfi_offset offsets will
be 8-byte aligned on PPC64 (and errors out if you provide a non-8-byte-aligned
offset). This is not true for the vrsave register, however, because this
register is used only on Darwin, GCC does not bother printing a .cfi_offset
entry for it (even though there is a slot in the stack frame for it as
specified by the ABI). This change allows us to do the same: we will also not
print .cfi_offset directives for vrsave.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185409 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for TLS data relocations and modifiers:
.quad target@dtpmod
.quad target@tprel
.quad target@dtprel
Currently exploited by the asm parser only.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185394 91177308-0d34-0410-b5e6-96231b3b80d8
Restrict the current TLS support to X86 ELF for now. Test that we don't
produce it on PPC & we can flesh that test case out with the right thing
once someone implements it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185389 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for all missing condition register logical
instructions and extended mnemonics to the asm parser.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185387 91177308-0d34-0410-b5e6-96231b3b80d8
Create a dedicated register class for floating point condition code registers and
move FCC0 from register class CCR to the new register class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185373 91177308-0d34-0410-b5e6-96231b3b80d8
When phis get lowered, destination copies are inserted using an iterator that is
determined once for all phis in the block, which BuildMI interprets as a request
to insert an instruction directly before the iterator. In the case of a cyclic
phi, source copies may also be inserted directly before this iterator, which can
cause source copies to be inserted before destination copies. The fix is to keep
an iterator to the last phi and then advance it while lowering each phi in order
to insert destination copies directly after the phis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185363 91177308-0d34-0410-b5e6-96231b3b80d8
Although you can't generate this from C on PPC64, if you have a loop using a
64-bit counter on PPC32 then you can't form a CTR-based loop for it. This had
been cauing the PPCCTRLoops pass to assert.
Thanks to Joerg Sonnenberger for providing a test case!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185361 91177308-0d34-0410-b5e6-96231b3b80d8
According to the AArch64 ELF specification (4.6.8), it's the
assembler's responsibility to make sure the shift amount is correct in
relocated MOVZ/MOVK instructions.
This wasn't being obeyed by either the MCJIT CodeGen or RuntimeDyldELF
(which happened to work out well for JIT tests). This commit should
make us compliant in this area.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185360 91177308-0d34-0410-b5e6-96231b3b80d8
(2) Rename llvm-cov test inputs so the string "llvm-cov" doesn't get
substituted by lit within the input filenames on the RUN line.
(3) XFAIL llvm-cov.test because it asserts:
include/llvm/ADT/SmallVector.h:140: reference llvm::SmallVectorTemplateCommon<llvm::GCOVBlock *, void>::operator[](unsigned int) [T = llvm::GCOVBlock *]: Assertion `begin() + idx < end()' failed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185358 91177308-0d34-0410-b5e6-96231b3b80d8
Turns out I'd misread the architecture reference manual and thought
that was a load/store-store barrier, when it's not.
Thanks for pointing it out Eli!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185356 91177308-0d34-0410-b5e6-96231b3b80d8
A @got reference must always result in a relocation, so that
the linker has a chance to set up the GOT entry, even if the
symbol happens to be local.
Add a PPCELFObjectWriter::ExplicitRelSym routine that enforces
a relocation to be emitted for GOT references.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185353 91177308-0d34-0410-b5e6-96231b3b80d8
The test case had a couple of FIXMEs where the instruction is in
fact already supported by the back-end. In some other case, while
the generic form of the instruction is not yet supported, a
specialized form is. This adds tests for those already supported
instructions / instruction forms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185347 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for the "sync $L" instruction with operand,
and provides aliases for "lwsync" and "ptesync".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185344 91177308-0d34-0410-b5e6-96231b3b80d8
I believe the full "dmb ish" barrier is not required to guarantee release
semantics for atomic operations. The weaker "dmb ishst" prevents previous
operations being reordered with a store executed afterwards, which is enough.
A key point to note (fortunately already correct) is that this barrier alone is
*insufficient* for sequential consistency, no matter how liberally placed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185339 91177308-0d34-0410-b5e6-96231b3b80d8
Since we were explicitly not calling AsmPrinter::doInitialization,
any module-scope inline asm was not being printed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185336 91177308-0d34-0410-b5e6-96231b3b80d8
Fix a case where we were incorrectly sign-extending a value when we should have been zero-extending the value.
Also change some SIGN_EXTEND to ANY_EXTEND because we really dont care and may have more opportunity to fold subexpressions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185331 91177308-0d34-0410-b5e6-96231b3b80d8