%got_hi, %got_lo, %call_hi, %call_lo, %higher, and %highest are now recognised
by MipsAsmParser::getVariantKind().
To prevent future issues with missing entries in this StringSwitch, I've added
an assertion to the default case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205200 91177308-0d34-0410-b5e6-96231b3b80d8
Unlike my previous commit, don't try to remove the corresponding VK_Mips_GOT yet
even though it shares the same assembly text since that is used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205196 91177308-0d34-0410-b5e6-96231b3b80d8
This allows allows us to replace ISD::EXTRACT_ELEMENT, which is lowered
using shifts, with ISD::EXTRACT_VECTOR_ELT, which is a no-op.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205187 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Where those ISA's are not currently supported, the test is run with the smallest
superset of that ISA.
Some instructions are valid but don't pass yet. These have been placed in the
valid-xfail.s's which will XPASS if _any_ instruction starts working.
The valid.s's do not verify the encoding yet. There are also no tests checking that instructions from neighbouring ISA's are not accepted.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3214
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205180 91177308-0d34-0410-b5e6-96231b3b80d8
When the loop vectorizer vectorizes code that uses the loop induction variable,
we often end up with IR like this:
%b1 = insertelement <2 x i32> undef, i32 %v, i32 0
%b2 = shufflevector <2 x i32> %b1, <2 x i32> undef, <2 x i32> zeroinitializer
%i = add <2 x i32> %b2, <i32 2, i32 3>
If the add in this example is not legal (as is the case on PPC with VSX), it
will be scalarized, and we'll end up with a number of extract_vector_elt nodes
with the vector shuffle as the input operand, and that vector shuffle is fed by
one or more build_vector nodes. By the time that vector operations are
expanded, visitEXTRACT_VECTOR_ELT will not create new extract_vector_elt by
looking through the vector shuffle (to make sure that no illegal operations are
created), and so the extract_vector_elt -> vector shuffle -> build_vector is
never simplified to an operand of the build vector.
By looking at build_vectors through a shuffle we fix this particular situation,
preventing a vector from being built, only to be deconstructed again (for the
scalarized add) -- an expensive proposition when this all needs to be done via
the stack. We probably want a more comprehensive fix here where we look back
recursively through any shuffles to any build_vectors or scalar_to_vectors,
etc. but that can come later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205179 91177308-0d34-0410-b5e6-96231b3b80d8
While reviewing r204163, I noticed that the MIPS16 test only checked for a .ent
directive and didn't actually check the code emitted. Fixed this and added a
check for llvm.bswap.i32 on MIPS64 at the same time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205177 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The FileHeader mapping now accepts an optional Flags sequence that accepts
the EF_<arch>_<flag> constants. When not given, Flags defaults to zero.
Reviewers: atanasyan
Reviewed By: atanasyan
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D3213
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205173 91177308-0d34-0410-b5e6-96231b3b80d8
is not a pattern to lower this with clever instructions that zero the
register, so restrict the zero immediate legality special case to f64
and f32 (the only two sizes which fmov seems to directly support). Fixes
backend errors when building code such as libxml.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205161 91177308-0d34-0410-b5e6-96231b3b80d8
There is no direct AVX instruction to convert to unsigned. I have some ideas
how we may be able to do this with three vector instructions but the current
backend just bails on this to get it scalarized.
See the comment why we need to adjust the cost returned by BasicTTI.
The test is a bit roundabout (and checks assembly rather than bit code) because
I'd like it to work even if at some point we could vectorize this conversion.
Fixes <rdar://problem/16371920>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205159 91177308-0d34-0410-b5e6-96231b3b80d8
When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using
SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire
vector and then load the piece we want. This is fine in isolation, but
generating a new store (and corresponding stack slot) for each extraction ends
up producing code of poor quality. When we scalarize a vector operation (using
SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT
for each element in the vector. This used to generate one stored copy of the
vector for each element in the vector. Now we search the uses of the vector for
a suitable store before generating a new one, which results in much more
efficient scalarization code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205153 91177308-0d34-0410-b5e6-96231b3b80d8
sitofp from v2i32 to v2f64 ends up generating a SIGN_EXTEND_INREG v2i64 node
(and similarly for v2i16 and v2i8). Even though there are no sign-extension (or
algebraic shifts) for v2i64 types, we can handle v2i32 sign extensions by
converting two and from v2i64. The small trick necessary here is to shift the
i32 elements into the right lanes before the i32 -> f64 step. This is because
of the big Endian nature of the system, we need the i32 portion in the high
word of the i64 elements.
For v2i16 and v2i8 we can do the same, but we first use the default Altivec
shift-based expansion from v2i16 or v2i8 to v2i32 (by casting to v4i32) and
then apply the above procedure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205146 91177308-0d34-0410-b5e6-96231b3b80d8
v2i64 is a legal type under VSX, however we don't have native vector
comparisons. We can handle eq/ne by casting it to an Altivec type, but
everything else must be expanded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205106 91177308-0d34-0410-b5e6-96231b3b80d8
When LLVM is not built with zlib, nocompression.s will test
for the error message. But this test case will cause breakage
because the exit code is non-zero. This commit fix this issue
by adding "not" to the command.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205102 91177308-0d34-0410-b5e6-96231b3b80d8
Issue subject: Crash using integrated assembler with immediate arithmetic
Fix description:
Expressions like 'cmp r0, #(l1 - l2) >> 3' could not be evaluated on asm parsing stage,
since it is impossible to resolve labels on this stage. In the end of stage we still have
expression (MCExpr).
Then, when we want to encode it, we expect it to be an immediate, but it still an expression.
Patch introduces a Fixup (MCFixup instance), that is processed after main encoding stage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205094 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a second implementation of the AArch64 architecture to LLVM,
accessible in parallel via the "arm64" triple. The plan over the
coming weeks & months is to merge the two into a single backend,
during which time thorough code review should naturally occur.
Everything will be easier with the target in-tree though, hence this
commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205090 91177308-0d34-0410-b5e6-96231b3b80d8
I started trying to fix a small issue, but this code has seen a small fix too
many.
The old code was fairly convoluted. Some of the issues it had:
* It failed to check if a symbol difference was in the some section when
converting a relocation to pcrel.
* It failed to check if the relocation was already pcrel.
* The pcrel value computation was wrong in some cases (relocation-pc.s)
* It was missing quiet a few cases where it should not convert symbol
relocations to section relocations, leaving the backends to patch it up.
* It would not propagate the fact that it had changed a relocation to pcrel,
requiring a quiet nasty work around in ARM.
* It was missing comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205076 91177308-0d34-0410-b5e6-96231b3b80d8
We had stored both f64 values and v2f64, etc. values in the VSX registers. This
worked, but was suboptimal because we would always spill 16-byte values even
through we almost always had scalar 8-byte values. This resulted in an
increase in stack-size use, extra memory bandwidth, etc. To fix this, I've
added 64-bit subregisters of the Altivec registers, and combined those with the
existing scalar floating-point registers to form a class of VSX scalar
floating-point registers. The ABI code has also been enhanced to use this
register class and some other necessary improvements have been made.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205075 91177308-0d34-0410-b5e6-96231b3b80d8
Emit 32-bit register names instead of 64-bit register names if the target does
not have 64-bit general purpose registers.
<rdar://problem/14653996>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205067 91177308-0d34-0410-b5e6-96231b3b80d8
Turns out debug_frame does use multiple fragments, so it doesn't
compress correctly with the current approach. Disable compressing it for
now while I figure out what's the best solution for it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205059 91177308-0d34-0410-b5e6-96231b3b80d8
WinCOFF cannot form PC relative relocations to support absolute
MCValues. We should reenable this once WinCOFF supports emission of
IMAGE_REL_I386_REL32 relocations.
This fixes PR19272.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205058 91177308-0d34-0410-b5e6-96231b3b80d8
This is a bit of a stab in the dark, since I have zlib on my machine.
Just going to bounce it off the bots & see if it sticks.
Do we have some convention for negative REQUIRES: checks? Or do I just
need to add a feature like I've done here?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205050 91177308-0d34-0410-b5e6-96231b3b80d8
Not only did I invert the indices when I wrote the code, but I also did the
same thing when I wrote the regression test. Oops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205046 91177308-0d34-0410-b5e6-96231b3b80d8
v2[fi]64 values need to be explicitly passed in VSX registers. This is because
the code in TRI that finds the minimal register class given a register and a
value type will assert if given an Altivec register and a non-Altivec type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205041 91177308-0d34-0410-b5e6-96231b3b80d8
It was using "lc -filetype=obj" just to pass the result to
"llvm-objdupm -disassemble" and then filecheck assembly.
The CHECK-NOT would never match anyway since it was missing $.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205036 91177308-0d34-0410-b5e6-96231b3b80d8
Extract element instructions that will be removed when vectorzing lower the
cost.
Patch by Arch D. Robison!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205020 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r204912, and follow-up commit r204948.
This introduced a performance regression, and the fix is not completely
clear yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205010 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r203553, and follow-up commits r203558 and r203574.
I will follow this up on the mailinglist to do it in a way that won't
cause subtle PRE bugs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205009 91177308-0d34-0410-b5e6-96231b3b80d8
As explained in r204976, because of how the allocation of VSX registers
interacts with the call-lowering code, we sometimes end up generating self VSX
copies. Specifically, things like this:
%VSL2<def> = COPY %F2, %VSL2<imp-use,kill>
(where %F2 is really a sub-register of %VSL2, and so this copy is a nop)
This adds a small cleanup pass to remove these prior to post-RA scheduling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204980 91177308-0d34-0410-b5e6-96231b3b80d8
First, v2f64 vector extract had not been declared legal (and so the existing
patterns were not being used). Second, the patterns for that, and for
scalar_to_vector, should really be a regclass copy, not a subregister
operation, because the VSX registers directly hold both the vector and scalar data.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204971 91177308-0d34-0410-b5e6-96231b3b80d8
These operations need to be expanded during legalization so that isel does not
crash. In theory, we might be able to custom lower some of these. That,
however, would need to be follow-up work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204963 91177308-0d34-0410-b5e6-96231b3b80d8
1) When creating a .debug_* section and instead create a .zdebug_
section.
2) When creating a fragment in a .zdebug_* section, make it a compressed
fragment.
3) When computing the size of a compressed section, compress the data
and use the size of the compressed data.
4) Emit the compressed bytes.
Also, check that only if a section has a compressed fragment, then that
is the only fragment in the section.
Assert-fail if the fragment's data is modified after it is compressed.
Initial review on llvm-commits by Eric Christopher and Rafael Espindola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204958 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes a miscompile introduced in r204912. It would miscompile code like
(unsigned)(a + -49) <= 5U. The transform would turn this into
(unsigned)a < 55U, which would return true for values in [0, 49], when
it should not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204948 91177308-0d34-0410-b5e6-96231b3b80d8
This adds back r204781.
Original message:
Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given
define void @my_func() {
ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias
We produce without this patch:
.weak my_alias
my_alias = my_func
.globl my_alias2
my_alias2 = my_alias
That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a
@my_alias = alias void ()* @other_func
would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.
There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204934 91177308-0d34-0410-b5e6-96231b3b80d8
To safe-guard BitcodeReader, this patch adds null check for all access to these list.
Patch by Dinesh Dwivedi!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204920 91177308-0d34-0410-b5e6-96231b3b80d8
Transform:
icmp X+Cst2, Cst
into:
icmp X, Cst-Cst2
when Cst-Cst2 does not overflow, and the add has nsw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204912 91177308-0d34-0410-b5e6-96231b3b80d8
Fix description:
Expressions like 'cmp r0, #(l1 - l2) >> 3' could not be evaluated on asm parsing stage,
since it is impossible to resolve labels on this stage. In the end of stage we still have
expression (MCExpr).
Then, when we want to encode it, we expect it to be an immediate, but it still an expression.
Patch introduces a Fixup (MCFixup instance), that is processed after main encoding stage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204899 91177308-0d34-0410-b5e6-96231b3b80d8
and v4i64->v4f64.
The new costs match what we did for SSE2 and reflect the reality of our codegen.
<rdar://problem/16381225>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204884 91177308-0d34-0410-b5e6-96231b3b80d8
llvm-cov tests are sensitive to line number changes, so putting this
at the end will limit churn when we fix the XFAIL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204871 91177308-0d34-0410-b5e6-96231b3b80d8
Functions may in an instrumented binary but not in the original source
when they're inserted by the compiler or the runtime. These functions
aren't meaningful to the user, so teach llvm-cov to skip over them
instead of crashing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204863 91177308-0d34-0410-b5e6-96231b3b80d8
This commit consist of two parts.
The first part fix the PR15967. The wrong conclusion was made when the MaxLookup
limit was reached. The fix introduce a out parameter (MaxLookupReached) to
DecomposeGEPExpression that the function aliasGEP can act upon.
The second part is introducing the constant MaxLookupSearchDepth to make sure
that DecomposeGEPExpression and GetUnderlyingObject use the same search depth.
This is a small cleanup to clarify the original algorithm.
Patch by Karl-Johan Karlsson!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204859 91177308-0d34-0410-b5e6-96231b3b80d8
I've not yet updated PPCTTI because I'm not sure what the actual relative cost
is compared to the aligned uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204848 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions have access to the complete VSX register file. In addition,
they "swap" the order of the elements so that element 0 (the scalar part) comes
first in memory and element 1 follows at a higher address.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204838 91177308-0d34-0410-b5e6-96231b3b80d8
In some cases it is possible for CGP to attempt to reuse a base address from
another basic block. In those cases we have to be sure that all the address
math was either done at the same bit width, or that none of it overflowed
before it was extended.
Patch by Louis Gerbarg <lgg@apple.com>
rdar://16307442
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204833 91177308-0d34-0410-b5e6-96231b3b80d8
> For functions where esi is used as base pointer, we would previously fall ba
> from lowering memcpy with "rep movs" because that clobbers esi.
>
> With this patch, we just store esi in another physical register, and restore
> it afterwards. This adds a little bit of register preassure, but the more
> efficient memcpy should be worth it.
>
> Differential Revision: http://llvm-reviews.chandlerc.com/D2968
This didn't work. I was ending up with code like this:
lea edi,[esi+38h]
mov ecx,0Fh
mov edx,esi
mov esi,ebx
rep movs dword ptr es:[edi],dword ptr [esi]
lea ecx,[esi+74h] <-- Ooops, we're now using esi before restoring it from edx.
add ebx,3Ch
mov esi,edx
I guess if we want to do this we need stronger glue or something, or doing the expansion
much later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204829 91177308-0d34-0410-b5e6-96231b3b80d8
v2i64 needs to be a legal VSX type because it is the SetCC result type from
v2f64 comparisons. We need to expand all non-arithmetic v2i64 operations.
This fixes the lowering for v2f64 VSELECT.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204828 91177308-0d34-0410-b5e6-96231b3b80d8
This enables TableGen to generate an additional two operand matcher
for our ArithLogicR class of instructions (constituted by 3 register operands).
E.g.: and $1, $2 <=> and $1, $1, $2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204826 91177308-0d34-0410-b5e6-96231b3b80d8
The '.dword' directive accepts a list of expressions and emits
them in 8-byte chunks in successive locations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204822 91177308-0d34-0410-b5e6-96231b3b80d8
The '.set mips64' directive enables the feature Mips:FeatureMips64
from assembly. Note that it doesn't modify the ELF header as opposed
to the use of -mips64 from the command-line. The reason for this
is that we want to be as compatible as possible with existing assemblers
like GAS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204817 91177308-0d34-0410-b5e6-96231b3b80d8
The '.set mips64r2' directive enables the feature Mips:FeatureMips64r2
from assembly. Note that it doesn't modify the ELF header as opposed
to the use of -mips64r2 from the command-line. The reason for this
is that we want to be as compatible as possible with existing assemblers
like GAS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204815 91177308-0d34-0410-b5e6-96231b3b80d8
We've already got versions without the barriers, so this just adds IR-level
support for generating the new v8 ones.
rdar://problem/16227836
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204813 91177308-0d34-0410-b5e6-96231b3b80d8
The directive '.option pic2' enables PIC from assembly source.
At the moment none of the macros/directives check the PIC bit
but that's going to be fixed relatively soon. For example, the
expansion of macros like 'la' depend on the relocation model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204803 91177308-0d34-0410-b5e6-96231b3b80d8
Implementing the LLVM part of the call to __builtin___clear_cache
which translates into an intrinsic @llvm.clear_cache and is lowered
by each target, either to a call to __clear_cache or nothing at all
incase the caches are unified.
Updating LangRef and adding some tests for the implemented architectures.
Other archs will have to implement the method in case this builtin
has to be compiled for it, since the default behaviour is to bail
unimplemented.
A Clang patch is required for the builtin to be lowered into the
llvm intrinsic. This will be done next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204802 91177308-0d34-0410-b5e6-96231b3b80d8
With VSX there is a real vector select instruction, and so we should use it.
Note that VSELECT will still scalarize for v2f64 because the corresponding
SetCC result type (v2i64) is not currently a legal type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204801 91177308-0d34-0410-b5e6-96231b3b80d8
These are aliases of t4-t7 and are provided for compatibility with both the
original ABI documentation (using t4-t7) and GNU As (using t0-t3)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204797 91177308-0d34-0410-b5e6-96231b3b80d8
This reveals a small mistake in mips-register-names.s ($sp is tested twice and
$s8 is not tested) which will be fixed in a follow-up commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204792 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r204781.
I will follow up to with msan folks to see what is what they
were trying to do with aliases to weak aliases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204784 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions are essentially the same as their Altivec counterparts, but
have access to the larger VSX register file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204782 91177308-0d34-0410-b5e6-96231b3b80d8
Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given
define void @my_func() {
ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias
We produce without this patch:
.weak my_alias
my_alias = my_func
.globl my_alias2
my_alias2 = my_alias
That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a
@my_alias = alias void ()* @other_func
would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.
There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204781 91177308-0d34-0410-b5e6-96231b3b80d8
Allows this test to pass on COFF platforms so we don't need to restrict
this test to a single target anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204780 91177308-0d34-0410-b5e6-96231b3b80d8
The logic was incorrect for variables, causing them to end up in the wrong
section if the section had an index >= 0xff00.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204771 91177308-0d34-0410-b5e6-96231b3b80d8
Adds the different broadcast instructions to the ReplaceableInstrsAVX2 table.
That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are
across domain (int <-> float).
In particular, prior to this patch we were generating:
vpbroadcastd LCPI1_0(%rip), %ymm2
vpand %ymm2, %ymm0, %ymm0
vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty
Now, we generate the following nice sequence where everything is in the float
domain:
vbroadcastss LCPI1_0(%rip), %ymm2
vandps %ymm2, %ymm0, %ymm0
vmaxps %ymm1, %ymm0, %ymm0
<rdar://problem/16354675>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204770 91177308-0d34-0410-b5e6-96231b3b80d8
We need .symtab_shndxr if and only if a symbol references a section with an
index >= 0xff00.
The old code was trying to figure out if the section was needed ahead of time,
making it a fairly dependent on the code actually writing the table. It was
also somewhat conservative and would create the section in cases where it was
not needed.
If I remember correctly, the old structure was there so that the sections were
created in the same order gas creates them. That was valuable when MC's support
for ELF was new and we tested with elf-dump.py.
This patch refactors the symbol table creation to another class and makes it
obvious that .symtab_shndxr is really only created when we are about to output
a reference to a section index >= 0xff00.
While here, also improve the tests to use macros. One file is one section
short of needing .symtab_shndxr, the second one has just the right number.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204769 91177308-0d34-0410-b5e6-96231b3b80d8
The VSX instruction set has two types of FMA instructions: A-type (where the
addend is taken from the output register) and M-type (where one of the product
operands is taken from the output register). This adds a small pass that runs
just after MI scheduling (and, thus, just before register allocation) that
mutates A-type instructions (that are created during isel) into M-type
instructions when:
1. This will eliminate an otherwise-necessary copy of the addend
2. One of the product operands is killed by the instruction
The "right" moment to make this decision is in between scheduling and register
allocation, because only there do we know whether or not one of the product
operands is killed by any particular instruction. Unfortunately, this also
makes the implementation somewhat complicated, because the MIs are not in SSA
form and we need to preserve the LiveIntervals analysis.
As a simple example, if we have:
%vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9
%vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16,
%RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16
...
%vreg9<def,tied1> = XSMADDADP %vreg9<tied0>, %vreg17, %vreg19,
%RM<imp-use>; VSLRC:%vreg9,%vreg17,%vreg19
...
We can eliminate the copy by changing from the A-type to the
M-type instruction. This means:
%vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16,
%RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16
is replaced by:
%vreg16<def,tied1> = XSMADDMDP %vreg16<tied0>, %vreg18, %vreg9,
%RM<imp-use>; VSLRC:%vreg16,%vreg18,%vreg9
and we remove: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204768 91177308-0d34-0410-b5e6-96231b3b80d8
This used to resort to splitting the 256-bit operation into two 128-bit
shuffles and then recombining the results.
Fixes <rdar://problem/16167303>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204735 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Previously the code didn't check if the before and after types for the
store were pointers to different address spaces. This resulted in
instcombine using a bitcast to convert between pointers to different
address spaces, causing an assertion due to the invalid cast.
It is not be appropriate to use addrspacecast this case because it is
not guaranteed to be a no-op cast. Instead bail out and do not do the
transformation.
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D3117
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204733 91177308-0d34-0410-b5e6-96231b3b80d8
This is supposed to have the same store size and alignment as <4 x i32>,
but currently is split into a 64-bit and 32-bit store.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204729 91177308-0d34-0410-b5e6-96231b3b80d8
If we have a loop of the form
for (unsigned n = 0; n != (k & -32); n += 32) {}
then we know that n is always divisible by 32 and the loop must
terminate. Even if we have a condition where the loop counter will
overflow it'll always hold this invariant.
PR19183. Our loop vectorizer creates this pattern and it's also
occasionally formed by loop counters derived from pointers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204728 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Remove the XFAIL added in my previous commit and correct the test such that
it correctly tests the expansion of the assembler temporary.
Also added a test to check that $at is always $1 when written by the
user.
Corrected the new assembler temporary warnings so that they are emitted for
numeric registers too.
Differential Revision: http://llvm-reviews.chandlerc.com/D3169
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204711 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The assembler temporary is normally $at ($1) but can be reassigned using
'.set at=$reg'. Regardless of which register is nominated as the assembler
temporary, $at remains $1 when written by the user.
Adds warnings under the following conditions:
* The register nominated as the assembler temporary is used by the user.
* '.set noat' is in effect and $at is used by the user.
Both of these only work for named registers. I have a follow up commit that makes it work for numeric registers as well.
XFAIL set-at-directive.s since it incorrectly tests that $at is redefined by
'.set at=$reg'. Testcases will follow in a separate commit.
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL
Differential Revision: http://llvm-reviews.chandlerc.com/D3167
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204710 91177308-0d34-0410-b5e6-96231b3b80d8
This is a pretty straight forward translation for COFF, we just need to
stick the data in a COMDAT section marked as
IMAGE_COMDAT_SELECT_NODUPLICATES.
N.B. We must be careful to avoid sticking entities with private linkage
in COMDAT groups. COFF is pretty hostile to the renaming of entities so
we must be careful to disallow GlobalVariables with unstable names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204703 91177308-0d34-0410-b5e6-96231b3b80d8
Implement debug_loc.dwo, as well as llvm-dwarfdump support for dumping
this section.
Outlined in the DWARF5 spec and http://gcc.gnu.org/wiki/DebugFission the
debug_loc.dwo section has more variation than the standard debug_loc,
allowing 3 different forms of entry (plus the end of list entry). GCC
seems to, and Clang certainly, only use one form, so I've just
implemented dumping support for that for now.
It wasn't immediately obvious that there was a good refactoring to share
the implementation of dumping support between debug_loc and
debug_loc.dwo, so they're separate for now - ideas welcome or I may come
back to it at some point.
As per a comment in the code, we could choose different forms that may
reduce the number of debug_addr entries we emit, but that will require
further study.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204697 91177308-0d34-0410-b5e6-96231b3b80d8
When register allocator's stage is RS_Spill, we choose spill over using the CSR
for the first time, if the spill cost is lower than CSRCost.
When register allocator's stage is < RS_Split, we choose pre-splitting over
using the CSR for the first time, if the cost of splitting is lower than
CSRCost.
CSRCost is set with command-line option "regalloc-csr-first-time-cost". The
default value is 0 to generate the same codes as before this commit.
With a value of 15 (1 << 14 is the entry frequency), I measured performance
gain of 3% on 253.perlbmk and 1.7% on 197.parser, with instrumented PGO,
on an arm device.
rdar://16162005
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204690 91177308-0d34-0410-b5e6-96231b3b80d8
This is used to avoid relocations in the dwo file by allowing
DW_AT_ranges specified in debug_info.dwo to be relative to this base
address. (r204667 implements the base-relative DW_AT_ranges side of
this)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204672 91177308-0d34-0410-b5e6-96231b3b80d8
This removes the debug_ranges relocations from debug_info.dwo (but
doesn't implement the DW_AT_GNU_ranges_base which is also necessary for
correct functioning)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204668 91177308-0d34-0410-b5e6-96231b3b80d8
Try to match scalar and first like the other instructions.
Expand 64-bit ands to a pair of 32-bit ands since that is not
available on the VALU.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204660 91177308-0d34-0410-b5e6-96231b3b80d8
As a first step towards real little-endian code generation, this patch
changes the PowerPC MC layer to actually generate little-endian object
files. This involves passing the little-endian flag through the various
layers, including down to createELFObjectWriter so we actually get basic
little-endian ELF objects, emitting instructions in little-endian order,
and handling fixups and relocations as appropriate for little-endian.
The bulk of the patch is to update most test cases in test/MC/PowerPC
to verify both big- and little-endian encodings. (The only test cases
*not* updated are those that create actual big-endian ABI code, like
the TLS tests.)
Note that while the object files are now little-endian, the generated
code itself is not yet updated, in particular, it still does not adhere
to the ELFv2 ABI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204634 91177308-0d34-0410-b5e6-96231b3b80d8
Those patterns are used when the load cannot be folded into the related broadcast
during the select phase.
This happens when the load gets additional uses that were not anticipated during
the previous lowering phases (constant vector to constant load, then constant
load reused) or when selection DAG is not able to prove that folding the load
will not create a cycle in the DAG.
<rdar://problem/16074331>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204631 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
- If only two registers are passed to a three-register operation, then the
first argument is both source and destination register.
- If a non-register is passed as the last argument, generate the immediate
version of the instruction.
Also mark DADD commutative and add scheduling information (to the generic
scheduler), and implement DSUB.
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL
CC: theraven
Differential Revision: http://llvm-reviews.chandlerc.com/D3148
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204605 91177308-0d34-0410-b5e6-96231b3b80d8
This is a pretty straight forward translation for COFF, we just need to
stick the function in a COMDAT section marked as
IMAGE_COMDAT_SELECT_NODUPLICATES.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204565 91177308-0d34-0410-b5e6-96231b3b80d8
When VSX is available, these instructions should be used in preference to the
older variants that only have access to the scalar floating-point registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204559 91177308-0d34-0410-b5e6-96231b3b80d8
Since the profile can come from 32-bit machines, we need to check the
pointer size. Change the magic number to facilitate this.
Adds tests for reading 32-bit and 64-bit binaries (both big- and
little-endian). The tests write a binary using printf in RUN lines
(like raw-magic-but-no-header.test). Assuming the bots don't complain,
this seems like a better way forward for testing RawInstrProfReader than
committing binary files.
<rdar://problem/16400648>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204557 91177308-0d34-0410-b5e6-96231b3b80d8
This is similar, but not identical to what gas does. The logic in MC is to just
compute the symbol table after parsing the entire file. GAS is mixed, given
.type b, @object
a = b
b:
.type b, @function
It will propagate the change and make 'a' a function. Given
.type b, @object
b:
a = b
.type b, @function
the type of 'a' is still object.
Since we do the computation in the end, we produce a function in both cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204555 91177308-0d34-0410-b5e6-96231b3b80d8
Some text shows up on stderr when using guard malloc, and this test
was trying to treat that as input to llvm-profdata show. There's no
reason to pipe stderr into show at all here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204549 91177308-0d34-0410-b5e6-96231b3b80d8
When a label is parsed, check if there is type information available for the
label. If so, check if the symbol is a function. If the symbol is a function
and we are in thumb mode and no explicit thumb_func has been emitted, adjust the
symbol data to indicate that the function definition is a thumb function.
The application of this inferencing is improved value handling in the object
file (the required thumb bit is set on symbols which are thumb functions). It
also helps improve compatibility with binutils.
The one complication that arises from this handling is the MCAsmStreamer. The
default implementation of getOrCreateSymbolData in MCStreamer does not support
tracking the symbol data. In order to support the semantics of thumb functions,
track symbol data in assembly streamer. Although O(n) in number of labels in
the TU, this is already done in various other streamers and as such the memory
overhead is not a practical concern in this scenario.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204544 91177308-0d34-0410-b5e6-96231b3b80d8
v2f64 values, like other 128-bit values, are returned under VSX in register
vs34 (Altivec register v2).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204543 91177308-0d34-0410-b5e6-96231b3b80d8
A PHI node usually has only one value/basic block pair per incoming basic block.
In the case of a switch statement it is possible that a following PHI node may
have more than one such pair per incoming basic block. E.g.:
%0 = phi i64 [ 123456, %case2 ], [ 654321, %Entry ], [ 654321, %Entry ]
This is valid and the verfier doesn't complain, because both values are the
same.
Constant hoisting materializes the constant for each operand separately and the
value is still the same, but the variable names have changed. As a result the
verfier can't recognize anymore that they are the same value and complains.
This fix adds special update code for PHI node in constant hoisting to prevent
this corner case.
This fixes <rdar://problem/16394449>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204537 91177308-0d34-0410-b5e6-96231b3b80d8
This patch renames method 'isConstantSplat' as 'getConstantSplatValue'
(mainly for consistency reasons), and rewrites its logic to ensure
that we always perform a legal 'cast<ConstantSDNode>'.
Added test shift-combine-crash.ll to verify that DAGCombiner no longer crashes with an assertion failure in the attempt to simplify a vector shift by a vector of all undef counts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204536 91177308-0d34-0410-b5e6-96231b3b80d8
sym_a:
sym_d = sym_a + 1
This is the smallest fix I was able to extract from what got reverted in
r204203.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204527 91177308-0d34-0410-b5e6-96231b3b80d8
We make sure a spill is not hoisted to a hotter outer loop by adding
a condition. Hoist a spill to outer loop if there are multiple dependents
(it can be beneficial if more than one dependents are hoisted) or
if DepSV (the hoisting source) is hotter than SV (the hoisting destination).
rdar://16268194
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204522 91177308-0d34-0410-b5e6-96231b3b80d8
Cleanup the current binary testcase for profile data.
- Rename it to something more specific.
- Remove the text comparison.
- Check the output of llvm-profdata show.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204518 91177308-0d34-0410-b5e6-96231b3b80d8
Include non-text characters in the magic number so that text files can't
match.
<rdar://problem/15950346>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204513 91177308-0d34-0410-b5e6-96231b3b80d8
Type units have no addresses, so there's no need for DW_AT_addr_base.
This removes another relocation from every skeletal type unit and brings
LLVM's skeletal type units in line with GCC's (containing only
GNU_dwo_name (strp), comp_dir (strp), and GNU_pubnames (flag_present)).
Cary's got some ideas about using str_index in the .o file to reduce
those last two relocations (well, replace two relocations with one
relocation (pointing to the string index) and two indicies)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204506 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, only regular AArch64 instructions were annotated with SchedRW lists.
This patch does the same for NEON enabling these instructions to be scheduled by
the MIScheduler. Additionally, store operations are now modeled and a few
SchedRW lists were updated for bug fixes (e.g. multiple def operands).
Reviewers: apazos, mcrosier, atrick
Patch by Dave Estes <cestes@codeaurora.org>!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204505 91177308-0d34-0410-b5e6-96231b3b80d8
Read a raw binary profile that corresponds to a memory dump from the
runtime profile.
The test is a binary file generated from
cfe/trunk/test/Profile/c-general.c with the new compiler-rt runtime and
the matching text version of the input. It includes instructions on how
to regenerate.
<rdar://problem/15950346>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204496 91177308-0d34-0410-b5e6-96231b3b80d8
This isn't a format we'll want to write out in practice, but moving it
to the writer library simplifies llvm-profdata and isolates it from
further changes to the format.
This also allows us to update the tests to not rely on the text output
format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204489 91177308-0d34-0410-b5e6-96231b3b80d8
This introduces the ProfileData library and updates llvm-profdata to
use this library for reading profiles. InstrProfReader is an abstract
base class that will be subclassed for both the raw instrprof data
from compiler-rt and the efficient instrprof format that will be used
for PGO.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204482 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
VECTOR_SHUFFLE concatenates the vectors in an vectorwise fashion.
<0b00, 0b01> + <0b10, 0b11> -> <0b00, 0b01, 0b10, 0b11>
VSHF concatenates the vectors in a bitwise fashion:
<0b00, 0b01> + <0b10, 0b11> ->
0b0100 + 0b1110 -> 0b01001110
<0b10, 0b11, 0b00, 0b01>
We must therefore swap the operands to get the correct result.
The test case that discovered the issue was MultiSource/Benchmarks/nbench.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3142
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204480 91177308-0d34-0410-b5e6-96231b3b80d8
Extend the target hook to take also the operand index into account when
calculating the cost of the constant materialization.
Related to <rdar://problem/16381500>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204435 91177308-0d34-0410-b5e6-96231b3b80d8
Originally the algorithm would search for expensive constants and track their
users, which could be instructions and constant expressions. This change only
tracks the constants for instructions, but constant expressions are indirectly
covered too. If an operand is an constant expression, then we look through the
expression to find anny expensive constants.
The algorithm keep now track of the instruction and the operand index where the
constant is used. This allows more precise hoisting of constant materialization
code for PHI instructions, because we only hoist to the basic block of the
incoming operand. Before we had to find the idom of all PHI operands and hoist
the materialization code there.
This also makes updating of instructions easier. Before we had to keep track of
the original constant, find it in the instructions, and then replace it. Now we
can just simply update the operand.
Related to <rdar://problem/16381500>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204433 91177308-0d34-0410-b5e6-96231b3b80d8
The commit r203762 introduced silent failure for complext SO expression, and it's even worse than compiler crash.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204427 91177308-0d34-0410-b5e6-96231b3b80d8
.data_region is only used in Darwin, so it shouldn't be generated
for other OS. Currently AArch64 doesn't support darwin yet, so
I removed it from AArch64. When Darwin is supported someday, we can
add it back and associate it with Darwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204424 91177308-0d34-0410-b5e6-96231b3b80d8
NumberOfRelocations field in COFF section table is only 16-bit wide. If an
object has more than 65535 relocations, the number of relocations is stored
to VirtualAddress field in the first relocation field, and a special flag
(IMAGE_SCN_LNK_NRELOC_OVFL) is set to Characteristics field.
In test we cheated a bit. I made up a test file so that it has
IMAGE_SCN_LNK_NRELOC_OVFL flag but the number of relocations is much smaller
than 65535. This is to avoid checking in a large test file just to test a
file with many relocations.
Differential Revision: http://llvm-reviews.chandlerc.com/D3139
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204418 91177308-0d34-0410-b5e6-96231b3b80d8
Sicne MBB->computeRegisterLivenes() returns Dead for sub regs like s0,
d0 is used in vpop instead of updating sp, which causes s0 dead before
its use.
This patch checks the liveness of each subreg to make sure the reg is
actually dead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204411 91177308-0d34-0410-b5e6-96231b3b80d8
The function exists to force an expression to be absolute, but there it is not
possible to force a symbol reference since
a = b
.long a
means something else.
This is an alternative fix for pr9951 that uses an assert. It then deletes
the old pr9951 test that was testing nothing already.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204399 91177308-0d34-0410-b5e6-96231b3b80d8
This commit extends the coverage of the constant hoisting pass, adds additonal
debug output and updates the function names according to the style guide.
Related to <rdar://problem/16381500>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204389 91177308-0d34-0410-b5e6-96231b3b80d8
This option caused LowerInvoke to generate code using SJLJ-based
exception handling, but there is no code left that interprets the
jmp_buf stack that the resulting code maintained (llvm.sjljeh.jblist).
This option has been obsolete for a while, and replaced by
SjLjEHPrepare.
This leaves the default behaviour of LowerInvoke, which is to convert
invokes to calls.
Differential Revision: http://llvm-reviews.chandlerc.com/D3136
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204388 91177308-0d34-0410-b5e6-96231b3b80d8
Use the range machinery for DW_AT_ranges and DW_AT_high/lo_pc.
This commit moves us from a single range per subprogram to extending
ranges if we are:
a) In the same section, and
b) In the same enclosing CU.
This means we have more fine grained ranges for compile units, and fewer
ranges overall when we have multiple functions in the same CU
adjacent to each other in the object file.
Also remove all of the earlier hacks around this functionality for
function sections etc. Also update all of the testcases to take into
account the merging functionality.
with a fix for location entries in the debug_loc section:
Make sure that debug loc entries are relative to the low_pc
of the compile unit. This means that when we only have a single
range that the offset should be just relative to the low_pc
of the unit, for multiple ranges for a CU this means that we'll be
relative to 0 which we emit along with DW_AT_ranges.
This mostly shows up with linked binaries, so add a testcase with
multiple CUs so that our location is going to be offset of a CU
with a non-zero low_pc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204377 91177308-0d34-0410-b5e6-96231b3b80d8
None of the existing tests for LowerInvoke check LowerInvoke's output,
and all but one use "-enable-correct-eh-support", which is obsolete,
so those tests will be removed when that option is removed.
To make sure LowerInvoke will still have test coverage, this adds a
test for its default mode which converts invokes to calls.
Differential Revision: http://llvm-reviews.chandlerc.com/D3124
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204344 91177308-0d34-0410-b5e6-96231b3b80d8
The Octeon cpu from Cavium Networks is mips64r2 based and has an extended
instruction set. In order to utilize this with LLVM, a new cpu feature "octeon"
and a subtarget feature "cnmips" is added. A small set of new instructions
(baddu, dmul, pop, dpop, seq, sne) is also added. LLVM generates dmul, pop and
dpop instructions with option -mcpu=octeon or -mattr=+cnmips.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204337 91177308-0d34-0410-b5e6-96231b3b80d8
After the -asan pass had been split into -asan (function-level) and -asan-module (module-level) some of the
tests have silently stopped working, because they didn't instrument the globals anymore.
We've decided to have every test using both passes, irrespective of the presence of globals in it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204335 91177308-0d34-0410-b5e6-96231b3b80d8
obj2yaml would emit the NUL bytes padding the auxiliary file symbol
records. Trimming them looks nicer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204314 91177308-0d34-0410-b5e6-96231b3b80d8
Given
bar = foo + 4
.long bar
MC would eat the 4. GNU as includes it in the relocation. The rule seems to be
that a variable that defines a symbol is used in the relocation and one that
does not define a symbol is evaluated and the result included in the relocation.
Fixing this unfortunately required some other changes:
* Since the variable is now evaluated, it would prevent the ELF writer from
noticing the weakref marker the elf streamer uses. This patch then replaces
that with a VariantKind in MCSymbolRefExpr.
* Using VariantKind then requires us to look past other VariantKind to see
.weakref bar,foo
call bar@PLT
doing this also fixes
zed = foo +2
call zed@PLT
so that is a good thing.
* Looking past VariantKind means that the relocation selection has to use
the fixup instead of the target.
This is a reboot of the previous fixes for MC. I will watch the sanitizer
buildbot and wait for a build before adding back the previous fixes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204294 91177308-0d34-0410-b5e6-96231b3b80d8
This appears to trigger failures with optimization and function arguments somehow.
This reverts commit r204277.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204286 91177308-0d34-0410-b5e6-96231b3b80d8
This commit moves us from a single range per subprogram to extending
ranges if we are:
a) In the same section, and
b) In the same enclosing CU.
This means we have more fine grained ranges for compile units, and fewer
ranges overall when we have multiple functions in the same CU
adjacent to each other in the object file.
Also remove all of the earlier hacks around this functionality for
function sections etc. Also update all of the testcases to take into
account the merging functionality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204277 91177308-0d34-0410-b5e6-96231b3b80d8
It isn't actually used now, and probably never will be, plus it makes
tests less annoying. I also think SC prints GDS instructions as a
separate instruction name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204270 91177308-0d34-0410-b5e6-96231b3b80d8
The current state of affairs has auxiliary symbols described as a big
bag of bytes. This is less than satisfying, it detracts from the YAML
file as being human readable.
Instead, allow for symbols to optionally contain their auxiliary data.
This allows us to have a much higher level way of describing things like
weak symbols, function definitions and section definitions.
This depends on D3105.
Differential Revision: http://llvm-reviews.chandlerc.com/D3092
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204214 91177308-0d34-0410-b5e6-96231b3b80d8
This isn't a complete fix - it falls back to non-comp_dir when multiple
compile units are in play. Adding a map of comp_dir to table is part of
the more general solution, but I gave up (in the short term) when I
realized I'd also have to calculate the size of each type unit so as to
produce correct DW_AT_stmt_list attributes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204202 91177308-0d34-0410-b5e6-96231b3b80d8
The "noduplicate" function attribute exists to prevent certain optimizations
from duplicating calls to the function. This is important on platforms where
certain function call duplications are unsafe (for example execution barriers
for CUDA and OpenCL).
This patch makes it possible to specify intrinsics as "noduplicate" and
translates that to the appropriate function attribute.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204200 91177308-0d34-0410-b5e6-96231b3b80d8
The use_iterator redesign in r203364 introduced an increment past the
end of a range in -objc-arc-contract. Added an explicit check for the
end of the range.
<rdar://problem/16333235>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204195 91177308-0d34-0410-b5e6-96231b3b80d8
Allow object files to be tagged with a version-min load command for iOS
or MacOSX.
Teach macho-dump to understand the version-min load commands for
testcases.
rdar://11337778
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204190 91177308-0d34-0410-b5e6-96231b3b80d8
For functions where esi is used as base pointer, we would previously fall back
from lowering memcpy with "rep movs" because that clobbers esi.
With this patch, we just store esi in another physical register, and restore
it afterwards. This adds a little bit of register preassure, but the more
efficient memcpy should be worth it.
Differential Revision: http://llvm-reviews.chandlerc.com/D2968
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204174 91177308-0d34-0410-b5e6-96231b3b80d8