Summary:
This prevents the discriminator generation pass from triggering if
the DWARF version being used in the module is prior to 4.
Reviewers: echristo, dblaikie
CC: llvm-commits
Differential Revision: http://reviews.llvm.org/D3413
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206507 91177308-0d34-0410-b5e6-96231b3b80d8
Change the command line vector-insertion.ll to explicitly set the neon syntax
to apple so that buildbots that default to other syntaxes won't fail.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206502 91177308-0d34-0410-b5e6-96231b3b80d8
Having i128 as a legal type complicates the legalization phase. v4i32
is already a legal type, so we will use that instead.
This fixes several piglit tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206500 91177308-0d34-0410-b5e6-96231b3b80d8
This patch improves the performance of vector creation in caseiswhere where
several of the lanes in the vector are a constant floating point value. It
also includes new patterns to fold together some of the instructions when the
value is 0.0f. Test cases included.
rdar://16349427
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206496 91177308-0d34-0410-b5e6-96231b3b80d8
Update the SXT[BHW]/UXTW instruction aliases and the shifted reg addressing
mode handling.
PR19455 and rdar://16650642
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206495 91177308-0d34-0410-b5e6-96231b3b80d8
After some discussions the preferred semantics of
the always_inline attribute is
inline always when the compiler can determine
that it it safe to do so.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206487 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, SSPBufferSize was assigned the value of the "stack-protector-buffer-size"
attribute after all uses of SSPBufferSize. The effect was that the default
SSPBufferSize was always used during analysis. I moved the check for the
attribute before the analysis; now --param ssp-buffer-size= works correctly again.
Differential Revision: http://reviews.llvm.org/D3349
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206486 91177308-0d34-0410-b5e6-96231b3b80d8
Still only 32-bit ARM using it at this stage, but the promotion allows
direct testing via opt and is a reasonably self-contained patch on the
way to switching ARM64.
At this point, other targets should be able to make use of it without
too much difficulty if they want. (See ARM64 commit coming soon for an
example).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206485 91177308-0d34-0410-b5e6-96231b3b80d8
The commit of r205855:
Author: Arnold Schwaighofer <aschwaighofer@apple.com>
Date: Wed Apr 9 14:20:47 2014 +0000
SLPVectorizer: Only vectorize intrinsics whose operands are widened equally
The vectorizer only knows how to vectorize intrinics by widening all operands by
the same factor.
Patch by Tyler Nowicki!
exposed a backend bug causing a regression (Cannot select ctpop).
The commit msg is a bit confusing because the patch actually changes the
behavior for the loop-vectorizer as well. As things got refactored into a
helper ctpop got snuck in to the trivially-vectorizable helper which is now
used by both vectorizers. In other words, we started seeing vector-ctpops in
the backend.
This change makes ctpop LegalizeAction::Expand for the types not supported by
the byte-only CNT instruction. We may be able to custom-lower these later to
a single CNT but this is to fix the compiler crash first.
Fixes <rdar://problem/16578951>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206433 91177308-0d34-0410-b5e6-96231b3b80d8
is set even when it contains a indirect branch.
The attribute overrules correctness concerns
like the escape of a local block address.
This is for rdar://16501761
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206429 91177308-0d34-0410-b5e6-96231b3b80d8
This enables TableGen to generate an additional two operand
matcher for our shift_rotate_imm and shift_rotate_reg class of instructions.
The tests were also updated so that they include now encoding information
for all affected instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206398 91177308-0d34-0410-b5e6-96231b3b80d8
This is so that EF_MIPS_NAN2008 is set if we are using IEEE 754-2008
NaN encoding (-mnan=2008). This patch also adds support for parsing
'.nan legacy' and '.nan 2008' assembly directives. The handling of
these directives should match GAS' behaviour i.e., the last directive
in use sets the ELF header bit (EF_MIPS_NAN2008).
Differential Revision: http://reviews.llvm.org/D3346
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206396 91177308-0d34-0410-b5e6-96231b3b80d8
These ones used completely different sets of intrinsics, so the only way to do
it is create a separate ARM64 copy and change them all.
Other than that, CodeGen was straightforward, no deficiencies detected here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206392 91177308-0d34-0410-b5e6-96231b3b80d8
This should fix the ninja-x64-msvc-RA-centos6 builder.
I suspect the check in MipsSubtarget.cpp is incorrect and is really trying to
check for a bare-metal target rather and anything other than linux. I'll
investigate this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206385 91177308-0d34-0410-b5e6-96231b3b80d8
Now that Linux is trying to reparse all inline asm it chokes on the different
comment character in this test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206382 91177308-0d34-0410-b5e6-96231b3b80d8
The most important part here is that we should actuall emit the stubs we refer
to in the exception table, but as a side issue this uses more sensible & GCC
compatible representations for some of the bits of information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206380 91177308-0d34-0410-b5e6-96231b3b80d8
If we know that a particular 64-bit constant has all high bits zero, then we
can rely on the fact that 32-bit ARM64 instructions automatically zero out the
high bits of an x-register. This gives the expansion logic less constraints to
satisfy and so sometimes allows it to pick better sequences.
Came up while porting test/CodeGen/AArch64/movw-consts.ll: this will allow a
32-bit MOVN to be used in @test8 soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206379 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
I had difficulty finding tests for the N32 and N64 ABI so I've added a
collection of calling convention tests based on the document MIPS ABIs
Described (MD00305), the MIPSpro N32 Handbook, and the SYSV ABI. Where the
documents/implementations disagree, I've used GCC to resolve the conflict.
A few interesting details:
* For N32, LLVM uses 64-bit pointers when saving $ra despite pointers being
32-bit. I've yet to find a supporting statement in the ABI documentation but
the current behaviour matches GCC.
* For O32, the non-variable portion of a varargs argument list is also subject
to the rule that floating-point is passed via GPR's (on N32/N64 only the
variable portion is subject to this rule). This agrees with GCC's behaviour
and the SYSV ABI but contradicts part of the MIPSpro N32 Handbook which talks about O32's behaviour.
* The N32 implementation has the wrong callee-saved register list.
(I already have a fix for this but will commit it as a follow-up).
I've left RUN-TODO lines in for O32 on MIPS64. I don't plan to support this case
for now but we should revisit it.
Reviewers: matheusalmeida, vmedic
Reviewed By: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3339
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206370 91177308-0d34-0410-b5e6-96231b3b80d8
The second half of a split i128 was ending up in x7, which is not a good thing.
This is another part of PR19432.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206366 91177308-0d34-0410-b5e6-96231b3b80d8
This particular DAG combine is designed to kick in when both ConstantFPs will
end up being loaded via a litpool, however those nodes have a semi-legal
status, dictated by isFPImmLegal so in some cases there wouldn't have been a
litpool in the first place. Don't try to be clever in those circumstances.
Picked up while merging some AArch64 tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206365 91177308-0d34-0410-b5e6-96231b3b80d8
Adjust the tests to validate the number of auxiliary entries used to store the
filename.
Thanks to majnemer's sharp eye for catching the missing - 1 in the round up
calculation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206359 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for emitting .file records. This is mostly a quality of
implementation change (more complete support for COFF file emission) that was
noticed while working on COFF file emission for Windows on ARM.
A .file record is emitted as a symbol with storage class FILE (103) and the name
".file". A series of auxiliary format 4 records follow which contain the file
name. The filename is stored as an ANSI string and is padded with NULL if the
length is not a multiple of COFF::SymbolSize (18).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206355 91177308-0d34-0410-b5e6-96231b3b80d8
Print in decimal for inline immediates, and hex otherwise. Use hex
always for offsets in addressing offsets.
This approximately matches what the shader compiler does.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206335 91177308-0d34-0410-b5e6-96231b3b80d8
handles Intrinsic::trap if TargetOptions::TrapFuncName is set.
This fixes a bug in which the trap function was not taken into consideration
when a program was compiled without optimization (at -O0).
<rdar://problem/16291933>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206323 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches the backend how to efficiently lower logical and
arithmetic packed shifts on both SSE and AVX/AVX2 machines.
When possible, instead of scalarizing a vector shift, the backend should try
to expand the shift into a sequence of two packed shifts by immedate count
followed by a MOVSS/MOVSD.
Example
(v4i32 (srl A, (build_vector < X, Y, Y, Y>)))
Can be rewritten as:
(v4i32 (MOVSS (srl A, <Y,Y,Y,Y>), (srl A, <X,X,X,X>)))
[with X and Y ConstantInt]
The advantage is that the two new shifts from the example would be lowered into
X86ISD::VSRLI nodes. This is always cheaper than scalarizing the vector into
four scalar shifts plus four pairs of vector insert/extract.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206316 91177308-0d34-0410-b5e6-96231b3b80d8
Sometimes we need emit the bits that would actually be a MOVN when producing a
relocated MOVZ instruction (don't ask). But not always, a check which ARM64 got
wrong until now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206289 91177308-0d34-0410-b5e6-96231b3b80d8
I've left the MachO CodeGen as it is, there's a reasonable chance it should use
the GOT like ConstPools, but I'm not certain.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206288 91177308-0d34-0410-b5e6-96231b3b80d8
This brings it into line with the AArch64 behaviour and should open the way for
certain OpenCL features.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206286 91177308-0d34-0410-b5e6-96231b3b80d8
Code is mostly copied directly across, with a slight extension of the
ISelDAGToDAG function so that it can cope with the floating-point constants
being behind a litpool.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206285 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, we bind those directives with the last symbol, so if none
has been defined, this would lead to a crash of the compiler.
<rdar://problem/15939159>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206236 91177308-0d34-0410-b5e6-96231b3b80d8
Thanks to dblaikie for updating the testcase!
Debug info: (bugfix) C++ C/Dtors can be compiled to multiple functions,
therefore, their declaration cannot have one DW_AT_linkage_name.
The specific instances however can and should have that attribute.
This patch reorders the code in DwarfUnit::getOrCreateSubprogramDIE()
to emit linkage names for C/Dtors.
rdar://problem/16362674.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206210 91177308-0d34-0410-b5e6-96231b3b80d8
In rare cases the dead definition elimination pass code can cause illegal cmn
instructions when it replaces dead registers on instructions that use
unmaterialized frame indexes. This patch disables the dead definition
optimization for instructions which include frame index operands.
rdar://16438284
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206208 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds a -arm64-dead-def-elimination flag so that it is possible to
disable dead definition elimination. Includes test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206207 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, BranchProbabilityInfo::calcLoopBranchHeuristics would determine the weights of basic blocks inside loops even when it didn't have enough information to estimate the branch probabilities correctly. This patch fixes the function to exit early if it doesn't see any exit edges or back edges and let the later heuristics determine the weights.
This fixes PR18705 and <rdar://problem/15991090>.
Differential Revision: http://reviews.llvm.org/D3363
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206194 91177308-0d34-0410-b5e6-96231b3b80d8
This allows correct relocations to be generated for a symbolic
address that is being adjusted by a negative constant. Since r204294,
such expressions have triggered undefined behavior when LLVM was built
without assertions.
Credit goes to Rafael for this patch; I'm submitting it on his behalf
as he is on vacation this week.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206192 91177308-0d34-0410-b5e6-96231b3b80d8
Once the auxiliary fields relating to the filename have been inspected, any
following auxiliary fields need not be visited as they have been consumed (the
following fields comprise the filepath as a single unit).
Adjust the test to catch this even if ASAN is not enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206190 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Two exceptions to this:
test/CodeGen/Mips/octeon.ll
test/CodeGen/Mips/octeon_popcnt.ll
these test extensions to MIPS64
One test is altered for MIPS-IV:
test/CodeGen/Mips/mips64countleading.ll
Tests dclo/dclz which were added in MIPS64. The MIPS-IV version tests
that dclo/dclz are not emitted.
Four tests fail and are not in this patch:
test/CodeGen/Mips/abicalls.ll
test/CodeGen/Mips/fcopysign-f32-f64.ll
test/CodeGen/Mips/fcopysign.ll
test/CodeGen/Mips/stack-alignment.ll
Depends on D3343
Reviewers: matheusalmeida, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3344
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206185 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
- Conditional moves acting on 64-bit GPR's should require MIPS-IV rather than MIPS64
- ISD::MUL, and ISD::MULH[US] should be lowered on all 64-bit ISA's
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL
I've added additional testcases to cover as much of the codegen changes
affecting MIPS-IV as I can. Where I've been unable to find an existing
MIPS64 testcase that can be re-used for MIPS-IV (mainly tests covering
ISD::GlobalAddress and similar), I at least agree that MIPS-IV should
behave like MIPS64. Further testcases that are fixed by this patch will follow
in my next commit. The testcases from that commit that fail for MIPS-IV without
this patch are:
LLVM :: CodeGen/Mips/2010-07-20-Switch.ll
LLVM :: CodeGen/Mips/cmov.ll
LLVM :: CodeGen/Mips/eh-dwarf-cfa.ll
LLVM :: CodeGen/Mips/largeimmprinting.ll
LLVM :: CodeGen/Mips/longbranch.ll
LLVM :: CodeGen/Mips/mips64-f128.ll
LLVM :: CodeGen/Mips/mips64directive.ll
LLVM :: CodeGen/Mips/mips64ext.ll
LLVM :: CodeGen/Mips/mips64fpldst.ll
LLVM :: CodeGen/Mips/mips64intldst.ll
LLVM :: CodeGen/Mips/mips64load-store-left-right.ll
LLVM :: CodeGen/Mips/sint-fp-store_pattern.ll
Reviewers: dsanders
Reviewed By: dsanders
CC: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3343
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206183 91177308-0d34-0410-b5e6-96231b3b80d8
Code change is because optimizeCompareInstr didn't know how to pull the
condition code out of FCSEL instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206171 91177308-0d34-0410-b5e6-96231b3b80d8
There was one definite issue in ARM64 (the off-by-1 check for whether
a shift could be folded in) and one difference that is probably
correct: ARM64 didn't fold nodes with multiple uses into the
arithmetic operations unless optimising for code size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206168 91177308-0d34-0410-b5e6-96231b3b80d8
This transformation is only valid when being used for an EQ or NE
comparison since the flags change otherwise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206167 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Previously loadImmediate() would produce MKMSK instructions with invalid
immediate values such as mkmsk r0, 9. Fix this by checking the mask size
is valid.
Reviewers: robertlytton
Reviewed By: robertlytton
CC: llvm-commits
Differential Revision: http://reviews.llvm.org/D3289
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206163 91177308-0d34-0410-b5e6-96231b3b80d8
BasicTTI::getMemoryOpCost must explicitly check for non-simple types; setting
AllowUnknown=true with TLI->getSimpleValueType is not sufficient because, for
example, non-power-of-two vector types return non-simple EVTs (not MVT::Other).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206150 91177308-0d34-0410-b5e6-96231b3b80d8
If a filename is a multiple of 18 characters, there will be no null-terminator.
This will result in an invalid access by the constructed StringRef. Add a test
case to exercise this and fix that handling. Address this same vulnerability in
llvm-readobj as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206145 91177308-0d34-0410-b5e6-96231b3b80d8
Implements the various TTI functions to enable constant hoisting on PPC. The
only significant test-suite change is this:
MultiSource/Benchmarks/VersaBench/bmm/bmm - 20% speedup
(which essentially reverses the slowdown from r206120).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206141 91177308-0d34-0410-b5e6-96231b3b80d8
If multiplication involves zero-extended arguments and the result is
compared as in the patterns:
%mul32 = trunc i64 %mul64 to i32
%zext = zext i32 %mul32 to i64
%overflow = icmp ne i64 %mul64, %zext
or
%overflow = icmp ugt i64 %mul64 , 0xffffffff
then the multiplication may be replaced by call to umul.with.overflow.
This change fixes PR4917 and PR4918.
Differential Revision: http://llvm-reviews.chandlerc.com/D2814
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206137 91177308-0d34-0410-b5e6-96231b3b80d8
We had been using the known-zero values of the operand of the or to construct
the mask for an rlwimi; this is not quite correct, but fine when the mask is
constant. When the mask is constant, then the known zeros of the operand must
be a superset of the zeros in the mask. However, when the mask is not a
constant, then there might be bits in the operand that are not known to be zero
that, at runtime, might be zero in the mask. Therefore, we check that any bits
not known to be zero *are* known to be one in the mask. Otherwise, we can't
fold the mask with the or and shift.
This was revealed as a miscompile of
MultiSource/Benchmarks/BitBench/drop3/drop3 when I started experimenting with
constant hoisting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206136 91177308-0d34-0410-b5e6-96231b3b80d8
I found this from a particular GDB test suite case of inlining
(something similar is provided as a test case) but came across a few
other related cases (other callers of the same functions, and one other
instance of the same coding mistake in a separate function).
I'm not sure what the best way to test this is (let alone to cover the
other cases I discovered), so hopefully this sufficies - open to ideas.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206130 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for file auxiliary symbol entries in COFF symbol tables. A COFF
symbol table with a FILE entry is followed by sizeof(__FILE__) / 18 auxiliary
symbol records which contain the filename. Read them and form the original
filename that the record contains. Then display the name in the output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206126 91177308-0d34-0410-b5e6-96231b3b80d8
There is no need to check if we want to hoist the immediate value of an
shift instruction. Simply return TCC_Free right away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206101 91177308-0d34-0410-b5e6-96231b3b80d8
Originally the cost model would give up for large constants and just return the
maximum cost. This is not what we want for constant hoisting, because some of
these constants are large in bitwidth, but are still cheap to materialize.
This commit fixes the cost model to either return TCC_Free if the cost cannot be
determined, or accurately calculate the cost even for large constants
(bitwidth > 128).
This fixes <rdar://problem/16591573>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206100 91177308-0d34-0410-b5e6-96231b3b80d8
therefore, their declaration cannot have one DW_AT_linkage_name.
The specific instances however can and should have that attribute.
This patch reorders the code in DwarfUnit::getOrCreateSubprogramDIE()
to emit linkage names for C/Dtors.
rdar://problem/16362674.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206096 91177308-0d34-0410-b5e6-96231b3b80d8
We had disabled use of TBAA during CodeGen (even when otherwise using AA)
because the ptrtoint/inttoptr used by CGP for address sinking caused BasicAA to
miss basic type punning that it should catch (and, thus, we'd fail to override
TBAA when we should).
However, when AA is in use during CodeGen, CGP now uses normal GEPs and
bitcasts, instead of ptrtoint/inttoptr, when doing address sinking. As a
result, BasicAA should be able to make us do the right thing in the face of
type-punning, and it seems safe to enable use of TBAA again. self-hosting seems
fine on PPC64/Linux on the P7, with TBAA enabled and -misched=shuffle.
Note: We still don't update TBAA when merging stack slots, although because
BasicAA should now catch all such cases, this is no longer a blocking issue.
Nevertheless, I plan to commit code to deal with this properly in the near
future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206093 91177308-0d34-0410-b5e6-96231b3b80d8