The compiler-rt functions __adddf3vfp and so on exist purely to allow Thumb1
code to make use of VFP instructions by switching back to ARM mode, they make
no sense for M-class processors which don't even have an ARM mode.
Given that justification, in practice this is a platform ABI decision so the
actual check is based on that rather than CPU features.
rdar://problem/15302004
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193327 91177308-0d34-0410-b5e6-96231b3b80d8
When generating the IfTrue basic block during the F128CSEL pseudo-instruction
handling, the NZCV live-in for the newly created BB wasn't being added. This
caused a fault during MI-sched/live range calculation when the predecessor
for the fall-through BB didn't have a live-in for phys-reg as expected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193316 91177308-0d34-0410-b5e6-96231b3b80d8
On sandy bridge (PR17654) we now get
vpxor %xmm1, %xmm1, %xmm1
vpunpckhbw %xmm1, %xmm0, %xmm2
vpunpcklbw %xmm1, %xmm0, %xmm0
vinsertf128 $1, %xmm2, %ymm0, %ymm0
On haswell it's a simple
vpmovzxbw %xmm0, %ymm0
There is a maze of duplicated and dead transforms and patterns in this
area. Remove the dead custom lowering of zext v8i16 to v8i32, that's
already handled by LowerAVXExtend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193262 91177308-0d34-0410-b5e6-96231b3b80d8
- Skip instructions added in prolog. For specific targets, prolog may
insert helper function calls (e.g. _chkstk will be called when
there're more than 4K bytes allocated on stack). However, these
helpers don't use/def YMM/XMM registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193261 91177308-0d34-0410-b5e6-96231b3b80d8
The SelectionDAGBuilder was promoting vector kernel arguments to legal
types, but this won't work for R600 and SI since kernel arguments are
stored in memory and can't be promoted. In order to handle vector
arguments correctly we need to look at the original types from the LLVM IR
function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193215 91177308-0d34-0410-b5e6-96231b3b80d8
The AMDGPUIndirectAddressing pass was previously responsible for
lowering private loads and stores to indirect addressing instructions.
However, this pass was buggy and way too complicated. The only
advantage it had over the new simplified code was that it saved one
instruction per direct write to private memory. This optimization
likely has a minimal impact on performance, and we may be able
to duplicate it using some other transformation.
For the private address space, we now:
1. Lower private loads/store to Register(Load|Store) instructions
2. Reserve part of the register file as 'private memory'
3. After regalloc lower the Register(Load|Store) instructions to
MOV instructions that use indirect addressing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193179 91177308-0d34-0410-b5e6-96231b3b80d8
the instruction defenitions and ISEL reflect this.
Prior to this patch these instructions took an i32i8imm, and the high bits were
dropped during encoding. This led to incorrect behavior for shifts by
immediates higher than 255. This patch fixes that issue by detecting large
immediate shifts and returning constant zero (for logical shifts) or capping
the shift amount at an encodable value (for arithmetic shifts).
Fixes <rdar://problem/14968098>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193096 91177308-0d34-0410-b5e6-96231b3b80d8
The second parameter of the SLD intrinsic is the number of columns (GPR) to
slide left the source array.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193076 91177308-0d34-0410-b5e6-96231b3b80d8
This ensures that the prefix data is treated as part of the function for
the purpose of debug info. This provides a better debugging experience,
among other things by allowing a debug info client to correctly look up
a function in debug info given a function pointer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193042 91177308-0d34-0410-b5e6-96231b3b80d8
PR17168 describes a test case that fails when compiling for debug with
fast-isel. Investigation showed that the test was failing because a DBG_VALUE
machine instruction was placed prior to a PHI.
For this problem to occur requires the following:
* Compile for debug
* Compile with fast-isel
* In a block B, fast-isel must partially succeed before punting to DAG-isel
* B must start with a PHI
* The first unhandled node in the DAG must not generate a machine instruction
* A debug value with an order less than that of that first node exists
When all of these circumstances apply, the existing test that an instruction
was not inserted won't fire. Currently it tests whether the block is empty,
or whether the last instruction generated is a phi. When fast-isel has
partially succeeded, the last instruction generated will not be a phi.
Instead, we need to check whether the current insert position is immediately
following a phi. This patch adds that check, and adds the test case from the
PR as a regression test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192976 91177308-0d34-0410-b5e6-96231b3b80d8
This caused the clang-native-mingw32-win7 buildbot to break.
The assembler was complaining about the following lines that were showing up
in the asm for CrashRecoveryContext.cpp:
movl $"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4", 4(%eax)
calll "_AddVectoredExceptionHandler@8"
.def "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4";
"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4":
calll "_RemoveVectoredExceptionHandler@4"
Reverting for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192940 91177308-0d34-0410-b5e6-96231b3b80d8
This commit implements the correct lowering of the
COPY_STRUCT_BYVAL_I32 pseudo-instruction for thumb1 targets.
Previously, the lowering of COPY_STRUCT_BYVAL_I32 generated the
post-increment forms of ldr/ldrh/ldrb instructions. Thumb1 does not
have the post-increment form of these instructions so the generated
assembly contained invalid instructions.
Passing the generated assembly to gcc caused it to complain with an
error like this:
Error: cannot honor width suffix -- `ldrb r3,[r0],#1'
and the integrated assembler would generate an object file with an
invalid instruction encoding.
This commit contains a small test case that demonstrates the problem
with thumb1 targets as well as an expanded test case that more
throughly tests the lowering of byval struct passing for arm,
thumb1, and thumb2 targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192916 91177308-0d34-0410-b5e6-96231b3b80d8
class. The instruction class includes the signed saturating doubling
multiply-add long, signed saturating doubling multiply-subtract long, and
the signed saturating doubling multiply long instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192908 91177308-0d34-0410-b5e6-96231b3b80d8
When canonicalizing dags according to the rule
(shl (zext (shr X, c1) ), c1) ==> (zext (shl (shr X, c1), c1))
remember to add the new shl dag to the DAGCombiner worklist of nodes.
If we don't explicitly add it to the worklist of nodes to visit, we
may not trigger later on the rule that folds the shift left + logical
shift right into a AND instruction with bitmask.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192883 91177308-0d34-0410-b5e6-96231b3b80d8
Consider the following:
typedef unsigned short ushort4U __attribute__((ext_vector_type(4),
aligned(2)));
typedef unsigned short ushort4 __attribute__((ext_vector_type(4)));
typedef unsigned short ushort8 __attribute__((ext_vector_type(8)));
typedef int int4 __attribute__((ext_vector_type(4)));
int4 __bbase_cvt_int(ushort4 v) {
ushort8 a;
a.lo = v;
return _mm_cvtepu16_epi32(a);
}
This generates the, not unreasonable, IR:
define <4 x i32> @foo0(double %v.coerce) nounwind ssp {
%tmp = bitcast double %v.coerce to <4 x i16>
%tmp1 = shufflevector <4 x i16> %tmp, <4 x i16> undef, <8 x i32> <i32
%0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef>
%tmp2 = tail call <4 x i32> @llvm.x86.sse41.pmovzxwd(<8 x i16> %tmp1)
ret <4 x i32> %tmp2
}
The problem is when type legalization gets hold of the v4i16. It
legalizes that by spilling to the stack, then doing a zero-extending
load. Things go even more silly from there, ending up with something
like:
_foo0:
movsd %xmm0, -8(%rsp) <== Spill to the stack.
movq -8(%rsp), %xmm0 <== Reload it right back out.
pmovzxwd %xmm0, %xmm1 <== Here's what we actually asked for.
pblendw $1, %xmm1, %xmm0 <== We don't need this at all
pmovzxwd %xmm0, %xmm0 <== We already did this
ret
The v8i8 to v8i16 zext intrinsic gives even worse results, with two
table lookups via pshufb instructions(!!).
To avoid all that, we can move the bitcasting until after we've formed
the wider (legal) vector type. Then our normal codegen flows along
nicely and we get the expected:
_foo0:
pmovzxwd %xmm0, %xmm0
ret
rdar://15245794
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192866 91177308-0d34-0410-b5e6-96231b3b80d8
The reason this got reverted was that the @feat.00 symbol which was emitted
for every TU became quoted, and on cygwin/mingw we use the gas assembler which
couldn't handle the quotes.
This commit fixes the problem by only emitting @feat.00 for win32, where we use
clang -cc1as to assemble. gas would just drop this symbol anyway, so there is no
loss there.
With @feat.00 gone, there shouldn't be quoted symbols showing up on cygwin since
it uses the Itanium ABI, which doesn't put these funny characters in symbols.
> Because of win32 mangling, we produce symbol and section names with
> funny characters in them, most notably @ characters.
>
> MC would choke on trying to parse its own assembly output. This patch addresses
> that by:
>
> - Making @ trigger quoting of symbol names
> - Also quote section names in the same way
> - Just parse section names like other identifiers (to allow for quotes)
> - Don't assume @ signifies a symbol variant if it is in a string.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192859 91177308-0d34-0410-b5e6-96231b3b80d8
We were calling llvm_unreachable() when failing to optimize the
branch into if case. However, it is still possible for us
to structurize the CFG by duplicating blocks even if this optimization
fails.
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192813 91177308-0d34-0410-b5e6-96231b3b80d8
This happens e.g. with <2 x i64> -1 on x86_32. It cannot be generated directly
because i64 is illegal. It would be nice if getNOT would handle this
transparently, but I don't see a way to generate a legal constant there right
now. Fixes PR17487.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192795 91177308-0d34-0410-b5e6-96231b3b80d8
The input to an RxSBG operation can be narrower as long as the upper bits
are don't care. This fixes a FIXME added in r192783.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192790 91177308-0d34-0410-b5e6-96231b3b80d8
We previously used the default expansion to SELECT_CC, which in turn would
expand to "LHI; BRC; LHI". In most cases it's better to use an IPM-based
sequence instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192784 91177308-0d34-0410-b5e6-96231b3b80d8
This is really an extension of the current (shl (shr ...)) -> shl optimization.
The main difference is that certain upper bits must also not be demanded.
The motivating examples are the first two in the testcase, which occur
in llvmpipe output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192783 91177308-0d34-0410-b5e6-96231b3b80d8
GNU AS didn't like quotes in symbol names.
Error: junk at end of line, first unrecognized character is `"'
.def "@feat.00";
"@feat.00" = 1
Reproduced on Cygwin's 2.23.52.20130309 and mingw32's 2.20.1.20100303.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192775 91177308-0d34-0410-b5e6-96231b3b80d8
Because of win32 mangling, we produce symbol and section names with
funny characters in them, most notably @ characters.
MC would choke on trying to parse its own assembly output. This patch addresses
that by:
- Making @ trigger quoting of symbol names
- Also quote section names in the same way
- Just parse section names like other identifiers (to allow for quotes)
- Don't assume @ signifies a symbol variant if it is in a string.
Differential Revision: http://llvm-reviews.chandlerc.com/D1945
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192758 91177308-0d34-0410-b5e6-96231b3b80d8
This changes the SelectionDAG scheduling preference to source
order. Soon, the SelectionDAG scheduler can be bypassed saving
a nice chunk of compile time.
Performance differences that result from this change are often a
consequence of register coalescing. The register coalescer is far from
perfect. Bugs can be filed for deficiencies.
On x86 SandyBridge/Haswell, the source order schedule is often
preserved, particularly for small blocks.
Register pressure is generally improved over the SD scheduler's ILP
mode. However, we are still able to handle large blocks that require
latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also
attempts to discover the critical path in single-block loops and
adjust heuristics accordingly.
The MI scheduler relies on the new machine model. This is currently
unimplemented for AVX, so we may not be generating the best code yet.
Unit tests are updated so they don't depend on SD scheduling heuristics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192750 91177308-0d34-0410-b5e6-96231b3b80d8
- Type of index used in extract_vector_elt or insert_vector_elt supposes
to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd
better to truncate (or zero-extend in case it's changed later) it to
mask element type to guarantee they are matching instead of asserting
that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192722 91177308-0d34-0410-b5e6-96231b3b80d8
- Lower signed division by constant powers-of-2 to target-independent
DAG operators instead of target-dependent ones to support them better
on targets where vector types are legal but shift operators on that
types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16>
though <16 x i16> is a legal type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192721 91177308-0d34-0410-b5e6-96231b3b80d8
rdar:15221834 False AVX register dependencies cause 5x slowdown on
flops-5/6 and significant slowdown on several others.
This was blocking the switch to MI-Sched.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192669 91177308-0d34-0410-b5e6-96231b3b80d8
through bitcast, ptrtoint, and inttoptr instructions. This is valid
only if the related instructions are in that same basic block, otherwise
we may reference variables that were not live accross basic blocks
resulting in undefined virtual registers.
The bug was exposed when both SDISel and FastISel were used within the same
function, i.e., one basic block is issued with FastISel and another with SDISel,
as demonstrated with the testcase.
<rdar://problem/15192473>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192636 91177308-0d34-0410-b5e6-96231b3b80d8
a) x86-64 TLS has been documented
b) the code path should use movq for the correct relocation
to be generated.
I've also added a fixme for the test case that we should improve
the code generated, it should look something like is documented
in the tls abi document.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192631 91177308-0d34-0410-b5e6-96231b3b80d8
Per original comment, the intention of this loop
is to go ahead and break the critical edge
(in order to sink this instruction) if there's
reason to believe doing so might "unblock" the
sinking of additional instructions that define
registers used by this one. The idea is that if
we have a few instructions to sink "together"
breaking the edge might be worthwhile.
This commit makes a few small changes
to help better realize this goal:
First, modify the loop to ignore registers
defined by this instruction. We don't
sink definitions of physical registers,
and sinking an SSA definition isn't
going to unblock an upstream instruction.
Second, ignore uses of physical registers.
Instructions that define physical registers are
rejected for sinking, and so moving this one
won't enable moving any defining instructions.
As an added bonus, while virtual register
use-def chains are generally small due
to SSA goodness, iteration over the uses
and definitions (used by hasOneNonDBGUse)
for physical registers like EFLAGS
can be rather expensive in practice.
(This is the original reason for looking at this)
Finally, to keep things simple continue
to only consider this trick for registers that
have a single use (via hasOneNonDBGUse),
but to avoid spuriously breaking critical edges
only do so if the definition resides
in the same MBB and therefore this one directly
blocks it from being sunk as well.
If sinking them together is meant to be,
let the iterative nature of this pass
sink the definition into this block first.
Update tests to accomodate this change,
add new testcase where sinking avoids pipeline stalls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192608 91177308-0d34-0410-b5e6-96231b3b80d8
The alignment of allocated space was wrong, see Bugzila 17345.
Done by Zvi Rackover <zvi.rackover@intel.com>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192573 91177308-0d34-0410-b5e6-96231b3b80d8
When if converting something like:
true:
... = R0<kill>
false:
... = R0<kill>
then the instructions of the true block must not have a <kill> flag
anymore, as the instruction of the false block follow and do still read
the R0 value.
Specifically this patch determines the set of register live-in in the
false block (possibly after simulating the liveness changes of the
duplicated instructions). Each of these live-in registers mustn't be
killed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192482 91177308-0d34-0410-b5e6-96231b3b80d8
This should fix the buildbots.
Original commit message:
[DAGCombiner] Slice a big load in two loads when the element are next to each
other in memory and the target has paired load and performs post-isel loads
combining.
E.g., this optimization will transform something like this:
a = load i64* addr
b = trunc i64 a to i32
c = lshr i64 a, 32
d = trunc i64 c to i32
into:
b = load i32* addr1
d = load i32* addr2
Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and
performs post-isel loads combining.
One should overload TargetLowering::hasPairedLoad to provide this information.
The default is false.
<rdar://problem/14477220>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192476 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts r192454
Apparently FileCheck isn't as smart as I though and does not enforce a
topological order between variable defs+uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192472 91177308-0d34-0410-b5e6-96231b3b80d8
other in memory and the target has paired load and performs post-isel loads
combining.
E.g., this optimization will transform something like this:
a = load i64* addr
b = trunc i64 a to i32
c = lshr i64 a, 32
d = trunc i64 c to i32
into:
b = load i32* addr1
d = load i32* addr2
Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and
performs post-isel loads combining.
One should overload TargetLowering::hasPairedLoad to provide this information.
The default is false.
<rdar://problem/14477220>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192471 91177308-0d34-0410-b5e6-96231b3b80d8
For NVPTX, this fixes a crash where the emitImplicitDef implementation was expecting physical registers,
while NVPTX uses virtual registers (with a couple of exceptions). Now, the implicit def comment will be
emitted as a true PTX register name. Other targets can use this to customize the output of implicit def
comments.
Fixes PR17519
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192444 91177308-0d34-0410-b5e6-96231b3b80d8
When a ConstantExpr which uses a thread local is part of a PHI node
instruction, the insruction that replaces the ConstantExpr must
be inserted in the predecessor block, in front of the terminator instruction.
If the predecessor block has multiple successors, the edge is first split.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192432 91177308-0d34-0410-b5e6-96231b3b80d8
We can't enable the verifier for tests with SI_IF and SI_ELSE, because
these instructions are always followed by a COPY which copies their
result to the next basic block. This violates the machine verifier's
rule that non-terminators can not folow terminators.
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192366 91177308-0d34-0410-b5e6-96231b3b80d8
When we had a sequence like:
s1 = VLDRS [r0, 1], Q0<imp-def>
s3 = VLDRS [r0, 2], Q0<imp-use,kill>, Q0<imp-def>
s0 = VLDRS [r0, 0], Q0<imp-use,kill>, Q0<imp-def>
s2 = VLDRS [r0, 4], Q0<imp-use,kill>, Q0<imp-def>
we were gathering the {s0, s1} loads below the s3 load. This is fine,
but confused the verifier since now the s3 load had Q0<imp-use> with
no definition above it.
This should mark such uses <undef> as well. The liveness structure at
the beginning and end of the block is unaffected, and the true sN
definitions should prevent any dodgy reorderings being introduced
elsewhere.
rdar://problem/15124449
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192344 91177308-0d34-0410-b5e6-96231b3b80d8
Substantial SelectionDAG scheduling is going away soon, and is
interfering with Hao's attempts to implement LDn/STn instructions, so
I say we make the leap first.
There were a few reorderings (inevitably) which broke some tests. I
tried to replace them with CHECK-DAG variants mostly, but some too
complex for that to be useful and I just reordered them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192282 91177308-0d34-0410-b5e6-96231b3b80d8
Mips16 will try and create a stub for it and this will
result in a link error because that function does not exist in libc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192223 91177308-0d34-0410-b5e6-96231b3b80d8
from struct byval to registers.
We used to pass 0 which means the alignment of PtrVT. Even when the alignment
of the struct is smaller than 4, the LOADs would have alignment of 4, and
further optimizations could combine the LOADs into a ldm, which would
cause crash.
The fix is to pass the alignment of the struct byval.
rdar://problem/15144402
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192126 91177308-0d34-0410-b5e6-96231b3b80d8
Bitcasting everything to i8* won't work. Autoupgrade the old
intrinsic declarations to use the new mangling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192117 91177308-0d34-0410-b5e6-96231b3b80d8
Regalloc can emit unaligned spills nowadays, but we can't fold the
spills into SSE ops if we can't guarantee alignment. PR12250.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192064 91177308-0d34-0410-b5e6-96231b3b80d8
This is required because i64 is a legal type but addxcc/subxcc reads icc carry bit, which are 32 bit conditional codes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192054 91177308-0d34-0410-b5e6-96231b3b80d8
The most likely case where this error happens is when the user specifies
too many register operands. Don't make it look like an internal LLVM bug
when we can see that the error is coming from an inline asm instruction.
For other instructions we keep the "ran out of registers" error.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192041 91177308-0d34-0410-b5e6-96231b3b80d8
optimizeSelect folds (predicated) copy instructions, it must not ignore
the original register class of the operand when replacing the register
with the copies dest register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191963 91177308-0d34-0410-b5e6-96231b3b80d8
The jump doesn't really kill the registers, the following call does but
we never get back anyway.
This avoids some verify-machineinstrs problems when TAILJUMPs are
if-converted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191962 91177308-0d34-0410-b5e6-96231b3b80d8
Copy over the whole register machine operand instead of creating a new one
with an incomplete set of flags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191961 91177308-0d34-0410-b5e6-96231b3b80d8
Thanks for Dimitry Andric, Rafael Espindola, and Benjamin Kramer
for providing and progressively reducing the test case!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191782 91177308-0d34-0410-b5e6-96231b3b80d8
There are no corresponding patterns for small immediates because they would
prevent the use of fused compare-and-branch instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191775 91177308-0d34-0410-b5e6-96231b3b80d8
This function-attribute modifies the callee-saved register list and function
epilogue (specifically the return instruction) so that a routine is suitable
for use as an interrupt-handler of the specified type without disrupting
user-mode applications.
rdar://problem/14207019
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191766 91177308-0d34-0410-b5e6-96231b3b80d8
Similar to low words, we can use the shorter LLIHL and LLIHH if it turns
out that the other half of the GR64 isn't live.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191750 91177308-0d34-0410-b5e6-96231b3b80d8
This just adds the basics necessary for allocating the upper words to
virtual registers (move, load and store). The move support is parameterised
in a way that makes it easy to handle zero extensions, but the associated
zero-extend patterns are added by a later patch.
The easiest way of testing this seemed to be add a new "h" register
constraint for high words. I don't expect the constraint to be useful
in real inline asms, but it should work, so I didn't try to hide it
behind an option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191739 91177308-0d34-0410-b5e6-96231b3b80d8
We were completely ignoring the unorder/ordered attributes of condition
codes and also incorrectly lowering seto and setuo.
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191603 91177308-0d34-0410-b5e6-96231b3b80d8
of loops.
Previously, two consecutive calls to function "func" would result in the
following sequence of instructions:
1. load $16, %got(func)($gp) // load address of lazy-binding stub.
2. move $25, $16
3. jalr $25 // jump to lazy-binding stub.
4. nop
5. move $25, $16
6. jalr $25 // jump to lazy-binding stub again.
With this patch, the second call directly jumps to func's address, bypassing
the lazy-binding resolution routine:
1. load $25, %got(func)($gp) // load address of lazy-binding stub.
2. jalr $25 // jump to lazy-binding stub.
3. nop
4. load $25, %got(func)($gp) // load resolved address of func.
5. jalr $25 // directly jump to func.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191591 91177308-0d34-0410-b5e6-96231b3b80d8
Remove the command line argument "struct-path-tbaa" since we should not depend
on command line argument to decide which format the IR file is using. Instead,
we check the first operand of the tbaa tag node, if it is a MDNode, we treat
it as struct-path aware TBAA format, otherwise, we treat it as scalar TBAA
format.
When clang starts to use struct-path aware TBAA format no matter whether
struct-path-tbaa is no, and we can auto-upgrade existing bc files, the support
for scalar TBAA format can be dropped.
Existing testing cases are updated to use the struct-path aware TBAA format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191538 91177308-0d34-0410-b5e6-96231b3b80d8
The backend tries to use block operations like MVC, NC, OC and XC for
simple scalar operations. For correctness reasons, it rejects any case
in which the regions might partially overlap. However, for performance
reasons, it should also reject cases where the regions might be equal,
since the instruction might then not use the fast path.
This fixes a performance regression seen in bzip2. We may want to limit
the optimisation even more in future, or even remove it entirely, but I'll
try with this for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191525 91177308-0d34-0410-b5e6-96231b3b80d8
The backend previously folded offsets into PC-relative addresses
whereever possible. That's the right thing to do when the address
can be used directly in a PC-relative memory reference (using things
like LRL). But if we have a register-based memory reference and need
to load the PC-relative address separately, it's better to use an anchor
point that could be shared with other accesses to the same area of the
variable.
Fixes a FIXME.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191524 91177308-0d34-0410-b5e6-96231b3b80d8
This intrinsic is lowered into an equivalent INSERT_VECTOR_ELT which is
further lowered into a sequence of insert.w's on MIPS32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191521 91177308-0d34-0410-b5e6-96231b3b80d8
This intrinsic is lowered into an equivalent BUILD_VECTOR which is further
lowered into a sequence of insert.w's on MIPS32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191519 91177308-0d34-0410-b5e6-96231b3b80d8
For v4f32 and v2f64, INSERT_VECTOR_ELT is matched by a pseudo-insn which is
later expanded to appropriate insve.[wd] insns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191515 91177308-0d34-0410-b5e6-96231b3b80d8
For v4f32 and v2f64, EXTRACT_VECTOR_ELT is matched by a pseudo-insn which may
be expanded to subregister copies and/or instructions as appropriate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191514 91177308-0d34-0410-b5e6-96231b3b80d8