ReconstructShuffle() may wrongly creat a CONCAT_VECTOR trying to
concat 2 of v2i32 into v4i16. This commit is to fix this issue and
try to generate UZP1 instead of lots of MOV and INS.
Patch is initalized by Kevin Qin, and refactored by Tim Northover.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211144 91177308-0d34-0410-b5e6-96231b3b80d8
To make sure branches are in range, we need to do a better job of estimating
the length of an inline assembly block than "it's probably 1 instruction, who'd
write asm with more than that?".
Fortunately there's already a (highly suspect, see how many ways you can think
of to break it!) callback for this purpose, which is used by the other targets.
rdar://problem/17277590
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211095 91177308-0d34-0410-b5e6-96231b3b80d8
Define an intrinsic for the frontend to use and pattern match it to
the RBIT instruction.
rdar://9283021
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211058 91177308-0d34-0410-b5e6-96231b3b80d8
There's probably no acatual change in behaviour here, just updating
the LowerFP_TO_INT function to be more similar to the reverse
implementation and updating costs to current CodeGen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210985 91177308-0d34-0410-b5e6-96231b3b80d8
inverted condition codes (CINC, CINV, CNEG, CSET, and CSETM).
Matching aliases based on "immediate classes", when disassembling,
wasn't previously supported, hence adding MCOperandPredicate
into class Operand, and implementing the support for it
in AsmWriterEmitter.
The parsing for those aliases was already custom, so just adding
the missing condition into AArch64AsmParser::parseCondCode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210528 91177308-0d34-0410-b5e6-96231b3b80d8
As Ana Pazos pointed out, these have to be restored to their incoming values
before a function returns; i.e. before the tail call. So they can't be used
correctly as the destination register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210525 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we were abandonning the attempt, leading to some combination of
extra work (when selection of a load/store fails completely) and inferior code
(when this leads to a real memcpy call instead of inlining).
rdar://problem/17187463
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210520 91177308-0d34-0410-b5e6-96231b3b80d8
We were hitting an assert if FastISel couldn't create the load or store we
requested. Currently this happens for large frame-local addresses, though
CodeGen could be improved there.
rdar://problem/17187463
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210519 91177308-0d34-0410-b5e6-96231b3b80d8
I saw at least a memory leak or two from inspection (on probably
untested error paths) and r206991, which was the original inspiration
for this change.
I ran this idea by Jim Grosbach a few weeks ago & he was OK with it.
Since it's a basically mechanical patch that seemed sufficient - usual
post-commit review, revert, etc, as needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210427 91177308-0d34-0410-b5e6-96231b3b80d8
This means the output of LowerFormalArguments returns a lowered
SDValue with the correct type (expected in SelectionDAGBuilder).
Without this, an assertion under a DEBUG macro triggers when those
types are passed on the stack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210102 91177308-0d34-0410-b5e6-96231b3b80d8
The C and C++ semantics for compare_exchange require it to return a bool
indicating success. This gets mapped to LLVM IR which follows each cmpxchg with
an icmp of the value loaded against the desired value.
When lowered to ldxr/stxr loops, this extra comparison is redundant: its
results are implicit in the control-flow of the function.
This commit makes two changes: it replaces that icmp with appropriate PHI
nodes, and then makes sure earlyCSE is called after expansion to actually make
use of the opportunities revealed.
I've also added -{arm,aarch64}-enable-atomic-tidy options, so that
existing fragile tests aren't perturbed too much by the change. Many
of them either rely on undef/unreachable too pervasively to be
restored to something well-defined (particularly while making sure
they test the same obscure assert from many years ago), or depend on a
particular CFG shape, which is disrupted by SimplifyCFG.
rdar://problem/16227836
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209883 91177308-0d34-0410-b5e6-96231b3b80d8
This matches gcc's behavior. It also seems natural given that aliases
contain other properties that govern how it is accessed (linkage,
visibility, dll storage).
Clang still has to be updated to expose this feature to C.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209759 91177308-0d34-0410-b5e6-96231b3b80d8
A test in test/Generic creates a DAG where the NZCV output of an ADCS is used
by multiple nodes. This makes LLVM want to save a copy of NZCV for later, which
it couldn't do before.
This should be the last fix required for the aarch64 buildbot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209651 91177308-0d34-0410-b5e6-96231b3b80d8
These are tested by test/CodeGen/Generic, so we should probably know
how to deal with them. Fortunately generic code does it if asked.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209646 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is debatable. There are two possible approaches, neither
of which is really satisfactory:
1. Use "@foo(i1 zeroext)" to mean an extension to 32-bits on Darwin,
and 8 bits otherwise.
2. Redefine "@foo(i1)" to mean that the i1 is extended by the caller
to 8 bits. This goes against the spirit of "zeroext" I think, but
it's a bit of a vague construct anyway (by definition you're going
to extend to the amount required by the ABI, that's why it's the
ABI!).
This implements option 2. The DAG machinery really isn't setup for the
first (there's a fairly strong assumption that "zeroext" goes to at
least the smallest register size), and even if it was the resulting
DAG looks like it would be inferior in many cases.
Theoretically we could add AssertZext nodes in the consumers of
ABI-passed values too now, but this actually seems to make the code
worse in practice by making truncation proceed in two steps. The code
produced is equally valid if we continue to assume only the low bit is
defined.
Should fix PR19850
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209637 91177308-0d34-0410-b5e6-96231b3b80d8
We can eliminate the custom C++ code in favour of some TableGen to
check the same things. Functionality should be identical, except for a
buffer overrun that was present in the C++ code and meant webkit
failed if any small argument needed to be passed on the stack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209636 91177308-0d34-0410-b5e6-96231b3b80d8
The code emitted is what would be expected for the small model, so it
shouldn't be used when objects can be the full 64-bits away.
This fixes MCJIT tests on Linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209585 91177308-0d34-0410-b5e6-96231b3b80d8