This reverts commit 5b55a47e94.
A test case was found to crash after this was applied. I'll file a bug to track fixing this with the test case needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212550 91177308-0d34-0410-b5e6-96231b3b80d8
This changes the implementation of atomic NAND operations
from "a & ~b" (compatible with GCC < 4.4) to actual "~(a & b)"
(compatible with GCC >= 4.4).
This is in line with the common-code and ARM back-end change
implemented in r212433.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212547 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches how to fold a shuffle according to rule:
shuffle (shuffle (x, undef, M0), undef, M1) -> shuffle(x, undef, M2)
We do this only if the resulting mask M2 is legal; this is to avoid introducing
illegal shuffles that are potentially expanded into a sub-optimal sequence
of target specific dag nodes.
This patch has the advantage of being target independent, since it works on ISD
nodes. Therefore, all targets (not only x86) can take advantage of this rule.
The idea behind this patch is that most shuffle pairs can be safely combined
before we run the legalizer on vector operations. This allows us to
combine/simplify dag nodes earlier in the process and not only immediately
before instruction selection stage.
That said. This patch is not meant to replace any existing target specific
combine rules; backends might still introduce new shuffles during legalization
stage. Also, this rule is very simple and avoids to aggressively optimize
shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212539 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Follow on to r212519 to improve the encapsulation and limit the scope of the enums.
Also merged two very similar parser functions, fixed a bug where ASE's
were not being reported, and marked CPR1's as being 128-bit when MSA is
enabled.
Differential Revision: http://reviews.llvm.org/D4384
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212522 91177308-0d34-0410-b5e6-96231b3b80d8
aggressively from the x86 shuffle lowering to the generic SDAG vector
shuffle formation code.
This code already tried to fold away shuffles of splats! It just had
lots of bugs and couldn't handle the case my new x86 shuffle lowering
needed.
First, it failed to correctly compute whether N2 was undef because it
pre-computed this, then did transformations which could *make* N2 undef,
then failed to ever re-consider the precomputed state.
Second, it didn't look through bitcasts at all, even in the safe cases
where they are just element-type bitcasts with no change to the number
of elements.
Third, it didn't handle all-zero bit casts nicely the way my code in the
x86 side of things did, which is essential to getting good zext-shuffle
lowerings.
But all of these are generic. I just ported the code down to this layer
and fixed the surrounding bugs. Tests exercising this in the x86 backend
still pass and some silly code in widen_cast-6.ll gets better. I updated
that test to be a bit more precise but it's still pretty unclear what
the value of the test is in this day and age.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212517 91177308-0d34-0410-b5e6-96231b3b80d8
around the handling of UNDEF lanes in boolean vector content analysis.
The code before my changes here also failed to check for non-constant
splats in a buildvector. I have no idea how to trigger this, I just
spotted by inspection when trying to understand the code. It seems
extremely unlikely to be worth the trouble to teach the only caller of
this code (DAG combining setcc patterns) how to cleverly handle undef
lanes, so I've just commented more thoroughly that we're giving up
there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212515 91177308-0d34-0410-b5e6-96231b3b80d8
nodes about whether they are splats. This is factored out and improved
from r212324 which got reverted as it was far too aggressive. The new
API should help more conservatively handle buildvectors that are
a mixture of splatted and undef values.
No functionality change at this point. The hope is to slowly
re-introduce the undef-tolerant optimization of splats, but each time
being forced to make a concious decision about how to handle the undefs
in a way that doesn't lead to contradicting assumptions about the
collapsed value.
Hal has pointed out in discussions that this may not end up being the
desired API and instead it may be more convenient to get a mask of the
undef elements or something similar. I'm starting simple and will expand
the API as I adapt actual callers and see exactly what they need.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212514 91177308-0d34-0410-b5e6-96231b3b80d8
All blacklisting logic is now moved to the frontend (Clang).
If a function (or source file it is in) is blacklisted, it doesn't
get sanitize_address attribute and is therefore not instrumented.
If a global variable (or source file it is in) is blacklisted, it is
reported to be blacklisted by the entry in llvm.asan.globals metadata,
and is not modified by the instrumentation.
The latter may lead to certain false positives - not all the globals
created by Clang are described in llvm.asan.globals metadata (e.g,
RTTI descriptors are not), so we may start reporting errors on them
even if "module" they appear in is blacklisted. We assume it's fine
to take such risk:
1) errors on these globals are rare and usually indicate wild memory access
2) we can lazily add descriptors for these globals into llvm.asan.globals
lazily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212505 91177308-0d34-0410-b5e6-96231b3b80d8
As destination k0 is allowed but not as predicate/writemask.
I also modified the test to allow checking of error messages by the assembler.
I applied a similar approach to the test ret.s in the same directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212504 91177308-0d34-0410-b5e6-96231b3b80d8
When combining a sequence of two PSHUFD dag nodes into a single PSHUFD,
make sure that we assign the correct type to the resulting PSHUFD.
X86ISD::PSHUFD dag nodes can be either MVT::v4i32 or MVT::v4f32.
Before this change, an assertion failure was triggered in method
'DAGCombinerInfo::CombineTo' when trying to combine the shuffles from the test
below into a single PSHUFD.
define <4 x float> @test1(<4 x float> %V) {
%1 = shufflevector <4 x float> %V, <4 x float> undef, <4 x i32> <i32 3, i32 0, i32 2, i32 1>
%2 = shufflevector <4 x float> %1, <4 x float> undef, <4 x i32> <i32 3, i32 0, i32 2, i32 1>
ret <4 x float> %2
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212498 91177308-0d34-0410-b5e6-96231b3b80d8
Add custom lowering code for signed multiply instruction selection, because the
default FastISel instruction selection for ISD::MUL will use unsigned multiply
for the i8 type and signed multiply for all other types. This would set the
incorrect flags for the overflow check.
This fixes <rdar://problem/17549300>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212493 91177308-0d34-0410-b5e6-96231b3b80d8
Currently AArch64FastISel crashes if it tries to extend an integer into an
MVT::i128. This can happen by creating 128 bit integers like so:
typedef unsigned int uint128_t __attribute__((mode(TI)));
typedef int sint128_t __attribute__((mode(TI)));
This patch makes EmitIntExt check for their presence and then falls back to
SelectionDAG.
Tests included.
rdar://17516686
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212492 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds to an existing loop over phi nodes in SimplifyCondBranchToCondBranch() to check for trapping ops and bails out of the optimization if we find one of those.
The test cases verify that trapping ops are not hoisted and non-trapping ops are still optimized as expected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212490 91177308-0d34-0410-b5e6-96231b3b80d8
According to a FIXME in ARMMCTargetDesc.cpp the ARM version parsing should be
in the Triple helper class.
Patch by: Gabor Ballabas
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212479 91177308-0d34-0410-b5e6-96231b3b80d8
Arguments passed as "byval align" should get the specified alignment
in the parameter save area. There was some code in PPCISelLowering.cpp
that attempted to implement this, but this didn't work correctly:
while code did update the ArgOffset value, it neglected to update
the PtrOff value (which was already computed from the old ArgOffset),
and it also neglected to update GPR_idx -- fields skipped due to
alignment in the save area must likewise be skipped in GPRs.
This patch fixes and simplifies this logic by:
- handling argument offset alignment right at the beginning
of argument processing, using a new helper routine
CalculateStackSlotAlignment (this avoids having to update
PtrOff and other derived values later on)
- not tracking GPR_idx separately, but always computing the
correct GPR_idx for each argument *from* its ArgOffset
- removing some redundant computation in LowerFormalArguments:
MinReservedArea must equal ArgOffset after argument processing,
so there's no use in computing it twice.
[This doesn't change the behavior of the current clang front-end,
since that never creates "byval align" arguments at the moment.
This will change with a follow-on patch, however.]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212476 91177308-0d34-0410-b5e6-96231b3b80d8
lanes in vector splats.
The core problem here is that undef lanes can't *unilaterally* be
considered to contribute to splats. Their handling needs to be more
cautious. There is also a reported failure of the nightly testers
(thanks Tobias!) that may well stem from the same core issue. I'm going
to fix this theoretical issue, factor the APIs a bit better, and then
verify that I don't see anything bad with Tobias's reduction from the
test suite before recommitting.
Original commit message for r212324:
[x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for
any constant, constant FP, or undef splat and to tolerate any undef
lanes in a splat, then replace all uses of isSplatVector in X86's
lowering with it.
This fixes issues where undef lanes in an otherwise splat vector would
prevent the splat logic from firing. It is a touch more awkward to use
this interface, but it is much more accurate. Suggestions for better
interface structuring welcome.
With this fix, the code generated with the widening legalization
strategy for widen_cast-4.ll is *dramatically* improved as the special
lowering strategies for a v16i8 SRA kick in even though the high lanes
are undef.
We also get a slightly different choice for broadcasting an aligned
memory location, and use vpshufd instead of vbroadcastss. This looks
like a minor win for pipelining and domain crossing, but a minor loss
for the number of micro-ops. I suspect its a wash, but folks can
easily tweak the lowering if they want.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212475 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes various bugs with reordering loads and stores.
Scalarized vector loads weren't collecting the chains
at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212473 91177308-0d34-0410-b5e6-96231b3b80d8
Generate entire ASan asm instrumentation in MC without
relying on runtime helper functions.
Patch by Yuri Gorshenin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212455 91177308-0d34-0410-b5e6-96231b3b80d8
essentially a DAG combine that never gets a chance to run.
We might typically expect DAG combining to remove shuffles-of-splats and
other similar patterns, but we don't get a chance to run the DAG
combiner when we recursively form sub-shuffles during the lowering of
a shuffle. So instead hand-roll a really important combine directly into
the lowering code to detect shuffles-of-splats, especially shuffles of
an all-zero splat which needn't even have the same element width, etc.
This lets the new vector shuffle lowering handle shuffles which
implement things like zero-extension really nicely. This will become
even more important when I wire the legalization of zero-extension to
vector shuffles with the new widening legalization strategy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212444 91177308-0d34-0410-b5e6-96231b3b80d8
We've been performing the wrong operation on ARM for "atomicrmw nand" for
years, since "a NAND b" is "~(a & b)" rather than ARM's very tempting "a & ~b".
This bled over into the generic expansion pass.
So I assume no-one has ever actually tried to do an atomic nand in the real
world. Oh well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212443 91177308-0d34-0410-b5e6-96231b3b80d8
This completes the handling for DLL import storage symbols when lowering
instructions. A DLL import storage symbol must have an additional load
performed prior to use. This is applicable to variables and functions.
This is particularly important for non-function symbols as it is possible to
handle function references by emitting a thunk which performs the translation
from the unprefixed __imp_ symbol to the proper symbol (although, this is a
non-optimal lowering). For a variable symbol, no such thunk can be
accommodated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212431 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for tracking DLLImport storage class information on a per symbol
basis in the ARM instruction selection. Use that information to correctly
mangle the symbol (dllimport symbols are referenced via *__imp_<name>).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212430 91177308-0d34-0410-b5e6-96231b3b80d8
Ensure that all paths that retrieve the symbol name go through GetARMGVSymbol
rather than getSymbol. This is desirable so that any global symbol mangling can
be centralised to this function. The motivation for this is handling of symbols
that are marked as having dll import dll storage. Such a symbol requires an
extra load that is currently handled in the backend and a __imp_ prefix on the
symbol name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212429 91177308-0d34-0410-b5e6-96231b3b80d8