Note that I followed the AVX2 convention here and didn't add LLVM intrinsics
for stores. These can be generated with the nontemporal hint on LLVM IR
stores (see new test). The GCC builtins are lowered directly into nontemporal
stores.
<rdar://problem/17082571>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211176 91177308-0d34-0410-b5e6-96231b3b80d8
It looks like there are two versions of LowerCallTo here: the
SelectionDAGBuilder one is designed to operate on LLVM IR, and the
TargetLowering one in the case where everything is at DAG level.
Previously, only the SelectionDAGBuilder variant could handle demoting
an impossible return to sret semantics (before delegating to the
TargetLowering version), but this functionality is also useful for
certain libcalls (e.g. 128-bit operations on 32-bit x86). So this
commit moves the sret handling down a level.
rdar://problem/17242889
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211155 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is a follow up to r211040 & r211052. Rather than bailing out of fast
isel this patch will generate an alternate instruction (movabsq) instead of the
leaq. While this will always have enough room to handle the 64 bit displacment
it is generally over kill for internal symbols (most displacements will be
within 32 bits) but since we have no way of communicating the code model to the
the assmebler in order to avoid flagging an absolute leal/leaq as illegal when
using a symbolic displacement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211130 91177308-0d34-0410-b5e6-96231b3b80d8
This optimizes predicates for certain compares, such as fcmp oeq %x, %x to
fcmp ord %x, %x. The latter one is more efficient to generate.
The same optimization is applied to conditional branches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211126 91177308-0d34-0410-b5e6-96231b3b80d8
Make use of helper functions to simplify the branch and compare instruction
selection in FastISel. Also add test cases for compare and conditonal branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211077 91177308-0d34-0410-b5e6-96231b3b80d8
On x86_86 the lea instruction can only use a 32 bit immediate value. When
the code is compiled statically the RIP register is not used, meaning the
immediate is all that can be used for the relocation, which is not sufficient
in the case of targets more than +/- 2GB away. This patch bails out of fast
isel in those cases and reverts to DAG which does the right thing.
Test case included.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211040 91177308-0d34-0410-b5e6-96231b3b80d8
This doesn't fix the abstract variable handling yet, but it introduces a
similar delay mechanism as was added for subprograms, causing
DW_AT_location to be reordered to the beginning of the attribute list
for local variables, and fixes all the test fallout for that.
A subsequent commit will remove the abstract variable handling in
DbgVariable and just do the abstract variable lookup at module end to
ensure that abstract variables introduced after their concrete
counterparts are appropriately referenced by the concrete variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210943 91177308-0d34-0410-b5e6-96231b3b80d8
Lowering this new node allows us to fold the almost universal
comparison for success before it's even formed. Instead we can create
a copy from EFLAGS and an X86ISD::SETCC operation since all "cmpxchg"
instructions set the zero-flag to the correct value.
rdar://problem/13201607
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210923 91177308-0d34-0410-b5e6-96231b3b80d8
This commit adds a weak variant of the cmpxchg operation, as described
in C++11. A cmpxchg instruction with this modifier is permitted to
fail to store, even if the comparison indicated it should.
As a result, cmpxchg instructions must return a flag indicating
success in addition to their original iN value loaded. Thus, for
uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The
second flag is 1 when the store succeeded.
At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been
added as the natural representation for the new cmpxchg instructions.
It is a strong cmpxchg.
By default this gets Expanded to the existing ATOMIC_CMP_SWAP during
Legalization, so existing backends should see no change in behaviour.
If they wish to deal with the enhanced node instead, they can call
setOperationAction on it. Beware: as a node with 2 results, it cannot
be selected from TableGen.
Currently, no use is made of the extra information provided in this
patch. Test updates are almost entirely adapting the input IR to the
new scheme.
Summary for out of tree users:
------------------------------
+ Legacy Bitcode files are upgraded during read.
+ Legacy assembly IR files will be invalid.
+ Front-ends must adapt to different type for "cmpxchg".
+ Backends should be unaffected by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210903 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for the cvttss2si/cvttsd2si intrinsics. Preceding
insertelement instructions are folded into the conversion instruction (if
possible).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210870 91177308-0d34-0410-b5e6-96231b3b80d8
Add branch weights to branch instructions, so that the following passes can
optimize based on it (i.e. basic block ordering).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210863 91177308-0d34-0410-b5e6-96231b3b80d8
This commit adds MachineMemOperands to load and store instructions. This allows
the peephole optimizer to fold load instructions. Unfortunatelly the peephole
optimizer currently doesn't run at -O0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210858 91177308-0d34-0410-b5e6-96231b3b80d8
Recommit with fixed argument attribute checking code, which is required to bail
out of all the cases we don't handle yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210815 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r210540, adds a testcase for the regression it
caused, and marks the R600 test it was supposed to fix as XFAIL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210792 91177308-0d34-0410-b5e6-96231b3b80d8
This patch moves part of the logic implemented by the target specific
combine rules added at r210477 to a separate helper function.
This should make easier to add more rules for matching AVX/AVX2 horizontal
adds/subs.
This patch also fixes a problem caused by a wrong check performed on indices
of extract_vector_elt dag nodes in input to the scalar adds/subs.
New tests have been added to verify that we correctly check indices of
extract_vector_elt dag nodes when selecting a horizontal operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210644 91177308-0d34-0410-b5e6-96231b3b80d8
This patch slightly changes the algorithm introduced at revision 210477
to fix a problem where the algorithm was producing incorrect code for
the VEX.256 encoded versions of horizontal add/sub.
For these cases, we now try to split the two 256-bit vectors into
128-bit chunks before emitting horizontal add/sub dag nodes.
Added a new test case into haddsub-2.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210545 91177308-0d34-0410-b5e6-96231b3b80d8
The C++ and C semantics of the compare_and_swap operations actually
require us to return a boolean "success" value. In LLVM terms this
means a second comparison of the output of "cmpxchg" against the input
desired value.
However, x86's "cmpxchg" instruction sets all flags for the comparison
formed, so we can skip any secondary comparison. (N.b. this isn't true
for cmpxchg8b/16b, which only set ZF).
rdar://problem/13201607
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210523 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds new target specific combine rules to identify horizontal
add/sub idioms from BUILD_VECTOR dag nodes.
This patch also teaches the DAGCombiner how to canonicalize sequences of
insert_vector_elt dag nodes according to the following rule:
(insert_vector_elt (insert_vector_elt A, I0), I1) ->
(insert_vecto_elt (insert_vector_elt A, I1), I0)
This new canonicalization rule only triggers if the inner insert_vector
dag node has exactly one use; also, both indices must be known constants,
and I1 < I0.
This last rule made it possible to write a simpler algorithm to identify
horizontal add/sub patterns because now we don't have to worry about the
ordering of insert_vector_elt dag nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210477 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches the backend how to check for the 'NoSignedWrap' flag on
binary operations to improve the emission of 'test' instructions.
If the result of a binary operation is known not to overflow we know that
resetting the Overflow flag is unnecessary and so we can avoid emitting
the test instruction.
Patch by Marcello Maggioni.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210468 91177308-0d34-0410-b5e6-96231b3b80d8
This patch modifies SelectionDAGBuilder to construct SDNodes with associated
NoSignedWrap, NoUnsignedWrap and Exact flags coming from IR BinaryOperator
instructions.
Added a new SDNode type called 'BinaryWithFlagsSDNode' to allow accessing
nsw/nuw/exact flags during codegen.
Patch by Marcello Maggioni.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210467 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We were being too strict and not accounting for undefs.
Added a test case and fixed another one where we improved codegen.
Reviewers: grosbach, nadav, delena
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D4039
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210361 91177308-0d34-0410-b5e6-96231b3b80d8
It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables.
This also adds backend support for generating the jump-instruction tables on ARM and X86.
Note that since the jumptable attribute creates a second function pointer for a
function, any function marked with jumptable must also be marked with unnamed_addr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210280 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is
up to MC (or the system assembler) to decide if that expression is valid or not.
This reduces our ability to diagnose invalid uses and how early we can spot
them, but it also lets us do things like
@test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32),
i32 ptrtoint (i32* @bar to i32)) to i32*)
An important implication of this patch is that the notion of aliased global
doesn't exist any more. The alias has to encode the information needed to
access it in its metadata (linkage, visibility, type, etc).
Another consequence to notice is that getSection has to return a "const char *".
It could return a NullTerminatedStringRef if there was such a thing, but when
that was proposed the decision was to just uses "const char*" for that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210062 91177308-0d34-0410-b5e6-96231b3b80d8
When lowering a ISD::BRCOND into a test+branch, make sure that we
always use the correct condition code to emit the test operation.
This fixes PR19858: "i8 checked mul is wrong on x86".
Patch by Keno Fisher!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210032 91177308-0d34-0410-b5e6-96231b3b80d8
Following the lead set by r209324, I'm making these tests match the whole
instruction, so we can be sure we're lowering them correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209947 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches the backend how to simplify/canonicalize dag node
sequences normally introduced by the backend when promoting certain dag nodes
with illegal vector type.
This patch adds two new combine rules:
1) fold (shuffle (bitcast (BINOP A, B)), Undef, <Mask>) ->
(shuffle (BINOP (bitcast A), (bitcast B)), Undef, <Mask>)
2) fold (BINOP (shuffle (A, Undef, <Mask>)), (shuffle (B, Undef, <Mask>))) ->
(shuffle (BINOP A, B), Undef, <Mask>).
Both rules are only triggered on the type-legalized DAG.
In particular, rule 1. is a target specific combine rule that attempts
to sink a bitconvert into the operands of a binary operation.
Rule 2. is a target independet rule that attempts to move a shuffle
immediately after a binary operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209930 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If both vector args to vselect are concat_vectors and the condition is
constant and picks half a vector from each argument, convert the vselect
into a concat_vectors.
Added a test.
The ConvertSelectToConcatVector is assuming it doesn't get vselects with
arguments of, for example, <undef, undef, true, true>. Those get taken
care of in the checks above its call.
Reviewers: nadav, delena, grosbach, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3916
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209929 91177308-0d34-0410-b5e6-96231b3b80d8