All insertf*/extractf* functions replaced with insert/extract since we have insertf and inserti forms.
Added lowering for INSERT_VECTOR_ELT / EXTRACT_VECTOR_ELT for 512-bit vectors.
Added lowering for EXTRACT/INSERT subvector for 512-bit vectors.
Added a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187491 91177308-0d34-0410-b5e6-96231b3b80d8
CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.
This also adds a test case for NVPTX that depends on this custom
legalization.
Differential Revision: http://llvm-reviews.chandlerc.com/D1195
Attempt to fix the buildbots by making the X86 test I just added platform independent
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187202 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit 187198. It broke the bots.
The soft float test probably needs a -triple because of name differences.
On the hard float test I am getting a "roundss $1, %xmm0, %xmm0", instead of
"vroundss $1, %xmm0, %xmm0, %xmm0".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187201 91177308-0d34-0410-b5e6-96231b3b80d8
CustomLowerNode was not being called during SplitVectorOperand,
meaning custom legalization could not be used by targets.
This also adds a test case for NVPTX that depends on this custom
legalization.
Differential Revision: http://llvm-reviews.chandlerc.com/D1195
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187198 91177308-0d34-0410-b5e6-96231b3b80d8
This removes the need to store the asm variant in each row of the single table that existed before. Shaves ~16K off the size of X86AsmParser.o.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187026 91177308-0d34-0410-b5e6-96231b3b80d8
This makes them consistent with 'bt' which already had this handling. gas has the same behavior. There have been discussions on the mailing list about determining size based on the immediate, but my goal here was just to remove the inconsistency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186904 91177308-0d34-0410-b5e6-96231b3b80d8
It only didn't use it before because it seems InstAlias handling in the asm printer fails to count tied operands so it tried to find an xor with 2 operands instead of the 3 it wfails to count tied.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186900 91177308-0d34-0410-b5e6-96231b3b80d8
Use PMIN/PMAX for UGE/ULE vector comparions to reduce the number of required
instructions. This trick also works for UGT/ULT, but there is no advantage in
doing so. It wouldn't reduce the number of instructions and it would actually
reduce performance.
Reviewer: Ben
radar:5972691
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186432 91177308-0d34-0410-b5e6-96231b3b80d8
In particular:
movsbw %al, %ax --> cbtw
movswl %ax, %eax --> cwtl
movslq %eax, %rax --> cltq
According to Intel's manual those have the same performance characteristics but
come with a smaller encoding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186174 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch adds explicit calling convention types for the Win64 and
System V/x86-64 ABIs. This allows code to override the default, and use
the Win64 convention on a target that wants to use SysV (and
vice-versa). This is needed to implement the `ms_abi` and `sysv_abi` GNU
attributes.
Reviewers:
CC:
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186144 91177308-0d34-0410-b5e6-96231b3b80d8
in-tree implementations of TargetLoweringBase::isFMAFasterThanMulAndAdd in
order to resolve the following issues with fmuladd (i.e. optional FMA)
intrinsics:
1. On X86(-64) targets, ISD::FMA nodes are formed when lowering fmuladd
intrinsics even if the subtarget does not support FMA instructions, leading
to laughably bad code generation in some situations.
2. On AArch64 targets, ISD::FMA nodes are formed for operations on fp128,
resulting in a call to a software fp128 FMA implementation.
3. On PowerPC targets, FMAs are not generated from fmuladd intrinsics on types
like v2f32, v8f32, v4f64, etc., even though they promote, split, scalarize,
etc. to types that support hardware FMAs.
The function has also been slightly renamed for consistency and to force a
merge/build conflict for any out-of-tree target implementing it. To resolve,
see comments and fixed in-tree examples.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185956 91177308-0d34-0410-b5e6-96231b3b80d8
Explicit references to %AH for an i8 remainder instruction can lead to
references to %AH in a REX prefixed instruction, which causes things to
blow up. Do the same thing in FastISel as we do for DAG isel and instead
shift %AX right by 8 bits and then extract the 8-bit subreg from that
result.
rdar://14203849
http://llvm.org/bugs/show_bug.cgi?id=16105
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185899 91177308-0d34-0410-b5e6-96231b3b80d8