That's what it actually means, and with 16-bit support it's going to be
a little more relevant since in a few corner cases we may actually want
to distinguish between 16-bit and 32-bit mode (for example the bare 'push'
aliases to pushw/pushl etc.)
Patch by David Woodhouse
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197768 91177308-0d34-0410-b5e6-96231b3b80d8
Added scalar compare VCMPSS, VCMPSD.
Implemented LowerSELECT for scalar FP operations.
I replaced FSETCCss, FSETCCsd with one node type FSETCCs.
Node extract_vector_elt(v16i1/v8i1, idx) returns an element of type i1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197384 91177308-0d34-0410-b5e6-96231b3b80d8
a vector packed single/double fp operation followed by a vector insert.
The effect is that the backend coverts the packed fp instruction
followed by a vectro insert into a SSE or AVX scalar fp instruction.
For example, given the following code:
__m128 foo(__m128 A, __m128 B) {
__m128 C = A + B;
return (__m128) {c[0], a[1], a[2], a[3]};
}
previously we generated:
addps %xmm0, %xmm1
movss %xmm1, %xmm0
we now generate:
addss %xmm1, %xmm0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197145 91177308-0d34-0410-b5e6-96231b3b80d8
immediately after SSE scalar fp instructions like addss or mulss.
Added patterns to select SSE scalar fp arithmetic instructions from a scalar
fp operation followed by a blend.
For example, given the following code:
__m128 foo(__m128 A, __m128 B) {
A[0] += B[0];
return A;
}
previously we generated:
addss %xmm0, %xmm1
movss %xmm1, %xmm0
now we generate:
addss %xmm1, %xmm0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196925 91177308-0d34-0410-b5e6-96231b3b80d8
- When selecting BLEND from vselect, the operands need swapping as due to the
difference between vselect and SSE/AVX's BLEND insn
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193900 91177308-0d34-0410-b5e6-96231b3b80d8
On sandy bridge (PR17654) we now get
vpxor %xmm1, %xmm1, %xmm1
vpunpckhbw %xmm1, %xmm0, %xmm2
vpunpcklbw %xmm1, %xmm0, %xmm0
vinsertf128 $1, %xmm2, %ymm0, %ymm0
On haswell it's a simple
vpmovzxbw %xmm0, %ymm0
There is a maze of duplicated and dead transforms and patterns in this
area. Remove the dead custom lowering of zext v8i16 to v8i32, that's
already handled by LowerAVXExtend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193262 91177308-0d34-0410-b5e6-96231b3b80d8
the instruction defenitions and ISEL reflect this.
Prior to this patch these instructions took an i32i8imm, and the high bits were
dropped during encoding. This led to incorrect behavior for shifts by
immediates higher than 255. This patch fixes that issue by detecting large
immediate shifts and returning constant zero (for logical shifts) or capping
the shift amount at an encodable value (for arithmetic shifts).
Fixes <rdar://problem/14968098>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193096 91177308-0d34-0410-b5e6-96231b3b80d8
This allows the instruction to be encoded using the 2-byte VEX form instead of the 3-byte VEX form. The GNU assembler has similar behavior and instruction selection already does this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192088 91177308-0d34-0410-b5e6-96231b3b80d8
Add llvm.x86.* intrinsics for all of the Intel SHA Extensions instructions, as
well as tests. Also remove mayLoad and hasSideEffects, which can be inferred
from the instruction patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190864 91177308-0d34-0410-b5e6-96231b3b80d8
Implements Instruction scheduler latencies for Silvermont,
using latencies from the Intel Silvermont Optimization Guide.
Auto detects SLM.
Turns on post RA scheduler when generating code for SLM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190717 91177308-0d34-0410-b5e6-96231b3b80d8
Add basic assembly/disassembly support for the first Intel SHA
instruction 'sha1rnds4'. Also includes feature flag, and test cases.
Support for the remaining instructions will follow in a separate patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190611 91177308-0d34-0410-b5e6-96231b3b80d8
All insertf*/extractf* functions replaced with insert/extract since we have insertf and inserti forms.
Added lowering for INSERT_VECTOR_ELT / EXTRACT_VECTOR_ELT for 512-bit vectors.
Added lowering for EXTRACT/INSERT subvector for 512-bit vectors.
Added a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187491 91177308-0d34-0410-b5e6-96231b3b80d8