llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-10-02 02:55:35 +00:00

Author	SHA1	Message	Date
Andrea Di Biagio	042bee88f3	[X86] Teach method 'isVectorClearMaskLegal' how to check for legal blend masks. This patch improves the folding of vector AND nodes into blend operations for targets that feature SSE4.1. A vector AND node where one of the operands is a constant build_vector with elements that are either zero or all-ones can be converted into a blend. This allows for example to simplify the following code: define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) { %1 = and <4 x i32> %A, <i32 0, i32 0, i32 0, i32 -1> %2 = and <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 0> %3 = or <4 x i32> %1, %2 ret <4 x i32> %3 } Before this patch llc (-mcpu=corei7) generated: andps LCPI1_0(%rip), %xmm0, %xmm0 andps LCPI1_1(%rip), %xmm1, %xmm1 orps %xmm1, %xmm0, %xmm0 retq With this patch we generate a single 'vpblendw'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221343 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 13:04:14 +00:00
Simon Pilgrim	4ad0654bb4	[X86][SSE] Enable commutation for SSE immediate blend instructions Patch to allow (v)blendps, (v)blendpd, (v)pblendw and vpblendd instructions to be commuted - swaps the src registers and inverts the blend mask. This is primarily to improve memory folding (see new tests), but it also improves the quality of shuffles (see modified tests). Differential Revision: http://reviews.llvm.org/D6015 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221313 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-04 23:25:08 +00:00
Andrea Di Biagio	c270e8abbe	[X86] Add 'FeatureSlowSHLD' to cpu 'bdver3'. Also explicit set FeatureAVX and FeatureSSE4A for all the bdver* cpus. This patch adds 'FeatureSlowSHLD' to 'bdver3'. According to the official AMD optimization guide for amdfam15: "Using alternative code in place of SHLD achieves lower overall latency and requires fewer execution resources. The 32-bit and 64-bit forms of ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath instructions, while SHLD is a VectorPath instruction." This patch also explicitly sets feature AVX and SSE4A for all the bdver* cpus. This part of the patch is a non-functional change and it is mainly done for clarity reasons (Both XOP and FMA4 already imply AVX and SSE4A). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221296 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-04 21:18:09 +00:00
Ahmed Bougacha	38728dc043	[X86] Add debug print name for X86ISD::[US]MUL8. NFC-ish. The opcodes were added in r220516, but I forgot to add the print names. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221185 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-03 21:25:18 +00:00
Ahmed Bougacha	40453da779	[X86] 8bit divrem: Improve codegen for AH register extraction. For 8-bit divrems where the remainder is used, we used to generate: divb %sil shrw $8, %ax movzbl %al, %eax That was to avoid an H-reg access, which is problematic mainly because it isn't possible in REX-prefixed instructions. This patch optimizes that to: divb %sil movzbl %ah, %eax To do that, we explicitly extend AH, and extract the L-subreg in the resulting register. The extension is done using the NOREX variants of MOVZX. To support signed operations, MOVSX_NOREX is also added. Further, this introduces a new SDNode type, [us]divrem_ext_hreg, which is then lowered to a sequence containing a single zext (rather than 2). Differential Revision: http://reviews.llvm.org/D6064 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221176 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-03 20:26:35 +00:00
Rafael Espindola	5793838fc8	Remove redundant calls to isMaterializable. This removes calls to isMaterializable in the following cases: * It was redundant with a call to isDeclaration now that isDeclaration returns the correct answer for materializable functions. * It was followed by a call to Materialize. Just call Materialize and check EC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221050 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 16:46:18 +00:00
Adrian Prantl	bdec4aee7b	Revert "Temporarily revert r220777 to sort out build bot breakage." This reverts commit r221028. Later commits depend on this and reverting just this one causes even more bots to fail. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221041 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 03:19:45 +00:00
NAKAMURA Takumi	5ea912ac8d	Revert r220779, "[AVX512] Removed special case for cmp instructions in getVectorMaskingNode. Now cmp intrinsics lower as other intrinsics through VSELECT, and then VSELECT tranforms to AND in PerformSELECTCombine." Since r221028 (reverting r220777), this caused failures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221040 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 01:36:14 +00:00
Adrian Prantl	a4e1564971	Temporarily revert r220777 to sort out build bot breakage. "[x86] Simplify vector selection if condition value type matches vselect value type and true value is all ones or false value is all zeros." git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221028 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 00:26:59 +00:00
Reid Kleckner	e1a4787d5d	Work around bugs in MSVC "14" CTP 3's conversion logic It appears to ignore or find ambiguous MachineInstrBuilder's conversion operators that allow conversion to MachineInstr* and MachineBasicBlock::bundle_iterator. As a workaround, add an explicit way to get the MachineInstr. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221017 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 23:19:46 +00:00
Robert Khasanov	7d18d46ef2	[AVX512] Added VBROADCAST{SS/SD} encoding for VL subset. Refactored through AVX512_maskable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220908 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-30 14:21:47 +00:00
Robert Khasanov	63c2f3292e	[AVX512] Implemented AVX512VL FP bnary packed instructions (VADDP, VSUBP, VMULP, VDIVP, VMAXP, VMINP) Refactored through AVX512_maskable Added encoding tests for them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220858 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-29 15:43:02 +00:00
Robert Khasanov	e1610162fb	[AVX512] Fix VSQRT packed instructions internal names. No functional change git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220808 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 18:22:41 +00:00
Robert Khasanov	9371efbcdb	[AVX512] Extended avx512_sqrt_packed (sqrt instructions) to VL subset. Refactored through AVX512_maskable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220806 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 18:15:20 +00:00
Robert Khasanov	59cb03d329	[AVX-512] Expanded rsqrt/rcp instructions to VL subset. Refactored multiclass through AVX512_maskable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220783 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 16:37:13 +00:00
Robert Khasanov	d4345dd85f	[AVX512] Removed special case for cmp instructions in getVectorMaskingNode. Now cmp intrinsics lower as other intrinsics through VSELECT, and then VSELECT tranforms to AND in PerformSELECTCombine. No functional change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220779 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 16:17:14 +00:00
Robert Khasanov	edf556ec1f	[x86] Simplify vector selection if condition value type matches vselect value type and true value is all ones or false value is all zeros. This transformation worked if selector is produced by SETCC, however SETCC is needed only if we consider to swap operands. So I replaced SETCC check for this case. Added tests for vselect of <X x i1> values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220777 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 15:59:40 +00:00
Robert Khasanov	4a52493457	[AVX512] Bring back vector-shuffle lowering support through broadcasts Ffter commit at rev219046 512-bit broadcasts lowering become non-optimal. Most of tests on broadcasting and embedded broadcasting were changed and they doesn’t produce efficient code. Example below is from commit changes (it’s the first test from test/CodeGen/X86/avx512-vbroadcast.ll): define <16 x i32> @_inreg16xi32(i32 %a) { ; CHECK-LABEL: _inreg16xi32: ; CHECK: ## BB#0: -; CHECK-NEXT: vpbroadcastd %edi, %zmm0 +; CHECK-NEXT: vmovd %edi, %xmm0 +; CHECK-NEXT: vpbroadcastd %xmm0, %ymm0 +; CHECK-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0 ; CHECK-NEXT: retq %b = insertelement <16 x i32> undef, i32 %a, i32 0 %c = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32> zeroinitializer ret <16 x i32> %c } Here, 256-bit broadcast was generated instead of 512-bit one. In this patch 1) I added vector-shuffle lowering through broadcasts 2) Removed asserts and branches likes because this is incorrect - assert(Subtarget->hasDQI() && "We can only lower v8i64 with AVX-512-DQI"); 3) Fixed lowering tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220774 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 12:28:51 +00:00
Reid Kleckner	d5de327da0	X86: Implement the vectorcall calling convention This is a Microsoft calling convention that supports both x86 and x86_64 subtargets. It passes vector and floating point arguments in XMM0-XMM5, and passes them indirectly once they are consumed. Homogenous vector aggregates of up to four elements can be passed in sequential vector registers, but this part is not implemented in LLVM and will be handled in Clang. On 32-bit x86, it is similar to fastcall in that it uses ecx:edx as integer register parameters and is callee cleanup. On x86_64, it delegates to the normal win64 calling convention. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D5943 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220745 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 01:29:26 +00:00
Adam Nemet	6bc8d95153	[AVX512] Add vpermil variable version This is implemented via a multiclass that derives from the vperm imm multiclass. Fixes <rdar://problem/18426089> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220737 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 23:08:40 +00:00
Adam Nemet	5c76721372	[AVX512] Clean up avx512_perm_imm to use X86VectorVTInfo No functionality change. No change in X86.td.expanded except that we only set the CD8 attributes for the memory variants. (This shouldn't be used unless we have a memory operand.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220736 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 23:08:37 +00:00
Adam Nemet	7ba4de2ccc	[AVX512] Derive vpermil* from avx512_perm_imm This used to derive from avx512_pshuf_imm which is confusing. NFC. Compared X86.td.expanded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220735 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 23:08:34 +00:00
Adam Nemet	d0ee9ada16	[AVX512] Fix copy-and-paste bugs in vpermil 1) i512mem -> f512mem (this is the packed FP input being permuted) 2) element size is 64 bits in EVEX_CD8 for PD. (A good illustration why X86VectorVTInfo is useful) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220734 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 23:08:31 +00:00
Pete Cooper	68aeef61f4	Fix a stackmap bug introduced in r220710. For a call to not return in to the stackmap shadow, the shadow must end with the call. To do this, we must insert any required nops before the call, and not after it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220728 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 22:38:45 +00:00
Pete Cooper	7476f9c513	Stackmap shadows should consider call returns a branch target. To avoid emitting too many nops, a stackmap shadow can include emitted instructions in the shadow, but these must not include branch targets. A return from a call should count as a branch target as patching over the instructions after the call would lead to incorrect behaviour for threads currently making that call, when they return. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220710 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 19:40:35 +00:00
Yuri Gorshenin	75bb472c06	[asan-asm-instrumentation] Added comment describing how asm instrumentation works. Summary: [asan-asm-instrumentation] Added comment describing how asm instrumentation works. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5970 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220670 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 08:38:54 +00:00
Elena Demikhovsky	9e19cf1ffd	AVX-512: Fixed encoding of VPBROADCASTM and added SKX forms of this instruction git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220638 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-26 09:52:24 +00:00
Simon Pilgrim	c31aaa5a3f	[X86][SSE] Vector integer/float conversion memory folding Tidied up some entries in the folding tables so that they are under the correct comment section (they were categorised as AVX2 instructions when they're AVX1). Minor patch agreed with qcolombet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220613 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-25 08:11:20 +00:00
Kevin Enderby	44ccedc273	Fix a Mach-O assembler segfault for a subtraction expression with an undefined symbol. In a Mach-O object file a relocatable expression of the form SymbolA - SymbolB + constant is allowed when both symbols are defined in a section. But when either symbol is undefined it is an error. The code was crashing when it had an undefined symbol in this case. And should have printed a error message using the location information in the relocation entry. rdar://18678402 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220599 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 22:39:40 +00:00
Simon Pilgrim	44efa200e2	[X86][SSE] Bitcast assertion in XFormVExtractWithShuffleIntoLoad Minor patch to fix an issue in XFormVExtractWithShuffleIntoLoad where a load is unary shuffled, then bitcast (to a type with the same number of elements) before extracting an element. An undef was created for the second shuffle operand using the original (post-bitcasted) vector type instead of the pre-bitcasted type like the rest of the shuffle node - this was then causing an assertion on the different types later on inside SelectionDAG::getVectorShuffle. Differential Revision: http://reviews.llvm.org/D5917 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220592 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 21:04:41 +00:00
Sanjay Patel	51fa1bcef3	Allow AVX vrsqrtps generation. This is a follow-on to r220570 that allows a 256-bit (v8f32) version of vrsqrtps to be generated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220579 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 17:59:18 +00:00
Sanjay Patel	a46f06efe2	Use rsqrt (X86) to speed up reciprocal square root calcs This is a first step for generating SSE rsqrt instructions for reciprocal square root calcs when fast-math is allowed. For now, be conservative and only enable this for AMD btver2 where performance improves significantly - for example, 29% on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c (if we convert the data type to single-precision float). This patch adds a two constant version of the Newton-Raphson refinement algorithm to DAGCombiner that can be selected by any target via a parameter returned by getRsqrtEstimate().. See PR20900 for more details: http://llvm.org/bugs/show_bug.cgi?id=20900 Differential Revision: http://reviews.llvm.org/D5658 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220570 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 17:02:16 +00:00
Adam Nemet	7f7bf0da6a	[AVX512] FMA support for the 231 variants This is asm/diasm-only support, similar to AVX. For ISeling the register variant, they are no different from 213 other than whether the multiplication or the addition operand is destructed. For ISeling the memory variant, i.e. to fold a load, they are no different than the 132 variant. The addition operand (op3) in both cases can come from memory. Again the ony difference is which operand is destructed. There could be a post-RA pass that would convert a 213 or 132 into a 231. Part of <rdar://problem/17082571> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220540 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 00:03:00 +00:00
Adam Nemet	97a3cd4383	[AVX512] Introduce fma3p_forms from AVX This multiclass generates the different forms: 213, 231, 132 in AVX. 132 in AVX512 is a separate class but I am planning to use this same multiclass to generate 231 relying on the nice the null_frag trick from AVX to disable codegen pattern for 231. No functionality change, no change in X86.td.expanded except for the different instruction definition names. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220539 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 00:02:55 +00:00
Ahmed Bougacha	4d962b05ca	[X86] Improve mul w/ overflow codegen, to MUL8+SETO. Currently, @llvm.smul.with.overflow.i8 expands to 9 instructions, where 3 are really needed. This adds X86ISD::UMUL8/SMUL8 SD nodes, and custom lowers them to MUL8/IMUL8 + SETO. i8 is a special case because there is no two/three operand variants of (I)MUL8, so the first operand and return value need to go in AL/AX. Also, we can't write patterns for these instructions: TableGen refuses patterns where output operands don't match SDNode results. In this case, instructions where the output operand is an implicitly defined register. A related special case (and FIXME) exists for MUL8 (X86InstrArith.td): // FIXME: Used for 8-bit mul, ignore result upper 8 bits. // This probably ought to be moved to a def : Pat<> if the // syntax can be accepted. [(set AL, (mul AL, GR8:$src)), (implicit EFLAGS)] Ideally, these go away with UMUL8, but we still need to improve TableGen support of implicit operands in patterns. Before this change: movsbl %sil, %eax movsbl %dil, %ecx imull %eax, %ecx movb %cl, %al sarb $7, %al movzbl %al, %eax movzbl %ch, %esi cmpl %eax, %esi setne %al After: movb %dil, %al imulb %sil seto %al Also, remove a made-redundant testcase for PR19858, and enable more FastISel ALU-overflow tests for SelectionDAG too. Differential Revision: http://reviews.llvm.org/D5809 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220516 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-23 21:55:31 +00:00
Matt Arsenault	015776f38c	Add minnum / maxnum codegen git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220342 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 23:01:01 +00:00
NAKAMURA Takumi	c65b578e25	X86AsmInstrumentation.cpp: Dissolve initializer-ranged-for. MSC17 disliked it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220301 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 16:22:52 +00:00
Yuri Gorshenin	7ffc5bb51a	[asan-asm-instrumentation] Fixed memory accesses with rbp as a base or an index register. Summary: Fixed memory accesses with rbp as a base or an index register. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5819 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220283 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 10:22:27 +00:00
Rafael Espindola	45968c54e9	Fix a bit of confusion about .set and produce more readable assembly. Every target we support has support for assembly that looks like a = b - c .long a What is special about MachO is that the above combination suppresses the production of a relocation. With this change we avoid producing the intermediary labels when they don't add any value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220256 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-21 01:17:30 +00:00
Quentin Colombet	a37862e2de	[X86] Fix a bug in the lowering of the mask of VSELECT. X86 code to lower VSELECT messed a bit with the bits set in the mask of VSELECT when it knows it can be lowered into BLEND. Indeed, only the high bits need to be set for those and it optimizes those accordingly. However, when the mask is a compile time constant, the lowering will be handled by the generic optimizer and those modifications will generate bad code in the generic optimizer. This patch fixes that by preventing the optimization if the VSELECT will be handled by the generic optimizer. <rdar://problem/18675020> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220242 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 23:13:30 +00:00
Simon Pilgrim	0d1978b813	[X86] Memory folding for commutative instructions (updated) This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning. Updated version of r219584 (reverted in r219595) - the commutation attempt now explicitly ensures that neither of the commuted source operands are tied to the destination operand / register, which was the source of all the regressions that occurred with the original patch attempt. Added additional regression test case provided by Joerg Sonnenberger. Differential Revision: http://reviews.llvm.org/D5818 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220239 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-20 22:14:22 +00:00
Andrea Di Biagio	5512b50db5	[X86] Fix missed selection of non-temporal store of zero vector. When the input to a store instruction was a zero vector, the backend always selected a normal vector store regardless of the non-temporal hint. This is fixed by this patch. This fixes PR19370. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220054 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 17:27:06 +00:00
Adam Nemet	fb9d61a8d6	[AVX512] Add DQ subvector inserts In AVX512f we support 64x2 and 32x8 inserts via matching them to 32x4 and 64x4 respectively. These are matched by "Alt" Pat<>'s (Alt stands for alternative VTs). Since DQ has native support for these intructions, I peeled off the non-"Alt" part of the baseclass into vinsert_for_size_no_alt. The DQ instructions are derived from this multiclass. The "Alt" Pat<>'s are disabled with DQ. Fixes <rdar://problem/18426089> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219874 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:42:17 +00:00
Adam Nemet	ccebe7258e	[AVX512] Two new attributes in X86VectorVTInfo for subvector insert The new attributes are NumElts and the CD8TupleForm. This prepares the code to enable x8 and x2 inserts. NFC, no change in X86.td.expanded except for the new attributes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219871 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:42:09 +00:00
Adam Nemet	80b9e006aa	[AVX512] Rename arg from Opcode32/64 to Opcode128/256 in vinsert_for_size It's the W bit that selects between 32 or 64 elt type and not the opcode. The opcode selects between the width of the insert (128 or 256). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219870 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:42:04 +00:00
Rafael Espindola	90ce9f70e2	Simplify handling of --noexecstack by using getNonexecutableStackSection. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219799 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 16:12:52 +00:00
Rafael Espindola	b510f8d08c	Move getNonexecutableStackSection up to the base ELF class. The .note.GNU-stack section is not SystemZ/X86 specific. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219796 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 15:44:16 +00:00
Simon Pilgrim	84a3feea38	[X86][SSE] pslldq/psrldq shuffle mask decodes Patch to provide shuffle decodes and asm comments for the sse pslldq/psrldq SSE2/AVX2 byte shift instructions. Differential Revision: http://reviews.llvm.org/D5598 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219738 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-14 22:31:34 +00:00
Hans Wennborg	76806748d4	[x86 asm] allow fwait alias in both At&t and Intel modes (PR21208) Differential Revision: http://reviews.llvm.org/D5741 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219725 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-14 21:41:17 +00:00
Robert Khasanov	ad5d223cb5	[AVX512] Extended avx512_binop_rm to DQ/VL subsets. Added encoding tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219686 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-14 15:13:56 +00:00

1 2 3 4 5 ...

10881 Commits