llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-11-02 22:23:10 +00:00

Author	SHA1	Message	Date
Reid Kleckner	bc1fd917f0	Move the segmented stack switch to a function attribute This removes the -segmented-stacks command line flag in favor of a per-function "split-stack" attribute. Patch by Luqman Aden and Alex Crichton! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205997 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-10 22:58:43 +00:00
Elena Demikhovsky	0d5d656524	AVX-512: insert element to mask vector; store i1 data Implemented INSERT_VECTOR_ELT operation for v16i1 and v8i1 vectors; Implemented "store" for i1 type git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205850 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 12:37:50 +00:00
Elena Demikhovsky	dbcb670605	AVX-512: Added fp_to_uint and uint_to_fp patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205754 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-08 07:24:02 +00:00
Matt Arsenault	a134cec06b	Add DAG parameter to ComputeNumSignBitsForTargetNode This way, you can check the number of sign bits in the operands. The depth parameter it already has is pretty useless without this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205649 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 20:13:13 +00:00
Craig Topper	84f7f350c3	Make consistent use of MCPhysReg instead of uint16_t throughout the tree. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205610 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 05:16:06 +00:00
Yaron Keren	9ee14e3522	Added isTargetWindowsMSVC(), renamed isTargetMingw() to isTargetWindowsGNU() and isTargetCygwin() to isTargetWindowsCygwin() to be consistent with the four Windows environments in Triple.h. Suggestion by Saleem Abdulrasool! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205393 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 04:27:51 +00:00
Yaron Keren	f2dc47ce99	If isKnownWindowsMSVCEnvironment then getOS == Triple::Win32 and Environment == Triple::MSVC so it will never be MinGW or Cygwin. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205349 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 18:52:55 +00:00
Yaron Keren	bec4ff5781	isTargetWindows() renamed to isTargetKnownWindowsMSVC() to reflect its current functionality. Based on Takumi NAKAMURA suggestion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205338 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 18:15:34 +00:00
Aaron Ballman	77d76519dc	Attempting to fix r205124, which had failed asserts when built with MSVC. Suggestion from Yaron Keren. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205313 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 13:56:35 +00:00
Alexey Volkov	1d75829ddc	[x86] Do not convert to cmp32 for Atom arch by Sergey Okunev Differential Revision: http://llvm-reviews.chandlerc.com/D2824 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205288 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 08:13:07 +00:00
Rafael Espindola	f165cf7ce8	Prevent alias from pointing to weak aliases. This adds back r204781. Original message: Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is not the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204934 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-27 15:26:56 +00:00
Rafael Espindola	72db10a995	Revert "Prevent alias from pointing to weak aliases." This reverts commit r204781. I will follow up to with msan folks to see what is what they were trying to do with aliases to weak aliases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204784 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 06:14:40 +00:00
Rafael Espindola	33845aa8c4	Prevent alias from pointing to weak aliases. Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is not the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204781 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 04:48:47 +00:00
Adam Nemet	6f4f46cf11	[X86] Generate VPSHUFB for in-place v16i16 shuffles This used to resort to splitting the 256-bit operation into two 128-bit shuffles and then recombining the results. Fixes <rdar://problem/16167303> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204735 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 17:47:06 +00:00
Adam Nemet	9526911809	[X86] Factor out new helper getPSHUFB I found three implementations of this. This splits it out into a new function and uses it from the three places. My plan is to add a fourth use when lowering a vector_shuffle:v16i16. Compared the assembly output of test/CodeGen/X86 before and after. The only change is due to how the first PSHUFB was generated in LowerVECTOR_SHUFFLEv8i16. If the shuffle mask specified undef (i.e. -1), the old implementation would write -1 * 2 and -1 * 2 + 1 (254 and 255) in the control mask. Now we write 0x80. These are of course interchangeable since bit 7 decides if a constant zero is written in the result byte. The other instances of this code use 0x80 consistently. Related to <rdar://problem/16167303> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204734 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 17:47:03 +00:00
Adam Nemet	a1b54dd1ff	[X86] Fix non-determinism in LowerVectorAllZeroTest This can be observed with the old testcase of CodeGen/X86/pr12312.ll: 47c47 < vorps %ymm0, %ymm1, %ymm0 --- > vorps %ymm1, %ymm0, %ymm0 97c97 < vorps %ymm1, %ymm0, %ymm0 --- > vorps %ymm0, %ymm1, %ymm0 The vector VecIns is populated with all the values from VecInMap. This is done while iterating VecInMap. VecInMap uses a hash of pointer values so the resulting order can vary depending on the memory layout. The fix is to populate the vector VecIns earlier as VecInMap is populated. This is done in DAG traversal order. Fixes <rdar://problem/16398806> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204623 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 16:52:08 +00:00
Craig Topper	59ae7294ef	Prune includes in X86 target. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204216 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-19 06:53:25 +00:00
Adam Nemet	131ab020c3	[X86] Fix unused variable warning with NDEBUG from r204058 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204063 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-17 17:32:53 +00:00
Adam Nemet	8c8fe42a0d	[VectorLegalizer/X86] Don't unvectorize fp_to_uint for v8f32->v8i16 Rather than LegalizeAction::Expand, this needs LegalizeAction::Promote to get promoted to fp_to_sint v8f32->v8i32. This is a legal operation on AVX. For that to work properly, we also need to teach the legalizer about the specific promotion required here. The default vector promotion uses bitcasting to a vector type of the same total size. We want to promote the vector element type, effectively widening the operation and then truncating the result. This is analogous to the current logic of how int_to_fp is promoted. The change also factors out some code from the int_to_fp promotion code to ValueType::widenIntegerVectorElementType. This is now shared between int_to_fp and fp_to_int. There is no longer need for the custom lowering of fp_to_sint f32->v8i16 in X86. It can now go through the new target-independent fp_to_*int promotion logic. I also checked that no other target uses Promote for these ops yet, so there shouldn't be any unexpected change in behavior. Fixes <rdar://problem/16202247> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204058 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-17 17:06:14 +00:00
Arnaud A. de Grandmaison	3c143dde40	Remove some dead assignements found by scan-build git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204013 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-15 22:13:15 +00:00
Owen Anderson	bf63022492	Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changing operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203865 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 23:12:04 +00:00
Hans Wennborg	1332459dbb	X86: Don't generate 64-bit movd after cmpneqsd in 32-bit mode (PR19059) This fixes the bug where we would bitcast the 64-bit floating point result of cmpneqsd to a 64-bit integer even on 32-bit targets. Differential Revision: http://llvm-reviews.chandlerc.com/D3009 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203581 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 15:49:24 +00:00
Tim Northover	ca396e391e	IR: add a second ordering operand to cmpxhg for failure The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203559 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 10:48:52 +00:00
Jim Grosbach	7a37166a7a	X86: Enable ISel of 16-bit MOVBE instructions. When the MOVBE instructions are available, use them for 16-bit endian swapping as well as for 32 and 64 bit. The patterns were already present on the instructions, but weren't being matched because the operation was unconditionally marked to 'Expand.' Change that to be conditional on whether the MOVBE instructions are available. Use 'rolw' to implement the in-register version (32 and 64 bit have the dedicated 'bswap' instruction for that). Patch by Louis Gerbarg <lgg@apple.com>. rdar://15479984 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203524 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 00:44:14 +00:00
Cameron McInally	f3ff7c32f7	Lower AVX v4i64->v4i32 truncate to one shuffle. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202996 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-05 19:41:16 +00:00
Chandler Carruth	4bbfbdf7d7	[Modules] Move CallSite into the IR library where it belogs. It is abstracting between a CallInst and an InvokeInst, both of which are IR concepts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202816 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-04 11:01:28 +00:00
Benjamin Kramer	d628f19f5d	[C++11] Replace llvm::next and llvm::prior with std::next and std::prev. Remove the old functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202636 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-02 12:27:27 +00:00
Elena Demikhovsky	a9fe27ffb3	AVX-512: Fixed extract_vector_elt for v8i1 vector git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202624 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-02 09:19:44 +00:00
Quentin Colombet	685b0d9315	Lower unsigned vsetcc to psubus in certain cases The current approach to lower a vsetult is to flip the sign bit of the operands, swap the operands and then use a (signed) pcmpgt. psubus (unsigned saturating subtract) can be used to emulate a vsetult more efficiently: + case ISD::SETULT: { + // If the comparison is against a constant we can turn this into a + // setule. With psubus, setule does not require a swap. This is + // beneficial because the constant in the register is no longer + // destructed as the destination so it can be hoisted out of a loop. I also enable lowering via psubus in a few other cases where it's clearly beneficial: setule and setuge if minu/maxu cannot be used. rdar://problem/14338765 Patch by Adam Nemet <anemet@apple.com>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202301 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-26 21:39:12 +00:00
Tim Northover	44697f3fc1	X86 CodeGenPrep: sink shufflevectors before shifts On x86, shifting a vector by a scalar is significantly cheaper than shifting a vector by another fully general vector. Unfortunately, because SelectionDAG operates on just one basic block at a time, the shufflevector instruction that reveals whether the right-hand side of a shift is really a scalar is often not visible to CodeGen when it's needed. This adds another handler to CodeGenPrepare, to sink any useful shufflevector instructions down to the basic block where they're used, predicated on a target hook (since on other architectures, doing so will often just introduce extra real work). rdar://problem/16063505 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201655 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-19 10:02:43 +00:00
Tim Northover	d729dfc96e	X86: use vpsllvd (& friends) for 16-bit shifts on Haswell git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201558 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-18 11:15:32 +00:00
Elena Demikhovsky	f280c65b32	AVX-512: simpyfied BUILD_VECTOR for masks; fixed cmp/test sequence git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201487 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-16 11:34:23 +00:00
Andrea Di Biagio	8887371782	[X86] Teach the backend how to lower vector shift left into multiply rather than scalarizing it. Instead of expanding a packed shift into a sequence of scalar shifts, the backend now tries (when possible) to convert the vector shift into a vector multiply. Before this change, a shift of a MVT::v8i16 vector by a build_vector of constants was always scalarized into a long sequence of "vector extracts + scalar shifts + vector insert". With this change, if there is SSE2 support, we emit a single vector multiply. This change also affects SSE4.1, AVX, AVX2 shifts: - A shift of a MVT::v4i32 vector by a build_vector of non uniform constants is now lowered when possible into a single SSE4.1 vector multiply. - Packed v16i16 shift left by constant build_vector are now expanded when possible into a single AVX2 vpmullw. This change also improves the lowering of AVX512f vector shifts. Added test CodeGen/X86/vec_shift6.ll with some code examples that are affected by this change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201271 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-12 23:42:28 +00:00
Elena Demikhovsky	e9d5f6e387	AVX: fixed a bug in LowerVECTOR_SHUFFLE git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201140 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-11 10:21:53 +00:00
Elena Demikhovsky	e4092e9895	AVX-512: Optimized BUILD_VECTOR pattern; fixed encoding of VEXTRACTPS instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201134 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-11 07:25:59 +00:00
Elena Demikhovsky	27ef6eec41	AVX-512: Fixed extract_vector_elt for v16i1 and v8i1 vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201066 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-10 07:02:39 +00:00
Tim Northover	c0fc62c2f9	X86: deduplicate V[SZ]EXT_MOVL and V[SZ]EXT nodes I believe VZEXT_MOVL means "zero all vector elements except the first" (and should have identical input & output types) whereas VZEXT means "zero extend each element of a vector (discarding higher elements if necessary)". For example: (v4i32 (vzext (v16i8 ...))) should zero extend the low 4 bytes of the incoming vector to 32-bits, discarding higher bytes. However, somewhere in the past, these two concepts had become confused, even leading to a nonsensical VSEXT_MOVL. This re-merges the nodes where appropriate (all VSEXT_MOVL -> VSEXT, VZEXT_MOVL -> VZEXT when it's an actual extension). rdar://problem/15981990 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200918 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-06 09:54:51 +00:00
Matt Arsenault	bb7bf85f3c	Add address space argument to allowsUnalignedMemoryAccess. On R600, some address spaces have more strict alignment requirements than others. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200887 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-05 23:15:53 +00:00
Elena Demikhovsky	c341b7c0ef	AVX-512: optimized icmp -> sext -> icmp pattern git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200849 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-05 16:17:36 +00:00
Craig Topper	0e8eceffbf	Move matching for x86 BMI BLSI/BLSMSK/BLSR instructions to isel patterns instead of DAG combine. This weakens the ability to fold loads with them because we aren't able to match patterns that load the same thing twice. But maybe we should fix that if we care. The peephole optimizer will be able to fold some loads in its absense. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200824 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-05 07:09:40 +00:00
Reid Kleckner	8a24e83550	Implement inalloca codegen for x86 with the new inalloca design Calls with inalloca are lowered by skipping all stores for arguments passed in memory and the initial stack adjustment to allocate argument memory. Now the frontend is responsible for the memory layout, and the backend doesn't have to do any work. As a result these changes are pretty minimal. Reviewers: echristo Differential Revision: http://llvm-reviews.chandlerc.com/D2637 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200596 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-31 23:50:57 +00:00
Reid Kleckner	4fa3492f97	x86: Rename NumBytesForCalleeToPush to ...Pop for accuracy If we have a callee cleanup convention, the callee is going to pop the arguments off the stack, not push them on. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200566 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-31 19:07:18 +00:00
Andrea Di Biagio	106b79744b	[X86] Add extra rules for combining vselect dag nodes into movsd. This improves the fix committed at revision 199683 adding the following new target specific combine rules: 1) fold (v4i32: vselect <0,0,-1,-1>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast A)), (v2i64 (bitcast B))) )) 2) fold (v4f32: vselect <0,0,-1,-1>, A, B) -> (v4f32 (bitcast (movsd (v2f64 (bitcast A)), (v2f64 (bitcast B))) )) 3) fold (v4i32: vselect <-1,-1,0,0>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) 4) fold (v4f32: vselect <-1,-1,0,0>, A, B) -> (v4f32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200324 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-28 18:14:21 +00:00
Juergen Ributzka	efbb39740c	[TLI] Add a new hook to TargetLowering to query the target if a load of a constant should be converted to simply the constant itself. Before this patch we used getIntImmCost from TargetTransformInfo to determine if a load of a constant should be converted to just a constant, but the threshold for this was set to an arbitrary value. This value works well for the two targets (X86 and ARM) that implement this target-hook, but it isn't target-independent at all. Now targets have the possibility to decide directly if this optimization should be performed. The default value is set to false to preserve the current behavior. The target hook has been moved to TargetLowering, which removed the last use and need of TargetTransformInfo in SelectionDAG. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200271 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-28 01:20:14 +00:00
Juergen Ributzka	fe08a38a2c	[X86] Prevent the creation of redundant ops for sadd and ssub with overflow. This commit teaches the X86 backend to create the same X86 instructions when it lowers an sadd/ssub with overflow intrinsic and a conditional branch that uses that overflow result. This allows SelectionDAG to recognize and remove one of the redundant operations. This fixes <rdar://problem/15874016> and <rdar://problem/15661073>. Reviewed by Nadav git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199976 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-24 06:47:57 +00:00
Lang Hames	6f1f795717	Add a few missing cases from r199933. Testcase coming shortly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199938 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-23 21:27:27 +00:00
Lang Hames	d8f4348cab	Replace vfmaddxx213 instructions with their 231-type equivalents in accumulator loops. Writing back to the accumulator (231-type) allows the coalescer to eliminate an extra copy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199933 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-23 20:23:36 +00:00
Elena Demikhovsky	e1a621d84f	AVX-512: added VPERM2D VPERM2Q VPERM2PS VPERM2PD instructions, they give better sequences than VPERMI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199893 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-23 14:27:26 +00:00
Andrea Di Biagio	825b93b2df	[X86] Teach how to combine a vselect into a movss/movsd Add target specific rules for combining vselect dag nodes into movss/movsd when possible. If the vector type of the vselect dag node in input is either MVT::v4i13 or MVT::v4f32, then try to fold according to rules: 1) fold (vselect (build_vector (0, -1, -1, -1)), A, B) -> (movss A, B) 2) fold (vselect (build_vector (-1, 0, 0, 0)), A, B) -> (movss B, A) If the vector type of the vselect dag node in input is either MVT::v2i64 or MVT::v2f64 (and we have SSE2), then try to fold according to rules: 3) fold (vselect (build_vector (0, -1)), A, B) -> (movsd A, B) 4) fold (vselect (build_vector (-1, 0)), A, B) -> (movsd B, A) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199683 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-20 19:35:22 +00:00
Michael Gottesman	ee804f423d	Move the retrieval of VT after all of the early exits from PerformOrCombine that do not use VT. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199612 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-19 21:06:00 +00:00

1 2 3 4 5 ...

2578 Commits