llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-12-20 03:17:48 +00:00

Author	SHA1	Message	Date
Chandler Carruth	88897b7c05	[x86] Clean up some unused variables, especially in release builds. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211894 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 12:04:18 +00:00
Chandler Carruth	c5114dbcc3	[x86] Teach the target combine step to aggressively fold pshufd insturcions. Summary: This allows it to fold pshufd instructions across intervening half-shuffles and other noise. This pattern actually shows up in the generic lowering tests, but I've also added direct tests using intrinsics to make sure that the specific desired functionality is working even if the lowering stuff changes in the future. Differential Revision: http://reviews.llvm.org/D4292 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211892 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 11:40:13 +00:00
Chandler Carruth	4363b0729b	[x86] Teach the target-specific combining how to aggressively fold half-shuffles, even looking through intervening instructions in a chain. Summary: This doesn't happen to show up with any test cases I've found for the current shuffle lowering, but previous attempts would benefit from this and it seems generally useful. I've tested it directly using intrinsics, which also shows that it will work with hand vectorized code as well. Note that even though pshufd isn't directly used in these tests, it gets exercised because we combine some of the half shuffles into a pshufd first, and then merge them. Differential Revision: http://reviews.llvm.org/D4291 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211890 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 11:34:40 +00:00
Chandler Carruth	f91161874e	[x86] Teach the X86 backend to DAG-combine SSE2 shuffles that are trivially redundant. This fixes several cases in the new vector shuffle lowering algorithm which would generate redundant shuffle instructions for the sake of simplicity. I'm also deleting a testcase which was somewhat ridiculous. It was checking for a bug in 2007 about incorrectly transforming shuffles by looking for the string "-86" in the output of a pretty substantial function. This test case doesn't seem to have any value at this point. Differential Revision: http://reviews.llvm.org/D4240 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211889 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 11:27:52 +00:00
Chandler Carruth	050d187bc8	[x86] Begin a significant overhaul of how vector lowering is done in the x86 backend. This sketches out a new code path for vector lowering, hidden behind an off-by-default flag while it is under development. The fundamental idea behind the new code path is to aggressively break down the problem space in ways that ease selecting the odd set of instructions available on x86, and carefully avoid scalarizing code even when forced to use older ISAs. Notably, this starts off restricting itself to SSE2 and implements the complete vector shuffle and blend space for 128-bit vectors in SSE2 without scalarizing. The plan is to layer on top of this ISA extensions where we can bail out of the complex SSE2 lowering and opt for a cheaper, specialized instruction (or set of instructions). It also needs to be generalized to AVX and AVX512 vector widths. Currently, this does a decent but not perfect job for SSE2. There are some specific shortcomings that I plan to address: - We need a peephole combine to fold together shuffles where possible. There are cases where a previous shuffle could be modified slightly to arrange for elements to be in the correct position and a later shuffle eliminated. Doing this eagerly added quite a bit of complexity, and so my plan is to combine away these redundancies afterward. - There are a lot more clever ways to use unpck and pack that need to be added. This is essential for real world shuffles as it turns out... Once SSE2 is polished a bit I should be able to get interesting numbers on performance improvements on benchmarks conducive to vectorization. All of this will be off by default until it is functionally equivalent of course. Differential Revision: http://reviews.llvm.org/D4225 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211888 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 11:23:44 +00:00
Andrea Di Biagio	fce47fe80c	Silence a warning due to a comparison between signed and unsigned. No functional change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211782 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-26 13:41:10 +00:00
Andrea Di Biagio	817812f61c	[X86] Improve the selection of SSE3/AVX addsub instructions. This patch teaches the backend how to canonicalize a shuffle vectors according to the rule: - (shuffle (FADD A, B), (FSUB A, B), Mask) -> (shuffle (FSUB A, -B), (FADD A, -B), Mask) Where 'Mask' is: <0,5,2,7> ;; for v4f32 and v4f64 shuffles. <0,3> ;; for v2f64 shuffles. <0,9,2,11,4,13,6,15> ;; for v8f32 shuffles. In general, ISel only knows how to pattern-match a canonical 'fadd + fsub + blendi' dag node sequence into an ADDSUB instruction. This new rule allows to convert a non-canonical dag sequence into a canonical one that will be matched by a single ADDSUB at ISel stage. The idea of converting a non-canonical ADDSUB into a canonical one by swapping the first two operands of the shuffle, and then negating the second operand of the FADD and FSUB, was originally proposed by Hal Finkel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211771 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-26 10:45:21 +00:00
Andrea Di Biagio	cae1ea691d	[X86] Always prefer to lower a VECTOR_SHUFFLE into a BLENDI instead of SHUFP (or VPERM2X128). This patch teaches method 'LowerVECTOR_SHUFFLE' to give higher precedence to the check for 'isBlendMask'; the idea is that, when possible, we should firstly check if a shuffle performs a blend, and in case, try to lower it into a BLENDI instead of selecting a SHUFP or (worse) a VPERM2X128. In general: - AVX VBLENDPS/D always have better latency and throughput than VPERM2F128; - BLENDPS/D instructions tend to always have better 'reciprocal throughput' than the equivalent SHUFPS/D; - Both BLENDPS/D and SHUFPS/D are often decoded into the same number of m-ops; however, a m-op obtained from a BLENDPS/D can be scheduled to more than one execution port. This patch: - Moves the check for 'isBlendMask' immediately before the check for 'isSHUFPMask' within method 'LowerVECTOR_SHUFFLE'; - Updates existing tests for sse/avx shuffle/blend instructions to verify that we select (v)blendps/d when possible (instead of (v)shufps/d or vperm2f128). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211720 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-25 17:41:58 +00:00
Chandler Carruth	2edf5e45ec	[x86] Add intrinsics for the pshufd, pshuflw, and pshufhw instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211694 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-25 13:12:54 +00:00
NAKAMURA Takumi	b720a3d15c	Re-apply r211399, "Generate native unwind info on Win64" with a fix to ignore SEH pseudo ops in X86 JIT emitter. -- This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211691 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-25 12:41:52 +00:00
Andrea Di Biagio	3e5582cc15	[X86] Add target combine rule to select ADDSUB instructions from a build_vector This patch teaches the backend how to combine a build_vector that implements an 'addsub' between packed float vectors into a sequence of vector add and vector sub followed by a VSELECT. The new VSELECT is expected to be lowered into a BLENDI. At ISel stage, the sequence 'vector add + vector sub + BLENDI' is pattern-matched against ISel patterns added at r211427 to select 'addsub' instructions. Added three more ISel patterns for ADDSUB. Added test sse3-avx-addsub-2.ll to verify that we correctly emit 'addsub' instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211679 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-25 10:02:21 +00:00
Robert Khasanov	031ad1b930	vpblend intrinsics combines as shifts intrinsics due to absence return stmt between them Fix PR20088 Differential Revision: http://reviews.llvm.org/D4277 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211617 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-24 18:08:04 +00:00
NAKAMURA Takumi	9124b45918	Revert r211399, "Generate native unwind info on Win64" It broke Legacy JIT Tests on x86_64-{mingw32\|msvc}, aka Windows x64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211480 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-22 22:00:56 +00:00
Filipe Cabecinhas	7798d5992a	Fix PR20087 by using the source index when changing the vector load git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211472 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-22 17:21:37 +00:00
Reid Kleckner	5b8e73ef81	Generate native unwind info on Win64 This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211399 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-20 20:35:47 +00:00
Chandler Carruth	c577e71bf5	[x86] Make the x86 PACKSSWB, PACKSSDW, PACKUSWB, and PACKUSDW instructions available as synthetic SDNodes PACKSS and PACKUS that will select to the correct instruction variants based on the return type. This allows us to use these rather important instructions when lowering vector shuffles. Also moves the relevant instruction definitions to be split out from the fully generic multiclasses to allow them to match these new SDNodes in the same way that the UNPCK instructions do. No functionality should actually be changed here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211332 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-20 01:05:28 +00:00
Andrea Di Biagio	cfdf805286	[X86] Teach how to combine horizontal binop even in the presence of undefs. Before this change, the backend was unable to fold a build_vector dag node with UNDEF operands into a single horizontal add/sub. This patch teaches how to combine a build_vector with UNDEF operands into a horizontal add/sub when possible. The algorithm conservatively avoids to combine a build_vector with only a single non-UNDEF operand. Added test haddsub-undef.ll to verify that we correctly fold horizontal binop even in the presence of UNDEFs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211265 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-19 10:29:41 +00:00
Cameron McInally	5d57928a32	Hook up vector int_ctlz for AVX512. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211024 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-16 14:12:28 +00:00
Tim Northover	eee7a7a836	X86: lower ATOMIC_CMP_SWAP_WITH_SUCCESS directly Lowering this new node allows us to fold the almost universal comparison for success before it's even formed. Instead we can create a copy from EFLAGS and an X86ISD::SETCC operation since all "cmpxchg" instructions set the zero-flag to the correct value. rdar://problem/13201607 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210923 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-13 17:29:39 +00:00
Tim Northover	8f2a85e099	IR: add "cmpxchg weak" variant to support permitted failure. This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity all cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210903 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-13 14:24:07 +00:00
Andrea Di Biagio	371446a7b8	[X86] Teach how to dump the name of target node RDTSCP_DAG. When I originally added node RDTSCP_DAG (r207127) I forgot to add a string name for it in method 'getTargetNodeName'. No functional change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210769 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-12 11:37:24 +00:00
Andrea Di Biagio	bf4e625cf1	[X86] Teach how to combine AVX and AVX2 horizontal binop on packed 256-bit vectors. This patch adds target combine rules to match: - [AVX] Horizontal add/sub of packed single/double precision floating point values from 256-bit vectors; - [AVX2] Horizontal add/sub of packed integer values from 256-bit vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210761 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-12 10:53:48 +00:00
Tim Northover	55c8ec6588	X86: add stringy name for X86ISD::LCMPXCHG16_DAG I don't know what "target specific node #383" is, and I don't want to have to. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210663 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-11 17:04:08 +00:00
Andrea Di Biagio	a069e64112	[X86] Refactor the logic to select horizontal adds/subs to a helper function. This patch moves part of the logic implemented by the target specific combine rules added at r210477 to a separate helper function. This should make easier to add more rules for matching AVX/AVX2 horizontal adds/subs. This patch also fixes a problem caused by a wrong check performed on indices of extract_vector_elt dag nodes in input to the scalar adds/subs. New tests have been added to verify that we correctly check indices of extract_vector_elt dag nodes when selecting a horizontal operation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210644 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-11 07:57:50 +00:00
Eric Christopher	f85ae2a8c2	Use the TargetMachine on the DAG or the MachineFunction instead of using the cached TargetMachine. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210589 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-10 21:25:13 +00:00
Eric Christopher	a1c71aa78d	Add a FIXME. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210559 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-10 18:31:18 +00:00
Andrea Di Biagio	c0edcf7de8	[X86] Improved target combine rules for selecting horizontal add/sub. This patch slightly changes the algorithm introduced at revision 210477 to fix a problem where the algorithm was producing incorrect code for the VEX.256 encoded versions of horizontal add/sub. For these cases, we now try to split the two 256-bit vectors into 128-bit chunks before emitting horizontal add/sub dag nodes. Added a new test case into haddsub-2.ll. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210545 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-10 16:42:57 +00:00
Tom Stellard	102d0f3e3f	SelectionDAG: Don't use MVT::Other to determine legality of ISD::SELECT_CC The SelectionDAG bad a special case for ISD::SELECT_CC, where it would allow targets to specify: setOperationAction(ISD::SELECT_CC, MVT::Other, Expand); to indicate that they wanted to expand ISD::SELECT_CC for all types. This wasn't applied correctly everywhere, and it makes writing new DAG patterns with ISD::SELECT_CC difficult. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210541 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-10 16:01:29 +00:00
Tim Northover	efbf7d1ceb	Revert "X86: elide comparisons after cmpxchg instructions." This reverts commit r210523. It was committed prematurely without waiting for review. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210524 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-10 10:50:11 +00:00
Tim Northover	984ee65445	X86: elide comparisons after cmpxchg instructions. The C++ and C semantics of the compare_and_swap operations actually require us to return a boolean "success" value. In LLVM terms this means a second comparison of the output of "cmpxchg" against the input desired value. However, x86's "cmpxchg" instruction sets all flags for the comparison formed, so we can skip any secondary comparison. (N.b. this isn't true for cmpxchg8b/16b, which only set ZF). rdar://problem/13201607 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210523 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-10 10:49:07 +00:00
Eric Christopher	fdea941583	Move all of the x86 subtarget initialized variables down into the x86 subtarget from the x86 target machine. Should be no functional change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210479 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-09 17:08:19 +00:00
Andrea Di Biagio	9b6992ddc2	[X86] Add target combine rules for horizontal add/sub. This patch adds new target specific combine rules to identify horizontal add/sub idioms from BUILD_VECTOR dag nodes. This patch also teaches the DAGCombiner how to canonicalize sequences of insert_vector_elt dag nodes according to the following rule: (insert_vector_elt (insert_vector_elt A, I0), I1) -> (insert_vecto_elt (insert_vector_elt A, I1), I0) This new canonicalization rule only triggers if the inner insert_vector dag node has exactly one use; also, both indices must be known constants, and I1 < I0. This last rule made it possible to write a simpler algorithm to identify horizontal add/sub patterns because now we don't have to worry about the ordering of insert_vector_elt dag nodes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210477 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-09 16:54:41 +00:00
Andrea Di Biagio	592d439efe	[X86] Avoid emitting unnecessary test instructions. This patch teaches the backend how to check for the 'NoSignedWrap' flag on binary operations to improve the emission of 'test' instructions. If the result of a binary operation is known not to overflow we know that resetting the Overflow flag is unnecessary and so we can avoid emitting the test instruction. Patch by Marcello Maggioni. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210468 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-09 12:34:50 +00:00
Alexey Volkov	a2bc6951a0	[X86] Use ADD/SUB instead of INC/DEC for Silvermont According to Intel Software Optimization Manual on Silvermont INC or DEC instructions require an additional uop to merge the flags. As a result, a branch instruction depending on an INC or a DEC instruction incurs a 1 cycle penalty. Differential Revision: http://reviews.llvm.org/D3990 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210466 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-09 11:40:41 +00:00
Craig Topper	b177041dfa	[C++11] Use 'nullptr'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210442 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-08 22:29:17 +00:00
Benjamin Kramer	0da5960e5b	X86: Don't turn shifts into ands if there's another use that may not check for equality. Fixes PR19964. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210371 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-06 21:08:55 +00:00
Filipe Cabecinhas	78cf19b9b9	Fixed a bug in lowering shuffle_vectors to insertps Summary: We were being too strict and not accounting for undefs. Added a test case and fixed another one where we improved codegen. Reviewers: grosbach, nadav, delena Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4039 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210361 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-06 18:07:06 +00:00
Andrea Di Biagio	0bedfa456a	[X86] Fix checked arithmetic for i8 on X86. When lowering a ISD::BRCOND into a test+branch, make sure that we always use the correct condition code to emit the test operation. This fixes PR19858: "i8 checked mul is wrong on x86". Patch by Keno Fisher! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210032 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-02 16:00:27 +00:00
Eric Christopher	c55e193cdd	Have the TLOF creation take a Triple rather than needing a subtarget. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209937 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-31 00:07:32 +00:00
Andrea Di Biagio	1726e2ff15	[X86] Add two combine rules to simplify dag nodes introduced during type legalization when promoting nodes with illegal vector type. This patch teaches the backend how to simplify/canonicalize dag node sequences normally introduced by the backend when promoting certain dag nodes with illegal vector type. This patch adds two new combine rules: 1) fold (shuffle (bitcast (BINOP A, B)), Undef, <Mask>) -> (shuffle (BINOP (bitcast A), (bitcast B)), Undef, <Mask>) 2) fold (BINOP (shuffle (A, Undef, <Mask>)), (shuffle (B, Undef, <Mask>))) -> (shuffle (BINOP A, B), Undef, <Mask>). Both rules are only triggered on the type-legalized DAG. In particular, rule 1. is a target specific combine rule that attempts to sink a bitconvert into the operands of a binary operation. Rule 2. is a target independet rule that attempts to move a shuffle immediately after a binary operation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209930 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-30 23:17:53 +00:00
Filipe Cabecinhas	94141a42ed	Separate the check for blend shuffle_vector masks Summary: Separate the check for blend shuffle_vector masks into isBlendMask. This function will also be used to check if a vector shuffle is legal. No change in functionality was intended, but we ended up improving codegen on two tests, which were being (more) optimized only if the resulting shuffle was legal. Reviewers: nadav, delena, andreadb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3964 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209923 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-30 21:31:21 +00:00
Rafael Espindola	dc3ce836da	Delete dead code. GV is never used past this point. This was probably a copy and paste error. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209518 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-23 15:07:51 +00:00
Andrea Di Biagio	3957d4245f	[X86] Improve the lowering of BITCAST from MVT::f64 to MVT::v4i16/MVT::v8i8. This patch teaches the x86 backend how to efficiently lower ISD::BITCAST dag nodes from MVT::f64 to MVT::v4i16 (and vice versa), and from MVT::f64 to MVT::v8i8 (and vice versa). This patch extends the logic from revision 208107 to also handle MVT::v4i16 and MVT::v8i8. Also, this patch correctly propagates Undef values when performing the widening of a vector (example: when widening from v2i32 to v4i32, the upper 64bits of the resulting vector are 'undef'). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209451 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-22 16:21:39 +00:00
Quentin Colombet	fd0096a42c	[X86] Fix a bug in the lowering of BLENDI introduced in r209043. ISD::VSELECT mask uses 1 to identify the first argument and 0 to identify the second argument. On the other hand, BLENDI uses 0 to identify the first argument and 1 to identify the second argument. Fix the generation of the blend mask to account for this difference. The bug did not show up with r209043, because we were not checking for the actual arguments of the blend instruction! This commit also fixes the test cases. Note: The same mask works for the BLENDr variant because the arguments are swapped during instruction selection (see the BLENDXXrr patterns). <rdar://problem/16975435> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209324 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-21 22:00:39 +00:00
Simon Atanasyan	57a98baa07	Add parentheses to suppress the gcc warning '-Wparentheses'. No functional changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209203 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-20 10:23:04 +00:00
Filipe Cabecinhas	ca162faee2	Added more insertps optimizations Summary: When inserting an element that's coming from a vector load or a broadcast of a vector (or scalar) load, combine the load into the insertps instruction. Added PerformINSERTPSCombine for the case where we need to fix the load (load of a vector + insertps with a non-zero CountS). Added patterns for the broadcasts. Also added tests for SSE4.1, AVX, and AVX2. Reviewers: delena, nadav, craig.topper Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3581 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209156 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-19 19:45:57 +00:00
Benjamin Kramer	bb81d9d5fa	SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the bswap not. - On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though. - On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal. - On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209123 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-19 13:12:38 +00:00
Saleem Abdulrasool	82b1114fef	Target: remove old constructors for CallLoweringInfo This is mostly a mechanical change changing all the call sites to the newer chained-function construction pattern. This removes the horrible 15-parameter constructor for the CallLoweringInfo in favour of setting properties of the call via chained functions. No functional change beyond the removal of the old constructors are intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209082 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-17 21:50:17 +00:00
Chandler Carruth	2ff4a49344	[x86] Fix a bad predicate I spotted by inspection -- pshufhw and pshuflw were added in SSE2, no SSSE3. Found this while auditing all uses of SSSE3 in the X86 target. I don't actually expect this to make a significant difference on anything and I don't have any detailed test cases but I updated the existing test cases that already covered some of this code path. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209056 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-17 03:29:20 +00:00
Filipe Cabecinhas	d77c1c4465	Implemented special cases for PerformVSELECTCombine. vselects with constant masks, after legalization, will get turned into specialized shuffle_vectors so they can be matched to blend+imm instructions. Fixed some tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209044 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-16 22:47:54 +00:00

... 4 5 6 7 8 ...

2884 Commits