llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-02-24 12:29:33 +00:00

Author	SHA1	Message	Date
Chandler Carruth	04402a6c13	[x86] Undo a flawed transform I added to form UNPCK instructions when AVX is available, and generally tidy up things surrounding UNPCK formation. Originally, I was thinking that the only advantage of PSHUFD over UNPCK instruction variants was its free copy, and otherwise we should use the shorter encoding UNPCK instructions. This isn't right though, there is a larger advantage of being able to fold a load into the operand of a PSHUFD. For UNPCK, the operand must be in a register so it can be the second input. This removes the UNPCK formation in the target-specific DAG combine for v4i32 shuffles. It also lifts the v8 and v16 cases out of the AVX-specific check as they are potentially replacing multiple instructions with a single instruction and so should always be valuable. The floating point checks are simplified accordingly. This also adjusts the formation of PSHUFD instructions to attempt to match the shuffle mask to one which would fit an UNPCK instruction variant. This was originally motivated to allow it to match the UNPCK instructions in the combiner, but clearly won't now. Eventually, we should add a MachineCombiner pass that can form UNPCK instructions post-RA when the operand is known to be in a register and thus there is no loss. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217755 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-15 10:35:41 +00:00
Chandler Carruth	886f0101a7	[x86] Fix the very broken formation of vpunpck instructions in the target-specific shuffl DAG combines. We were recognizing the paired shuffles backwards. This code needs to be replaced anyways as we have the same functionality elsewhere, but I'll do the refactoring in a follow-up, this is the minimal fix to the behavior. In addition to fixing miscompiles with the new vector shuffle lowering, it also causes the canonicalization to kick in much better, selecting the smaller encoding variants in lots of places in the new AVX path. This still isn't quite ideal as we don't need both the shufpd and the punpck instructions, but that'll get fixed in a follow-up patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215690 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 03:54:49 +00:00
Benjamin Kramer	bb41c75ab5	X86: Custom lower sext v16i8 to v16i16, and the corresponding truncate. Also update the cost model. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193270 91177308-0d34-0410-b5e6-96231b3b80d8	2013-10-23 21:06:07 +00:00
Arnaud A. de Grandmaison	d9e70873f3	Cleanup: test source files do not need to be executable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180003 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-22 08:02:43 +00:00
Nadav Rotem	b05130e1b2	Optimize sext <4 x i8> and <4 x i16> to <4 x i64>. Patch by Ahmad, Muhammad T <muhammad.t.ahmad@intel.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177421 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-19 18:38:27 +00:00
Elena Demikhovsky	52981c4b60	I optimized the following patterns: sext <4 x i1> to <4 x i64> sext <4 x i8> to <4 x i64> sext <4 x i16> to <4 x i64> I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns: (sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT))) The sext_in_reg (v4i32 x) may be lowered to shl+sar operations. The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution. I also added a cost of this operations to the AVX costs table. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175619 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-20 12:42:54 +00:00
Nadav Rotem	0c8607ba6a	Revert 172708. The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends. This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical. Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172968 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-20 08:35:56 +00:00
Nadav Rotem	ba95865441	On Sandybridge split unaligned 256bit stores into two xmm-sized stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172894 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-19 08:38:41 +00:00
Elena Demikhovsky	6c327f92a5	Optimization for the following SIGN_EXTEND pairs: v8i8 -> v8i64, v8i8 -> v8i32, v4i8 -> v4i64, v4i16 -> v4i64 for AVX and AVX2. Bug 14865. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172708 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-17 09:59:53 +00:00
Benjamin Kramer	17347912b4	X86: Emit vector sext as shuffle + sra if vpmovsx is not available. Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better than scalarized loads. Fixes PR14590. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170984 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-22 11:34:28 +00:00
Elena Demikhovsky	4b977312c7	Optimized load + SIGN_EXTEND patterns in the X86 backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170506 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 07:50:20 +00:00
Elena Demikhovsky	dcabc7bca9	Optimization for SIGN_EXTEND operation on AVX. Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32 extensions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149600 91177308-0d34-0410-b5e6-96231b3b80d8	2012-02-02 09:10:43 +00:00

12 Commits