llvm-6502/test/CodeGen
Adam Nemet 49f31255be [AVX512] Fix miscompile for unpack
r189189 implemented AVX512 unpack by essentially performing a 256-bit unpack
between the low and the high 256 bits of src1 into the low part of the
destination and another unpack of the low and high 256 bits of src2 into the
high part of the destination.

I don't think that's how unpack works.  AVX512 unpack simply has more 128-bit
lanes but other than it works the same way as AVX.  So in each 128-bit lane,
we're always interleaving certain parts of both operands rather different
parts of one of the operands.

E.g. for this:
__v16sf a = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
__v16sf b = { 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 };
__v16sf c = __builtin_shufflevector(a, b, 0, 8, 1, 9, 4, 12, 5, 13, 16,
	    			       	     24, 17, 25, 20, 28, 21, 29);

we generated punpcklps (notice how the elements of a and b are not interleaved
in the shuffle).  In turn, c was set to this:

  0 16 1 17 4 20 5 21 8 24 9 25 12 28 13 29

Obviously this should have just returned the mask vector of the shuffle
vector.

I mostly reverted this change and made sure the original AVX code worked
for 512-bit vectors as well.

Also updated the tests because they matched the logic from the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217602 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-11 16:51:10 +00:00
..
AArch64 [AArch64] Reenable the PBQP test now that the leak issue has been fixed. 2014-09-11 10:39:52 +00:00
ARM [ARM] Add Thumb-2 code size optimization regression test for LSR (register). 2014-09-11 10:45:50 +00:00
CPP
Generic Add a regression test to sanity check the PBQP allocator. 2014-09-03 18:04:10 +00:00
Hexagon DebugInfo: Assert that any CU for which debug_loc lists are emitted, has at least one range. 2014-08-06 00:21:25 +00:00
Inputs
Mips Replace -use-init-array with -use-ctors. 2014-09-02 13:54:53 +00:00
MSP430 Drop the W postfix on the 16-bit registers. 2014-09-10 06:58:14 +00:00
NVPTX [MachineSink] Use the real post dominator tree 2014-09-01 03:47:25 +00:00
PowerPC Enable splitting indexing from loads with TargetConstants 2014-09-02 16:05:23 +00:00
R600 R600: Test local atomics for evergreen 2014-09-11 15:02:52 +00:00
SPARC
SystemZ
Thumb Check-label a bit more specific 2014-09-03 13:32:08 +00:00
Thumb2 ARM / x86_64 varargs: Don't save regparms in prologue without va_start 2014-08-22 21:59:26 +00:00
X86 [AVX512] Fix miscompile for unpack 2014-09-11 16:51:10 +00:00
XCore llvm/test/CodeGen/XCore/dwarf_debug.ll: Fix not to be affected by *-win32. 2014-07-04 11:58:03 +00:00