[x86] Fix PR20540 where the x86 shuffle DAG combiner had completely

broken logic for merging shuffle masks in the face of SM_SentinelZero
mask operands.

While these are '-1' they don't mean 'undef' the way '-1' means in the
pre-legalized shuffle masks. Instead, they mean that the shuffle
operation is forcibly zeroing that lane. Reflect this and explicitly
handle it in a bunch of places. In one place the effect is equivalent
but much more clear. In the rest it was really weirdly broken.

Also, rewrite the entire merging thing to be a more directy operation
with a single loop and just doing math to map the indices through the
various masks.

Also add a bunch of asserts to try to make in extremely clear what the
different masks can possibly look like.

Finally, add some comments to clarify that we're merging shuffle masks
*up* here rather than *down* as we do everywhere else, and thus the
logic is quite confusing.

Thanks to several different people for sending test cases, and for
Robert Khasanov for an initial attempt at fixing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215687 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Chandler Carruth
2014-08-15 02:43:18 +00:00
parent feb45e3f0f
commit 477f28c48d
2 changed files with 53 additions and 22 deletions

View File

@@ -313,3 +313,15 @@ entry:
ret <16 x i8> %s.12.4
}
define <16 x i8> @PR20540(<8 x i8> %a) {
; SSSE3-LABEL: @PR20540
; SSSE3: # BB#0:
; SSSE3-NEXT: pxor %xmm1, %xmm1
; SSSE3-NEXT: pshufb {{.*}} # xmm1 = zero,zero,zero,zero,zero,zero,zero,zero,xmm1[0,0,0,0,0,0,0,0]
; SSSE3-NEXT: pshufb {{.*}} # xmm0 = xmm0[0,2,4,6,8,10,12,14],zero,zero,zero,zero,zero,zero,zero,zero
; SSSE3-NEXT: por %xmm1, %xmm0
; SSSE3-NEXT: retq
%shuffle = shufflevector <8 x i8> %a, <8 x i8> zeroinitializer, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 8, i32 8, i32 8, i32 8, i32 8, i32 8, i32 8>
ret <16 x i8> %shuffle
}