25505 Commits

Author SHA1 Message Date
Rafael Espindola
936a1b573b Add a test showing the interaction of linker scripts and plugin.
In particular, the linker script is processed early enough for function g
to be internalized.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214916 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 19:56:53 +00:00
Chandler Carruth
fadc91beec [x86] Fix a crasher due to shuffles which cancel each other out and add
a test case.

We also miscompile this test case which is showing a serious flaw in the
single-input v8i16 shuffle code. I've left the specific instruction
checks FIXME-ed out until I can address the bug in the single-input
code, but I wanted to separate out a significant functionality change to
produce correct code from a very simple and targeted crasher fix.

The miscompile problem stems from keeping track of inputs by value
rather than by index. As a consequence of doing this, we can't reliably
update those inputs because they might swap and we can't detect this
without copying the mask.

The blend code now uses indices for the input lists and this seems
strictly better. It also should make it easier to sort things and do
other cleanups. I think the time has come to simplify The Great Lambda
here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214914 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 18:45:49 +00:00
Philip Reames
b835f3446f Remove dead zero store to calloc initialized memory
Optimize the following IR:

%1 = tail call noalias i8* @calloc(i64 1, i64 4)
%2 = bitcast i8* %1 to i32*
; This store is dead and should be removed
store i32 0, i32* %2, align 4

Memory returned by calloc is guaranteed to be zero initialized. If the value being stored is the constant zero (and the store is not otherwise observable across threads), we can delete the store.  If the store is to an out of bounds address, it is undefined and thus also removable.

Reviewed By: nicholas

Differential Revision: http://reviews.llvm.org/D3942




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214897 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:48:20 +00:00
Jonathan Roelofs
c2feb6bb3a Revert r214881 because it broke lots of build-bots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214893 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:36:05 +00:00
Sanjay Patel
6902c687b0 Optimize vector fabs of bitcasted constant integer values.
Allow vector fabs operations on bitcasted constant integer values to be optimized
in the same way that we already optimize scalar fabs.

So for code like this:
%bitcast = bitcast i64 18446744069414584320 to <2 x float> ; 0xFFFF_FFFF_0000_0000
%fabs = call <2 x float> @llvm.fabs.v2f32(<2 x float> %bitcast)
%ret = bitcast <2 x float> %fabs to i64

Instead of generating something like this:

movabsq (constant pool loadi of mask for sign bits)
vmovq   (move from integer register to vector/fp register)
vandps  (mask off sign bits)
vmovq   (move vector/fp register back to integer return register)

We should generate:

mov     (put constant value in return register)

I have also removed a redundant clause in the first 'if' statement:
N0.getOperand(0).getValueType().isInteger()

is the same thing as:
IntVT.isInteger()

Testcases for x86 and ARM added to existing files that deal with vector fabs.
One existing testcase for x86 removed because it is no longer ideal.

For more background, please see:
http://reviews.llvm.org/D4770

And:
http://llvm.org/bugs/show_bug.cgi?id=20354

Differential Revision: http://reviews.llvm.org/D4785


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214892 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:35:22 +00:00
Adam Nemet
545f89213d [AVX512] Add masking variant and intrinsics for valignd/q
This is similar to what I did with the two-source permutation recently.  (It's
almost too similar so that we should consider generating the masking variants
with some tablegen help.)

Both encoding and intrinsic tests are added as well.  For the latter, this is
what the IR that the intrinsic test on the clang side generates.

Part of <rdar://problem/17688758>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214890 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:23:04 +00:00
Jonathan Roelofs
77327b8520 Fix return sequence on armv4 thumb
POP on armv4t cannot be used to change thumb state (unilke later non-m-class
architectures), therefore we need a different return sequence that uses 'bx'
instead:

  POP {r3}
  ADD sp, #offset
  BX r3

This patch also fixes an issue where the return value in r3 would get clobbered
for functions that return 128 bits of data. In that case, we generate this
sequence instead:

  MOV ip, r3
  POP {r3}
  ADD sp, #offset
  MOV lr, r3
  MOV r3, ip
  BX lr

http://reviews.llvm.org/D4748



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214881 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 17:13:17 +00:00
David Blaikie
effcb72a40 Improve test for merged global debug info by using llvm-dwarfdump.
It's a bit of a tradeoff, since llvm-dwarfdump doesn't print the name of
the global symbol being used as an address in the addressing mode, but
this avoids the dependence on hardcoded set labels that keep changing
(5+ commits over the last few years that each update the set label as it
changes due to other, unrelated differences in output). This could've,
instead, been changed to match the set name then match the name in the
string pool but that would present other issues (needing to skip over
the sets that weren't of interest, etc) and checking that the addresses
(granted, without relocations applied - so it's not the whole story)
match in the two variable location descriptions seems sufficient and
fairly stable here.

There are a few similar other tests with similar label dependence that
I'll update soonish.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214878 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 16:20:25 +00:00
Joerg Sonnenberger
2888b08b44 Add accessors for the PPC 403 bank registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214875 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 15:45:15 +00:00
Renato Golin
84bd563975 Add tests for cp10/cp11 on ARMv5/6
Tests for ARMv7/8 are already on diagnostics.s

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214872 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 15:29:41 +00:00
Keith Walker
f2115a0b0d Specify that the thumb setend and blx <immed> instructions are not valid on an m-class target
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214871 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 15:11:59 +00:00
Keith Walker
966fc9344b Define stc2/stc2l/ldc2/ldc2l as thumb2 instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214868 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:58:05 +00:00
Joerg Sonnenberger
bb97134ffe Accessors for SSR2 and SSR3 on PPC 403.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214867 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:53:05 +00:00
Tom Stellard
94dfb8818d R600/SI: Update MUBUF assembly string to match AMD proprietary compiler
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214866 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:48:12 +00:00
Tom Stellard
9a7e35aecc R600/SI: Avoid generating REGISTER_LOAD instructions.
SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code
path for 8-bit and 16-bit private loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214865 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:40:52 +00:00
Joerg Sonnenberger
deaa09e169 Add dci/ici instructions for PPC 476 and friends.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214864 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:40:32 +00:00
Joerg Sonnenberger
2bdd960ae3 Add mftblo and mftbhi for PPC 4xx.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214863 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 14:18:16 +00:00
Joerg Sonnenberger
0f365741f3 Add lswi / stswi for assembler use with a warning to not add patterns
for them.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214862 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 13:34:01 +00:00
Yi Kong
68b3d680f2 AArch64: Add support for instruction prefetch intrinsic
Instruction prefetch is not implemented for AArch64, it is incorrectly
translated into data prefetch instruction.

Differential Revision: http://reviews.llvm.org/D4777


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214860 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 12:46:47 +00:00
James Molloy
72035e9a8e Teach the SLP Vectorizer that keeping some values live over a callsite can have a cost.
Some types, such as 128-bit vector types on AArch64, don't have any callee-saved registers. So if a value needs to stay live over a callsite, it must be spilled and refilled. This cost is now taken into account.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214859 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 12:30:34 +00:00
Joerg Sonnenberger
c754b579ae Allow binary and for tblgen math.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214851 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 09:43:25 +00:00
Chandler Carruth
e6329cf303 [x86] Fix a crash and wrong-code bug in the new vector lowering all
found by a single test reduced out of a failure on llvm-stress.

The start of the problem (and the crash) came when we tried to use
a find of a non-used slot in the move-to half of the move-mask as the
target for two bad-half inputs. While if lucky this will be the first of
a pair of slots which we can place the bad-half inputs into, it isn't
actually guaranteed. This really isn't surprising, not sure what I was
thinking. The correct way to find the two unused slots is to look for
one of the *used* slots. We know it isn't that pair, and we can use some
modular arithmetic to find the other pair by masking off the odd bit and
adding 2 modulo 4. With this, we reliably found a viable pair of slots
for the bad-half inputs.

Sadly, that wasn't enough. We also had a wrong code bug that surfaced
when I reduced the test case for this where we would use the same slot
twice for the two bad inputs. This is because both of the bad inputs
could be in odd slots originally and thus the mod-2 mapping would
actually be the same. The whole point of the weird indexing into the
pair of empty slots was to try to leverage when the end result needed
the two bad-half inputs to be paired in a dword and pre-pair them in the
correct orrientation. This is less important with the powerful combining
we're now doing, and also easier and more reliable to achieve be noting
that we add the bad-half inputs in order. Thus, if they are in a dword
pair, the low part of that will be the first input in the sequence.
Always putting that in the low element will just do the right thing in
addition to computing the correct result.

Test case added. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214849 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 08:19:21 +00:00
Juergen Ributzka
eee659a076 [FastISel][AArch64] Implement the FastLowerArguments hook.
This implements basic argument lowering for AArch64 in FastISel. It only
handles a small subset of the C calling convention. It supports simple
arguments that can be passed in GPR and FPR registers.

This should cover most of the trivial cases without falling back to
SelectionDAG.

This fixes <rdar://problem/17890986>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214846 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:48 +00:00
Kevin Qin
6739735812 Revert "r214832 - MachineCombiner Pass for selecting faster instruction"
It broke compiling of most Benchmark and internal test, as clang got
clashed by segmentation fault or assertion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214845 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:47 +00:00
Juergen Ributzka
7e9c0bc511 [FastISel][AArch64] Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214844 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:44 +00:00
Gerolf Hoflehner
c2328d552c MachineCombiner Pass for selecting faster instruction
sequence on AArch64

Re-commit of r214669 without changes to test cases
LLVM::CodeGen/AArch64/arm64-neon-mul-div.ll and
LLVM:: CodeGen/AArch64/dp-3source.ll
This resolves the reported compfails of the original commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214832 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 01:16:13 +00:00
Joerg Sonnenberger
e2f9c8d663 Add TCR register access
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214826 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:53:42 +00:00
Joerg Sonnenberger
7c5b978254 Add PPC 603's tlbld and tlbli instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214825 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:49:45 +00:00
Bill Schmidt
84fef1f55d [PPC64LE] Fix wrong IR for vec_sld and vec_vsldoi
My original LE implementation of the vsldoi instruction, with its
altivec.h interfaces vec_sld and vec_vsldoi, produces incorrect
shufflevector operations in the LLVM IR.  Correct code is generated
because the back end handles the incorrect shufflevector in a
consistent manner.

This patch and a companion patch for Clang correct this problem by
removing the fixup from altivec.h and the corresponding fixup from the
PowerPC back end.  Several test cases are also modified to reflect the
now-correct LLVM IR.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214800 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:21:01 +00:00
Kevin Enderby
2a78a0cd47 Enable Darwin vararg parameters support in assembler macros.
Duplicate the vararg tests for linux and add a tests which mixed
vararg arguments with darwin positional parameters.

Patch by: Janne Grunau <j@jannau.net>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214799 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 23:14:37 +00:00
Joerg Sonnenberger
db3ce56a58 Add simplified aliases for access to DCCR, ICCR, DEAR and ESR
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214797 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 22:56:42 +00:00
Juergen Ributzka
2c68cde701 [FastISel][AArch64] Fix shift lowering for i8 and i16 value types.
This fix changes the parameters #r and #s that are passed to the UBFM/SBFM
instruction to get the zero/sign-extension for free.

The original problem was that the shift left would use the 32-bit shift even for
i8/i16 value types, which could leave the upper bits set with "garbage" values.

The arithmetic shift right on the other side would use the wrong MSB as sign-bit
to determine what bits to shift into the value.

This fixes <rdar://problem/17907720>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214788 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:49:51 +00:00
Chandler Carruth
8a74e2fc07 [SDAG] Fix a really, really terrible bug in the DAG combiner.
This code is completely wrong. It is also dead, as if it were to *ever*
run, it would crash. Fortunately, after my work to the combiner, it is
at least *possible* to reach the code, and llvm-stress has found a test
case. Thanks to Patrick for reporting.

It would be really good if anyone who remembers how this code works and
what it was intended to do could add some more obvious test coverage
instead of my completely contrived and reduced test case. My test case
was so brittle I left a bread crumb comment in it to help the next
person to stumble on it and not know what it was actually testing for.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214785 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:29:59 +00:00
Joerg Sonnenberger
25c8b4774b tlbre / tlbwe / tlbsx / tlbsx. variants for the PPC 4xx CPUs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214784 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:28:22 +00:00
Chad Rosier
82c93451f3 [AArch64] Extend the number of scalar instructions supported in the AdvSIMD
scalar integer instruction pass.

This is a patch I had lying around from a few months ago.  The pass is
currently disabled by default, so nothing to interesting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214779 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:20:25 +00:00
Joerg Sonnenberger
a770bd6cae MC uses .lcomm now, so adjust.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:06:00 +00:00
Reid Kleckner
0b04cf6e71 Fix failure to invoke exception handler on Win64
When the last instruction prior to a function epilogue is a call, we
need to emit a nop so that the return address is not in the epilogue IP
range.  This is consistent with MSVC's behavior, and may be a workaround
for a bug in the Win64 unwinder.

Differential Revision: http://reviews.llvm.org/D4751

Patch by Vadim Chugunov!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214775 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:05:27 +00:00
Joerg Sonnenberger
355845b437 Recognize mftbl as alias for mftb, for symmetry with mttb.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214769 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 20:28:34 +00:00
Joerg Sonnenberger
e8f23a8a5d Allow .lcomm with alignment on ELF targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214754 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 18:45:10 +00:00
Joerg Sonnenberger
df64464ad2 Add support for m[ft][di]bat[ul] instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214731 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 17:07:41 +00:00
Joerg Sonnenberger
977b978f93 Add features for PPC 4xx and e500/e500mc instructions.
Move the test cases for them into separate files.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214724 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 15:47:38 +00:00
Ulrich Weigand
29ec7479a1 [PowerPC] Add target triple to vec_urem_const.ll test case
This should hopefully fix build bots on other architectures.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214721 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 14:55:26 +00:00
Robert Khasanov
7017934668 [SKX] Enabling load/store instructions: encoding
Instructions: VMOVAPD, VMOVAPS, VMOVDQA8, VMOVDQA16, VMOVDQA32,VMOVDQA64, VMOVDQU8, VMOVDQU16, VMOVDQU32,VMOVDQU64, VMOVUPD, VMOVUPS,

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214719 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 14:35:15 +00:00
Ulrich Weigand
c568629589 [PowerPC] Swap arguments to vpkuhum/vpkuwum on little-endian
In commit r213915, Bill fixed little-endian usage of vmrgh* and vmrgl*
by swapping the input arguments.  As it turns out, the exact same fix
is also required for the vpkuhum/vpkuwum patterns.

This fixes another regression in llvmpipe when vector support is
enabled.

Reviewed by Bill Schmidt.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214718 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 13:53:40 +00:00
Ulrich Weigand
d5e9497c88 [PowerPC] MULHU/MULHS are not legal for vector types
I ran into some test failures where common code changed vector division
by constant into a multiply-high operation (MULHU).  But these are not
implemented by the back-end, so we failed to recognize the insn.

Fixed by marking MULHU/MULHS as Expand for vector types.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214716 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 13:27:12 +00:00
Ulrich Weigand
3b7a193521 [PowerPC] Fix and improve vector comparisons
This patch refactors code generation of vector comparisons.

This fixes a wrong code-gen bug for ISD::SETGE for floating-point types,
and improves generated code for vector comparisons in general.

Specifically, the patch moves all logic deciding how to implement vector
comparisons into getVCmpInst, which gets two extra boolean outputs
indicating to its caller whether its needs to swap the input operands
and/or negate the result of the comparison.  Apart from implementing
these two modifications as directed by getVCmpInst, there is no need
to ever implement vector comparisons in any other manner; in particular,
there is never a need to perform two separate comparisons (e.g. one for
equal and one for greater-than, as code used to do before this patch).

Reviewed by Bill Schmidt.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214714 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 13:13:57 +00:00
Daniel Sanders
98b419bff7 [mips] Add assembler support for '.set mipsX'.
Summary:
This patch also fixes an issue with the way the Mips assembler enables/disables architecture
features. Before this patch, the assembler never disabled feature bits. For example,
.set mips64
.set mips32r2

would result in the 'OR' of mips64 with mips32r2 feature bits which isn't right.
Unfortunately this isn't trivial to fix because there's not an easy way to clear
feature bits as the algorithm in MCSubtargetInfo (ToggleFeature) only clears the bits
that imply the feature being cleared and not the implied bits by the feature (there's a
better explanation to the code I added).

Patch by Matheus Almeida and updated by Toma Tabacu

Reviewers: vmedic, matheusalmeida, dsanders

Reviewed By: dsanders

Subscribers: tomatabacu, llvm-commits

Differential Revision: http://reviews.llvm.org/D4123


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214709 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 12:20:00 +00:00
Chandler Carruth
48593d7934 [x86] Just unilaterally prefer SSSE3-style PSHUFB lowerings over clever
use of PACKUS. It's cleaner that way.

I looked at implementing clever combine-based folding of PACKUS chains
into PSHUFB but it is quite hard and doesn't seem likely to be worth it.
The most annoying part would be detecting that the correct masking had
been done to use PACKUS-style instructions as a blend operation rather
than there being any saturating as is indicated by its name. We generate
really nice code for what few test cases I've come up with that aren't
completely contrived for this by just directly prefering PSHUFB and so
let's go with that strategy for now. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214707 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 10:17:35 +00:00
Chandler Carruth
93f5d9f093 [x86] Implement more aggressive use of PACKUS chains for lowering common
patterns of v16i8 shuffles.

This implements one of the more important FIXMEs for the SSE2 support in
the new shuffle lowering. We now generate the optimal shuffle sequence
for truncate-derived shuffles which show up essentially everywhere.

Unfortunately, this exposes a weakness in other parts of the shuffle
logic -- we can no longer form PSHUFB here. I'll add the necessary
support for that and other things in a subsequent commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214702 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 09:40:02 +00:00
Kevin Qin
534100b31e Revert "r214669 - MachineCombiner Pass for selecting faster instruction"
This commit broke "make check" for several hours, so get it reverted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214697 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 05:10:33 +00:00