llvm-6502/test/Transforms
Filipe Cabecinhas c5f611404c Convert some X86 blendv* intrinsics into IR.
Summary:
Implemented an InstCombine transformation that takes a blendv* intrinsic
call and translates it into an IR select, if the mask is constant.

This will eventually get lowered into blends with immediates if possible,
or pblendvb (with an option to further optimize if we can transform the
pblendvb into a blend+immediate instruction, depending on the selector).
It will also enable optimizations by the IR passes, which give up on
sight of the intrinsic.

Both the transformation and the lowering of its result to asm got shiny
new tests.

The transformation is a bit convoluted because of blendvp[sd]'s
definition:

Its mask is a floating point value! This forces us to convert it and get
the highest bit. I suppose this happened because the mask has type
__m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin.

I will send an email to llvm-dev to discuss if we want to change this or
not.

Reviewers: grosbach, delena, nadav

Differential Revision: http://reviews.llvm.org/D3859

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209643 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-27 03:42:20 +00:00
..
ADCE
AddDiscriminators Fix bug 19437 - Only add discriminators for DWARF 4 and above. 2014-04-17 22:33:50 +00:00
ArgumentPromotion IR: Conservatively verify inalloca arguments 2014-04-30 17:22:00 +00:00
AtomicExpandLoadLinked/ARM Atomics: promote ARM's IR-based atomics pass to CodeGen. 2014-04-17 18:22:47 +00:00
BBVectorize Allow vectorization of bit intrinsics in BB Vectorizer. 2014-04-25 03:33:48 +00:00
BranchFolding
CodeExtractor
CodeGenPrepare Similar to bitcast, treat addrspacecast as a foldable operand. 2014-05-22 00:02:52 +00:00
ConstantHoisting AArch64/ARM64: move ARM64 into AArch64's place 2014-05-24 12:50:23 +00:00
ConstantMerge
ConstProp Teach the constant folder to look through bitcast constant expressions 2014-05-15 09:56:28 +00:00
CorrelatedValuePropagation
DeadArgElim
DeadStoreElimination
DebugIR
EarlyCSE
FunctionAttrs
GCOVProfiling
GlobalDCE Fix most of PR10367. 2014-05-16 19:35:39 +00:00
GlobalMerge AArch64/ARM64: move ARM64 into AArch64's place 2014-05-24 12:50:23 +00:00
GlobalOpt Add comdat key field to llvm.global_ctors and llvm.global_dtors 2014-05-16 20:39:27 +00:00
GVN [GVN] Pass the phi-translated address of a load instead of the untranslated 2014-05-02 17:59:17 +00:00
IndVarSimplify ScalarEvolution: Fix handling of AddRecs in isKnownPredicate 2014-05-23 00:06:56 +00:00
Inline Add support for missed and analysis optimization remarks. 2014-05-22 14:19:46 +00:00
InstCombine Convert some X86 blendv* intrinsics into IR. 2014-05-27 03:42:20 +00:00
InstSimplify Teach isKnownNonNull that a nonnull return is not null. Add a test for this case as well as the case of a nonnull attribute (already handled but not tested). 2014-05-20 05:13:21 +00:00
Internalize Fix most of PR10367. 2014-05-16 19:35:39 +00:00
IPConstantProp
JumpThreading
LCSSA
LICM
LoopDeletion
LoopIdiom
LoopReroll
LoopRotate
LoopSimplify
LoopStrengthReduce AArch64/ARM64: move ARM64 into AArch64's place 2014-05-24 12:50:23 +00:00
LoopUnroll Move late partial-unrolling thresholds into the processor definitions 2014-05-08 09:14:44 +00:00
LoopUnswitch
LoopVectorize AArch64/ARM64: move ARM64 into AArch64's place 2014-05-24 12:50:23 +00:00
LowerAtomic
LowerExpectIntrinsic
LowerInvoke Remove LowerInvoke's obsolete "-enable-correct-eh-support" option 2014-03-20 19:54:47 +00:00
LowerSwitch
Mem2Reg
MemCpyOpt Treat lifetime.start'd memory like we treat freshly alloca'd memory. Patch by Björn Steinbrink! 2014-03-26 23:45:15 +00:00
MergeFunc IR: Don't allow non-default visibility on local linkage 2014-05-07 22:57:20 +00:00
MetaRenamer
ObjCARC
PhaseOrdering
PruneEH
Reassociate
Reg2Mem
SampleProfile
Scalarizer
ScalarRepl
SCCP
SeparateConstOffsetFromGEP/NVPTX Add the extracted constant offset using GEP 2014-05-23 18:39:40 +00:00
SimplifyCFG Add ExtractValue instruction to SimplifyCFG's ComputeSpeculationCost 2014-05-09 17:02:46 +00:00
Sink Sink: Don't sink static allocas from the entry block 2014-03-21 15:51:51 +00:00
SLPVectorizer AArch64/ARM64: move ARM64 into AArch64's place 2014-05-24 12:50:23 +00:00
SROA
StripSymbols
StructurizeCFG
TailCallElim Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'. 2014-05-05 23:59:03 +00:00
TailDup