llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 04:30:12 +00:00

History

Andrea Di Biagio 029a76b0a2 [Vectorizer] Add a new 'OperandValueKind' in TargetTransformInfo called 'OK_NonUniformConstValue' to identify operands which are constants but not constant splats. The cost model now allows returning 'OK_NonUniformConstValue' for non splat operands that are instances of ConstantVector or ConstantDataVector. With this change, targets are now able to compute different costs for instructions with non-uniform constant operands. For example, On X86 the cost of a vector shift may vary depending on whether the second operand is a uniform or non-uniform constant. This patch applies the following changes: - The cost model computation now takes into account non-uniform constants; - The cost of vector shift instructions has been improved in X86TargetTransformInfo analysis pass; - BBVectorize, SLPVectorizer and LoopVectorize now know how to distinguish between non-uniform and uniform constant operands. Added a new test to verify that the output of opt '-cost-model -analyze' is valid in the following configurations: SSE2, SSE4.1, AVX, AVX2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201272 91177308-0d34-0410-b5e6-96231b3b80d8		2014-02-12 23:43:47 +00:00
..
IPA	GlobalsModRef: Unify and clean up duplicated pointer analysis code.	2014-02-10 14:17:30 +00:00
AliasAnalysis.cpp	[cleanup] Move the Dominators.h and Verifier.h headers into the IR	2014-01-13 09:26:24 +00:00
AliasAnalysisCounter.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
AliasAnalysisEvaluator.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
AliasDebugger.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
AliasSetTracker.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
Analysis.cpp	[PM] Make the verifier work independently of any pass manager.	2014-01-19 02:22:18 +00:00
BasicAliasAnalysis.cpp	Fix known typos	2014-01-24 17:20:08 +00:00
BlockFrequencyInfo.cpp	BlockFrequencyInfo: Readded getEntryFreq.	2013-12-20 22:11:11 +00:00
BranchProbabilityInfo.cpp	[block-freq] Teach branch probability how to return the edge weight in between a BasicBlock and one of its successors.	2013-12-14 02:24:25 +00:00
CaptureTracking.cpp	Make nocapture analysis work with addrspacecast	2014-01-14 19:11:52 +00:00
CFG.cpp	Make succ_iterator a real random access iterator and clean up a couple of users.	2014-02-10 14:17:42 +00:00
CFGPrinter.cpp	Use the new script to sort the includes of every file under lib.	2012-12-03 16:50:05 +00:00
CMakeLists.txt	[PM] Add a new "lazy" call graph analysis pass for the new pass manager.	2014-02-06 04:37:03 +00:00
CodeMetrics.cpp	Begin fleshing out an interface in TTI for modelling the costs of	2013-01-22 11:26:02 +00:00
ConstantFolding.cpp	Add addrspacecast instruction.	2013-11-15 01:34:59 +00:00
CostModel.cpp	[Vectorizer] Add a new 'OperandValueKind' in TargetTransformInfo called	2014-02-12 23:43:47 +00:00
Delinearization.cpp	Re-sort all of the includes with ./utils/sort_includes.py so that	2014-01-07 11:48:04 +00:00
DependenceAnalysis.cpp	Fix known typos	2014-01-24 17:20:08 +00:00
DominanceFrontier.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
DomPrinter.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
InstCount.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
InstructionSimplify.cpp	InstSimplify: Make shift, select and GEP simplifications vector-aware.	2014-01-24 17:09:53 +00:00
Interval.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
IntervalPartition.cpp
IVUsers.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
LazyCallGraph.cpp	[PM] Fix horrible typos that somehow didn't cause a failure in a C++11	2014-02-06 05:17:02 +00:00
LazyValueInfo.cpp	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size.	2013-07-04 01:31:24 +00:00
LibCallAliasAnalysis.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
LibCallSemantics.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
Lint.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
LLVMBuild.txt	LLVMBuild: Introduce a common section which currently has a list of the	2011-12-12 22:45:54 +00:00
Loads.cpp	Change GetPointerBaseWithConstantOffset's DataLayout argument from a	2013-01-31 02:00:45 +00:00
LoopInfo.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
LoopPass.cpp	Disable most IR-level transform passes on functions marked 'optnone'.	2014-02-06 00:07:05 +00:00
Makefile
MemDepPrinter.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
MemoryBuiltins.cpp	Update optimization passes to handle inalloca arguments	2014-01-28 02:38:36 +00:00
MemoryDependenceAnalysis.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
ModuleDebugInfoPrinter.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
NoAliasAnalysis.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
PHITransAddr.cpp	[cleanup] Move the Dominators.h and Verifier.h headers into the IR	2014-01-13 09:26:24 +00:00
PostDominators.cpp	[PM] Pull the generic graph algorithms and data structures for dominator	2014-01-13 10:52:56 +00:00
PtrUseVisitor.cpp	Hoist the GEP constant address offset computation to a common home on	2012-12-11 10:29:10 +00:00
README.txt
RegionInfo.cpp	[PM] Split DominatorTree into a concrete analysis result object which	2014-01-13 13:07:17 +00:00
RegionPass.cpp	Remove the the block_node_iterator of Region, replace it by the block_iterator.	2012-08-27 13:49:24 +00:00
RegionPrinter.cpp	Use the new script to sort the includes of every file under lib.	2012-12-03 16:50:05 +00:00
ScalarEvolution.cpp	SCEV: Cast switched values to make -Wswitch more useful.	2014-02-11 19:02:55 +00:00
ScalarEvolutionAliasAnalysis.cpp	Use the new script to sort the includes of every file under lib.	2012-12-03 16:50:05 +00:00
ScalarEvolutionExpander.cpp	[cleanup] Move the Dominators.h and Verifier.h headers into the IR	2014-01-13 09:26:24 +00:00
ScalarEvolutionNormalization.cpp	[cleanup] Move the Dominators.h and Verifier.h headers into the IR	2014-01-13 09:26:24 +00:00
SparsePropagation.cpp	Move all of the header files which are involved in modelling the LLVM IR	2013-01-02 11:36:10 +00:00
TargetTransformInfo.cpp	Make succ_iterator a real random access iterator and clean up a couple of users.	2014-02-10 14:17:42 +00:00
Trace.cpp	Put the functionality for printing a value to a raw_ostream as an	2014-01-09 02:29:41 +00:00
TypeBasedAliasAnalysis.cpp	TBAA: fix PR17620.	2013-10-22 01:40:25 +00:00
ValueTracking.cpp	Allow speculating llvm.sqrt, fma and fmuladd	2014-01-31 00:09:00 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//