llvm-6502/test/Transforms
Diego Novillo 563b29f8db SampleProfileLoader pass. Initial setup.
This adds a new scalar pass that reads a file with samples generated
by 'perf' during runtime. The samples read from the profile are
incorporated and emmited as IR metadata reflecting that profile.

The profile file is assumed to have been generated by an external
profile source. The profile information is converted into IR metadata,
which is later used by the analysis routines to estimate block
frequencies, edge weights and other related data.

External profile information files have no fixed format, each profiler
is free to define its own. This includes both the on-disk representation
of the profile and the kind of profile information stored in the file.
A common kind of profile is based on sampling (e.g., perf), which
essentially counts how many times each line of the program has been
executed during the run.

The SampleProfileLoader pass is organized as a scalar transformation.
On startup, it reads the file given in -sample-profile-file to
determine what kind of profile it contains.  This file is assumed to
contain profile information for the whole application. The profile
data in the file is read and incorporated into the internal state of
the corresponding profiler.

To facilitate testing, I've organized the profilers to support two file
formats: text and native. The native format is whatever on-disk
representation the profiler wants to support, I think this will mostly
be bitcode files, but it could be anything the profiler wants to
support. To do this, every profiler must implement the
SampleProfile::loadNative() function.

The text format is mostly meant for debugging. Records are separated by
newlines, but each profiler is free to interpret records as it sees fit.
Profilers must implement the SampleProfile::loadText() function.

Finally, the pass will call SampleProfile::emitAnnotations() for each
function in the current translation unit. This function needs to
translate the loaded profile into IR metadata, which the analyzer will
later be able to use.

This patch implements the first steps towards the above design. I've
implemented a sample-based flat profiler. The format of the profile is
fairly simplistic. Each sampled function contains a list of relative
line locations (from the start of the function) together with a count
representing how many samples were collected at that line during
execution. I generate this profile using perf and a separate converter
tool.

Currently, I have only implemented a text format for these profiles. I
am interested in initial feedback to the whole approach before I send
the other parts of the implementation for review.

This patch implements:

- The SampleProfileLoader pass.
- The base ExternalProfile class with the core interface.
- A SampleProfile sub-class using the above interface. The profiler
  generates branch weight metadata on every branch instructions that
  matches the profiles.
- A text loader class to assist the implementation of
  SampleProfile::loadText().
- Basic unit tests for the pass.

Additionally, the patch uses profile information to compute branch
weights based on instruction samples.

This patch converts instruction samples into branch weights. It
does a fairly simplistic conversion:

Given a multi-way branch instruction, it calculates the weight of
each branch based on the maximum sample count gathered from each
target basic block.

Note that this assignment of branch weights is somewhat lossy and can be
misleading. If a basic block has more than one incoming branch, all the
incoming branches will get the same weight. In reality, it may be that
only one of them is the most heavily taken branch.

I will adjust this assignment in subsequent patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194566 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-13 12:22:21 +00:00
..
ADCE
ArgumentPromotion
BBVectorize Prevent LoopVectorizer and SLPVectorizer running if the target has no vector registers. 2013-09-18 12:43:35 +00:00
BranchFolding
CodeExtractor
CodeGenPrepare
ConstantMerge Corruptly merge constants with explicit and implicit alignments. 2013-11-12 20:21:43 +00:00
ConstProp
CorrelatedValuePropagation
DeadArgElim Fix a bug in Dead Argument Elimination. 2013-10-09 17:21:44 +00:00
DeadStoreElimination
DebugIR Use right pointer type in DebugIR 2013-09-27 22:26:25 +00:00
EarlyCSE
FunctionAttrs
GCOVProfiling
GlobalDCE
GlobalOpt Quote potential shell expansions found in tests 2013-10-28 23:37:45 +00:00
GVN Fix PR17952. 2013-11-11 22:00:23 +00:00
IndVarSimplify Add test case for PR12377, it was fixed by r194116. 2013-11-06 11:55:41 +00:00
Inline Rename testing case to use - instead of _. 2013-11-04 18:52:06 +00:00
InstCombine Fold (iszero(A&K1) | iszero(A&K2)) -> (A&(K1|K2)) != (K1|K2) if we know that K1 and K2 are 'one-hot' (only one bit is on). 2013-11-12 22:38:59 +00:00
InstSimplify Add a test that large offsets on GEPs on 32 bits targets are handled correctly. 2013-09-28 21:27:49 +00:00
Internalize Use LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN instead of the "dso list". 2013-10-31 20:51:58 +00:00
IPConstantProp
JumpThreading Don't eliminate a partially redundant load if it's in a landing pad. 2013-10-21 04:09:17 +00:00
LCSSA
LICM Debug Info: In DIBuilder, the derived-from field of a DW_TAG_pointer_type 2013-10-05 01:43:03 +00:00
LoopDeletion
LoopIdiom Teach loop-idiom about address space pointer sizes 2013-09-11 05:09:42 +00:00
LoopRotate
LoopSimplify UpdatePHINodes in BasicBlockUtils should not crash on duplicate predecessors 2013-10-04 23:41:05 +00:00
LoopStrengthReduce Fix "existant" typos 2013-10-29 02:35:28 +00:00
LoopUnroll Implement TTI getUnrollingPreferences for PowerPC 2013-09-11 21:20:40 +00:00
LoopUnswitch
LoopVectorize Scalarize select vector arguments when extracted. 2013-11-04 20:36:06 +00:00
LowerAtomic
LowerExpectIntrinsic
LowerInvoke
LowerSwitch
Mem2Reg
MemCpyOpt
MergeFunc Teach MergeFunctions about address spaces 2013-11-10 01:44:37 +00:00
MetaRenamer
ObjCARC [objc-arc] Convert the one directional retain/release relation assert to a conditional check + fail. 2013-11-05 16:02:40 +00:00
PhaseOrdering
PruneEH
Reassociate
Reg2Mem
SampleProfile SampleProfileLoader pass. Initial setup. 2013-11-13 12:22:21 +00:00
ScalarRepl Teach scalarrepl about address spaces 2013-10-30 22:54:58 +00:00
SCCP
SimplifyCFG FoldBranchToCommonDest merges branches into a single branch with or/and of the condition. It has a heuristics for estimating when some of the dependencies are processed by out-of-order processors. This patch adds another rule to the heuristics that says that if the "BonusInstruction" that we speculatively execute is used by the condition of the second branch then it is okay to hoist it. This change exposes more opportunities for other passes to transform the code. It does not matter that much that we if-convert the code because the selectiondag builder splits or/and branches into multiple branches when profitable. 2013-11-12 22:37:16 +00:00
Sink
SLPVectorizer Add llvm/test/Transforms/SLPVectorizer/ARM/lit.local.cfg. Tests there require ARM in targets. 2013-10-29 02:46:00 +00:00
SROA SROA: Handle casts involving vectors of pointers and integer scalars. 2013-09-21 20:36:04 +00:00
StripSymbols
StructurizeCFG StructurizeCFG: Add dependency on LowerSwitch pass 2013-10-02 17:04:59 +00:00
TailCallElim
TailDup