mirror of
https://github.com/autc04/Retro68.git
synced 2025-01-09 18:33:06 +00:00
145 lines
17 KiB
HTML
145 lines
17 KiB
HTML
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Chapter 19. Profile Mode</title><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><meta name="keywords" content="C++, library, profile" /><meta name="keywords" content="ISO C++, library" /><meta name="keywords" content="ISO C++, runtime, library" /><link rel="home" href="../index.html" title="The GNU C++ Library" /><link rel="up" href="extensions.html" title="Part III. Extensions" /><link rel="prev" href="parallel_mode_test.html" title="Testing" /><link rel="next" href="profile_mode_design.html" title="Design" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Chapter 19. Profile Mode</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="parallel_mode_test.html">Prev</a> </td><th width="60%" align="center">Part III.
|
||
Extensions
|
||
|
||
</th><td width="20%" align="right"> <a accesskey="n" href="profile_mode_design.html">Next</a></td></tr></table><hr /></div><div class="chapter"><div class="titlepage"><div><div><h2 class="title"><a id="manual.ext.profile_mode"></a>Chapter 19. Profile Mode</h2></div></div></div><div class="toc"><p><strong>Table of Contents</strong></p><dl class="toc"><dt><span class="section"><a href="profile_mode.html#manual.ext.profile_mode.intro">Intro</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode.html#manual.ext.profile_mode.using">Using the Profile Mode</a></span></dt><dt><span class="section"><a href="profile_mode.html#manual.ext.profile_mode.tuning">Tuning the Profile Mode</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_design.html">Design</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.wrapper">Wrapper Model</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.instrumentation">Instrumentation</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.rtlib">Run Time Behavior</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.analysis">Analysis and Diagnostics</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.cost-model">Cost Model</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.reports">Reports</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.testing">Testing</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_api.html">Extensions for Custom Containers</a></span></dt><dt><span class="section"><a href="profile_mode_cost_model.html">Empirical Cost Model</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html">Implementation Issues</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.stack">Stack Traces</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.symbols">Symbolization of Instruction Addresses</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.concurrency">Concurrency</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.stdlib-in-proflib">Using the Standard Library in the Instrumentation Implementation</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.malloc-hooks">Malloc Hooks</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.construction-destruction">Construction and Destruction of Global Objects</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_devel.html">Developer Information</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_devel.html#manual.ext.profile_mode.developer.bigpic">Big Picture</a></span></dt><dt><span class="section"><a href="profile_mode_devel.html#manual.ext.profile_mode.developer.howto">How To Add A Diagnostic</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html">Diagnostics</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.template">Diagnostic Template</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.containers">Containers</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.hashtable_too_small">Hashtable Too Small</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.hashtable_too_large">Hashtable Too Large</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.inefficient_hash">Inefficient Hash</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_too_small">Vector Too Small</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_too_large">Vector Too Large</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_to_hashtable">Vector to Hashtable</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.hashtable_to_vector">Hashtable to Vector</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_to_list">Vector to List</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.list_to_vector">List to Vector</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.list_to_slist">List to Forward List (Slist)</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.assoc_ord_to_unord">Ordered to Unordered Associative Container</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.algorithms">Algorithms</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.algorithms.sort">Sort Algorithm Performance</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.locality">Data Locality</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.locality.sw_prefetch">Need Software Prefetch</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.locality.linked">Linked Structure Locality</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.mthread">Multithreaded Data Access</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.mthread.ddtest">Data Dependence Violations at Container Level</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.mthread.false_share">False Sharing</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.statistics">Statistics</a></span></dt></dl></dd><dt><span class="bibliography"><a href="profile_mode.html#profile_mode.biblio">Bibliography</a></span></dt></dl></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="manual.ext.profile_mode.intro"></a>Intro</h2></div></div></div><p>
|
||
<span class="emphasis"><em>Goal: </em></span>Give performance improvement advice based on
|
||
recognition of suboptimal usage patterns of the standard library.
|
||
</p><p>
|
||
<span class="emphasis"><em>Method: </em></span>Wrap the standard library code. Insert
|
||
calls to an instrumentation library to record the internal state of
|
||
various components at interesting entry/exit points to/from the standard
|
||
library. Process trace, recognize suboptimal patterns, give advice.
|
||
For details, see the
|
||
<a class="link" href="https://ieeexplore.ieee.org/document/4907670/" target="_top">Perflint
|
||
paper presented at CGO 2009</a>.
|
||
</p><p>
|
||
<span class="emphasis"><em>Strengths: </em></span>
|
||
</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
|
||
Unintrusive solution. The application code does not require any
|
||
modification.
|
||
</p></li><li class="listitem"><p> The advice is call context sensitive, thus capable of
|
||
identifying precisely interesting dynamic performance behavior.
|
||
</p></li><li class="listitem"><p>
|
||
The overhead model is pay-per-view. When you turn off a diagnostic class
|
||
at compile time, its overhead disappears.
|
||
</p></li></ul></div><p>
|
||
</p><p>
|
||
<span class="emphasis"><em>Drawbacks: </em></span>
|
||
</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
|
||
You must recompile the application code with custom options.
|
||
</p></li><li class="listitem"><p>You must run the application on representative input.
|
||
The advice is input dependent.
|
||
</p></li><li class="listitem"><p>
|
||
The execution time will increase, in some cases by factors.
|
||
</p></li></ul></div><p>
|
||
</p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.using"></a>Using the Profile Mode</h3></div></div></div><p>
|
||
This is the anticipated common workflow for program <code class="code">foo.cc</code>:
|
||
</p><pre class="programlisting">
|
||
$ cat foo.cc
|
||
#include <vector>
|
||
int main() {
|
||
vector<int> v;
|
||
for (int k = 0; k < 1024; ++k) v.insert(v.begin(), k);
|
||
}
|
||
|
||
$ g++ -D_GLIBCXX_PROFILE foo.cc
|
||
$ ./a.out
|
||
$ cat libstdcxx-profile.txt
|
||
vector-to-list: improvement = 5: call stack = 0x804842c ...
|
||
: advice = change std::vector to std::list
|
||
vector-size: improvement = 3: call stack = 0x804842c ...
|
||
: advice = change initial container size from 0 to 1024
|
||
</pre><p>
|
||
</p><p>
|
||
Anatomy of a warning:
|
||
</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
|
||
Warning id. This is a short descriptive string for the class
|
||
that this warning belongs to. E.g., "vector-to-list".
|
||
</p></li><li class="listitem"><p>
|
||
Estimated improvement. This is an approximation of the benefit expected
|
||
from implementing the change suggested by the warning. It is given on
|
||
a log10 scale. Negative values mean that the alternative would actually
|
||
do worse than the current choice.
|
||
In the example above, 5 comes from the fact that the overhead of
|
||
inserting at the beginning of a vector vs. a list is around 1024 * 1024 / 2,
|
||
which is around 10e5. The improvement from setting the initial size to
|
||
1024 is in the range of 10e3, since the overhead of dynamic resizing is
|
||
linear in this case.
|
||
</p></li><li class="listitem"><p>
|
||
Call stack. Currently, the addresses are printed without
|
||
symbol name or code location attribution.
|
||
Users are expected to postprocess the output using, for instance, addr2line.
|
||
</p></li><li class="listitem"><p>
|
||
The warning message. For some warnings, this is static text, e.g.,
|
||
"change vector to list". For other warnings, such as the one above,
|
||
the message contains numeric advice, e.g., the suggested initial size
|
||
of the vector.
|
||
</p></li></ul></div><p>
|
||
</p><p>Three files are generated. <code class="code">libstdcxx-profile.txt</code>
|
||
contains human readable advice. <code class="code">libstdcxx-profile.raw</code>
|
||
contains implementation specific data about each diagnostic.
|
||
Their format is not documented. They are sufficient to generate
|
||
all the advice given in <code class="code">libstdcxx-profile.txt</code>. The advantage
|
||
of keeping this raw format is that traces from multiple executions can
|
||
be aggregated simply by concatenating the raw traces. We intend to
|
||
offer an external utility program that can issue advice from a trace.
|
||
<code class="code">libstdcxx-profile.conf.out</code> lists the actual diagnostic
|
||
parameters used. To alter parameters, edit this file and rename it to
|
||
<code class="code">libstdcxx-profile.conf</code>.
|
||
</p><p>Advice is given regardless whether the transformation is valid.
|
||
For instance, we advise changing a map to an unordered_map even if the
|
||
application semantics require that data be ordered.
|
||
We believe such warnings can help users understand the performance
|
||
behavior of their application better, which can lead to changes
|
||
at a higher abstraction level.
|
||
</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.tuning"></a>Tuning the Profile Mode</h3></div></div></div><p>Compile time switches and environment variables (see also file
|
||
profiler.h). Unless specified otherwise, they can be set at compile time
|
||
using -D_<name> or by setting variable <name>
|
||
in the environment where the program is run, before starting execution.
|
||
</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
|
||
<code class="code">_GLIBCXX_PROFILE_NO_<diagnostic></code>:
|
||
disable specific diagnostics.
|
||
See section Diagnostics for possible values.
|
||
(Environment variables not supported.)
|
||
</p></li><li class="listitem"><p>
|
||
<code class="code">_GLIBCXX_PROFILE_TRACE_PATH_ROOT</code>: set an alternative root
|
||
path for the output files.
|
||
</p></li><li class="listitem"><p>_GLIBCXX_PROFILE_MAX_WARN_COUNT: set it to the maximum
|
||
number of warnings desired. The default value is 10.</p></li><li class="listitem"><p>
|
||
<code class="code">_GLIBCXX_PROFILE_MAX_STACK_DEPTH</code>: if set to 0,
|
||
the advice will
|
||
be collected and reported for the program as a whole, and not for each
|
||
call context.
|
||
This could also be used in continuous regression tests, where you
|
||
just need to know whether there is a regression or not.
|
||
The default value is 32.
|
||
</p></li><li class="listitem"><p>
|
||
<code class="code">_GLIBCXX_PROFILE_MEM_PER_DIAGNOSTIC</code>:
|
||
set a limit on how much memory to use for the accounting tables for each
|
||
diagnostic type. When this limit is reached, new events are ignored
|
||
until the memory usage decreases under the limit. Generally, this means
|
||
that newly created containers will not be instrumented until some
|
||
live containers are deleted. The default is 128 MB.
|
||
</p></li><li class="listitem"><p>
|
||
<code class="code">_GLIBCXX_PROFILE_NO_THREADS</code>:
|
||
Make the library not use threads. If thread local storage (TLS) is not
|
||
available, you will get a preprocessor error asking you to set
|
||
-D_GLIBCXX_PROFILE_NO_THREADS if your program is single-threaded.
|
||
Multithreaded execution without TLS is not supported.
|
||
(Environment variable not supported.)
|
||
</p></li><li class="listitem"><p>
|
||
<code class="code">_GLIBCXX_HAVE_EXECINFO_H</code>:
|
||
This name should be defined automatically at library configuration time.
|
||
If your library was configured without <code class="code">execinfo.h</code>, but
|
||
you have it in your include path, you can define it explicitly. Without
|
||
it, advice is collected for the program as a whole, and not for each
|
||
call context.
|
||
(Environment variable not supported.)
|
||
</p></li></ul></div><p>
|
||
</p></div></div><div class="bibliography"><div class="titlepage"><div><div><h2 class="title"><a id="profile_mode.biblio"></a>Bibliography</h2></div></div></div><div class="biblioentry"><a id="id-1.3.5.6.9.2"></a><p><span class="citetitle"><em class="citetitle">
|
||
Perflint: A Context Sensitive Performance Advisor for C++ Programs
|
||
</em>. </span><span class="author"><span class="firstname">Lixia</span> <span class="surname">Liu</span>. </span><span class="author"><span class="firstname">Silvius</span> <span class="surname">Rus</span>. </span><span class="copyright">Copyright © 2009 . </span><span class="publisher"><span class="publishername">
|
||
Proceedings of the 2009 International Symposium on Code Generation
|
||
and Optimization
|
||
. </span></span></p></div></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="parallel_mode_test.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="extensions.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="profile_mode_design.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Testing </td><td width="20%" align="center"><a accesskey="h" href="../index.html">Home</a></td><td width="40%" align="right" valign="top"> Design</td></tr></table></div></body></html> |