mirror of
https://github.com/Zenithsiz/ftmemsim-valgrind.git
synced 2026-02-03 18:13:01 +00:00
We currently use a mix of <option> and <computeroutput> tags for command
line options. This commit changes them to all <option>. Also make consistent how options with multiple names (eg. -h --help) are shown. Also, remove section describing --help and --version in Callgrind's chapter; these aren't necessary and are presumably a hangover from when Callgrind was a separate tool. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@10659
This commit is contained in:
parent
b8d3c302c0
commit
ac7761261a
@ -8,7 +8,7 @@
|
||||
<title>Cachegrind: a cache and branch-prediction profiler</title>
|
||||
|
||||
<para>To use this tool, you must specify
|
||||
<computeroutput>--tool=cachegrind</computeroutput> on the
|
||||
<option>--tool=cachegrind</option> on the
|
||||
Valgrind command line.</para>
|
||||
|
||||
<sect1 id="cg-manual.overview" xreflabel="Overview">
|
||||
@ -55,7 +55,7 @@ instruction executed, you can find out how many instructions are
|
||||
executed per line, which can be useful for traditional profiling.</para>
|
||||
|
||||
<para>Branch profiling is not enabled by default. To use it, you must
|
||||
additionally specify <computeroutput>--branch-sim=yes</computeroutput>
|
||||
additionally specify <option>--branch-sim=yes</option>
|
||||
on the command line.</para>
|
||||
|
||||
|
||||
@ -64,7 +64,7 @@ on the command line.</para>
|
||||
|
||||
<para>First off, as for normal Valgrind use, you probably want to
|
||||
compile with debugging info (the
|
||||
<computeroutput>-g</computeroutput> flag). But by contrast with
|
||||
<option>-g</option> flag). But by contrast with
|
||||
normal Valgrind use, you probably <command>do</command> want to turn
|
||||
optimisation on, since you should profile your program as it will
|
||||
be normally run.</para>
|
||||
@ -83,7 +83,7 @@ be normally run.</para>
|
||||
|
||||
<para>Branch prediction statistics are not collected by default.
|
||||
To do so, add the flag
|
||||
<computeroutput>--branch-sim=yes</computeroutput>.
|
||||
<option>--branch-sim=yes</option>.
|
||||
</para>
|
||||
|
||||
<para>This step should be done every time you want to collect
|
||||
@ -98,7 +98,7 @@ be normally run.</para>
|
||||
files to annotate can be specified manually, or manually on
|
||||
the command line, or "interesting" source files can be
|
||||
annotated automatically with the
|
||||
<computeroutput>--auto=yes</computeroutput> option. You can
|
||||
<option>--auto=yes</option> option. You can
|
||||
annotate C/C++ files or assembly language files equally
|
||||
easily.</para>
|
||||
|
||||
@ -175,9 +175,9 @@ Cachegrind will fall back to using a default configuration (that
|
||||
of a model 3/4 Athlon). Cachegrind will tell you if this
|
||||
happens. You can manually specify one, two or all three levels
|
||||
(I1/D1/L2) of the cache from the command line using the
|
||||
<computeroutput>--I1</computeroutput>,
|
||||
<computeroutput>--D1</computeroutput> and
|
||||
<computeroutput>--L2</computeroutput> options.
|
||||
<option>--I1</option>,
|
||||
<option>--D1</option> and
|
||||
<option>--L2</option> options.
|
||||
For cache parameters to be valid for simulation, the number
|
||||
of sets (with associativity being the number of cache lines in
|
||||
each set) has to be a power of two.</para>
|
||||
@ -186,9 +186,9 @@ each set) has to be a power of two.</para>
|
||||
Cachegrind cannot automatically
|
||||
determine the cache configuration, so you will
|
||||
need to specify it with the
|
||||
<computeroutput>--I1</computeroutput>,
|
||||
<computeroutput>--D1</computeroutput> and
|
||||
<computeroutput>--L2</computeroutput> options.</para>
|
||||
<option>--I1</option>,
|
||||
<option>--D1</option> and
|
||||
<option>--L2</option> options.</para>
|
||||
|
||||
|
||||
<para>Other noteworthy behaviour:</para>
|
||||
@ -356,7 +356,7 @@ file:</para>
|
||||
<listitem>
|
||||
<para>To use an output file name other than the default
|
||||
<computeroutput>cachegrind.out</computeroutput>,
|
||||
use the <computeroutput>--cachegrind-out-file</computeroutput>
|
||||
use the <option>--cachegrind-out-file</option>
|
||||
switch.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
@ -371,7 +371,7 @@ file:</para>
|
||||
on the output file name serves two purposes. Firstly, it means you
|
||||
don't have to rename old log files that you don't want to overwrite.
|
||||
Secondly, and more importantly, it allows correct profiling with the
|
||||
<computeroutput>--trace-children=yes</computeroutput> option of
|
||||
<option>--trace-children=yes</option> option of
|
||||
programs that spawn child processes.</para>
|
||||
|
||||
</sect2>
|
||||
@ -465,8 +465,8 @@ configuration, or failing that, via defaults).</para>
|
||||
<para>Enables or disables collection of branch instruction and
|
||||
misprediction counts. By default this is disabled as it
|
||||
slows Cachegrind down by approximately 25%. Note that you
|
||||
cannot specify <computeroutput>--cache-sim=no</computeroutput>
|
||||
and <computeroutput>--branch-sim=no</computeroutput>
|
||||
cannot specify <option>--cache-sim=no</option>
|
||||
and <option>--branch-sim=no</option>
|
||||
together, as that would leave Cachegrind with no
|
||||
information to collect.</para>
|
||||
</listitem>
|
||||
@ -615,7 +615,7 @@ Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw file:function
|
||||
<listitem>
|
||||
<para>Events shown: the events shown, which is a subset of the events
|
||||
gathered. This can be adjusted with the
|
||||
<computeroutput>--show</computeroutput> option.</para>
|
||||
<option>--show</option> option.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
@ -626,12 +626,12 @@ Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw file:function
|
||||
<computeroutput>Ir</computeroutput> counts, they will then be
|
||||
sorted by <computeroutput>I1mr</computeroutput> counts, and
|
||||
so on. This order can be adjusted with the
|
||||
<computeroutput>--sort</computeroutput> option.</para>
|
||||
<option>--sort</option> option.</para>
|
||||
|
||||
<para>Note that this dictates the order the functions appear.
|
||||
It is <command>not</command> the order in which the columns
|
||||
appear; that is dictated by the "events shown" line (and can
|
||||
be changed with the <computeroutput>--show</computeroutput>
|
||||
be changed with the <option>--show</option>
|
||||
option).</para>
|
||||
</listitem>
|
||||
|
||||
@ -644,7 +644,7 @@ Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw file:function
|
||||
<computeroutput>Ir</computeroutput> is chosen as the
|
||||
threshold event since it is the primary sort event. The
|
||||
threshold can be adjusted with the
|
||||
<computeroutput>--threshold</computeroutput>
|
||||
<option>--threshold</option>
|
||||
option.</para>
|
||||
</listitem>
|
||||
|
||||
@ -655,7 +655,7 @@ Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw file:function
|
||||
|
||||
<listitem>
|
||||
<para>Auto-annotation: whether auto-annotation was requested
|
||||
via the <computeroutput>--auto=yes</computeroutput>
|
||||
via the <option>--auto=yes</option>
|
||||
option. In this case no.</para>
|
||||
</listitem>
|
||||
|
||||
@ -676,7 +676,7 @@ instructions that write to memory). The name
|
||||
and/or function name could not be determined from debugging
|
||||
information. If most of the entries have the form
|
||||
<computeroutput>???:???</computeroutput> the program probably
|
||||
wasn't compiled with <computeroutput>-g</computeroutput>. If any
|
||||
wasn't compiled with <option>-g</option>. If any
|
||||
code was invalidated (either due to self-modifying code or
|
||||
unloading of shared objects) its counts are aggregated into a
|
||||
single cost centre written as
|
||||
@ -688,7 +688,7 @@ and from libraries (eg. <filename>getc.c</filename>)</para>
|
||||
|
||||
<para>There are two ways to annotate source files -- by choosing
|
||||
them manually, or with the
|
||||
<computeroutput>--auto=yes</computeroutput> option. To do it
|
||||
<option>--auto=yes</option> option. To do it
|
||||
manually, just specify the filenames as additional arguments to
|
||||
cg_annotate. For example, the
|
||||
output from running <filename>cg_annotate <filename>
|
||||
@ -736,7 +736,7 @@ terminal is clearly useful.)</para>
|
||||
(<computeroutput>User-annotated source</computeroutput>) as
|
||||
having been chosen manually for annotation. If the file was
|
||||
found in one of the directories specified with the
|
||||
<computeroutput>-I / --include</computeroutput> option, the directory
|
||||
<option>-I</option>/<option>--include</option> option, the directory
|
||||
and file are both given.</para>
|
||||
|
||||
<para>Each line is annotated with its event counts. Events not
|
||||
@ -757,7 +757,7 @@ part of a file the shown code comes from, eg:</para>
|
||||
(figures and code for line 878)]]></programlisting>
|
||||
|
||||
<para>The amount of context to show around annotated lines is
|
||||
controlled by the <computeroutput>--context</computeroutput>
|
||||
controlled by the <option>--context</option>
|
||||
option.</para>
|
||||
|
||||
<para>To get automatic annotation, run
|
||||
@ -765,8 +765,8 @@ option.</para>
|
||||
cg_annotate will automatically annotate every source file it can
|
||||
find that is mentioned in the function-by-function summary.
|
||||
Therefore, the files chosen for auto-annotation are affected by
|
||||
the <computeroutput>--sort</computeroutput> and
|
||||
<computeroutput>--threshold</computeroutput> options. Each
|
||||
the <option>--sort</option> and
|
||||
<option>--threshold</option> options. Each
|
||||
source file is clearly marked (<computeroutput>Auto-annotated
|
||||
source</computeroutput>) as being chosen automatically. Any
|
||||
files that could not be found are mentioned at the end of the
|
||||
@ -785,9 +785,9 @@ usually compiled with debugging information, but the source files
|
||||
are often not present on a system. If a file is chosen for
|
||||
annotation <command>both</command> manually and automatically, it
|
||||
is marked as <computeroutput>User-annotated
|
||||
source</computeroutput>. Use the <computeroutput>-I /
|
||||
--include</computeroutput> option to tell Valgrind where to look
|
||||
for source files if the filenames found from the debugging
|
||||
source</computeroutput>. Use the
|
||||
<option>-I</option>/<option>--include</option> option to tell Valgrind where
|
||||
to look for source files if the filenames found from the debugging
|
||||
information aren't specific enough.</para>
|
||||
|
||||
<para>Beware that cg_annotate can take some time to digest large
|
||||
@ -839,27 +839,25 @@ cg_annotate.</para>
|
||||
<itemizedlist>
|
||||
|
||||
<listitem>
|
||||
<para><computeroutput>-h, --help</computeroutput></para>
|
||||
<para><computeroutput>-v, --version</computeroutput></para>
|
||||
<para><option>-h --help</option></para>
|
||||
<para><option>-v --version</option></para>
|
||||
<para>Help and version, as usual.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem id="sort">
|
||||
<para><computeroutput>--sort=A,B,C</computeroutput> [default:
|
||||
<para><option>--sort=A,B,C</option> [default:
|
||||
order in
|
||||
<computeroutput>cachegrind.out.<pid></computeroutput>]</para>
|
||||
<para>Specifies the events upon which the sorting of the
|
||||
function-by-function entries will be based. Useful if you
|
||||
want to concentrate on eg. I cache misses
|
||||
(<computeroutput>--sort=I1mr,I2mr</computeroutput>), or D
|
||||
cache misses
|
||||
(<computeroutput>--sort=D1mr,D2mr</computeroutput>), or L2
|
||||
misses
|
||||
(<computeroutput>--sort=D2mr,I2mr</computeroutput>).</para>
|
||||
(<option>--sort=I1mr,I2mr</option>), or D cache misses
|
||||
(<option>--sort=D1mr,D2mr</option>), or L2 misses
|
||||
(<option>--sort=D2mr,I2mr</option>).</para>
|
||||
</listitem>
|
||||
|
||||
<listitem id="show">
|
||||
<para><computeroutput>--show=A,B,C</computeroutput> [default:
|
||||
<para><option>--show=A,B,C</option> [default:
|
||||
all, using order in
|
||||
<computeroutput>cachegrind.out.<pid></computeroutput>]</para>
|
||||
<para>Specifies which events to show (and the column
|
||||
@ -869,7 +867,7 @@ cg_annotate.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem id="threshold">
|
||||
<para><computeroutput>--threshold=X</computeroutput>
|
||||
<para><option>--threshold=X</option>
|
||||
[default: 99%]</para>
|
||||
<para>Sets the threshold for the function-by-function
|
||||
summary. Functions are shown that account for more than X%
|
||||
@ -878,24 +876,23 @@ cg_annotate.</para>
|
||||
|
||||
<para>Note: thresholds can be set for more than one of the
|
||||
events by appending any events for the
|
||||
<computeroutput>--sort</computeroutput> option with a colon
|
||||
<option>--sort</option> option with a colon
|
||||
and a number (no spaces, though). E.g. if you want to see
|
||||
the functions that cover 99% of L2 read misses and 99% of L2
|
||||
write misses, use this option:</para>
|
||||
<para><computeroutput>--sort=D2mr:99,D2mw:99</computeroutput></para>
|
||||
<para><option>--sort=D2mr:99,D2mw:99</option></para>
|
||||
</listitem>
|
||||
|
||||
<listitem id="auto">
|
||||
<para><computeroutput>--auto=no</computeroutput> [default]</para>
|
||||
<para><computeroutput>--auto=yes</computeroutput></para>
|
||||
<para><option>--auto=no</option> [default]</para>
|
||||
<para><option>--auto=yes</option></para>
|
||||
<para>When enabled, automatically annotates every file that
|
||||
is mentioned in the function-by-function summary that can be
|
||||
found. Also gives a list of those that couldn't be found.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem id="context">
|
||||
<para><computeroutput>--context=N</computeroutput> [default:
|
||||
8]</para>
|
||||
<para><option>--context=N</option> [default: 8]</para>
|
||||
<para>Print N lines of context before and after each
|
||||
annotated line. Avoids printing large sections of source
|
||||
files that were not executed. Use a large number
|
||||
@ -903,9 +900,8 @@ cg_annotate.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem id="include">
|
||||
<para><computeroutput>-I<dir>,
|
||||
--include=<dir></computeroutput> [default: empty
|
||||
string]</para>
|
||||
<para><option>-I<dir>, --include=<dir></option>
|
||||
[default: empty string]</para>
|
||||
<para>Adds a directory to the list in which to search for
|
||||
files. Multiple -I/--include options can be given to add
|
||||
multiple directories.</para>
|
||||
@ -1046,7 +1042,7 @@ cg_annotate issues warnings.</para>
|
||||
|
||||
<listitem>
|
||||
<para>If you compile some files with
|
||||
<computeroutput>-g</computeroutput> and some without, some
|
||||
<option>-g</option> and some without, some
|
||||
events that take place in a file without debug info could be
|
||||
attributed to the last line of a file with debug info
|
||||
(whichever one gets placed before the non-debug-info file in
|
||||
|
||||
@ -8,7 +8,7 @@
|
||||
|
||||
|
||||
<para>To use this tool, you must specify
|
||||
<computeroutput>--tool=callgrind</computeroutput> on the
|
||||
<option>--tool=callgrind</option> on the
|
||||
Valgrind command line.</para>
|
||||
|
||||
<sect1 id="cl-manual.use" xreflabel="Overview">
|
||||
@ -61,7 +61,7 @@ of the profiling, two command line tools are provided:</para>
|
||||
</variablelist>
|
||||
|
||||
<para>To use Callgrind, you must specify
|
||||
<computeroutput>--tool=callgrind</computeroutput> on the Valgrind
|
||||
<option>--tool=callgrind</option> on the Valgrind
|
||||
command line.</para>
|
||||
|
||||
<sect2 id="cl-manual.functionality" xreflabel="Functionality">
|
||||
@ -498,8 +498,7 @@ callgrind.out.<emphasis>pid</emphasis>.<emphasis>part</emphasis>-<emphasis>threa
|
||||
<title>Command line option reference</title>
|
||||
|
||||
<para>
|
||||
In the following, options are grouped into classes, in the same order as
|
||||
the output of <computeroutput>callgrind --help</computeroutput>.
|
||||
In the following, options are grouped into classes.
|
||||
</para>
|
||||
<para>
|
||||
Some options allow the specification of a function/symbol name, such as
|
||||
@ -513,30 +512,6 @@ shell. This feature is important especially for C++, as without wildcard
|
||||
usage, the function would have to be specified in full extent, including
|
||||
parameter signature. </para>
|
||||
|
||||
<sect2 id="cl-manual.options.misc"
|
||||
xreflabel="Miscellaneous options">
|
||||
<title>Miscellaneous options</title>
|
||||
|
||||
<variablelist id="cl.opts.list.misc">
|
||||
|
||||
<varlistentry>
|
||||
<term><option>--help</option></term>
|
||||
<listitem>
|
||||
<para>Show summary of options. This is a short version of this
|
||||
manual section.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><option>--version</option></term>
|
||||
<listitem>
|
||||
<para>Show version of callgrind.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
</variablelist>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="cl-manual.options.creation"
|
||||
xreflabel="Dump creation options">
|
||||
<title>Dump creation options</title>
|
||||
@ -750,9 +725,9 @@ Also see <xref linkend="cl-manual.limits"/>.</para>
|
||||
option <xref linkend="opt.toggle-collect"/>. If you use this flag,
|
||||
collection
|
||||
state should be switched off at the beginning. Note that the
|
||||
specification of <computeroutput>--toggle-collect</computeroutput>
|
||||
specification of <option>--toggle-collect</option>
|
||||
implicitly sets
|
||||
<computeroutput>--collect-state=no</computeroutput>.</para>
|
||||
<option>--collect-state=no</option>.</para>
|
||||
<para>Collection state can be toggled also by inserting the client request
|
||||
<computeroutput><xref linkend="cr.toggle-collect"/>;</computeroutput>
|
||||
at the needed code positions.</para>
|
||||
|
||||
@ -48,7 +48,7 @@ included below.</para>
|
||||
<variablelist remap="TP">
|
||||
|
||||
<varlistentry>
|
||||
<term><option>-h, --help</option></term>
|
||||
<term><option>-h --help</option></term>
|
||||
<listitem>
|
||||
<para>Show summary of options.</para>
|
||||
</listitem>
|
||||
|
||||
@ -49,7 +49,7 @@ included below.</para>
|
||||
<variablelist remap="TP">
|
||||
|
||||
<varlistentry>
|
||||
<term><option>-h, --help</option></term>
|
||||
<term><option>-h --help</option></term>
|
||||
<listitem>
|
||||
<para>Show summary of options.</para>
|
||||
</listitem>
|
||||
|
||||
@ -56,8 +56,8 @@ program with any extra supporting libraries.</para>
|
||||
on x86, amd64, ppc32 and ppc64, the overhead is 6 simple integer instructions
|
||||
and is probably undetectable except in tight loops.
|
||||
However, if you really wish to compile out the client requests, you can
|
||||
compile with <computeroutput>-DNVALGRIND</computeroutput> (analogous to
|
||||
<computeroutput>-DNDEBUG</computeroutput>'s effect on
|
||||
compile with <option>-DNVALGRIND</option> (analogous to
|
||||
<option>-DNDEBUG</option>'s effect on
|
||||
<computeroutput>assert()</computeroutput>).
|
||||
</para>
|
||||
|
||||
@ -103,7 +103,7 @@ tool-specific macros).</para>
|
||||
once.</para>
|
||||
<para>
|
||||
Alternatively, for transparent self-modifying-code support,
|
||||
use<computeroutput>--smc-check=all</computeroutput>, or run
|
||||
use<option>--smc-check=all</option>, or run
|
||||
on ppc32/Linux or ppc64/Linux.
|
||||
</para>
|
||||
</listitem>
|
||||
@ -504,7 +504,7 @@ will honour only the first one.</para>
|
||||
|
||||
<para>Figuring out what's going on given the dynamic nature of wrapping
|
||||
can be difficult. The
|
||||
<computeroutput>--trace-redir=yes</computeroutput> flag makes
|
||||
<option>--trace-redir=yes</option> flag makes
|
||||
this possible
|
||||
by showing the complete state of the redirection subsystem after
|
||||
every
|
||||
@ -536,10 +536,10 @@ sections. The active binding set is (conceptually) recomputed from
|
||||
the specifications, and all known symbol names, following any change
|
||||
to the specification set.</para>
|
||||
|
||||
<para><computeroutput>--trace-redir=yes</computeroutput> shows the contents
|
||||
<para><option>--trace-redir=yes</option> shows the contents
|
||||
of both sets following any such event.</para>
|
||||
|
||||
<para><computeroutput>-v</computeroutput> prints a line of text each
|
||||
<para><option>-v</option> prints a line of text each
|
||||
time an active specification is used for the first time.</para>
|
||||
|
||||
<para>Hence for maximum debugging effectiveness you will need to use both
|
||||
@ -555,7 +555,7 @@ However, to make the implementation more robust, the two kinds
|
||||
of interception (wrapping vs replacement) are treated differently.
|
||||
</para>
|
||||
|
||||
<para><computeroutput>--trace-redir=yes</computeroutput> shows
|
||||
<para><option>--trace-redir=yes</option> shows
|
||||
specifications and bindings for both
|
||||
replacement and wrapper functions. To differentiate the
|
||||
two, replacement bindings are printed using
|
||||
|
||||
@ -113,20 +113,20 @@ already, if you intended to debug your program with GNU gdb, or some
|
||||
other debugger.</para>
|
||||
|
||||
<para>If you are planning to use Memcheck: On rare
|
||||
occasions, compiler optimisations (at <computeroutput>-O2</computeroutput>
|
||||
and above, and sometimes <computeroutput>-O1</computeroutput>) have been
|
||||
occasions, compiler optimisations (at <option>-O2</option>
|
||||
and above, and sometimes <option>-O1</option>) have been
|
||||
observed to generate code which fools Memcheck into wrongly reporting
|
||||
uninitialised value errors, or missing uninitialised value errors. We have
|
||||
looked in detail into fixing this, and unfortunately the result is that
|
||||
doing so would give a further significant slowdown in what is already a slow
|
||||
tool. So the best solution is to turn off optimisation altogether. Since
|
||||
this often makes things unmanageably slow, a reasonable compromise is to use
|
||||
<computeroutput>-O</computeroutput>. This gets you the majority of the
|
||||
<option>-O</option>. This gets you the majority of the
|
||||
benefits of higher optimisation levels whilst keeping relatively small the
|
||||
chances of false positives or false negatives from Memcheck. Also, you
|
||||
should compile your code with <computeroutput>-Wall</computeroutput> because
|
||||
should compile your code with <option>-Wall</option> because
|
||||
it can identify some or all of the problems that Valgrind can miss at the
|
||||
higher optimisation levels. (Using <computeroutput>-Wall</computeroutput>
|
||||
higher optimisation levels. (Using <option>-Wall</option>
|
||||
is also a good idea in general.) All other tools (as far as we know) are
|
||||
unaffected by optimisation level.</para>
|
||||
|
||||
@ -631,7 +631,7 @@ categories.</para>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry id="opt.quiet" xreflabel="--quiet">
|
||||
<term><option>-q --quiet</option></term>
|
||||
<term><option>-q</option>, <option>--quiet</option></term>
|
||||
<listitem>
|
||||
<para>Run silently, and only print error messages. Useful if you
|
||||
are running regression tests or have some other automated test
|
||||
@ -640,7 +640,7 @@ categories.</para>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry id="opt.verbose" xreflabel="--verbose">
|
||||
<term><option>-v --verbose</option></term>
|
||||
<term><option>-v</option>, <option>--verbose</option></term>
|
||||
<listitem>
|
||||
<para>Be more verbose. Gives extra information on various aspects
|
||||
of your program, such as: the shared objects loaded, the
|
||||
@ -1525,7 +1525,7 @@ following entry in <literal>~/.valgrindrc</literal>:</para>
|
||||
run. Without the <computeroutput>memcheck:</computeroutput>
|
||||
part, this will cause problems if you select other tools that
|
||||
don't understand
|
||||
<computeroutput>--leak-check=yes</computeroutput>.</para>
|
||||
<option>--leak-check=yes</option>.</para>
|
||||
|
||||
</sect2>
|
||||
|
||||
@ -1589,7 +1589,7 @@ able to cope with any POSIX-compliant use of signals.</para>
|
||||
<para>If you're using signals in clever ways (for example, catching
|
||||
SIGSEGV, modifying page state and restarting the instruction), you're
|
||||
probably relying on precise exceptions. In this case, you will need
|
||||
to use <computeroutput>--vex-iropt-precise-memory-exns=yes</computeroutput>.
|
||||
to use <option>--vex-iropt-precise-memory-exns=yes</option>.
|
||||
</para>
|
||||
|
||||
<para>If your program dies as a result of a fatal core-dumping signal,
|
||||
@ -1961,7 +1961,7 @@ shipped.</para>
|
||||
<title>Warning Messages You Might See</title>
|
||||
|
||||
<para>Most of these only appear if you run in verbose mode
|
||||
(enabled by <computeroutput>-v</computeroutput>):</para>
|
||||
(enabled by <option>-v</option>):</para>
|
||||
|
||||
<itemizedlist>
|
||||
|
||||
|
||||
@ -44,13 +44,13 @@ documentation of Memcheck and the other tools, please read the User Manual.
|
||||
|
||||
<para>Compile your program with <option>-g</option> to include debugging
|
||||
information so that Memcheck's error messages include exact line
|
||||
numbers. Using <computeroutput>-O0</computeroutput> is also a good
|
||||
numbers. Using <option>-O0</option> is also a good
|
||||
idea, if you can tolerate the slowdown. With
|
||||
<computeroutput>-O1</computeroutput> line numbers in error messages can
|
||||
<option>-O1</option> line numbers in error messages can
|
||||
be inaccurate, although generally speaking running Memcheck on code compiled
|
||||
at <computeroutput>-O1</computeroutput> works fairly well.
|
||||
at <option>-O1</option> works fairly well.
|
||||
Use of
|
||||
<computeroutput>-O2</computeroutput> and above is not recommended as
|
||||
<option>-O2</option> and above is not recommended as
|
||||
Memcheck occasionally reports uninitialised-value errors which don't
|
||||
really exist.</para>
|
||||
|
||||
|
||||
@ -17,8 +17,12 @@ xml to html markup transformations:
|
||||
|
||||
<programlisting> --> <pre class="programlisting">
|
||||
<screen> --> <pre class="screen">
|
||||
<computeroutput> --> <tt class="computeroutput">
|
||||
<literal> --> <tt>
|
||||
<option> --> <code class="option">
|
||||
<filename> --> <code class="filename">
|
||||
<function> --> <code class="function">
|
||||
<literal> --> <code class="literal">
|
||||
<varname> --> <code class="varname">
|
||||
<computeroutput> --> <code class="computeroutput">
|
||||
<emphasis> --> <i>
|
||||
<command> --> <b class="command">
|
||||
<blockquote> --> <div class="blockquote">
|
||||
|
||||
@ -8,7 +8,7 @@
|
||||
<title>DRD: a thread error detector</title>
|
||||
|
||||
<para>To use this tool, you must specify
|
||||
<computeroutput>--tool=drd</computeroutput>
|
||||
<option>--tool=drd</option>
|
||||
on the Valgrind command line.</para>
|
||||
|
||||
|
||||
@ -653,7 +653,7 @@ The above report has the following meaning:
|
||||
displayed. For dynamically allocated data the allocation call
|
||||
stack is shown. For static variables and stack variables the
|
||||
allocation context is only shown when the option
|
||||
<computeroutput>--read-var-info=yes</computeroutput> has been
|
||||
<option>--read-var-info=yes</option> has been
|
||||
specified. Otherwise DRD will print <computeroutput>Allocation
|
||||
context: unknown</computeroutput>.
|
||||
</para>
|
||||
|
||||
@ -6,7 +6,7 @@
|
||||
<title>BBV: an experimental basic block vector generation tool</title>
|
||||
|
||||
<para>To use this tool, you must specify
|
||||
<computeroutput>--tool=exp-bbv</computeroutput> on the Valgrind
|
||||
<option>--tool=exp-bbv</option> on the Valgrind
|
||||
command line.</para>
|
||||
|
||||
<sect1 id="bbv-manual.overview" xreflabel="Overview">
|
||||
@ -202,7 +202,7 @@ command line.</para>
|
||||
<para>
|
||||
The Basic Block Vector is dumped at fixed intervals. This
|
||||
is commonly done every 100 million instructions; the
|
||||
<computeroutput>--interval-size</computeroutput> option can be
|
||||
<option>--interval-size</option> option can be
|
||||
used to change this.
|
||||
</para>
|
||||
|
||||
@ -252,7 +252,7 @@ T:18:45 :12:135353 :56:78 314:4324263]]></programlisting>
|
||||
BBV vectors will be different than those generated by other tools.
|
||||
In practice this does not seem to affect the accuracy of the
|
||||
SimPoint results. We do internally force the
|
||||
<computeroutput>--vex-guest-chase-thresh=0</computeroutput>
|
||||
<option>--vex-guest-chase-thresh=0</option>
|
||||
option to Valgrind which forces a more basic-block like
|
||||
behavior.
|
||||
</para>
|
||||
|
||||
@ -9,7 +9,7 @@
|
||||
<title>Ptrcheck: an experimental heap, stack & global array overrun detector</title>
|
||||
|
||||
<para>To use this tool, you must specify
|
||||
<computeroutput>--tool=exp-ptrcheck</computeroutput> on the Valgrind
|
||||
<option>--tool=exp-ptrcheck</option> on the Valgrind
|
||||
command line.</para>
|
||||
|
||||
|
||||
@ -161,7 +161,7 @@ possibly be a valid pointer.</para>
|
||||
<title>How Ptrcheck Works: Stack and Global Checks</title>
|
||||
|
||||
<para>When a source file is compiled
|
||||
with <computeroutput>-g</computeroutput>, the compiler attaches DWARF3
|
||||
with <option>-g</option>, the compiler attaches DWARF3
|
||||
debugging information which describes the location of all stack and
|
||||
global arrays in the file.</para>
|
||||
|
||||
|
||||
@ -8,7 +8,7 @@
|
||||
<title>Helgrind: a thread error detector</title>
|
||||
|
||||
<para>To use this tool, you must specify
|
||||
<computeroutput>--tool=helgrind</computeroutput> on the Valgrind
|
||||
<option>--tool=helgrind</option> on the Valgrind
|
||||
command line.</para>
|
||||
|
||||
|
||||
|
||||
@ -7,7 +7,7 @@
|
||||
<title>Lackey: an example tool</title>
|
||||
|
||||
<para>To use this tool, you must specify
|
||||
<computeroutput>--tool=lackey</computeroutput> on the Valgrind
|
||||
<option>--tool=lackey</option> on the Valgrind
|
||||
command line.</para>
|
||||
|
||||
|
||||
@ -26,7 +26,7 @@ over performance.</para>
|
||||
|
||||
<listitem>
|
||||
<para>When command line option
|
||||
<computeroutput>--basic-counts=yes</computeroutput> is specified,
|
||||
<option>--basic-counts=yes</option> is specified,
|
||||
it prints the following statistics and information about the execution of
|
||||
the client program:</para>
|
||||
|
||||
@ -38,7 +38,7 @@ over performance.</para>
|
||||
function in glibc's dynamic linker that resolves function
|
||||
references to shared objects.</para>
|
||||
<para>You can change the name of the function tracked with command line
|
||||
option <computeroutput>--fnname=<name></computeroutput>.</para>
|
||||
option <option>--fnname=<name></option>.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
@ -72,7 +72,7 @@ over performance.</para>
|
||||
|
||||
<listitem>
|
||||
<para>When command line option
|
||||
<computeroutput>--detailed-counts=yes</computeroutput> is
|
||||
<option>--detailed-counts=yes</option> is
|
||||
specified, a table is printed with counts of loads, stores and ALU
|
||||
operations for various types of operands.</para>
|
||||
|
||||
@ -82,7 +82,7 @@ over performance.</para>
|
||||
|
||||
<listitem>
|
||||
<para>When command line option
|
||||
<computeroutput>--trace-mem=yes</computeroutput> is
|
||||
<option>--trace-mem=yes</option> is
|
||||
specified, it prints out the size and address of almost every load and
|
||||
store made by the program. See the comments at the top of the file
|
||||
<computeroutput>lackey/lk_main.c</computeroutput> for details about
|
||||
@ -92,7 +92,7 @@ over performance.</para>
|
||||
|
||||
<listitem>
|
||||
<para>When command line option
|
||||
<computeroutput>--trace-superblocks=yes</computeroutput> is
|
||||
<option>--trace-superblocks=yes</option> is
|
||||
specified, it prints out the address of every superblock
|
||||
(extended basic block) executed by the program. This is
|
||||
primarily of interest to Valgrind developers. See the comments at
|
||||
@ -104,14 +104,14 @@ over performance.</para>
|
||||
</orderedlist>
|
||||
|
||||
<para>Note that Lackey runs quite slowly, especially when
|
||||
<computeroutput>--detailed-counts=yes</computeroutput> is specified.
|
||||
<option>--detailed-counts=yes</option> is specified.
|
||||
It could be made to run a lot faster by doing a slightly more
|
||||
sophisticated job of the instrumentation, but that would undermine
|
||||
its role as a simple example tool. Hence we have chosen not to do
|
||||
so.</para>
|
||||
|
||||
<para>Note also that <computeroutput>--trace-mem=yes</computeroutput>
|
||||
and <computeroutput>--trace-superblocks=yes</computeroutput> create
|
||||
<para>Note also that <option>--trace-mem=yes</option>
|
||||
and <option>--trace-superblocks=yes</option> create
|
||||
immense amounts of output. If you are saving the output in a file,
|
||||
you can eat up tens of gigabytes of disk space very quickly.
|
||||
As a result of printing out so much stuff, they also cause the program
|
||||
|
||||
@ -8,7 +8,7 @@
|
||||
<title>Massif: a heap profiler</title>
|
||||
|
||||
<para>To use this tool, you must specify
|
||||
<computeroutput>--tool=massif</computeroutput> on the Valgrind
|
||||
<option>--tool=massif</option> on the Valgrind
|
||||
command line.</para>
|
||||
|
||||
<sect1 id="ms-manual.overview" xreflabel="Overview">
|
||||
@ -54,7 +54,7 @@ which parts of your program are responsible for allocating the heap memory.
|
||||
|
||||
|
||||
<para>First off, as for the other Valgrind tools, you should compile with
|
||||
debugging info (the <computeroutput>-g</computeroutput> flag). It shouldn't
|
||||
debugging info (the <option>-g</option> flag). It shouldn't
|
||||
matter much what optimisation level you compile your program with, as this
|
||||
is unlikely to affect the heap memory usage.</para>
|
||||
|
||||
@ -188,7 +188,7 @@ For very short-run programs such as the example, most of the executed
|
||||
instructions involve the loading and dynamic linking of the program. The
|
||||
execution of <computeroutput>main</computeroutput> (and thus the heap
|
||||
allocations) only occur at the very end. For a short-running program like
|
||||
this, we can use the <computeroutput>--time-unit=B</computeroutput> option
|
||||
this, we can use the <option>--time-unit=B</option> option
|
||||
to specify that we want the time unit to instead be the number of bytes
|
||||
allocated/deallocated on the heap and stack(s).</para>
|
||||
|
||||
@ -232,7 +232,7 @@ taking snapshots for every heap allocation/deallocation, but as a program
|
||||
runs for longer, it takes snapshots less frequently. It also discards older
|
||||
snapshots as the program goes on; when it reaches the maximum number of
|
||||
snapshots (100 by default, although changeable with the
|
||||
<computeroutput>--max-snapshots</computeroutput> option) half of them are
|
||||
<option>--max-snapshots</option> option) half of them are
|
||||
deleted. This means that a reasonable number of snapshots are always
|
||||
maintained.</para>
|
||||
|
||||
@ -246,7 +246,7 @@ shortly. Detailed snapshots are represented in the graph by bars consisting
|
||||
of '@' characters. The text at the bottom show that 3 detailed
|
||||
snapshots were taken for this program (snapshots 9, 14 and 24). By default,
|
||||
every 10th snapshot is detailed, although this can be changed via the
|
||||
<computeroutput>--detailed-freq</computeroutput> option.</para>
|
||||
<option>--detailed-freq</option> option.</para>
|
||||
|
||||
<para>Finally, there is at most one <emphasis>peak</emphasis> snapshot. The
|
||||
peak snapshot is a detailed snapshot, and records the point where memory
|
||||
@ -260,7 +260,7 @@ at every allocation, i.e. it is <emphasis>not</emphasis> just the peak among
|
||||
the regular snapshots. However, recording the true peak is expensive, and
|
||||
so by default Massif records a peak whose size is within 1% of the size of
|
||||
the true peak. See the description of the
|
||||
<computeroutput>--peak-inaccuracy</computeroutput> option below for more
|
||||
<option>--peak-inaccuracy</option> option below for more
|
||||
details.</para>
|
||||
|
||||
<para>The following graph is from an execution of Konqueror, the KDE web
|
||||
@ -331,7 +331,7 @@ a small amount of information is recorded for each one:</para>
|
||||
|
||||
<listitem><para>The time it was taken. In this case, the time unit is
|
||||
bytes, due to the use of
|
||||
<computeroutput>--time-unit=B</computeroutput>.</para></listitem>
|
||||
<option>--time-unit=B</option>.</para></listitem>
|
||||
|
||||
<listitem><para>The total memory consumption at that point.</para></listitem>
|
||||
|
||||
@ -347,14 +347,14 @@ a small amount of information is recorded for each one:</para>
|
||||
The exact number of administrative bytes depends on the details of the
|
||||
allocator. By default Massif assumes 8 bytes per block, as can be seen
|
||||
from the example, but this number can be changed via the
|
||||
<computeroutput>--heap-admin</computeroutput> option.</para>
|
||||
<option>--heap-admin</option> option.</para>
|
||||
|
||||
<para>Second, allocators often round up the number of bytes asked for to a
|
||||
larger number. By default, if N bytes are asked for, Massif rounds N up
|
||||
to the nearest multiple of 8 that is equal to or greater than N. This is
|
||||
typical behaviour for allocators, and is required to ensure that elements
|
||||
within the block are suitably aligned. The rounding size can be changed
|
||||
with the <computeroutput>--alignment</computeroutput> option, although it
|
||||
with the <option>--alignment</option> option, although it
|
||||
cannot be less than 8, and must be a power of two.</para></listitem>
|
||||
|
||||
<listitem><para>The size of the stack(s). By default, stack profiling is
|
||||
@ -379,7 +379,7 @@ functions, and so all 9,000 useful bytes (which is 99.21% of all allocated
|
||||
bytes) go through them. But how were <function>malloc</function> and new
|
||||
called? At this point, every allocation so far has been due to line 21
|
||||
inside <function>main</function>, hence the second line in the tree. The
|
||||
<computeroutput>-></computeroutput> indicates that main (line 20) called
|
||||
<option>-></option> indicates that main (line 20) called
|
||||
<function>malloc</function>.</para>
|
||||
|
||||
<para>Let's see what the subsequent output shows happened next:</para>
|
||||
@ -491,7 +491,7 @@ only prints the details for code locations responsible for more than 1%.
|
||||
The entries that do not meet this threshold are aggregated. This avoids
|
||||
filling up the output with large numbers of unimportant entries. The
|
||||
thresholds can be changed with the
|
||||
<computeroutput>--threshold</computeroutput> option that both Massif and
|
||||
<option>--threshold</option> option that both Massif and
|
||||
ms_print support.</para>
|
||||
|
||||
</sect2>
|
||||
@ -617,7 +617,7 @@ operator new[](unsigned long, std::nothrow_t const&)
|
||||
<listitem>
|
||||
<para>Any direct heap allocation (i.e. a call to
|
||||
<function>malloc</function>, <function>new</function>, etc, or a call
|
||||
to a function name in a <computeroutput>--alloc-fn</computeroutput>
|
||||
to a function name in a <option>--alloc-fn</option>
|
||||
option) that occurs in a function specified by this option will be
|
||||
ignored. This is mostly useful for testing purposes. This option can
|
||||
be specified multiple times on the command line, to name multiple
|
||||
@ -632,7 +632,7 @@ operator new[](unsigned long, std::nothrow_t const&)
|
||||
</para>
|
||||
|
||||
<para>Note that overloaded C++ names must be written in full, as for
|
||||
<computeroutput>--alloc-fn</computeroutput> above.
|
||||
<option>--alloc-fn</option> above.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
@ -685,7 +685,7 @@ operator new[](unsigned long, std::nothrow_t const&)
|
||||
</term>
|
||||
<listitem>
|
||||
<para>Frequency of detailed snapshots. With
|
||||
<computeroutput>--detailed-freq=1</computeroutput>, every snapshot is
|
||||
<option>--detailed-freq=1</option>, every snapshot is
|
||||
detailed.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
@ -741,14 +741,14 @@ operator new[](unsigned long, std::nothrow_t const&)
|
||||
<itemizedlist>
|
||||
|
||||
<listitem>
|
||||
<para><computeroutput>-h, --help</computeroutput></para>
|
||||
<para><computeroutput>-v, --version</computeroutput></para>
|
||||
<para><option>-h --help</option></para>
|
||||
<para><option>-v --version</option></para>
|
||||
<para>Help and version, as usual.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para><option><![CDATA[--threshold=<m.n>]]></option> [default: 1.0]</para>
|
||||
<para>Same as Massif's <computeroutput>--threshold</computeroutput>, but
|
||||
<para>Same as Massif's <option>--threshold</option>, but
|
||||
applied after profiling rather than during.</para>
|
||||
</listitem>
|
||||
|
||||
|
||||
@ -157,7 +157,7 @@ difficult-to-diagnose crashes.</para>
|
||||
lost" and "possibly lost" blocks. When enabled, the leak detector also
|
||||
shows "reachable" and "indirectly lost" blocks. (In other words, it
|
||||
shows all blocks, except suppressed ones, so
|
||||
<computeroutput>--show-all</computeroutput> would be a better name for
|
||||
<option>--show-all</option> would be a better name for
|
||||
it.)</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
@ -764,12 +764,12 @@ LEAK SUMMARY:
|
||||
suppressed: 0 bytes in 0 blocks.
|
||||
]]></programlisting>
|
||||
|
||||
<para>If <computeroutput>--leak-check=full</computeroutput> is specified,
|
||||
<para>If <option>--leak-check=full</option> is specified,
|
||||
Memcheck will give details for each definitely lost or possibly lost block,
|
||||
including where it was allocated. (Actually, it merges results for all
|
||||
blocks that have the same category and sufficiently similar stack traces
|
||||
into a single "loss record". The
|
||||
<computeroutput>--leak-resolution</computeroutput> lets you control the
|
||||
<option>--leak-resolution</option> lets you control the
|
||||
meaning of "sufficiently similar".) It cannot tell you when or how or why
|
||||
the pointer to a leaked block was lost; you have to work that out for
|
||||
yourself. In general, you should attempt to ensure your programs do not
|
||||
@ -795,7 +795,7 @@ bytes in other blocks are indirectly lost because of this lost block.
|
||||
The loss records are not presented in any notable order, so the loss record
|
||||
numbers aren't particularly meaningful.</para>
|
||||
|
||||
<para>If you specify <computeroutput>--show-reachable=yes</computeroutput>,
|
||||
<para>If you specify <option>--show-reachable=yes</option>,
|
||||
reachable and indirectly lost blocks will also be shown, as the following
|
||||
two examples show.</para>
|
||||
|
||||
@ -1289,7 +1289,7 @@ arguments.</para>
|
||||
|
||||
<listitem>
|
||||
<para><varname>VALGRIND_DO_LEAK_CHECK</varname>: does a full memory leak
|
||||
check (like <computeroutput>--leak-check=full</computeroutput> right now.
|
||||
check (like <option>--leak-check=full</option> right now.
|
||||
This is useful for incrementally checking for leaks between arbitrary
|
||||
places in the program's execution. It has no return value.</para>
|
||||
</listitem>
|
||||
@ -1297,7 +1297,7 @@ arguments.</para>
|
||||
<listitem>
|
||||
<para><varname>VALGRIND_DO_QUICK_LEAK_CHECK</varname>: like
|
||||
<varname>VALGRIND_DO_LEAK_CHECK</varname>, except it produces only a leak
|
||||
summary (like <computeroutput>--leak-check=summary</computeroutput>).
|
||||
summary (like <option>--leak-check=summary</option>).
|
||||
It has no return value.</para>
|
||||
</listitem>
|
||||
|
||||
@ -1580,7 +1580,7 @@ the same <computeroutput>mpicc</computeroutput> you use to build the
|
||||
MPI application you want to debug. By default, Valgrind tries
|
||||
<computeroutput>mpicc</computeroutput>, but you can specify a
|
||||
different one by using the configure-time flag
|
||||
<computeroutput>--with-mpicc=</computeroutput>. Currently the
|
||||
<option>--with-mpicc=</option>. Currently the
|
||||
wrappers are only buildable with
|
||||
<computeroutput>mpicc</computeroutput>s which are based on GNU
|
||||
<computeroutput>gcc</computeroutput> or Intel's
|
||||
@ -1704,7 +1704,7 @@ valgrind MPI wrappers 16386: Try MPIWRAP_DEBUG=help for possible options
|
||||
</itemizedlist>
|
||||
|
||||
<para> If you want to use Valgrind's XML output facility
|
||||
(<computeroutput>--xml=yes</computeroutput>), you should pass
|
||||
(<option>--xml=yes</option>), you should pass
|
||||
<computeroutput>quiet</computeroutput> in
|
||||
<computeroutput>MPIWRAP_DEBUG</computeroutput> so as to get rid of any
|
||||
extraneous printing from the wrappers.</para>
|
||||
|
||||
@ -87,7 +87,7 @@ trick, one which I assume the
|
||||
support.</para>
|
||||
|
||||
<para><filename>valgrind.so</filename> is linked with the
|
||||
<computeroutput>-z initfirst</computeroutput> flag, which
|
||||
<option>-z initfirst</option> flag, which
|
||||
requests that its initialisation code is run before that of any
|
||||
other object in the executable image. When this happens,
|
||||
valgrind gains control. The real CPU becomes "trapped" in
|
||||
@ -489,8 +489,8 @@ result:</para>
|
||||
entirely.</para>
|
||||
|
||||
<para>To find out which glibc symbols are used by Valgrind,
|
||||
reinstate the link flags <computeroutput>-nostdlib
|
||||
-Wl,-no-undefined</computeroutput>. This causes linking to
|
||||
reinstate the link flags <option>-nostdlib
|
||||
-Wl,-no-undefined</option>. This causes linking to
|
||||
fail, but will tell you what you depend on. I have mostly,
|
||||
but not entirely, got rid of the glibc dependencies; what
|
||||
remains is, IMO, fairly harmless. AFAIK the current
|
||||
|
||||
@ -8,7 +8,7 @@
|
||||
<title>Nulgrind: the minimal Valgrind tool</title>
|
||||
|
||||
<para>To use this tool, you must specify
|
||||
<computeroutput>--tool=none</computeroutput> on the Valgrind
|
||||
<option>--tool=none</option> on the Valgrind
|
||||
command line.</para>
|
||||
|
||||
<sect1 id="ms-manual.overview" xreflabel="Overview">
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user