mirror of
https://github.com/Zenithsiz/ftmemsim-valgrind.git
synced 2026-02-03 18:13:01 +00:00
- There were detailed descriptions of all the tools in the Quick Start Guide, the Manual introduction, and the start of each tool chapter. To avoid duplication/overlap, I removed these altogether from the Quick Start Guide, and shortened them in the intro. - Improved the description of what errors Memcheck can find. - Made all tool chapters start with "Overview" section, for consistency. - Made the "run with --tool=XXX" bit consistent in each tool chapter. - Made all tool chapter titles match the description given when running them. - Added BBV to the User Manual intro. - Generally clarified, updated, and future-proofed various bits of text in the Quick Start Guide and User Manual introduction. Also: - Changed Nulgrind's start-up description to "the minimal Valgrind tool". - Fixed some punctuation in the usage message. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@10652
781 lines
34 KiB
XML
781 lines
34 KiB
XML
<?xml version="1.0"?> <!-- -*- sgml -*- -->
|
|
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
|
|
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
|
|
[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
|
|
|
|
|
|
<chapter id="ms-manual" xreflabel="Massif: a heap profiler">
|
|
<title>Massif: a heap profiler</title>
|
|
|
|
<para>To use this tool, you must specify
|
|
<computeroutput>--tool=massif</computeroutput> on the Valgrind
|
|
command line.</para>
|
|
|
|
<sect1 id="ms-manual.overview" xreflabel="Overview">
|
|
<title>Overview</title>
|
|
|
|
<para>Massif is a heap profiler. It measures how much heap memory your
|
|
program uses. This includes both the useful space, and the extra bytes
|
|
allocated for book-keeping and alignment purposes. It can also
|
|
measure the size of your program's stack(s), although it does not do so by
|
|
default.</para>
|
|
|
|
<para>Heap profiling can help you reduce the amount of memory your program
|
|
uses. On modern machines with virtual memory, this provides the following
|
|
benefits:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem><para>It can speed up your program -- a smaller
|
|
program will interact better with your machine's caches and
|
|
avoid paging.</para></listitem>
|
|
|
|
<listitem><para>If your program uses lots of memory, it will
|
|
reduce the chance that it exhausts your machine's swap
|
|
space.</para></listitem>
|
|
</itemizedlist>
|
|
|
|
<para>Also, there are certain space leaks that aren't detected by
|
|
traditional leak-checkers, such as Memcheck's. That's because
|
|
the memory isn't ever actually lost -- a pointer remains to it --
|
|
but it's not in use. Programs that have leaks like this can
|
|
unnecessarily increase the amount of memory they are using over
|
|
time. Massif can help identify these leaks.</para>
|
|
|
|
<para>Importantly, Massif tells you not only how much heap memory your
|
|
program is using, it also gives very detailed information that indicates
|
|
which parts of your program are responsible for allocating the heap memory.
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="ms-manual.using" xreflabel="Using Massif">
|
|
<title>Using Massif</title>
|
|
|
|
|
|
<para>First off, as for the other Valgrind tools, you should compile with
|
|
debugging info (the <computeroutput>-g</computeroutput> flag). It shouldn't
|
|
matter much what optimisation level you compile your program with, as this
|
|
is unlikely to affect the heap memory usage.</para>
|
|
|
|
<para>Then, to gather heap profiling information about the program
|
|
<computeroutput>prog</computeroutput>, type:</para>
|
|
<screen><![CDATA[
|
|
% valgrind --tool=massif prog
|
|
]]></screen>
|
|
|
|
<para>The program will execute (slowly). Upon completion, no summary
|
|
statistics are printed to Valgrind's commentary; all of Massif's profiling
|
|
data is written to a file. By default, this file is called
|
|
<filename>massif.out.<pid></filename>, where
|
|
<filename><pid></filename> is the process ID.</para>
|
|
|
|
<para>To see the information gathered by Massif in an easy-to-read form, use
|
|
the ms_print script. If the output file's name is
|
|
<filename>massif.out.12345</filename>, type:</para>
|
|
<screen><![CDATA[
|
|
% ms_print massif.out.12345]]></screen>
|
|
|
|
<para>ms_print will produce (a) a graph showing the memory consumption over
|
|
the program's execution, and (b) detailed information about the responsible
|
|
allocation sites at various points in the program, including the point of
|
|
peak memory allocation. The use of a separate script for presenting the
|
|
results is deliberate: it separates the data gathering from its
|
|
presentation, and means that new methods of presenting the data can be added in
|
|
the future.</para>
|
|
|
|
<sect2 id="ms-manual.anexample" xreflabel="An Example">
|
|
<title>An Example Program</title>
|
|
|
|
<para>An example will make things clear. Consider the following C program
|
|
(annotated with line numbers) which allocates a number of different blocks
|
|
on the heap.</para>
|
|
|
|
<screen><![CDATA[
|
|
1 #include <stdlib.h>
|
|
2
|
|
3 void g(void)
|
|
4 {
|
|
5 malloc(4000);
|
|
6 }
|
|
7
|
|
8 void f(void)
|
|
9 {
|
|
10 malloc(2000);
|
|
11 g();
|
|
12 }
|
|
13
|
|
14 int main(void)
|
|
15 {
|
|
16 int i;
|
|
17 int* a[10];
|
|
18
|
|
19 for (i = 0; i < 10; i++) {
|
|
20 a[i] = malloc(1000);
|
|
21 }
|
|
22
|
|
23 f();
|
|
24
|
|
25 g();
|
|
26
|
|
27 for (i = 0; i < 10; i++) {
|
|
28 free(a[i]);
|
|
29 }
|
|
30
|
|
31 return 0;
|
|
32 }
|
|
]]></screen>
|
|
|
|
</sect2>
|
|
|
|
|
|
<sect2 id="ms-manual.theoutputpreamble" xreflabel="The Output Preamble">
|
|
<title>The Output Preamble</title>
|
|
|
|
<para>After running this program under Massif, the first part of ms_print's
|
|
output contains a preamble which just states how the program, Massif and
|
|
ms_print were each invoked:</para>
|
|
|
|
<screen><![CDATA[
|
|
--------------------------------------------------------------------------------
|
|
Command: example
|
|
Massif arguments: (none)
|
|
ms_print arguments: massif.out.12797
|
|
--------------------------------------------------------------------------------
|
|
]]></screen>
|
|
|
|
</sect2>
|
|
|
|
|
|
<sect2 id="ms-manual.theoutputgraph" xreflabel="The Output Graph">
|
|
<title>The Output Graph</title>
|
|
|
|
<para>The next part is the graph that shows how memory consumption occurred
|
|
as the program executed:</para>
|
|
|
|
<screen><![CDATA[
|
|
KB
|
|
19.63^ #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| #
|
|
| :#
|
|
| :#
|
|
| :#
|
|
0 +----------------------------------------------------------------------->ki 0 113.4
|
|
|
|
|
|
Number of snapshots: 25
|
|
Detailed snapshots: [9, 14 (peak), 24]
|
|
]]></screen>
|
|
|
|
<para>Why is most of the graph empty, with only a couple of bars at the very
|
|
end? By default, Massif uses "instructions executed" as the unit of time.
|
|
For very short-run programs such as the example, most of the executed
|
|
instructions involve the loading and dynamic linking of the program. The
|
|
execution of <computeroutput>main</computeroutput> (and thus the heap
|
|
allocations) only occur at the very end. For a short-running program like
|
|
this, we can use the <computeroutput>--time-unit=B</computeroutput> option
|
|
to specify that we want the time unit to instead be the number of bytes
|
|
allocated/deallocated on the heap and stack(s).</para>
|
|
|
|
<para>If we re-run the program under Massif with this option, and then
|
|
re-run ms_print, we get this more useful graph:</para>
|
|
|
|
<screen><![CDATA[
|
|
19.63^ ###
|
|
| #
|
|
| # ::
|
|
| # : :::
|
|
| :::::::::# : : ::
|
|
| : # : : : ::
|
|
| : # : : : : :::
|
|
| : # : : : : : ::
|
|
| ::::::::::: # : : : : : : :::
|
|
| : : # : : : : : : : ::
|
|
| ::::: : # : : : : : : : : ::
|
|
| @@@: : : # : : : : : : : : : @
|
|
| ::@ : : : # : : : : : : : : : @
|
|
| :::: @ : : : # : : : : : : : : : @
|
|
| ::: : @ : : : # : : : : : : : : : @
|
|
| ::: : : @ : : : # : : : : : : : : : @
|
|
| :::: : : : @ : : : # : : : : : : : : : @
|
|
| ::: : : : : @ : : : # : : : : : : : : : @
|
|
| :::: : : : : : @ : : : # : : : : : : : : : @
|
|
| ::: : : : : : : @ : : : # : : : : : : : : : @
|
|
0 +----------------------------------------------------------------------->KB 0 29.48
|
|
|
|
Number of snapshots: 25
|
|
Detailed snapshots: [9, 14 (peak), 24]
|
|
]]></screen>
|
|
|
|
<para>Each vertical bar represents a snapshot, i.e. a measurement of the
|
|
memory usage at a certain point in time. If the next snapshot is more than
|
|
one column away, a horizontal line of characters is drawn from the top of
|
|
the snapshot to just before the next snapshot column. The text at the
|
|
bottom show that 25 snapshots were taken for this program, which is one per
|
|
heap allocation/deallocation, plus a couple of extras. Massif starts by
|
|
taking snapshots for every heap allocation/deallocation, but as a program
|
|
runs for longer, it takes snapshots less frequently. It also discards older
|
|
snapshots as the program goes on; when it reaches the maximum number of
|
|
snapshots (100 by default, although changeable with the
|
|
<computeroutput>--max-snapshots</computeroutput> option) half of them are
|
|
deleted. This means that a reasonable number of snapshots are always
|
|
maintained.</para>
|
|
|
|
<para>Most snapshots are <emphasis>normal</emphasis>, and only basic
|
|
information is recorded for them. Normal snapshots are represented in the
|
|
graph by bars consisting of ':' characters.</para>
|
|
|
|
<para>Some snapshots are <emphasis>detailed</emphasis>. Information about
|
|
where allocations happened are recorded for these snapshots, as we will see
|
|
shortly. Detailed snapshots are represented in the graph by bars consisting
|
|
of '@' characters. The text at the bottom show that 3 detailed
|
|
snapshots were taken for this program (snapshots 9, 14 and 24). By default,
|
|
every 10th snapshot is detailed, although this can be changed via the
|
|
<computeroutput>--detailed-freq</computeroutput> option.</para>
|
|
|
|
<para>Finally, there is at most one <emphasis>peak</emphasis> snapshot. The
|
|
peak snapshot is a detailed snapshot, and records the point where memory
|
|
consumption was greatest. The peak snapshot is represented in the graph by
|
|
a bar consisting of '#' characters. The text at the bottom shows
|
|
that snapshot 14 was the peak. Note that for tiny programs that never
|
|
deallocate heap memory, Massif will not record a peak snapshot.</para>
|
|
|
|
<para>Some more details about the peak: the peak is determined by looking
|
|
at every allocation, i.e. it is <emphasis>not</emphasis> just the peak among
|
|
the regular snapshots. However, recording the true peak is expensive, and
|
|
so by default Massif records a peak whose size is within 1% of the size of
|
|
the true peak. See the description of the
|
|
<computeroutput>--peak-inaccuracy</computeroutput> option below for more
|
|
details.</para>
|
|
|
|
<para>The following graph is from an execution of Konqueror, the KDE web
|
|
browser. It shows what graphs for larger programs look like.</para>
|
|
<screen><![CDATA[
|
|
MB
|
|
3.952^ #
|
|
| @#:
|
|
| :@@#:
|
|
| @@::::@@#:
|
|
| @ :: :@@#::
|
|
| @@@ :: :@@#::
|
|
| @@:@@@ :: :@@#::
|
|
| :::@ :@@@ :: :@@#::
|
|
| : :@ :@@@ :: :@@#::
|
|
| :@: :@ :@@@ :: :@@#::
|
|
| @@:@: :@ :@@@ :: :@@#:::
|
|
| : :: ::@@:@: :@ :@@@ :: :@@#:::
|
|
| :@@: ::::: ::::@@@:::@@:@: :@ :@@@ :: :@@#:::
|
|
| ::::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
|
|
| @: ::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
|
|
| @: ::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
|
|
| @: ::@@:::::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
|
|
| ::@@@: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
|
|
| :::::@ @: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
|
|
| @@:::::@ @: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#:::
|
|
0 +----------------------------------------------------------------------->Mi
|
|
0 626.4
|
|
|
|
Number of snapshots: 63
|
|
Detailed snapshots: [3, 4, 10, 11, 15, 16, 29, 33, 34, 36, 39, 41,
|
|
42, 43, 44, 49, 50, 51, 53, 55, 56, 57 (peak)]
|
|
]]></screen>
|
|
|
|
<para>Note that the larger size units are KB, MB, GB, etc. As is typical
|
|
for memory measurements, these are based on a multiplier of 1024, rather
|
|
than the standard SI multiplier of 1000. Strictly speaking, they should be
|
|
written KiB, MiB, GiB, etc.</para>
|
|
|
|
</sect2>
|
|
|
|
|
|
<sect2 id="ms-manual.thesnapshotdetails" xreflabel="The Snapshot Details">
|
|
<title>The Snapshot Details</title>
|
|
|
|
<para>Returning to our example, the graph is followed by the detailed
|
|
information for each snapshot. The first nine snapshots are normal, so only
|
|
a small amount of information is recorded for each one:</para>
|
|
<screen><![CDATA[
|
|
--------------------------------------------------------------------------------
|
|
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
|
|
--------------------------------------------------------------------------------
|
|
0 0 0 0 0 0
|
|
1 1,008 1,008 1,000 8 0
|
|
2 2,016 2,016 2,000 16 0
|
|
3 3,024 3,024 3,000 24 0
|
|
4 4,032 4,032 4,000 32 0
|
|
5 5,040 5,040 5,000 40 0
|
|
6 6,048 6,048 6,000 48 0
|
|
7 7,056 7,056 7,000 56 0
|
|
8 8,064 8,064 8,000 64 0
|
|
]]></screen>
|
|
|
|
<para>Each normal snapshot records several things.</para>
|
|
|
|
<itemizedlist>
|
|
<listitem><para>Its number.</para></listitem>
|
|
|
|
<listitem><para>The time it was taken. In this case, the time unit is
|
|
bytes, due to the use of
|
|
<computeroutput>--time-unit=B</computeroutput>.</para></listitem>
|
|
|
|
<listitem><para>The total memory consumption at that point.</para></listitem>
|
|
|
|
<listitem><para>The number of useful heap bytes allocated at that point.
|
|
This reflects the number of bytes asked for by the
|
|
program.</para></listitem>
|
|
|
|
<listitem><para>The number of extra heap bytes allocated at that point.
|
|
This reflects the number of bytes allocated in excess of what the program
|
|
asked for. There are two sources of extra heap bytes.</para>
|
|
|
|
<para>First, every heap block has administrative bytes associated with it.
|
|
The exact number of administrative bytes depends on the details of the
|
|
allocator. By default Massif assumes 8 bytes per block, as can be seen
|
|
from the example, but this number can be changed via the
|
|
<computeroutput>--heap-admin</computeroutput> option.</para>
|
|
|
|
<para>Second, allocators often round up the number of bytes asked for to a
|
|
larger number. By default, if N bytes are asked for, Massif rounds N up
|
|
to the nearest multiple of 8 that is equal to or greater than N. This is
|
|
typical behaviour for allocators, and is required to ensure that elements
|
|
within the block are suitably aligned. The rounding size can be changed
|
|
with the <computeroutput>--alignment</computeroutput> option, although it
|
|
cannot be less than 8, and must be a power of two.</para></listitem>
|
|
|
|
<listitem><para>The size of the stack(s). By default, stack profiling is
|
|
off as it slows Massif down greatly. Therefore, the stack column is zero
|
|
in the example.</para></listitem>
|
|
</itemizedlist>
|
|
|
|
<para>The next snapshot is detailed. As well as the basic counts, it gives
|
|
an allocation tree which indicates exactly which pieces of code were
|
|
responsible for allocating heap memory:</para>
|
|
|
|
<screen><![CDATA[
|
|
9 9,072 9,072 9,000 72 0
|
|
99.21% (9,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
|
|
->99.21% (9,000B) 0x804841A: main (example.c:20)
|
|
]]></screen>
|
|
|
|
<para>The allocation tree can be read from the top down. The first line
|
|
indicates all heap allocation functions such as <function>malloc</function>
|
|
and C++ <function>new</function>. All heap allocations go through these
|
|
functions, and so all 9,000 useful bytes (which is 99.21% of all allocated
|
|
bytes) go through them. But how were <function>malloc</function> and new
|
|
called? At this point, every allocation so far has been due to line 21
|
|
inside <function>main</function>, hence the second line in the tree. The
|
|
<computeroutput>-></computeroutput> indicates that main (line 20) called
|
|
<function>malloc</function>.</para>
|
|
|
|
<para>Let's see what the subsequent output shows happened next:</para>
|
|
|
|
<screen><![CDATA[
|
|
--------------------------------------------------------------------------------
|
|
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
|
|
--------------------------------------------------------------------------------
|
|
10 10,080 10,080 10,000 80 0
|
|
11 12,088 12,088 12,000 88 0
|
|
12 16,096 16,096 16,000 96 0
|
|
13 20,104 20,104 20,000 104 0
|
|
14 20,104 20,104 20,000 104 0
|
|
99.48% (20,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
|
|
->49.74% (10,000B) 0x804841A: main (example.c:20)
|
|
|
|
|
->39.79% (8,000B) 0x80483C2: g (example.c:5)
|
|
| ->19.90% (4,000B) 0x80483E2: f (example.c:11)
|
|
| | ->19.90% (4,000B) 0x8048431: main (example.c:23)
|
|
| |
|
|
| ->19.90% (4,000B) 0x8048436: main (example.c:25)
|
|
|
|
|
->09.95% (2,000B) 0x80483DA: f (example.c:10)
|
|
->09.95% (2,000B) 0x8048431: main (example.c:23)
|
|
]]></screen>
|
|
|
|
<para>The first four snapshots are similar to the previous ones. But then
|
|
the global allocation peak is reached, and a detailed snapshot is taken.
|
|
Its allocation tree shows that 20,000B of useful heap memory has been
|
|
allocated, and the lines and arrows indicate that this is from three
|
|
different code locations: line 20, which is responsible for 10,000B
|
|
(49.74%); line 5, which is responsible for 8,000B (39.79%); and line 10,
|
|
which is responsible for 2,000B (9.95%).</para>
|
|
|
|
<para>We can then drill down further in the allocation tree. For example,
|
|
of the 8,000B asked for by line 5, half of it was due to a call from line
|
|
11, and half was due to a call from line 25.</para>
|
|
|
|
<para>In short, Massif collates the stack trace of every single allocation
|
|
point in the program into a single tree, which gives a complete picture at
|
|
a particular point in time of how and why all heap memory was
|
|
allocated.</para>
|
|
|
|
<para>Note that the tree entries correspond not to functions, but to
|
|
individual code locations. For example, if function <function>A</function>
|
|
calls <function>malloc</function>, and function <function>B</function> calls
|
|
<function>A</function> twice, once on line 10 and once on line 11, then
|
|
the two calls will result in two distinct stack traces in the tree. In
|
|
contrast, if <function>B</function> calls <function>A</function> repeatedly
|
|
from line 15 (e.g. due to a loop), then each of those calls will be
|
|
represented by the same stack trace in the tree.</para>
|
|
|
|
<para>Note also that each tree entry with children in the example satisfies an
|
|
invariant: the entry's size is equal to the sum of its children's sizes.
|
|
For example, the first entry has size 20,000B, and its children have sizes
|
|
10,000B, 8,000B, and 2,000B. In general, this invariant almost always
|
|
holds. However, in rare circumstances stack traces can be malformed, in
|
|
which case a stack trace can be a sub-trace of another stack trace. This
|
|
means that some entries in the tree may not satisfy the invariant -- the
|
|
entry's size will be greater than the sum of its children's sizes. This is
|
|
not a big problem, but could make the results confusing. Massif can
|
|
sometimes detect when this happens; if it does, it issues a warning:</para>
|
|
|
|
<screen><![CDATA[
|
|
Warning: Malformed stack trace detected. In Massif's output,
|
|
the size of an entry's child entries may not sum up
|
|
to the entry's size as they normally do.
|
|
]]></screen>
|
|
|
|
<para>However, Massif does not detect and warn about every such occurrence.
|
|
Fortunately, malformed stack traces are rare in practice.</para>
|
|
|
|
<para>Returning now to ms_print's output, the final part is similar:</para>
|
|
|
|
<screen><![CDATA[
|
|
--------------------------------------------------------------------------------
|
|
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
|
|
--------------------------------------------------------------------------------
|
|
15 21,112 19,096 19,000 96 0
|
|
16 22,120 18,088 18,000 88 0
|
|
17 23,128 17,080 17,000 80 0
|
|
18 24,136 16,072 16,000 72 0
|
|
19 25,144 15,064 15,000 64 0
|
|
20 26,152 14,056 14,000 56 0
|
|
21 27,160 13,048 13,000 48 0
|
|
22 28,168 12,040 12,000 40 0
|
|
23 29,176 11,032 11,000 32 0
|
|
24 30,184 10,024 10,000 24 0
|
|
99.76% (10,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
|
|
->79.81% (8,000B) 0x80483C2: g (example.c:5)
|
|
| ->39.90% (4,000B) 0x80483E2: f (example.c:11)
|
|
| | ->39.90% (4,000B) 0x8048431: main (example.c:23)
|
|
| |
|
|
| ->39.90% (4,000B) 0x8048436: main (example.c:25)
|
|
|
|
|
->19.95% (2,000B) 0x80483DA: f (example.c:10)
|
|
| ->19.95% (2,000B) 0x8048431: main (example.c:23)
|
|
|
|
|
->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
|
|
]]></screen>
|
|
|
|
<para>The final detailed snapshot shows how the heap looked at termination.
|
|
The 00.00% entry represents the code locations for which memory was
|
|
allocated and then freed (line 20 in this case, the memory for which was
|
|
freed on line 28). However, no code location details are given for this
|
|
entry; by default, Massif only records the details for code locations
|
|
responsible for more than 1% of useful memory bytes, and ms_print likewise
|
|
only prints the details for code locations responsible for more than 1%.
|
|
The entries that do not meet this threshold are aggregated. This avoids
|
|
filling up the output with large numbers of unimportant entries. The
|
|
thresholds can be changed with the
|
|
<computeroutput>--threshold</computeroutput> option that both Massif and
|
|
ms_print support.</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="ms-manual.forkingprograms" xreflabel="Forking Programs">
|
|
<title>Forking Programs</title>
|
|
<para>If your program forks, the child will inherit all the profiling data that
|
|
has been gathered for the parent.</para>
|
|
|
|
<para>If the output file format string (controlled by
|
|
<option>--massif-out-file</option>) does not contain <option>%p</option>, then
|
|
the outputs from the parent and child will be intermingled in a single output
|
|
file, which will almost certainly make it unreadable by ms_print.</para>
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="ms-manual.options" xreflabel="Massif Options">
|
|
<title>Massif Options</title>
|
|
|
|
<para>Massif-specific options are:</para>
|
|
|
|
<!-- start of xi:include in the manpage -->
|
|
<variablelist id="ms.opts.list">
|
|
|
|
<varlistentry id="opt.heap" xreflabel="--heap">
|
|
<term>
|
|
<option><![CDATA[--heap=<yes|no> [default: yes] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>Specifies whether heap profiling should be done.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.heap-admin" xreflabel="--heap-admin">
|
|
<term>
|
|
<option><![CDATA[--heap-admin=<number> [default: 8] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>If heap profiling is enabled, gives the number of administrative
|
|
bytes per block to use. This should be an estimate of the average,
|
|
since it may vary. For example, the allocator used by
|
|
<computeroutput>glibc</computeroutput> requires somewhere between 4 to
|
|
15 bytes per block, depending on various factors. It also requires
|
|
admin space for freed blocks, although Massif does not account
|
|
for this.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.stacks" xreflabel="--stacks">
|
|
<term>
|
|
<option><![CDATA[--stacks=<yes|no> [default: yes] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>Specifies whether stack profiling should be done. This option
|
|
slows Massif down greatly, and so is off by default. Note that Massif
|
|
assumes that the main stack has size zero at start-up. This is not
|
|
true, but measuring the actual stack size is not easy, and it reflects
|
|
the size of the part of the main stack that a user program actually
|
|
has control over.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.depth" xreflabel="--depth">
|
|
<term>
|
|
<option><![CDATA[--depth=<number> [default: 30] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>Maximum depth of the allocation trees recorded for detailed
|
|
snapshots. Increasing it will make Massif run somewhat more slowly,
|
|
use more memory, and produce bigger output files.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.alloc-fn" xreflabel="--alloc-fn">
|
|
<term>
|
|
<option><![CDATA[--alloc-fn=<name> ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>Functions specified with this option will be treated as though
|
|
they were a heap allocation function such as
|
|
<function>malloc</function>. This is useful for functions that are
|
|
wrappers to <function>malloc</function> or <function>new</function>,
|
|
which can fill up the allocation trees with uninteresting information.
|
|
This option can be specified multiple times on the command line, to
|
|
name multiple functions.</para>
|
|
|
|
<para>Note that overloaded C++ names must be written in full. Single
|
|
quotes may be necessary to prevent the shell from breaking them up.
|
|
For example:
|
|
<screen><![CDATA[
|
|
--alloc-fn='operator new(unsigned, std::nothrow_t const&)'
|
|
]]></screen>
|
|
</para>
|
|
|
|
<para>
|
|
The full list of functions and operators that are by default
|
|
considered allocation functions is as follows.</para>
|
|
<screen><
|
|
operator new[](unsigned long)
|
|
operator new(unsigned, std::nothrow_t const&)
|
|
operator new[](unsigned, std::nothrow_t const&)
|
|
operator new(unsigned long, std::nothrow_t const&)
|
|
operator new[](unsigned long, std::nothrow_t const&)
|
|
]]></screen>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.ignore-fn" xreflabel="--ignore-fn">
|
|
<term>
|
|
<option><![CDATA[--ignore-fn=<name> ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>Any direct heap allocation (i.e. a call to
|
|
<function>malloc</function>, <function>new</function>, etc, or a call
|
|
to a function name in a <computeroutput>--alloc-fn</computeroutput>
|
|
option) that occurs in a function specified by this option will be
|
|
ignored. This is mostly useful for testing purposes. This option can
|
|
be specified multiple times on the command line, to name multiple
|
|
functions.
|
|
</para>
|
|
|
|
<para>Any <function>realloc</function> of an ignored block will
|
|
also be ignored, even if the <function>realloc</function> call does
|
|
not occur in an ignored function. This avoids the possibility of
|
|
negative heap sizes if ignored blocks are shrunk with
|
|
<function>realloc</function>.
|
|
</para>
|
|
|
|
<para>Note that overloaded C++ names must be written in full, as for
|
|
<computeroutput>--alloc-fn</computeroutput> above.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.threshold" xreflabel="--threshold">
|
|
<term>
|
|
<option><![CDATA[--threshold=<m.n> [default: 1.0] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>The significance threshold for heap allocations, as a
|
|
percentage. Allocation tree entries that account for less than this
|
|
will be aggregated. Note that this should be specified in tandem with
|
|
ms_print's option of the same name.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.peak-inaccuracy" xreflabel="--peak-inaccuracy">
|
|
<term>
|
|
<option><![CDATA[--peak-inaccuracy=<m.n> [default: 1.0] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>Massif does not necessarily record the actual global memory
|
|
allocation peak; by default it records a peak only when the global
|
|
memory allocation size exceeds the previous peak by at least 1.0%.
|
|
This is because there can be many local allocation peaks along the way,
|
|
and doing a detailed snapshot for every one would be expensive and
|
|
wasteful, as all but one of them will be later discarded. This
|
|
inaccuracy can be changed (even to 0.0%) via this option, but Massif
|
|
will run drastically slower as the number approaches zero.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.time-unit" xreflabel="--time-unit">
|
|
<term>
|
|
<option><![CDATA[--time-unit=i|ms|B [default: i] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>The time unit used for the profiling. There are three
|
|
possibilities: instructions executed (i), which is good for most
|
|
cases; real (wallclock) time (ms, i.e. milliseconds), which is
|
|
sometimes useful; and bytes allocated/deallocated on the heap and/or
|
|
stack (B), which is useful for very short-run programs, and for
|
|
testing purposes, because it is the most reproducible across different
|
|
machines.</para> </listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.detailed-freq" xreflabel="--detailed-freq">
|
|
<term>
|
|
<option><![CDATA[--detailed-freq=<n> [default: 10] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>Frequency of detailed snapshots. With
|
|
<computeroutput>--detailed-freq=1</computeroutput>, every snapshot is
|
|
detailed.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.max-snapshots" xreflabel="--max-snapshots">
|
|
<term>
|
|
<option><![CDATA[--max-snapshots=<n> [default: 100] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>The maximum number of snapshots recorded. If set to N, for all
|
|
programs except very short-running ones, the final number of snapshots
|
|
will be between N/2 and N.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.massif-out-file" xreflabel="--massif-out-file">
|
|
<term>
|
|
<option><![CDATA[--massif-out-file=<file> [default: massif.out.%p] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>Write the profile data to <computeroutput>file</computeroutput>
|
|
rather than to the default output file,
|
|
<computeroutput>massif.out.<pid></computeroutput>. The
|
|
<option>%p</option> and <option>%q</option> format specifiers can be
|
|
used to embed the process ID and/or the contents of an environment
|
|
variable in the name, as is the case for the core option
|
|
<option>--log-file</option>. See <xref
|
|
linkend="manual-core.basicopts"/> for details.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.alignment" xreflabel="--alignment">
|
|
<term>
|
|
<option><![CDATA[--alignment=<n> [default: 1.0] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>The minimum alignment (and thus size) of heap blocks.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
<!-- end of xi:include in the manpage -->
|
|
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="ms-manual.ms_print-options" xreflabel="ms_print Options">
|
|
<title>ms_print Options</title>
|
|
|
|
<para>ms_print's options are:</para>
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
<para><computeroutput>-h, --help</computeroutput></para>
|
|
<para><computeroutput>-v, --version</computeroutput></para>
|
|
<para>Help and version, as usual.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><option><![CDATA[--threshold=<m.n>]]></option> [default: 1.0]</para>
|
|
<para>Same as Massif's <computeroutput>--threshold</computeroutput>, but
|
|
applied after profiling rather than during.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><option><![CDATA[--x=<m.n>]]></option> [default: 72]</para>
|
|
<para>Width of the graph, in columns.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><option><![CDATA[--y=<n>]]></option> [default: 20]</para>
|
|
<para>Height of the graph, in rows.</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="ms-manual.fileformat" xreflabel="fileformat">
|
|
<title>Massif's output file format</title>
|
|
<para>Massif's file format is plain text (i.e. not binary) and deliberately
|
|
easy to read for both humans and machines. Nonetheless, the exact format
|
|
is not described here. This is because the format is currently very
|
|
Massif-specific. We plan to make the format more general, and thus suitable
|
|
for possible use with other tools. Once this has been done, the format will
|
|
be documented here.</para>
|
|
|
|
</sect1>
|
|
|
|
</chapter>
|