mirror of
https://github.com/Zenithsiz/ftmemsim-valgrind.git
synced 2026-02-03 18:13:01 +00:00
noaccess, writable, readable, other
Now they are:
noaccess, undefined, defined, partdefined
As a result, the following names:
make_writable, make_readable,
check_writable, check_readable, check_defined
have become:
make_mem_undefined, make_mem_defined,
check_mem_is_addressable, check_mem_is_defined, check_value_is_defined
(and likewise for the upper-case versions for client request macros).
The old MAKE_* and CHECK_* macros still work for backwards compatibility.
This is much better, because the old names were subtly misleading. For
example:
- "readable" really meant "readable and writable".
- "writable" really meant "writable and maybe readable, depending on how
the read value is used".
- "check_writable" really meant "check writable or readable"
The new names avoid these problems.
The recently-added macro which was called MAKE_DEFINED is now
MAKE_MEM_DEFINED_IF_ADDRESSABLE.
I also corrected the spelling of "addressable" in numerous places in
memcheck.h.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5802
1076 lines
42 KiB
XML
1076 lines
42 KiB
XML
<?xml version="1.0"?> <!-- -*- sgml -*- -->
|
|
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
|
|
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
|
|
|
|
|
|
<chapter id="mc-manual" xreflabel="Memcheck: a heavyweight memory checker">
|
|
<title>Memcheck: a heavyweight memory checker</title>
|
|
|
|
<para>To use this tool, you may specify <option>--tool=memcheck</option>
|
|
on the Valgrind command line. You don't have to, though, since Memcheck
|
|
is the default tool.</para>
|
|
|
|
|
|
<sect1 id="mc-manual.bugs"
|
|
xreflabel="Kinds of bugs that Memcheck can find">
|
|
<title>Kinds of bugs that Memcheck can find</title>
|
|
|
|
<para>Memcheck is Valgrind's heavyweight memory checking tool. All
|
|
reads and writes of memory are checked, and calls to
|
|
malloc/new/free/delete are intercepted. As a result, Memcheck can detect
|
|
the following problems:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>Use of uninitialised memory</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Reading/writing memory after it has been free'd</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Reading/writing off the end of malloc'd blocks</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Reading/writing inappropriate areas on the stack</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Memory leaks - where pointers to malloc'd blocks are
|
|
lost forever</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Mismatched use of malloc/new/new [] vs
|
|
free/delete/delete []</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Overlapping <computeroutput>src</computeroutput> and
|
|
<computeroutput>dst</computeroutput> pointers in
|
|
<function>memcpy()</function> and related
|
|
functions</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="mc-manual.flags"
|
|
xreflabel="Command-line flags specific to Memcheck">
|
|
<title>Command-line flags specific to Memcheck</title>
|
|
|
|
<!-- start of xi:include in the manpage -->
|
|
<variablelist id="mc.opts.list">
|
|
|
|
<varlistentry id="opt.leak-check" xreflabel="--leak-check">
|
|
<term>
|
|
<option><![CDATA[--leak-check=<no|summary|yes|full> [default: summary] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>When enabled, search for memory leaks when the client
|
|
program finishes. A memory leak means a malloc'd block, which has
|
|
not yet been free'd, but to which no pointer can be found. Such a
|
|
block can never be free'd by the program, since no pointer to it
|
|
exists. If set to <varname>summary</varname>, it says how many
|
|
leaks occurred. If set to <varname>full</varname> or
|
|
<varname>yes</varname>, it gives details of each individual
|
|
leak.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.show-reachable" xreflabel="--show-reachable">
|
|
<term>
|
|
<option><![CDATA[--show-reachable=<yes|no> [default: no] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>When disabled, the memory leak detector only shows blocks
|
|
for which it cannot find a pointer to at all, or it can only find
|
|
a pointer to the middle of. These blocks are prime candidates for
|
|
memory leaks. When enabled, the leak detector also reports on
|
|
blocks which it could find a pointer to. Your program could, at
|
|
least in principle, have freed such blocks before exit. Contrast
|
|
this to blocks for which no pointer, or only an interior pointer
|
|
could be found: they are more likely to indicate memory leaks,
|
|
because you do not actually have a pointer to the start of the
|
|
block which you can hand to <function>free</function>, even if you
|
|
wanted to.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.leak-resolution" xreflabel="--leak-resolution">
|
|
<term>
|
|
<option><![CDATA[--leak-resolution=<low|med|high> [default: low] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>When doing leak checking, determines how willing
|
|
<constant>memcheck</constant> is to consider different backtraces to
|
|
be the same. When set to <varname>low</varname>, only the first
|
|
two entries need match. When <varname>med</varname>, four entries
|
|
have to match. When <varname>high</varname>, all entries need to
|
|
match.</para>
|
|
|
|
<para>For hardcore leak debugging, you probably want to use
|
|
<option>--leak-resolution=high</option> together with
|
|
<option>--num-callers=40</option> or some such large number. Note
|
|
however that this can give an overwhelming amount of information,
|
|
which is why the defaults are 4 callers and low-resolution
|
|
matching.</para>
|
|
|
|
<para>Note that the <option>--leak-resolution=</option> setting
|
|
does not affect <constant>memcheck's</constant> ability to find
|
|
leaks. It only changes how the results are presented.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.freelist-vol" xreflabel="--freelist-vol">
|
|
<term>
|
|
<option><![CDATA[--freelist-vol=<number> [default: 5000000] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>When the client program releases memory using
|
|
<function>free</function> (in <literal>C</literal>) or delete
|
|
(<literal>C++</literal>), that memory is not immediately made
|
|
available for re-allocation. Instead, it is marked inaccessible
|
|
and placed in a queue of freed blocks. The purpose is to defer as
|
|
long as possible the point at which freed-up memory comes back
|
|
into circulation. This increases the chance that
|
|
<constant>memcheck</constant> will be able to detect invalid
|
|
accesses to blocks for some significant period of time after they
|
|
have been freed.</para>
|
|
|
|
<para>This flag specifies the maximum total size, in bytes, of the
|
|
blocks in the queue. The default value is five million bytes.
|
|
Increasing this increases the total amount of memory used by
|
|
<constant>memcheck</constant> but may detect invalid uses of freed
|
|
blocks which would otherwise go undetected.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.workaround-gcc296-bugs" xreflabel="--workaround-gcc296-bugs">
|
|
<term>
|
|
<option><![CDATA[--workaround-gcc296-bugs=<yes|no> [default: no] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>When enabled, assume that reads and writes some small
|
|
distance below the stack pointer are due to bugs in gcc 2.96, and
|
|
does not report them. The "small distance" is 256 bytes by
|
|
default. Note that gcc 2.96 is the default compiler on some older
|
|
Linux distributions (RedHat 7.X) and so you may need to use this
|
|
flag. Do not use it if you do not have to, as it can cause real
|
|
errors to be overlooked. A better alternative is to use a more
|
|
recent gcc/g++ in which this bug is fixed.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.partial-loads-ok" xreflabel="--partial-loads-ok">
|
|
<term>
|
|
<option><![CDATA[--partial-loads-ok=<yes|no> [default: no] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>Controls how <constant>memcheck</constant> handles word-sized,
|
|
word-aligned loads from addresses for which some bytes are
|
|
addressible and others are not. When <varname>yes</varname>, such
|
|
loads do not elicit an address error. Instead, the loaded V bytes
|
|
corresponding to the illegal addresses indicate Undefined, and
|
|
those corresponding to legal addresses are loaded from shadow
|
|
memory, as usual.</para>
|
|
|
|
<para>When <varname>no</varname>, loads from partially invalid
|
|
addresses are treated the same as loads from completely invalid
|
|
addresses: an illegal-address error is issued, and the resulting V
|
|
bytes indicate valid data.</para>
|
|
|
|
<para>Note that code that behaves in this way is in violation of
|
|
the the ISO C/C++ standards, and should be considered broken. If
|
|
at all possible, such code should be fixed. This flag should be
|
|
used only as a last resort.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.undef-value-errors" xreflabel="--undef-value-errors">
|
|
<term>
|
|
<option><![CDATA[--undef-value-errors=<yes|no> [default: yes] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>Controls whether <constant>memcheck</constant> detects
|
|
dangerous uses of undefined value errors. When
|
|
<varname>yes</varname>, Memcheck behaves like Addrcheck, a lightweight
|
|
memory-checking tool that used to be part of Valgrind, which didn't
|
|
detect undefined value errors. Use this option if you don't like
|
|
seeing undefined value errors.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
<!-- end of xi:include in the manpage -->
|
|
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="mc-manual.errormsgs"
|
|
xreflabel="Explanation of error messages from Memcheck">
|
|
<title>Explanation of error messages from Memcheck</title>
|
|
|
|
<para>Despite considerable sophistication under the hood, Memcheck can
|
|
only really detect two kinds of errors: use of illegal addresses, and
|
|
use of undefined values. Nevertheless, this is enough to help you
|
|
discover all sorts of memory-management nasties in your code. This
|
|
section presents a quick summary of what error messages mean. The
|
|
precise behaviour of the error-checking machinery is described in
|
|
<xref linkend="mc-manual.machine"/>.</para>
|
|
|
|
|
|
<sect2 id="mc-manual.badrw"
|
|
xreflabel="Illegal read / Illegal write errors">
|
|
<title>Illegal read / Illegal write errors</title>
|
|
|
|
<para>For example:</para>
|
|
<programlisting><![CDATA[
|
|
Invalid read of size 4
|
|
at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9)
|
|
by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9)
|
|
by 0x40B07FF4: read_png_image__FP8QImageIO (kernel/qpngio.cpp:326)
|
|
by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621)
|
|
Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd
|
|
]]></programlisting>
|
|
|
|
<para>This happens when your program reads or writes memory at a place
|
|
which Memcheck reckons it shouldn't. In this example, the program did a
|
|
4-byte read at address 0xBFFFF0E0, somewhere within the system-supplied
|
|
library libpng.so.2.1.0.9, which was called from somewhere else in the
|
|
same library, called from line 326 of <filename>qpngio.cpp</filename>,
|
|
and so on.</para>
|
|
|
|
<para>Memcheck tries to establish what the illegal address might relate
|
|
to, since that's often useful. So, if it points into a block of memory
|
|
which has already been freed, you'll be informed of this, and also where
|
|
the block was free'd at. Likewise, if it should turn out to be just off
|
|
the end of a malloc'd block, a common result of off-by-one-errors in
|
|
array subscripting, you'll be informed of this fact, and also where the
|
|
block was malloc'd.</para>
|
|
|
|
<para>In this example, Memcheck can't identify the address. Actually
|
|
the address is on the stack, but, for some reason, this is not a valid
|
|
stack address -- it is below the stack pointer and that isn't allowed.
|
|
In this particular case it's probably caused by gcc generating invalid
|
|
code, a known bug in some ancient versions of gcc.</para>
|
|
|
|
<para>Note that Memcheck only tells you that your program is about to
|
|
access memory at an illegal address. It can't stop the access from
|
|
happening. So, if your program makes an access which normally would
|
|
result in a segmentation fault, you program will still suffer the same
|
|
fate -- but you will get a message from Memcheck immediately prior to
|
|
this. In this particular example, reading junk on the stack is
|
|
non-fatal, and the program stays alive.</para>
|
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2 id="mc-manual.uninitvals"
|
|
xreflabel="Use of uninitialised values">
|
|
<title>Use of uninitialised values</title>
|
|
|
|
<para>For example:</para>
|
|
<programlisting><![CDATA[
|
|
Conditional jump or move depends on uninitialised value(s)
|
|
at 0x402DFA94: _IO_vfprintf (_itoa.h:49)
|
|
by 0x402E8476: _IO_printf (printf.c:36)
|
|
by 0x8048472: main (tests/manuel1.c:8)
|
|
]]></programlisting>
|
|
|
|
<para>An uninitialised-value use error is reported when your program
|
|
uses a value which hasn't been initialised -- in other words, is
|
|
undefined. Here, the undefined value is used somewhere inside the
|
|
printf() machinery of the C library. This error was reported when
|
|
running the following small program:</para>
|
|
<programlisting><![CDATA[
|
|
int main()
|
|
{
|
|
int x;
|
|
printf ("x = %d\n", x);
|
|
}]]></programlisting>
|
|
|
|
<para>It is important to understand that your program can copy around
|
|
junk (uninitialised) data as much as it likes. Memcheck observes this
|
|
and keeps track of the data, but does not complain. A complaint is
|
|
issued only when your program attempts to make use of uninitialised
|
|
data. In this example, x is uninitialised. Memcheck observes the value
|
|
being passed to <literal>_IO_printf</literal> and thence to
|
|
<literal>_IO_vfprintf</literal>, but makes no comment. However,
|
|
_IO_vfprintf has to examine the value of x so it can turn it into the
|
|
corresponding ASCII string, and it is at this point that Memcheck
|
|
complains.</para>
|
|
|
|
<para>Sources of uninitialised data tend to be:</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>Local variables in procedures which have not been initialised,
|
|
as in the example above.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>The contents of malloc'd blocks, before you write something
|
|
there. In C++, the new operator is a wrapper round malloc, so if
|
|
you create an object with new, its fields will be uninitialised
|
|
until you (or the constructor) fill them in, which is only Right and
|
|
Proper.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2 id="mc-manual.badfrees" xreflabel="Illegal frees">
|
|
<title>Illegal frees</title>
|
|
|
|
<para>For example:</para>
|
|
<programlisting><![CDATA[
|
|
Invalid free()
|
|
at 0x4004FFDF: free (vg_clientmalloc.c:577)
|
|
by 0x80484C7: main (tests/doublefree.c:10)
|
|
Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd
|
|
at 0x4004FFDF: free (vg_clientmalloc.c:577)
|
|
by 0x80484C7: main (tests/doublefree.c:10)
|
|
]]></programlisting>
|
|
|
|
<para>Memcheck keeps track of the blocks allocated by your program with
|
|
malloc/new, so it can know exactly whether or not the argument to
|
|
free/delete is legitimate or not. Here, this test program has freed the
|
|
same block twice. As with the illegal read/write errors, Memcheck
|
|
attempts to make sense of the address free'd. If, as here, the address
|
|
is one which has previously been freed, you wil be told that -- making
|
|
duplicate frees of the same block easy to spot.</para>
|
|
|
|
</sect2>
|
|
|
|
|
|
<sect2 id="mc-manual.rudefn"
|
|
xreflabel="When a block is freed with an inappropriate deallocation
|
|
function">
|
|
<title>When a block is freed with an inappropriate deallocation
|
|
function</title>
|
|
|
|
<para>In the following example, a block allocated with
|
|
<function>new[]</function> has wrongly been deallocated with
|
|
<function>free</function>:</para>
|
|
<programlisting><![CDATA[
|
|
Mismatched free() / delete / delete []
|
|
at 0x40043249: free (vg_clientfuncs.c:171)
|
|
by 0x4102BB4E: QGArray::~QGArray(void) (tools/qgarray.cpp:149)
|
|
by 0x4C261C41: PptDoc::~PptDoc(void) (include/qmemarray.h:60)
|
|
by 0x4C261F0E: PptXml::~PptXml(void) (pptxml.cc:44)
|
|
Address 0x4BB292A8 is 0 bytes inside a block of size 64 alloc'd
|
|
at 0x4004318C: __builtin_vec_new (vg_clientfuncs.c:152)
|
|
by 0x4C21BC15: KLaola::readSBStream(int) const (klaola.cc:314)
|
|
by 0x4C21C155: KLaola::stream(KLaola::OLENode const *) (klaola.cc:416)
|
|
by 0x4C21788F: OLEFilter::convert(QCString const &) (olefilter.cc:272)
|
|
]]></programlisting>
|
|
|
|
<para>In <literal>C++</literal> it's important to deallocate memory in a
|
|
way compatible with how it was allocated. The deal is:</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>If allocated with
|
|
<function>malloc</function>,
|
|
<function>calloc</function>,
|
|
<function>realloc</function>,
|
|
<function>valloc</function> or
|
|
<function>memalign</function>, you must
|
|
deallocate with <function>free</function>.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>If allocated with <function>new[]</function>, you must
|
|
deallocate with <function>delete[]</function>.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>If allocated with <function>new</function>, you must deallocate
|
|
with <function>delete</function>.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>The worst thing is that on Linux apparently it doesn't matter if
|
|
you do muddle these up, and it all seems to work ok, but the same
|
|
program may then crash on a different platform, Solaris for example. So
|
|
it's best to fix it properly. According to the KDE folks "it's amazing
|
|
how many C++ programmers don't know this".</para>
|
|
|
|
<para>Pascal Massimino adds the following clarification:
|
|
<function>delete[]</function> must be used for objects allocated by
|
|
<function>new[]</function> because the compiler stores the size of the
|
|
array and the pointer-to-member to the destructor of the array's content
|
|
just before the pointer actually returned. This implies a
|
|
variable-sized overhead in what's returned by <function>new</function>
|
|
or <function>new[]</function>.</para>
|
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2 id="mc-manual.badperm"
|
|
xreflabel="Passing system call parameters with
|
|
inadequate read/write permissions">
|
|
<title>Passing system call parameters with inadequate read/write
|
|
permissions</title>
|
|
|
|
<para>Memcheck checks all parameters to system calls:
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>It checks all the direct parameters themselves.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Also, if a system call needs to read from a buffer provided by
|
|
your program, Memcheck checks that the entire buffer is addressible
|
|
and has valid data, ie, it is readable.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Also, if the system call needs to write to a user-supplied
|
|
buffer, Memcheck checks that the buffer is addressible.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>After the system call, Memcheck updates its tracked information to
|
|
precisely reflect any changes in memory permissions caused by the system
|
|
call.</para>
|
|
|
|
<para>Here's an example of two system calls with invalid parameters:</para>
|
|
<programlisting><![CDATA[
|
|
#include <stdlib.h>
|
|
#include <unistd.h>
|
|
int main( void )
|
|
{
|
|
char* arr = malloc(10);
|
|
int* arr2 = malloc(sizeof(int));
|
|
write( 1 /* stdout */, arr, 10 );
|
|
exit(arr2[0]);
|
|
}
|
|
]]></programlisting>
|
|
|
|
<para>You get these complaints ...</para>
|
|
<programlisting><![CDATA[
|
|
Syscall param write(buf) points to uninitialised byte(s)
|
|
at 0x25A48723: __write_nocancel (in /lib/tls/libc-2.3.3.so)
|
|
by 0x259AFAD3: __libc_start_main (in /lib/tls/libc-2.3.3.so)
|
|
by 0x8048348: (within /auto/homes/njn25/grind/head4/a.out)
|
|
Address 0x25AB8028 is 0 bytes inside a block of size 10 alloc'd
|
|
at 0x259852B0: malloc (vg_replace_malloc.c:130)
|
|
by 0x80483F1: main (a.c:5)
|
|
|
|
Syscall param exit(error_code) contains uninitialised byte(s)
|
|
at 0x25A21B44: __GI__exit (in /lib/tls/libc-2.3.3.so)
|
|
by 0x8048426: main (a.c:8)
|
|
]]></programlisting>
|
|
|
|
<para>... because the program has (a) tried to write uninitialised junk
|
|
from the malloc'd block to the standard output, and (b) passed an
|
|
uninitialised value to <function>exit</function>. Note that the first
|
|
error refers to the memory pointed to by
|
|
<computeroutput>buf</computeroutput> (not
|
|
<computeroutput>buf</computeroutput> itself), but the second error
|
|
refers to the argument <computeroutput>error_code</computeroutput>
|
|
itself.</para>
|
|
|
|
</sect2>
|
|
|
|
|
|
<sect2 id="mc-manual.overlap"
|
|
xreflabel="Overlapping source and destination blocks">
|
|
<title>Overlapping source and destination blocks</title>
|
|
|
|
<para>The following C library functions copy some data from one
|
|
memory block to another (or something similar):
|
|
<function>memcpy()</function>,
|
|
<function>strcpy()</function>,
|
|
<function>strncpy()</function>,
|
|
<function>strcat()</function>,
|
|
<function>strncat()</function>.
|
|
The blocks pointed to by their <computeroutput>src</computeroutput> and
|
|
<computeroutput>dst</computeroutput> pointers aren't allowed to overlap.
|
|
Memcheck checks for this.</para>
|
|
|
|
<para>For example:</para>
|
|
<programlisting><![CDATA[
|
|
==27492== Source and destination overlap in memcpy(0xbffff294, 0xbffff280, 21)
|
|
==27492== at 0x40026CDC: memcpy (mc_replace_strmem.c:71)
|
|
==27492== by 0x804865A: main (overlap.c:40)
|
|
==27492==
|
|
]]></programlisting>
|
|
|
|
<para>You don't want the two blocks to overlap because one of them could
|
|
get partially trashed by the copying.</para>
|
|
|
|
<para>You might think that Memcheck is being overly pedantic reporting
|
|
this in the case where <computeroutput>dst</computeroutput> is less than
|
|
<computeroutput>src</computeroutput>. For example, the obvious way to
|
|
implement <function>memcpy()</function> is by copying from the first
|
|
byte to the last. However, the optimisation guides of some
|
|
architectures recommend copying from the last byte down to the first.
|
|
Also, some implementations of <function>memcpy()</function> zero
|
|
<computeroutput>dst</computeroutput> before copying, because zeroing the
|
|
destination's cache line(s) can improve performance.</para>
|
|
|
|
<para>The moral of the story is: if you want to write truly portable
|
|
code, don't make any assumptions about the language
|
|
implementation.</para>
|
|
|
|
</sect2>
|
|
|
|
|
|
<sect2 id="mc-manual.leaks" xreflabel="Memory leak detection">
|
|
<title>Memory leak detection</title>
|
|
|
|
<para>Memcheck keeps track of all memory blocks issued in response to
|
|
calls to malloc/calloc/realloc/new. So when the program exits, it knows
|
|
which blocks have not been freed.
|
|
</para>
|
|
|
|
<para>If <option>--leak-check</option> is set appropriately, for each
|
|
remaining block, Memcheck scans the entire address space of the process,
|
|
looking for pointers to the block. Each block fits into one of the
|
|
three following categories.</para>
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
<para>Still reachable: A pointer to the start of the block is found.
|
|
This usually indicates programming sloppiness. Since the block is
|
|
still pointed at, the programmer could, at least in principle, free
|
|
it before program exit. Because these are very common and arguably
|
|
not a problem, Memcheck won't report such blocks unless
|
|
<option>--show-reachable=yes</option> is specified.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Possibly lost, or "dubious": A pointer to the interior of the
|
|
block is found. The pointer might originally have pointed to the
|
|
start and have been moved along, or it might be entirely unrelated.
|
|
Memcheck deems such a block as "dubious", because it's unclear
|
|
whether or not a pointer to it still exists.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Definitely lost, or "leaked": The worst outcome is that no
|
|
pointer to the block can be found. The block is classified as
|
|
"leaked", because the programmer could not possibly have freed it at
|
|
program exit, since no pointer to it exists. This is likely a
|
|
symptom of having lost the pointer at some earlier point in the
|
|
program.</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
<para>For each block mentioned, Memcheck will also tell you where the
|
|
block was allocated. It cannot tell you how or why the pointer to a
|
|
leaked block has been lost; you have to work that out for yourself. In
|
|
general, you should attempt to ensure your programs do not have any
|
|
leaked or dubious blocks at exit.</para>
|
|
|
|
<para>For example:</para>
|
|
<programlisting><![CDATA[
|
|
8 bytes in 1 blocks are definitely lost in loss record 1 of 14
|
|
at 0x........: malloc (vg_replace_malloc.c:...)
|
|
by 0x........: mk (leak-tree.c:11)
|
|
by 0x........: main (leak-tree.c:39)
|
|
|
|
88 (8 direct, 80 indirect) bytes in 1 blocks are definitely lost
|
|
in loss record 13 of 14
|
|
at 0x........: malloc (vg_replace_malloc.c:...)
|
|
by 0x........: mk (leak-tree.c:11)
|
|
by 0x........: main (leak-tree.c:25)
|
|
]]></programlisting>
|
|
|
|
<para>The first message describes a simple case of a single 8 byte block
|
|
that has been definitely lost. The second case mentions both "direct"
|
|
and "indirect" leaks. The distinction is that a direct leak is a block
|
|
which has no pointers to it. An indirect leak is a block which is only
|
|
pointed to by other leaked blocks. Both kinds of leak are bad.</para>
|
|
|
|
<para>The precise area of memory in which Memcheck searches for pointers
|
|
is: all naturally-aligned machine-word-sized words for which all A bits
|
|
indicate addressibility and all V bits indicated that the stored value
|
|
is actually valid.</para>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="mc-manual.suppfiles" xreflabel="Writing suppression files">
|
|
<title>Writing suppression files</title>
|
|
|
|
<para>The basic suppression format is described in
|
|
<xref linkend="manual-core.suppress"/>.</para>
|
|
|
|
<para>The suppression (2nd) line should have the form:</para>
|
|
<programlisting><![CDATA[
|
|
Memcheck:suppression_type]]></programlisting>
|
|
|
|
<para>The Memcheck suppression types are as follows:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para><varname>Value1</varname>,
|
|
<varname>Value2</varname>,
|
|
<varname>Value4</varname>,
|
|
<varname>Value8</varname>,
|
|
<varname>Value16</varname>,
|
|
meaning an uninitialised-value error when
|
|
using a value of 1, 2, 4, 8 or 16 bytes.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Or: <varname>Cond</varname> (or its old
|
|
name, <varname>Value0</varname>), meaning use
|
|
of an uninitialised CPU condition code.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Or: <varname>Addr1</varname>,
|
|
<varname>Addr2</varname>,
|
|
<varname>Addr4</varname>,
|
|
<varname>Addr8</varname>,
|
|
<varname>Addr16</varname>,
|
|
meaning an invalid address during a
|
|
memory access of 1, 2, 4, 8 or 16 bytes respectively.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Or: <varname>Param</varname>, meaning an
|
|
invalid system call parameter error.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Or: <varname>Free</varname>, meaning an
|
|
invalid or mismatching free.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Or: <varname>Overlap</varname>, meaning a
|
|
<computeroutput>src</computeroutput> /
|
|
<computeroutput>dst</computeroutput> overlap in
|
|
<function>memcpy()</function> or a similar function.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Or: <varname>Leak</varname>, meaning
|
|
a memory leak.</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
<para>The extra information line: for Param errors, is the name of the
|
|
offending system call parameter. No other error kinds have this extra
|
|
line.</para>
|
|
|
|
<para>The first line of the calling context: for Value and Addr errors,
|
|
it is either the name of the function in which the error occurred, or,
|
|
failing that, the full path of the .so file or executable containing the
|
|
error location. For Free errors, is the name of the function doing the
|
|
freeing (eg, <function>free</function>,
|
|
<function>__builtin_vec_delete</function>, etc). For Overlap errors, is
|
|
the name of the function with the overlapping arguments (eg.
|
|
<function>memcpy()</function>, <function>strcpy()</function>,
|
|
etc).</para>
|
|
|
|
<para>Lastly, there's the rest of the calling context.</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="mc-manual.machine"
|
|
xreflabel="Details of Memcheck's checking machinery">
|
|
<title>Details of Memcheck's checking machinery</title>
|
|
|
|
<para>Read this section if you want to know, in detail, exactly
|
|
what and how Memcheck is checking.</para>
|
|
|
|
|
|
<sect2 id="mc-manual.value" xreflabel="Valid-value (V) bit">
|
|
<title>Valid-value (V) bits</title>
|
|
|
|
<para>It is simplest to think of Memcheck implementing a synthetic CPU
|
|
which is identical to a real CPU, except for one crucial detail. Every
|
|
bit (literally) of data processed, stored and handled by the real CPU
|
|
has, in the synthetic CPU, an associated "valid-value" bit, which says
|
|
whether or not the accompanying bit has a legitimate value. In the
|
|
discussions which follow, this bit is referred to as the V (valid-value)
|
|
bit.</para>
|
|
|
|
<para>Each byte in the system therefore has a 8 V bits which follow it
|
|
wherever it goes. For example, when the CPU loads a word-size item (4
|
|
bytes) from memory, it also loads the corresponding 32 V bits from a
|
|
bitmap which stores the V bits for the process' entire address space.
|
|
If the CPU should later write the whole or some part of that value to
|
|
memory at a different address, the relevant V bits will be stored back
|
|
in the V-bit bitmap.</para>
|
|
|
|
<para>In short, each bit in the system has an associated V bit, which
|
|
follows it around everywhere, even inside the CPU. Yes, all the CPU's
|
|
registers (integer, floating point, vector and condition registers) have
|
|
their own V bit vectors.</para>
|
|
|
|
<para>Copying values around does not cause Memcheck to check for, or
|
|
report on, errors. However, when a value is used in a way which might
|
|
conceivably affect the outcome of your program's computation, the
|
|
associated V bits are immediately checked. If any of these indicate
|
|
that the value is undefined, an error is reported.</para>
|
|
|
|
<para>Here's an (admittedly nonsensical) example:</para>
|
|
<programlisting><![CDATA[
|
|
int i, j;
|
|
int a[10], b[10];
|
|
for ( i = 0; i < 10; i++ ) {
|
|
j = a[i];
|
|
b[i] = j;
|
|
}]]></programlisting>
|
|
|
|
<para>Memcheck emits no complaints about this, since it merely copies
|
|
uninitialised values from <varname>a[]</varname> into
|
|
<varname>b[]</varname>, and doesn't use them in any way. However, if
|
|
the loop is changed to:</para>
|
|
<programlisting><![CDATA[
|
|
for ( i = 0; i < 10; i++ ) {
|
|
j += a[i];
|
|
}
|
|
if ( j == 77 )
|
|
printf("hello there\n");
|
|
]]></programlisting>
|
|
|
|
<para>then Valgrind will complain, at the
|
|
<computeroutput>if</computeroutput>, that the condition depends on
|
|
uninitialised values. Note that it <command>doesn't</command> complain
|
|
at the <varname>j += a[i];</varname>, since at that point the
|
|
undefinedness is not "observable". It's only when a decision has to be
|
|
made as to whether or not to do the <function>printf</function> -- an
|
|
observable action of your program -- that Memcheck complains.</para>
|
|
|
|
<para>Most low level operations, such as adds, cause Memcheck to use the
|
|
V bits for the operands to calculate the V bits for the result. Even if
|
|
the result is partially or wholly undefined, it does not
|
|
complain.</para>
|
|
|
|
<para>Checks on definedness only occur in three places: when a value is
|
|
used to generate a memory address, when control flow decision needs to
|
|
be made, and when a system call is detected, Valgrind checks definedness
|
|
of parameters as required.</para>
|
|
|
|
<para>If a check should detect undefinedness, an error message is
|
|
issued. The resulting value is subsequently regarded as well-defined.
|
|
To do otherwise would give long chains of error messages. In effect, we
|
|
say that undefined values are non-infectious.</para>
|
|
|
|
<para>This sounds overcomplicated. Why not just check all reads from
|
|
memory, and complain if an undefined value is loaded into a CPU
|
|
register? Well, that doesn't work well, because perfectly legitimate C
|
|
programs routinely copy uninitialised values around in memory, and we
|
|
don't want endless complaints about that. Here's the canonical example.
|
|
Consider a struct like this:</para>
|
|
<programlisting><![CDATA[
|
|
struct S { int x; char c; };
|
|
struct S s1, s2;
|
|
s1.x = 42;
|
|
s1.c = 'z';
|
|
s2 = s1;
|
|
]]></programlisting>
|
|
|
|
<para>The question to ask is: how large is <varname>struct S</varname>,
|
|
in bytes? An <varname>int</varname> is 4 bytes and a
|
|
<varname>char</varname> one byte, so perhaps a <varname>struct
|
|
S</varname> occupies 5 bytes? Wrong. All (non-toy) compilers we know
|
|
of will round the size of <varname>struct S</varname> up to a whole
|
|
number of words, in this case 8 bytes. Not doing this forces compilers
|
|
to generate truly appalling code for subscripting arrays of
|
|
<varname>struct S</varname>'s.</para>
|
|
|
|
<para>So <varname>s1</varname> occupies 8 bytes, yet only 5 of them will
|
|
be initialised. For the assignment <varname>s2 = s1</varname>, gcc
|
|
generates code to copy all 8 bytes wholesale into <varname>s2</varname>
|
|
without regard for their meaning. If Memcheck simply checked values as
|
|
they came out of memory, it would yelp every time a structure assignment
|
|
like this happened. So the more complicated semantics described above
|
|
is necessary. This allows <literal>gcc</literal> to copy
|
|
<varname>s1</varname> into <varname>s2</varname> any way it likes, and a
|
|
warning will only be emitted if the uninitialised values are later
|
|
used.</para>
|
|
|
|
</sect2>
|
|
|
|
|
|
<sect2 id="mc-manual.vaddress" xreflabel=" Valid-address (A) bits">
|
|
<title>Valid-address (A) bits</title>
|
|
|
|
<para>Notice that the previous subsection describes how the validity of
|
|
values is established and maintained without having to say whether the
|
|
program does or does not have the right to access any particular memory
|
|
location. We now consider the latter issue.</para>
|
|
|
|
<para>As described above, every bit in memory or in the CPU has an
|
|
associated valid-value (V) bit. In addition, all bytes in memory, but
|
|
not in the CPU, have an associated valid-address (A) bit. This
|
|
indicates whether or not the program can legitimately read or write that
|
|
location. It does not give any indication of the validity or the data
|
|
at that location -- that's the job of the V bits -- only whether or not
|
|
the location may be accessed.</para>
|
|
|
|
<para>Every time your program reads or writes memory, Memcheck checks
|
|
the A bits associated with the address. If any of them indicate an
|
|
invalid address, an error is emitted. Note that the reads and writes
|
|
themselves do not change the A bits, only consult them.</para>
|
|
|
|
<para>So how do the A bits get set/cleared? Like this:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>When the program starts, all the global data areas are
|
|
marked as accessible.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>When the program does malloc/new, the A bits for exactly the
|
|
area allocated, and not a byte more, are marked as accessible. Upon
|
|
freeing the area the A bits are changed to indicate
|
|
inaccessibility.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>When the stack pointer register (<literal>SP</literal>) moves
|
|
up or down, A bits are set. The rule is that the area from
|
|
<literal>SP</literal> up to the base of the stack is marked as
|
|
accessible, and below <literal>SP</literal> is inaccessible. (If
|
|
that sounds illogical, bear in mind that the stack grows down, not
|
|
up, on almost all Unix systems, including GNU/Linux.) Tracking
|
|
<literal>SP</literal> like this has the useful side-effect that the
|
|
section of stack used by a function for local variables etc is
|
|
automatically marked accessible on function entry and inaccessible
|
|
on exit.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>When doing system calls, A bits are changed appropriately.
|
|
For example, mmap() magically makes files appear in the process'
|
|
address space, so the A bits must be updated if mmap()
|
|
succeeds.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Optionally, your program can tell Valgrind about such changes
|
|
explicitly, using the client request mechanism described
|
|
above.</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
</sect2>
|
|
|
|
|
|
<sect2 id="mc-manual.together" xreflabel="Putting it all together">
|
|
<title>Putting it all together</title>
|
|
|
|
<para>Memcheck's checking machinery can be summarised as
|
|
follows:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>Each byte in memory has 8 associated V (valid-value) bits,
|
|
saying whether or not the byte has a defined value, and a single A
|
|
(valid-address) bit, saying whether or not the program currently has
|
|
the right to read/write that address.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>When memory is read or written, the relevant A bits are
|
|
consulted. If they indicate an invalid address, Valgrind emits an
|
|
Invalid read or Invalid write error.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>When memory is read into the CPU's registers, the relevant V
|
|
bits are fetched from memory and stored in the simulated CPU. They
|
|
are not consulted.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>When a register is written out to memory, the V bits for that
|
|
register are written back to memory too.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>When values in CPU registers are used to generate a memory
|
|
address, or to determine the outcome of a conditional branch, the V
|
|
bits for those values are checked, and an error emitted if any of
|
|
them are undefined.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>When values in CPU registers are used for any other purpose,
|
|
Valgrind computes the V bits for the result, but does not check
|
|
them.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>One the V bits for a value in the CPU have been checked, they
|
|
are then set to indicate validity. This avoids long chains of
|
|
errors.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>When values are loaded from memory, valgrind checks the A bits
|
|
for that location and issues an illegal-address warning if needed.
|
|
In that case, the V bits loaded are forced to indicate Valid,
|
|
despite the location being invalid.</para>
|
|
|
|
<para>This apparently strange choice reduces the amount of confusing
|
|
information presented to the user. It avoids the unpleasant
|
|
phenomenon in which memory is read from a place which is both
|
|
unaddressible and contains invalid values, and, as a result, you get
|
|
not only an invalid-address (read/write) error, but also a
|
|
potentially large set of uninitialised-value errors, one for every
|
|
time the value is used.</para>
|
|
|
|
<para>There is a hazy boundary case to do with multi-byte loads from
|
|
addresses which are partially valid and partially invalid. See
|
|
details of the flag <option>--partial-loads-ok</option> for details.
|
|
</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
|
|
<para>Memcheck intercepts calls to malloc, calloc, realloc, valloc,
|
|
memalign, free, new, new[], delete and delete[]. The behaviour you get
|
|
is:</para>
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
<para>malloc/new/new[]: the returned memory is marked as addressible
|
|
but not having valid values. This means you have to write on it
|
|
before you can read it.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>calloc: returned memory is marked both addressible and valid,
|
|
since calloc() clears the area to zero.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>realloc: if the new size is larger than the old, the new
|
|
section is addressible but invalid, as with malloc.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>If the new size is smaller, the dropped-off section is marked
|
|
as unaddressible. You may only pass to realloc a pointer previously
|
|
issued to you by malloc/calloc/realloc.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>free/delete/delete[]: you may only pass to these functions a
|
|
pointer previously issued to you by the corresponding allocation
|
|
function. Otherwise, Valgrind complains. If the pointer is indeed
|
|
valid, Valgrind marks the entire area it points at as unaddressible,
|
|
and places the block in the freed-blocks-queue. The aim is to defer
|
|
as long as possible reallocation of this block. Until that happens,
|
|
all attempts to access it will elicit an invalid-address error, as
|
|
you would hope.</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
</sect2>
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="mc-manual.clientreqs" xreflabel="Client requests">
|
|
<title>Client Requests</title>
|
|
|
|
<para>The following client requests are defined in
|
|
<filename>memcheck.h</filename>.
|
|
See <filename>memcheck.h</filename> for exact details of their
|
|
arguments.</para>
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
<para><varname>VALGRIND_MAKE_MEM_NOACCESS</varname>,
|
|
<varname>VALGRIND_MAKE_MEM_UNDEFINED</varname> and
|
|
<varname>VALGRIND_MAKE_MEM_DEFINED</varname>.
|
|
These mark address ranges as completely inaccessible,
|
|
accessible but containing undefined data, and accessible and
|
|
containing defined data, respectively. Subsequent errors may
|
|
have their faulting addresses described in terms of these
|
|
blocks. Returns a "block handle". Returns zero when not run
|
|
on Valgrind.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><varname>VALGRIND_MAKE_MEM_DEFINED_IF_ADDRESSABLE</varname>.
|
|
This is just like <varname>VALGRIND_MAKE_MEM_DEFINED</varname> but only
|
|
affects those bytes that are already addressable.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><varname>VALGRIND_DISCARD</varname>: At some point you may
|
|
want Valgrind to stop reporting errors in terms of the blocks
|
|
defined by the previous three macros. To do this, the above macros
|
|
return a small-integer "block handle". You can pass this block
|
|
handle to <varname>VALGRIND_DISCARD</varname>. After doing so,
|
|
Valgrind will no longer be able to relate addressing errors to the
|
|
user-defined block associated with the handle. The permissions
|
|
settings associated with the handle remain in place; this just
|
|
affects how errors are reported, not whether they are reported.
|
|
Returns 1 for an invalid handle and 0 for a valid handle (although
|
|
passing invalid handles is harmless). Always returns 0 when not run
|
|
on Valgrind.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><varname>VALGRIND_CHECK_MEM_IS_ADDRESSABLE</varname> and
|
|
<varname>VALGRIND_CHECK_MEM_IS_DEFINED</varname>: check immediately
|
|
whether or not the given address range has the relevant property,
|
|
and if not, print an error message. Also, for the convenience of
|
|
the client, returns zero if the relevant property holds; otherwise,
|
|
the returned value is the address of the first byte for which the
|
|
property is not true. Always returns 0 when not run on
|
|
Valgrind.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><varname>VALGRIND_CHECK_VALUE_IS_DEFINED</varname>: a quick and easy
|
|
way to find out whether Valgrind thinks a particular value
|
|
(lvalue, to be precise) is addressable and defined. Prints an error
|
|
message if not. Returns no value.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><varname>VALGRIND_DO_LEAK_CHECK</varname>: run the memory leak
|
|
detector right now. Returns no value. I guess this could be used
|
|
to incrementally check for leaks between arbitrary places in the
|
|
program's execution. Warning: not properly tested!</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><varname>VALGRIND_COUNT_LEAKS</varname>: fills in the four
|
|
arguments with the number of bytes of memory found by the previous
|
|
leak check to be leaked, dubious, reachable and suppressed. Again,
|
|
useful in test harness code, after calling
|
|
<varname>VALGRIND_DO_LEAK_CHECK</varname>.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><varname>VALGRIND_GET_VBITS</varname> and
|
|
<varname>VALGRIND_SET_VBITS</varname>: allow you to get and set the
|
|
V (validity) bits for an address range. You should probably only
|
|
set V bits that you have got with
|
|
<varname>VALGRIND_GET_VBITS</varname>. Only for those who really
|
|
know what they are doing.</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
</sect1>
|
|
</chapter>
|