mirror of
https://github.com/Zenithsiz/ftmemsim-valgrind.git
synced 2026-02-04 02:18:37 +00:00
511 lines
20 KiB
XML
511 lines
20 KiB
XML
<?xml version="1.0"?> <!-- -*- sgml -*- -->
|
|
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
|
|
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
|
|
[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
|
|
|
|
|
|
<chapter id="pc-manual"
|
|
xreflabel="Ptrcheck: an experimental heap, stack and global array overrun detector">
|
|
<title>Ptrcheck: an experimental heap, stack and global array overrun detector</title>
|
|
|
|
<para>To use this tool, you must specify
|
|
<option>--tool=exp-ptrcheck</option> on the Valgrind
|
|
command line.</para>
|
|
|
|
|
|
|
|
|
|
<sect1 id="pc-manual.overview" xreflabel="Overview">
|
|
<title>Overview</title>
|
|
|
|
<para>Ptrcheck is a tool for finding overruns of heap, stack
|
|
and global arrays. Its functionality overlaps somewhat with
|
|
Memcheck's, but it is able to catch invalid accesses in a number of
|
|
cases that Memcheck would miss. A detailed comparison against
|
|
Memcheck is presented below.</para>
|
|
|
|
<para>Ptrcheck is composed of two almost completely independent tools
|
|
that have been glued together. One part,
|
|
in <computeroutput>h_main.[ch]</computeroutput>, checks accesses
|
|
through heap-derived pointers. The other part, in
|
|
<computeroutput>sg_main.[ch]</computeroutput>, checks accesses to
|
|
stack and global arrays. The remaining
|
|
files <computeroutput>pc_{common,main}.[ch]</computeroutput>, provide
|
|
common error-management and coordination functions, so as to make it
|
|
appear as a single tool.</para>
|
|
|
|
<para>The heap-check part is an extensively-hacked (largely rewritten)
|
|
version of the experimental "Annelid" tool developed and described by
|
|
Nicholas Nethercote and Jeremy Fitzhardinge. The stack- and global-
|
|
check part uses a heuristic approach derived from an observation about
|
|
the likely forms of stack and global array accesses, and, as far as is
|
|
known, is entirely novel.</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
|
|
<sect1 id="pc-manual.options" xreflabel="Ptrcheck Command-line Options">
|
|
<title>Ptrcheck Command-line Options</title>
|
|
|
|
<para>Ptrcheck-specific command-line options are:</para>
|
|
|
|
<!-- start of xi:include in the manpage -->
|
|
<variablelist id="pc.opts.list">
|
|
|
|
<varlistentry id="opt.enable-sg-checks" xreflabel="--enable-sg-checks">
|
|
<term>
|
|
<option><![CDATA[--enable-sg-checks=no|yes
|
|
[default: yes] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>By default, Ptrcheck checks for overruns of stack, global
|
|
and heap arrays.
|
|
With <varname>--enable-sg-checks=no</varname>, the stack and
|
|
global array checks are omitted, and only heap checking is
|
|
performed. This can be useful because the stack and global
|
|
checks are quite expensive, so omitting them speeds Ptrcheck up
|
|
a lot.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry id="opt.partial-loads-ok" xreflabel="--partial-loads-ok">
|
|
<term>
|
|
<option><![CDATA[--partial-loads-ok=<yes|no> [default: no] ]]></option>
|
|
</term>
|
|
<listitem>
|
|
<para>This option has the same meaning as it does for
|
|
Memcheck.</para>
|
|
<para>Controls how Ptrcheck handles word-sized, word-aligned
|
|
loads which partially overlap the end of heap blocks -- that is,
|
|
some of the bytes in the word are validly addressable, but
|
|
others are not. When <varname>yes</varname>, such loads do not
|
|
produce an address error. When <varname>no</varname> (the
|
|
default), loads from partially invalid addresses are treated the
|
|
same as loads from completely invalid addresses: an illegal heap
|
|
access error is issued.
|
|
</para>
|
|
<para>Note that code that behaves in this way is in violation of
|
|
the the ISO C/C++ standards, and should be considered broken. If
|
|
at all possible, such code should be fixed. This option should be
|
|
used only as a last resort.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
<!-- end of xi:include in the manpage -->
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
|
|
<sect1 id="pc-manual.how-works.heap-checks"
|
|
xreflabel="How Ptrcheck Works: Heap Checks">
|
|
<title>How Ptrcheck Works: Heap Checks</title>
|
|
|
|
<para>Ptrcheck can check for invalid uses of heap pointers, including
|
|
out of range accesses and accesses to freed memory. The mechanism is
|
|
however completely different from Memcheck's, and the checking is more
|
|
powerful.</para>
|
|
|
|
<para>For each pointer in the program, Ptrcheck keeps track of which
|
|
heap block (if any) it was derived from. Then, when an access is made
|
|
through that pointer, Ptrcheck compares the access address with the
|
|
bounds of the associated block, and reports an error if the address is
|
|
out of bounds, or if the block has been freed.</para>
|
|
|
|
<para>Of course it is rarely the case that one wants to access a block
|
|
only at the exact address returned by <function>malloc</function> et al.
|
|
Ptrcheck understands that adding or subtracting offsets from a pointer to a
|
|
block results in a pointer to the same block.</para>
|
|
|
|
<para>At a fundamental level, this scheme works because a correct
|
|
program cannot make assumptions about the addresses returned by
|
|
<function>malloc</function> et al. In particular it cannot make any
|
|
assumptions about the differences in addresses returned by subsequent calls
|
|
to <function>malloc</function> et al. Hence there are very few ways to take
|
|
an address returned by <function>malloc</function>, modify it, and still
|
|
have a valid address. In short, the only allowable operations are adding
|
|
and subtracting other non-pointer values. Almost all other operations
|
|
produce a value which cannot possibly be a valid pointer.</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="pc-manual.how-works.sg-checks"
|
|
xreflabel="How Ptrcheck Works: Stack and Global Checks">
|
|
<title>How Ptrcheck Works: Stack and Global Checks</title>
|
|
|
|
<para>When a source file is compiled
|
|
with <option>-g</option>, the compiler attaches DWARF3
|
|
debugging information which describes the location of all stack and
|
|
global arrays in the file.</para>
|
|
|
|
<para>Checking of accesses to such arrays would then be relatively
|
|
simple, if the compiler could also tell us which array (if any) each
|
|
memory referencing instruction was supposed to access. Unfortunately
|
|
the DWARF3 debugging format does not provide a way to represent such
|
|
information, so we have to resort to a heuristic technique to
|
|
approximate the same information. The key observation is that
|
|
<emphasis>
|
|
if a memory referencing instruction accesses inside a stack or
|
|
global array once, then it is highly likely to always access that
|
|
same array</emphasis>.</para>
|
|
|
|
<para>To see how this might be useful, consider the following buggy
|
|
fragment:</para>
|
|
<programlisting><![CDATA[
|
|
{ int i, a[10]; // both are auto vars
|
|
for (i = 0; i <= 10; i++)
|
|
a[i] = 42;
|
|
}
|
|
]]></programlisting>
|
|
|
|
<para>At run time we will know the precise address
|
|
of <computeroutput>a[]</computeroutput> on the stack, and so we can
|
|
observe that the first store resulting from <computeroutput>a[i] =
|
|
42</computeroutput> writes <computeroutput>a[]</computeroutput>, and
|
|
we will (correctly) assume that that instruction is intended always to
|
|
access <computeroutput>a[]</computeroutput>. Then, on the 11th
|
|
iteration, it accesses somewhere else, possibly a different local,
|
|
possibly an un-accounted for area of the stack (eg, spill slot), so
|
|
Ptrcheck reports an error.</para>
|
|
|
|
<para>There is an important caveat.</para>
|
|
|
|
<para>Imagine a function such as <function>memcpy</function>, which is used
|
|
to read and write many different areas of memory over the lifetime of the
|
|
program. If we insist that the read and write instructions in its memory
|
|
copying loop only ever access one particular stack or global variable, we
|
|
will be flooded with errors resulting from calls to
|
|
<function>memcpy</function>.</para>
|
|
|
|
<para>To avoid this problem, Ptrcheck instantiates fresh likely-target
|
|
records for each entry to a function, and discards them on exit. This
|
|
allows detection of cases where (e.g.) <function>memcpy</function> overflows
|
|
its source or destination buffers for any specific call, but does not carry
|
|
any restriction from one call to the next. Indeed, multiple threads may be
|
|
multiple simultaneous calls to (e.g.) <function>memcpy</function> without
|
|
mutual interference.</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
|
|
<sect1 id="pc-manual.cmp-w-memcheck"
|
|
xreflabel="Comparison with Memcheck">
|
|
<title>Comparison with Memcheck</title>
|
|
|
|
<para>Memcheck does not do any access checks for stack or global arrays, so
|
|
the presence of those in Ptrcheck is a straight win. (But see
|
|
"Limitations" below).</para>
|
|
|
|
<para>Memcheck and Ptrcheck use different approaches for checking heap
|
|
accesses. Memcheck maintains bitmaps telling it which areas of memory
|
|
are accessible and which are not. If a memory access falls in an
|
|
unaccessible area, it reports an error. By marking the 16 bytes
|
|
before and after an allocated block unaccessible, Memcheck is able to
|
|
detect small over- and underruns of the block. Similarly, by marking
|
|
freed memory as unaccessible, Memcheck can detect all accesses to
|
|
freed memory.</para>
|
|
|
|
<para>Memcheck's approach is simple. But it's also weak. It can't
|
|
catch block overruns beyond 16 bytes. And, more generally, because it
|
|
focusses only on the question "is the target address accessible", it
|
|
fails to detect invalid accesses which just happen to fall within some
|
|
other valid area. This is not improbable, especially in crowded areas
|
|
of the process' address space.</para>
|
|
|
|
<para>Ptrcheck's approach is to keep track of pointers derived from
|
|
heap blocks. It tracks pointers which are derived directly from calls
|
|
to <function>malloc</function> et al, but also ones derived indirectly, by
|
|
adding or subtracting offsets from the directly-derived pointers. When a
|
|
pointer is finally used to access memory, Ptrcheck compares the access
|
|
address with that of the block it was originally derived from, and
|
|
reports an error if the access address is not within the block
|
|
bounds.</para>
|
|
|
|
<para>Consequently Ptrcheck can detect any out of bounds access
|
|
through a heap-derived pointer, no matter how far from the original
|
|
block it is.</para>
|
|
|
|
<para>A second advantage is that Ptrcheck is better at detecting
|
|
accesses to blocks freed very far in the past. Memcheck can detect
|
|
these too, but only for blocks freed relatively recently. To detect
|
|
accesses to a freed block, Memcheck must make it inaccessible, hence
|
|
requiring a space overhead proportional to the size of the block. If
|
|
the blocks are large, Memcheck will have to make them available for
|
|
re-allocation relatively quickly, thereby losing the ability to detect
|
|
invalid accesses to them.</para>
|
|
|
|
<para>By contrast, Ptrcheck has a constant per-block space requirement
|
|
of four machine words, for detection of accesses to freed blocks. A
|
|
freed block can be reallocated immediately, yet Ptrcheck can still
|
|
detect all invalid accesses through any pointers derived from the old
|
|
allocation, providing only that the four-word descriptor for the old
|
|
allocation is stored. For example, on a 64-bit machine, to detect
|
|
accesses in any of the most recently freed 10 million blocks, Ptrcheck
|
|
will require only 320MB of extra storage. Achieving the same level of
|
|
detection with Memcheck is close to impossible and would likely
|
|
involve several gigabytes of extra storage.</para>
|
|
|
|
<para>Having said all that, remember that Memcheck performs uninitialised
|
|
value checking, invalid and mismatched free checking, overlap checking, and
|
|
leak checking, none of which Ptrcheck do. Memcheck has also benefitted from
|
|
years of refinement, tuning, and experience with production-level usage, and
|
|
so is much faster than Ptrcheck as it currently stands.
|
|
</para>
|
|
|
|
<para>Consequently we recommend you first make your programs run Memcheck
|
|
clean. Once that's done, try Ptrcheck to see if you can shake out any
|
|
further heap, global or stack errors.</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
|
|
|
|
<sect1 id="pc-manual.limitations"
|
|
xreflabel="Limitations">
|
|
<title>Limitations</title>
|
|
|
|
<para>This is an experimental tool, which relies rather too heavily on some
|
|
not-as-robust-as-I-would-like assumptions on the behaviour of correct
|
|
programs. There are a number of limitations which you should be aware
|
|
of.</para>
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
<para>Heap checks: Ptrcheck can occasionally lose track of, or
|
|
become confused about, which heap block a given pointer has been
|
|
derived from. This can cause it to falsely report errors, or to
|
|
miss some errors. This is not believed to be a serious
|
|
problem.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Heap checks: Ptrcheck only tracks pointers that are stored
|
|
properly aligned in memory. If a pointer is stored at a misaligned
|
|
address, and then later read again, Ptrcheck will lose track of
|
|
what it points at. Similar problem if a pointer is split into
|
|
pieces and later reconsitituted.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Heap checks: Ptrcheck needs to "understand" which system
|
|
calls return pointers and which don't. Many, but not all system
|
|
calls are handled. If an unhandled one is encountered, Ptrcheck will
|
|
abort. Fortunately, adding support for a new syscall is very
|
|
easy.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Stack checks: It follows from the description above (<xref
|
|
linkend="pc-manual.how-works.sg-checks"/>) that the first access by a
|
|
memory referencing instruction to a stack or global array creates an
|
|
association between that instruction and the array, which is checked on
|
|
subsequent accesses by that instruction, until the containing function
|
|
exits. Hence, the first access by an instruction to an array (in any
|
|
given function instantiation) is not checked for overrun, since Ptrcheck
|
|
uses that as the "example" of how subsequent accesses should
|
|
behave.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Stack checks: Similarly, and more serious, it is clearly
|
|
possible to write legitimate pieces of code which break the basic
|
|
assumption upon which the stack/global checking rests. For
|
|
example:</para>
|
|
|
|
<programlisting><![CDATA[
|
|
{ int a[10], b[10], *p, i;
|
|
for (i = 0; i < 10; i++) {
|
|
p = /* arbitrary condition */ ? &a[i] : &b[i];
|
|
*p = 42;
|
|
}
|
|
}
|
|
]]></programlisting>
|
|
|
|
<para>In this case the store sometimes
|
|
accesses <computeroutput>a[]</computeroutput> and
|
|
sometimes <computeroutput>b[]</computeroutput>, but in no cases is
|
|
the addressed array overrun. Nevertheless the change in target
|
|
will cause an error to be reported.</para>
|
|
|
|
<para>It is hard to see how to get around this problem. The only
|
|
mitigating factor is that such constructions appear very rare, at
|
|
least judging from the results using the tool so far. Such a
|
|
construction appears only once in the Valgrind sources (running
|
|
Valgrind on Valgrind) and perhaps two or three times for a start
|
|
and exit of Firefox. The best that can be done is to suppress the
|
|
errors.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Performance: the stack/global checks require reading all of
|
|
the DWARF3 type and variable information on the executable and its
|
|
shared objects. This is computationally expensive and makes
|
|
startup quite slow. You can expect debuginfo reading time to be in
|
|
the region of a minute for an OpenOffice sized application, on a
|
|
2.4 GHz Core 2 machine. Reading this information also requires a
|
|
lot of memory. To make it viable, Ptrcheck goes to considerable
|
|
trouble to compress the in-memory representation of the DWARF3
|
|
data, which is why the process of reading it appears slow.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Performance: Ptrcheck runs slower than Memcheck. This is
|
|
partly due to a lack of tuning, but partly due to algorithmic
|
|
difficulties. The heap-check side is potentially quite fast. The
|
|
stack and global checks can sometimes require a number of range
|
|
checks per memory access, and these are difficult to short-circuit
|
|
(despite considerable efforts having been made).
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Coverage: the heap checking is relatively robust, requiring
|
|
only that Ptrcheck can see calls to <function>malloc</function> et al.
|
|
In that sense it has debug-info requirements comparable with Memcheck,
|
|
and is able to heap-check programs even with no debugging information
|
|
attached.</para>
|
|
|
|
<para>Stack/global checking is much more fragile. If a shared
|
|
object does not have debug information attached, then Ptrcheck will
|
|
not be able to determine the bounds of any stack or global arrays
|
|
defined within that shared object, and so will not be able to check
|
|
accesses to them. This is true even when those arrays are accessed
|
|
from some other shared object which was compiled with debug
|
|
info.</para>
|
|
|
|
<para>At the moment Ptrcheck accepts objects lacking debuginfo
|
|
without comment. This is dangerous as it causes Ptrcheck to
|
|
silently skip stack and global checking for such objects. It would
|
|
be better to print a warning in such circumstances.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Coverage: Ptrcheck checks that the areas read or written by
|
|
system calls do not overrun heap blocks. But it doesn't currently
|
|
check them for overruns stack and global arrays. This would be
|
|
easy to add.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Platforms: the stack/global checks won't work properly on any
|
|
PowerPC platforms, only on x86 and amd64 targets. That's because
|
|
the stack and global checking requires tracking function calls and
|
|
exits reliably, and there's no obvious way to do it with the PPC
|
|
ABIs. (In comparison, with the x86 and amd64 ABIs this is relatively
|
|
straightforward.)</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Robustness: related to the previous point. Function
|
|
call/exit tracking for x86/amd64 is believed to work properly even
|
|
in the presence of longjmps within the same stack (although this
|
|
has not been tested). However, code which switches stacks is
|
|
likely to cause breakage/chaos.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
|
|
|
|
<sect1 id="pc-manual.todo-user-visible"
|
|
xreflabel="Still To Do: User-visible Functionality">
|
|
<title>Still To Do: User-visible Functionality</title>
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
<para>Extend system call checking to work on stack and global arrays.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Print a warning if a shared object does not have debug info
|
|
attached, or if, for whatever reason, debug info could not be
|
|
found, or read.</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
|
|
<sect1 id="pc-manual.todo-implementation"
|
|
xreflabel="Still To Do: Implementation Tidying">
|
|
<title>Still To Do: Implementation Tidying</title>
|
|
|
|
<para>Items marked CRITICAL are considered important for correctness:
|
|
non-fixage of them is liable to lead to crashes or assertion failures
|
|
in real use.</para>
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
<para>h_main.c: make N_FREED_SEGS command-line configurable.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para> sg_main.c: Improve the performance of the stack / global
|
|
checks by doing some up-front filtering to ignore references in
|
|
areas which "obviously" can't be stack or globals. This will
|
|
require using information that m_aspacemgr knows about the address
|
|
space layout.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>h_main.c: get rid of the last_seg_added hack; add suitable
|
|
plumbing to the core/tool interface to do this cleanly.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>h_main.c: move vast amounts of arch-dependent ugliness
|
|
(get_IntRegInfo et al) to its own source file, a la
|
|
mc_machine.c.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>h_main.c: make the lossage-check stuff work again, as a way
|
|
of doing quality assurance on the implementation.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>h_main.c: schemeEw_Atom: don't generate a call to
|
|
nonptr_or_unknown, this is really stupid, since it could be done at
|
|
translation time instead.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>CRITICAL: h_main.c: h_instrument (main instrumentation fn):
|
|
generate shadows for word-sized temps defined in the block's
|
|
preamble. (Why does this work at all, as it stands?)</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>sg_main.c: fix compute_II_hash to make it a bit more sensible
|
|
for ppc32/64 targets (except that sg_ doesn't work on ppc32/64
|
|
targets, so this is a bit academic at the moment).</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
</chapter>
|