mirror of
https://github.com/Zenithsiz/ftmemsim-valgrind.git
synced 2026-02-03 18:13:01 +00:00
Minor HTML fixes in docs, thanks to Arnaud Desitter.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@1522
This commit is contained in:
parent
7332367dab
commit
3cc0c8f8fa
@ -54,8 +54,7 @@ The two steps are:
|
||||
This step should be done every time you want to collect
|
||||
information about a new program, a changed program, or about the
|
||||
same program with different input.
|
||||
</li>
|
||||
<p>
|
||||
</li><p>
|
||||
<li>Generate a function-by-function summary, and possibly annotate
|
||||
source files, using the supplied
|
||||
<code>cg_annotate</code> program. Source files to annotate can be
|
||||
@ -66,14 +65,14 @@ The two steps are:
|
||||
<p>
|
||||
This step can be performed as many times as you like for each
|
||||
Step 2. You may want to do multiple annotations showing
|
||||
different information each time.<p>
|
||||
</li>
|
||||
different information each time.
|
||||
</li><p>
|
||||
</ol>
|
||||
|
||||
The steps are described in detail in the following sections.<p>
|
||||
The steps are described in detail in the following sections.
|
||||
|
||||
|
||||
<h4>4.3 Cache simulation specifics</h3>
|
||||
<h3>4.3 Cache simulation specifics</h3>
|
||||
|
||||
Cachegrind uses a simulation for a machine with a split L1 cache and a unified
|
||||
L2 cache. This configuration is used for all (modern) x86-based machines we
|
||||
@ -85,20 +84,22 @@ The more specific characteristics of the simulation are as follows.
|
||||
<ul>
|
||||
<li>Write-allocate: when a write miss occurs, the block written to
|
||||
is brought into the D1 cache. Most modern caches have this
|
||||
property.</li><p>
|
||||
|
||||
property.<p>
|
||||
</li>
|
||||
<p>
|
||||
<li>Bit-selection hash function: the line(s) in the cache to which a
|
||||
memory block maps is chosen by the middle bits M--(M+N-1) of the
|
||||
byte address, where:
|
||||
<ul>
|
||||
<li> line size = 2^M bytes </li>
|
||||
<li>(cache size / line size) = 2^N bytes</li>
|
||||
</ul> </li><p>
|
||||
|
||||
</ul>
|
||||
</li>
|
||||
<p>
|
||||
<li>Inclusive L2 cache: the L2 cache replicates all the entries of
|
||||
the L1 cache. This is standard on Pentium chips, but AMD
|
||||
Athlons use an exclusive L2 cache that only holds blocks evicted
|
||||
from L1. Ditto AMD Durons and most modern VIAs.</li><p>
|
||||
from L1. Ditto AMD Durons and most modern VIAs.</li>
|
||||
</ul>
|
||||
|
||||
The cache configuration simulated (cache size, associativity and line size) is
|
||||
@ -108,8 +109,9 @@ an early incarnation that doesn't give any cache information, then Cachegrind
|
||||
will fall back to using a default configuration (that of a model 3/4 Athlon).
|
||||
Cachegrind will tell you if this happens. You can manually specify one, two or
|
||||
all three levels (I1/D1/L2) of the cache from the command line using the
|
||||
<code>--I1</code>, <code>--D1</code> and <code>--L2</code> options.<p>
|
||||
<code>--I1</code>, <code>--D1</code> and <code>--L2</code> options.
|
||||
|
||||
<p>
|
||||
Other noteworthy behaviour:
|
||||
|
||||
<ul>
|
||||
@ -118,14 +120,15 @@ Other noteworthy behaviour:
|
||||
<li>If both blocks hit --> counted as one hit</li>
|
||||
<li>If one block hits, the other misses --> counted as one miss</li>
|
||||
<li>If both blocks miss --> counted as one miss (not two)</li>
|
||||
</ul><p></li>
|
||||
</ul>
|
||||
</li>
|
||||
|
||||
<li>Instructions that modify a memory location (eg. <code>inc</code> and
|
||||
<code>dec</code>) are counted as doing just a read, ie. a single data
|
||||
reference. This may seem strange, but since the write can never cause a
|
||||
miss (the read guarantees the block is in the cache) it's not very
|
||||
interesting.<p>
|
||||
|
||||
interesting.
|
||||
<p>
|
||||
Thus it measures not the number of times the data cache is accessed, but
|
||||
the number of times a data cache miss could occur.<p>
|
||||
</li>
|
||||
@ -170,14 +173,14 @@ that look like this will be printed:
|
||||
Cache accesses for instruction fetches are summarised first, giving the
|
||||
number of fetches made (this is the number of instructions executed, which
|
||||
can be useful to know in its own right), the number of I1 misses, and the
|
||||
number of L2 instruction (<code>L2i</code>) misses.<p>
|
||||
|
||||
number of L2 instruction (<code>L2i</code>) misses.
|
||||
<p>
|
||||
Cache accesses for data follow. The information is similar to that of the
|
||||
instruction fetches, except that the values are also shown split between reads
|
||||
and writes (note each row's <code>rd</code> and <code>wr</code> values add up
|
||||
to the row's total).<p>
|
||||
|
||||
Combined instruction and data figures for the L2 cache follow that.<p>
|
||||
to the row's total).
|
||||
<p>
|
||||
Combined instruction and data figures for the L2 cache follow that.
|
||||
|
||||
|
||||
<h3>4.5 Output file</h3>
|
||||
@ -194,8 +197,7 @@ Things to note about the <code>cachegrind.out.<i>pid</i></code> file:
|
||||
is run, and will overwrite any existing
|
||||
<code>cachegrind.out.<i>pid</i></code> in the current directory (but
|
||||
that won't happen very often because it takes some time for process ids
|
||||
to be recycled).</li>
|
||||
<p>
|
||||
to be recycled).</li><p>
|
||||
<li>It can be huge: <code>ls -l</code> generates a file of about
|
||||
350KB. Browsing a few files and web pages with a Konqueror
|
||||
built with full debugging information generates a file
|
||||
|
||||
@ -175,7 +175,7 @@ three spaces in which program code executes:
|
||||
<a name="writingaskin"></a>
|
||||
<h2>2 Writing a Skin</h2>
|
||||
|
||||
<a name="whywriteaskin"</a>
|
||||
<a name="whywriteaskin"></a>
|
||||
<h3>2.1 Why write a skin?</h3>
|
||||
|
||||
Before you write a skin, you should have some idea of what it should do. What
|
||||
@ -209,7 +209,7 @@ the number of times a particular function is called) to very intrusive (e.g.
|
||||
memcheck's memory checking).
|
||||
|
||||
|
||||
<a name="suggestedskins"</a>
|
||||
<a name="suggestedskins"></a>
|
||||
<h3>2.2 Suggested skins</h3>
|
||||
|
||||
Here is a list of ideas we have had for skins that should not be too hard to
|
||||
@ -279,7 +279,7 @@ implement.
|
||||
|
||||
We would love to hear from anyone who implements these or other skins.
|
||||
|
||||
<a name="howskinswork"</a>
|
||||
<a name="howskinswork"></a>
|
||||
<h3>2.3 How skins work</h3>
|
||||
|
||||
Skins must define various functions for instrumenting programs that are called
|
||||
@ -299,7 +299,7 @@ This magic is all done for you; the shared object used is chosen with the
|
||||
<code>--skin</code> option to the <code>valgrind</code> startup script. The
|
||||
default skin used is <code>memcheck</code>, Valgrind's original memory checker.
|
||||
|
||||
<a name="gettingcode"</a>
|
||||
<a name="gettingcode"></a>
|
||||
<h3>2.4 Getting the code</h3>
|
||||
|
||||
To write your own skin, you'll need to check out a copy of Valgrind from the
|
||||
@ -325,7 +325,7 @@ cvs -z3 -d:pserver:anonymous@cvs.valgrind.sourceforge.net:/cvsroot/valgrind co -
|
||||
where <code><i>TAG</i></code> has the form <code>VALGRIND_X_Y_Z</code> for
|
||||
version X.Y.Z.
|
||||
|
||||
<a name="gettingstarted"</a>
|
||||
<a name="gettingstarted"></a>
|
||||
<h3>2.5 Getting started</h3>
|
||||
|
||||
Valgrind uses GNU <code>automake</code> and <code>autoconf</code> for the
|
||||
@ -532,7 +532,7 @@ These just prepend longer strings in front of names to avoid potential
|
||||
namespace clashes. We strongly recommend using the <code>SK_</code> macro
|
||||
for any global functions and variables in your skin.<p>
|
||||
|
||||
<a name="wordsofadvice"</a>
|
||||
<a name="wordsofadvice"></a>
|
||||
<h3>2.11 Words of Advice</h3>
|
||||
|
||||
Writing and debugging skins is not trivial. Here are some suggestions for
|
||||
@ -556,6 +556,7 @@ Valgrind with some effort:
|
||||
for (p = 0; p < 50000; p++)
|
||||
for (q = 0; q < 50000; q++) ;
|
||||
}
|
||||
</pre>
|
||||
</li><p>
|
||||
and rebuild Valgrind.
|
||||
|
||||
@ -594,7 +595,7 @@ The other debugging command line options can be useful too (run <code>valgrind
|
||||
Once a skin becomes more complicated, there are some extra things you may
|
||||
want/need to do.
|
||||
|
||||
<a name="suppressions"</a>
|
||||
<a name="suppressions"></a>
|
||||
<h3>3.1 Suppressions</h3>
|
||||
|
||||
If your skin reports errors and you want to suppress some common ones, you can
|
||||
@ -603,7 +604,7 @@ add suppressions to the suppression files. The relevant files are
|
||||
these files by combining the relevant <code>.supp</code> files depending on the
|
||||
versions of linux, X and glibc on a system.
|
||||
|
||||
<a name="documentation"</a>
|
||||
<a name="documentation"></a>
|
||||
<h3>3.2 Documentation</h3>
|
||||
|
||||
If you are feeling conscientious and want to write some HTML documentation for
|
||||
@ -636,7 +637,7 @@ name again):
|
||||
</li><p>
|
||||
</ol>
|
||||
|
||||
<a name="regressiontests"</a>
|
||||
<a name="regressiontests"></a>
|
||||
<h3>3.3 Regression tests</h3>
|
||||
|
||||
Valgrind has some support for regression tests. If you want to write
|
||||
@ -673,7 +674,7 @@ regression tests for your skin:
|
||||
</li><p>
|
||||
</ol>
|
||||
|
||||
<a name="profiling"</a>
|
||||
<a name="profiling"></a>
|
||||
<h3>3.4 Profiling</h3>
|
||||
|
||||
To do simple tick-based profiling of a skin, include the line
|
||||
@ -691,7 +692,7 @@ core profiling event numbers. See <code>include/vg_skin.h</code> for details
|
||||
and the ``memcheck'' skin for an example.
|
||||
|
||||
|
||||
<a name="othermakefilehackery"</a>
|
||||
<a name="othermakefilehackery"></a>
|
||||
<h3>3.5 Other makefile hackery</h3>
|
||||
|
||||
If you add any directories under <code>valgrind/foobar/</code>, you will
|
||||
@ -704,7 +705,7 @@ add them to the <code>bin_SCRIPTS</code> variable in
|
||||
<code>valgrind/foobar/Makefile.am</code>.<p>
|
||||
|
||||
|
||||
<a name="interfaceversions"</a>
|
||||
<a name="interfaceversions"></a>
|
||||
<h3>3.5 Core/skin interface versions</h3>
|
||||
|
||||
In order to allow for the core/skin interface to evolve over time, Valgrind
|
||||
|
||||
@ -32,7 +32,7 @@ Detailed technical notes for hackers, maintainers and the
|
||||
overly-curious<br>
|
||||
These notes pertain to snapshot 20020306<br>
|
||||
<p>
|
||||
<a href="mailto:jseward@acm.org">jseward@acm.org<br>
|
||||
<a href="mailto:jseward@acm.org">jseward@acm.org</a><br>
|
||||
<a href="http://developer.kde.org/~sewardj">http://developer.kde.org/~sewardj</a><br>
|
||||
Copyright © 2000-2002 Julian Seward
|
||||
<p>
|
||||
@ -363,7 +363,7 @@ performance or functionality. As a result:
|
||||
<li>The main dispatch loop, in <code>VG_(dispatch)</code>, checks
|
||||
that translations do not set <code>%ebp</code> to any value
|
||||
different from <code>VG_EBP_DISPATCH_CHECKED</code> or
|
||||
<code>& VG_(baseBlock)</code>. In effect this test is free,
|
||||
<code>& VG_(baseBlock)</code>. In effect this test is free,
|
||||
and is permanently engaged.
|
||||
<p>
|
||||
<li>There are a couple of ifdefed-out consistency checks I
|
||||
@ -762,7 +762,7 @@ junk faster than you can possibly imagine.
|
||||
<h3>UCode operand tags: type <code>Tag</code></h3>
|
||||
|
||||
UCode is, more or less, a simple two-address RISC-like code. In
|
||||
keeping with the x86 AT&T assembly syntax, generally speaking the
|
||||
keeping with the x86 AT&T assembly syntax, generally speaking the
|
||||
first operand is the source operand, and the second is the destination
|
||||
operand, which is modified when the uinstr is notionally executed.
|
||||
|
||||
@ -1725,7 +1725,7 @@ Every 1000 basic blocks, we see if more signals have arrived. If so,
|
||||
<code>VG_(deliver_signals)</code> builds signal delivery frames on the
|
||||
client's stack, and allows their handlers to be run. Valgrind places
|
||||
in these signal delivery frames a bogus return address,
|
||||
</code>VG_(signalreturn_bogusRA)</code>, and checks all jumps to see
|
||||
<code>VG_(signalreturn_bogusRA)</code>, and checks all jumps to see
|
||||
if any jump to it. If so, this is a sign that a signal handler is
|
||||
returning, and if so Valgrind removes the relevant signal frame from
|
||||
the client's stack, restores the from the signal frame the simulated
|
||||
@ -2051,9 +2051,9 @@ void fooble ( void )
|
||||
int spacer1;
|
||||
int b[10];
|
||||
int spacer2;
|
||||
VALGRIND_MAKE_NOACCESS(&spacer0, sizeof(int));
|
||||
VALGRIND_MAKE_NOACCESS(&spacer1, sizeof(int));
|
||||
VALGRIND_MAKE_NOACCESS(&spacer2, sizeof(int));
|
||||
VALGRIND_MAKE_NOACCESS(&spacer0, sizeof(int));
|
||||
VALGRIND_MAKE_NOACCESS(&spacer1, sizeof(int));
|
||||
VALGRIND_MAKE_NOACCESS(&spacer2, sizeof(int));
|
||||
a[10] = 99;
|
||||
}
|
||||
</pre>
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user