Add a section to the cachegrind manual suggesting how to act on the results.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@6853
This commit is contained in:
Nicholas Nethercote 2007-09-17 22:28:21 +00:00
parent 5771d4fcc6
commit c7a4bb81a5

View File

@ -1226,28 +1226,31 @@ fail these checks.</para>
<para>
So, you've managed to profile your program with Cachegrind. Now what?
What's the best way to actually act on the information it provides to speed
up your program?</para>
up your program? Here are some rules of thumb that we have found to be
useful.</para>
<para>
First of all, the global hit/miss rate numbers are not that useful. If you
have multiple programs or multiple runs of a program, comparing the numbers
might identify if any are outliers. Otherwise, they're not enough to act
on.</para>
might identify if any are outliers and worthy of closer investigation.
Otherwise, they're not enough to act on.</para>
<para>
The source code annotations are much more useful. In our experience, the
best place to start is by looking at the <computeroutput>Ir</computeroutput>
numbers. They simply measure how many instructions were executed for each
line, and don't include any cache information, but they can still be very
useful for identifying bottlenecks.</para>
The line-by-line source code annotations are much more useful. In our
experience, the best place to start is by looking at the
<computeroutput>Ir</computeroutput> numbers. They simply measure how many
instructions were executed for each line, and don't include any cache
information, but they can still be very useful for identifying
bottlenecks.</para>
<para>
After that, we have found that L2 misses are typically a much bigger source
of slow-downs than L1 misses. So it's worth looking for any snippets of
code that cause a lot of L2 misses. If you find any, it's still not always
easy to work out how to improve things. You need to have a reasonable
understanding of how caches work, the principles of locality, and your
program's data access patterns. </para>
code that cause a high proportion of the L2 misses. If you find any, it's
still not always easy to work out how to improve things. You need to have a
reasonable understanding of how caches work, the principles of locality, and
your program's data access patterns. Improving things may require
redesigning a data structure, for example.</para>
<para>
In short, Cachegrind can tell you where some of the bottlenecks in your code