mirror of
https://github.com/Zenithsiz/ftmemsim-valgrind.git
synced 2026-02-03 10:05:29 +00:00
438 lines
16 KiB
Plaintext
438 lines
16 KiB
Plaintext
Valgrind FAQ, version 2.1.2
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
Last revised 18 July 2004
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
1. Background
|
|
2. Compiling, installing and configuring
|
|
3. Valgrind aborts unexpectedly
|
|
4. Valgrind behaves unexpectedly
|
|
5. Memcheck doesn't find my bug
|
|
6. Miscellaneous
|
|
|
|
|
|
-----------------------------------------------------------------
|
|
1. Background
|
|
-----------------------------------------------------------------
|
|
|
|
1.1. How do you pronounce "Valgrind"?
|
|
|
|
The "Val" as in the world "value". The "grind" is pronounced with a
|
|
short 'i' -- ie. "grinned" (rhymes with "tinned") rather than "grined"
|
|
(rhymes with "find").
|
|
|
|
Don't feel bad: almost everyone gets it wrong at first.
|
|
|
|
-----------------------------------------------------------------
|
|
|
|
1.2. Where does the name "Valgrind" come from?
|
|
|
|
From Nordic mythology. Originally (before release) the project was
|
|
named Heimdall, after the watchman of the Nordic gods. He could "see a
|
|
hundred miles by day or night, hear the grass growing, see the wool
|
|
growing on a sheep's back" (etc). This would have been a great name,
|
|
but it was already taken by a security package "Heimdal".
|
|
|
|
Keeping with the Nordic theme, Valgrind was chosen. Valgrind is the
|
|
name of the main entrance to Valhalla (the Hall of the Chosen Slain in
|
|
Asgard). Over this entrance there resides a wolf and over it there is
|
|
the head of a boar and on it perches a huge eagle, whose eyes can see to
|
|
the far regions of the nine worlds. Only those judged worthy by the
|
|
guardians are allowed to pass through Valgrind. All others are refused
|
|
entrance.
|
|
|
|
It's not short for "value grinder", although that's not a bad guess.
|
|
|
|
|
|
-----------------------------------------------------------------
|
|
2. Compiling, installing and configuring
|
|
-----------------------------------------------------------------
|
|
|
|
2.1. When I trying building Valgrind, 'make' dies partway with an
|
|
assertion failure, something like this: make: expand.c:489:
|
|
|
|
allocated_variable_append: Assertion
|
|
`current_variable_set_list->next != 0' failed.
|
|
|
|
It's probably a bug in 'make'. Some, but not all, instances of version 3.79.1
|
|
have this bug, see www.mail-archive.com/bug-make@gnu.org/msg01658.html. Try
|
|
upgrading to a more recent version of 'make'. Alternatively, we have heard
|
|
that unsetting the CFLAGS environment variable avoids the problem.
|
|
|
|
|
|
-----------------------------------------------------------------
|
|
3. Valgrind aborts unexpectedly
|
|
-----------------------------------------------------------------
|
|
|
|
3.1. Programs run OK on Valgrind, but at exit produce a bunch of errors a bit
|
|
like this
|
|
|
|
==20755== Invalid read of size 4
|
|
==20755== at 0x40281C8A: _nl_unload_locale (loadlocale.c:238)
|
|
==20755== by 0x4028179D: free_mem (findlocale.c:257)
|
|
==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34)
|
|
==20755== by 0x40048DCC: vgPlain___libc_freeres_wrapper
|
|
(vg_clientfuncs.c:585)
|
|
==20755== Address 0x40CC304C is 8 bytes inside a block of size 380 free'd
|
|
==20755== at 0x400484C9: free (vg_clientfuncs.c:180)
|
|
==20755== by 0x40281CBA: _nl_unload_locale (loadlocale.c:246)
|
|
==20755== by 0x40281218: free_mem (setlocale.c:461)
|
|
==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34)
|
|
|
|
and then die with a segmentation fault.
|
|
|
|
When the program exits, Valgrind runs the procedure __libc_freeres() in
|
|
glibc. This is a hook for memory debuggers, so they can ask glibc to
|
|
free up any memory it has used. Doing that is needed to ensure that
|
|
Valgrind doesn't incorrectly report space leaks in glibc.
|
|
|
|
Problem is that running __libc_freeres() in older glibc versions causes
|
|
this crash.
|
|
|
|
WORKAROUND FOR 1.1.X and later versions of Valgrind: use the
|
|
--run-libc-freeres=no flag. You may then get space leak reports for
|
|
glibc-allocations (please _don't_ report these to the glibc people,
|
|
since they are not real leaks), but at least the program runs.
|
|
|
|
-----------------------------------------------------------------
|
|
|
|
3.2. My (buggy) program dies like this:
|
|
valgrind: vg_malloc2.c:442 (bszW_to_pszW):
|
|
Assertion `pszW >= 0' failed.
|
|
|
|
If Memcheck (the memory checker) shows any invalid reads, invalid writes
|
|
and invalid frees in your program, the above may happen. Reason is that
|
|
your program may trash Valgrind's low-level memory manager, which then
|
|
dies with the above assertion, or something like this. The cure is to
|
|
fix your program so that it doesn't do any illegal memory accesses. The
|
|
above failure will hopefully go away after that.
|
|
|
|
-----------------------------------------------------------------
|
|
|
|
3.3. My program dies, printing a message like this along the way:
|
|
|
|
disInstr: unhandled instruction bytes: 0x66 0xF 0x2E 0x5
|
|
|
|
Older versions did not support some x86 instructions, particularly
|
|
SSE/SSE2 instructions. Try a newer Valgrind; we now support almost all
|
|
instructions. If it still happens with newer versions, if the failing
|
|
instruction is an SSE/SSE2 instruction, you might be able to recompile
|
|
your program without it by using the flag -march to gcc. Either way,
|
|
let us know and we'll try to fix it.
|
|
|
|
Another possibility is that your program has a bug and erroneously jumps
|
|
to a non-code address, in which case you'll get a SIGILL signal.
|
|
Memcheck/Addrcheck may issue a warning just before this happens, but they
|
|
might not if the jump happens to land in addressable memory.
|
|
|
|
-----------------------------------------------------------------
|
|
|
|
3.4. My program dies like this:
|
|
|
|
error: /lib/librt.so.1: symbol __pthread_clock_settime, version
|
|
GLIBC_PRIVATE not defined in file libpthread.so.0 with link time
|
|
reference
|
|
|
|
This is a total swamp. Nevertheless there is a way out. It's a problem
|
|
which is not easy to fix. Really the problem is that /lib/librt.so.1
|
|
refers to some symbols __pthread_clock_settime and
|
|
__pthread_clock_gettime in /lib/libpthread.so which are not intended to
|
|
be exported, ie they are private.
|
|
|
|
Best solution is to ensure your program does not use /lib/librt.so.1.
|
|
|
|
However .. since you're probably not using it directly, or even
|
|
knowingly, that's hard to do. You might instead be able to fix it by
|
|
playing around with coregrind/vg_libpthread.vs. Things to try:
|
|
|
|
Remove this
|
|
|
|
GLIBC_PRIVATE {
|
|
__pthread_clock_gettime;
|
|
__pthread_clock_settime;
|
|
};
|
|
|
|
or maybe remove this
|
|
|
|
GLIBC_2.2.3 {
|
|
__pthread_clock_gettime;
|
|
__pthread_clock_settime;
|
|
} GLIBC_2.2;
|
|
|
|
or maybe add this
|
|
|
|
GLIBC_2.2.4 {
|
|
__pthread_clock_gettime;
|
|
__pthread_clock_settime;
|
|
} GLIBC_2.2;
|
|
|
|
GLIBC_2.2.5 {
|
|
__pthread_clock_gettime;
|
|
__pthread_clock_settime;
|
|
} GLIBC_2.2;
|
|
|
|
or some combination of the above. After each change you need to delete
|
|
coregrind/libpthread.so and do make && make install.
|
|
|
|
I just don't know if any of the above will work. If you can find a
|
|
solution which works, I would be interested to hear it.
|
|
|
|
To which someone replied:
|
|
|
|
I deleted this:
|
|
|
|
GLIBC_2.2.3 {
|
|
__pthread_clock_gettime;
|
|
__pthread_clock_settime;
|
|
} GLIBC_2.2;
|
|
|
|
and it worked.
|
|
|
|
|
|
-----------------------------------------------------------------
|
|
4. Valgrind behaves unexpectedly
|
|
-----------------------------------------------------------------
|
|
|
|
4.1. I try running "valgrind my_program", but my_program runs normally,
|
|
and Valgrind doesn't emit any output at all.
|
|
|
|
For versions prior to 2.1.1:
|
|
|
|
Valgrind doesn't work out-of-the-box with programs that are entirely
|
|
statically linked. It does a quick test at startup, and if it detects
|
|
that the program is statically linked, it aborts with an explanation.
|
|
|
|
This test may fail in some obscure cases, eg. if you run a script under
|
|
Valgrind and the script interpreter is statically linked.
|
|
|
|
If you still want static linking, you can ask gcc to link certain
|
|
libraries statically. Try the following options:
|
|
|
|
-Wl,-Bstatic -lmyLibrary1 -lotherLibrary -Wl,-Bdynamic
|
|
|
|
Just make sure you end with -Wl,-Bdynamic so that libc is dynamically
|
|
linked.
|
|
|
|
If you absolutely cannot use dynamic libraries, you can try statically
|
|
linking together all the .o files in coregrind/, all the .o files of the
|
|
tool of your choice (eg. those in memcheck/), and the .o files of your
|
|
program. You'll end up with a statically linked binary that runs
|
|
permanently under Valgrind's control. Note that we haven't tested this
|
|
procedure thoroughly.
|
|
|
|
|
|
For versions 2.1.1 and later:
|
|
|
|
Valgrind does now work with static binaries, although beware that some
|
|
of the tools won't operate as well as normal, because they have access
|
|
to less information about how the program runs. Eg. Memcheck will miss
|
|
some errors that it would otherwise find. This is because Valgrind
|
|
doesn't replace malloc() and friends with its own versions. It's best
|
|
if your program is dynamically linked with glibc.
|
|
|
|
-----------------------------------------------------------------
|
|
|
|
4.2. My threaded server process runs unbelievably slowly on Valgrind.
|
|
So slowly, in fact, that at first I thought it had completely
|
|
locked up.
|
|
|
|
We are not completely sure about this, but one possibility is that
|
|
laptops with power management fool Valgrind's timekeeping mechanism,
|
|
which is (somewhat in error) based on the x86 RDTSC instruction. A
|
|
"fix" which is claimed to work is to run some other cpu-intensive
|
|
process at the same time, so that the laptop's power-management
|
|
clock-slowing does not kick in. We would be interested in hearing more
|
|
feedback on this.
|
|
|
|
Another possible cause is that versions prior to 1.9.6 did not support
|
|
threading on glibc 2.3.X systems well. Hopefully the situation is much
|
|
improved with 1.9.6 and later versions.
|
|
|
|
-----------------------------------------------------------------
|
|
|
|
4.3. My program uses the C++ STL and string classes. Valgrind
|
|
reports 'still reachable' memory leaks involving these classes
|
|
at the exit of the program, but there should be none.
|
|
|
|
First of all: relax, it's probably not a bug, but a feature. Many
|
|
implementations of the C++ standard libraries use their own memory pool
|
|
allocators. Memory for quite a number of destructed objects is not
|
|
immediately freed and given back to the OS, but kept in the pool(s) for
|
|
later re-use. The fact that the pools are not freed at the exit() of
|
|
the program cause Valgrind to report this memory as still reachable.
|
|
The behaviour not to free pools at the exit() could be called a bug of
|
|
the library though.
|
|
|
|
Using gcc, you can force the STL to use malloc and to free memory as
|
|
soon as possible by globally disabling memory caching. Beware! Doing
|
|
so will probably slow down your program, sometimes drastically.
|
|
|
|
- With gcc 2.91, 2.95, 3.0 and 3.1, compile all source using the STL
|
|
with -D__USE_MALLOC. Beware! This is removed from gcc starting with
|
|
version 3.3.
|
|
|
|
- With 3.2.2 and later, you should export the environment variable
|
|
GLIBCPP_FORCE_NEW before running your program.
|
|
|
|
There are other ways to disable memory pooling: using the malloc_alloc
|
|
template with your objects (not portable, but should work for gcc) or
|
|
even writing your own memory allocators. But all this goes beyond the
|
|
scope of this FAQ. Start by reading
|
|
http://gcc.gnu.org/onlinedocs/libstdc++/ext/howto.html#3 if you
|
|
absolutely want to do that. But beware:
|
|
|
|
1) there are currently changes underway for gcc which are not totally
|
|
reflected in the docs right now ("now" == 26 Apr 03)
|
|
|
|
2) allocators belong to the more messy parts of the STL and people went
|
|
at great lengths to make it portable across platforms. Chances are
|
|
good that your solution will work on your platform, but not on
|
|
others.
|
|
|
|
-----------------------------------------------------------------------------
|
|
4.4. The stack traces given by Memcheck (or another tool) aren't helpful.
|
|
How can I improve them?
|
|
|
|
If they're not long enough, use --num-callers to make them longer.
|
|
|
|
If they're not detailed enough, make sure you are compiling with -g to add
|
|
debug information. And don't strip symbol tables (programs should be
|
|
unstripped unless you run 'strip' on them; some libraries ship stripped).
|
|
|
|
Also, -fomit-frame-pointer and -fstack-check can make stack traces worse.
|
|
|
|
Some example sub-traces:
|
|
|
|
With debug information and unstripped (best):
|
|
|
|
Invalid write of size 1
|
|
at 0x80483BF: really (malloc1.c:20)
|
|
by 0x8048370: main (malloc1.c:9)
|
|
|
|
With no debug information, unstripped:
|
|
|
|
Invalid write of size 1
|
|
at 0x80483BF: really (in /auto/homes/njn25/grind/head5/a.out)
|
|
by 0x8048370: main (in /auto/homes/njn25/grind/head5/a.out)
|
|
|
|
With no debug information, stripped:
|
|
|
|
Invalid write of size 1
|
|
at 0x80483BF: (within /auto/homes/njn25/grind/head5/a.out)
|
|
by 0x8048370: (within /auto/homes/njn25/grind/head5/a.out)
|
|
by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so)
|
|
by 0x80482CC: (within /auto/homes/njn25/grind/head5/a.out)
|
|
|
|
With debug information and -fomit-frame-pointer:
|
|
|
|
Invalid write of size 1
|
|
at 0x80483C4: really (malloc1.c:20)
|
|
by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so)
|
|
by 0x80482CC: ??? (start.S:81)
|
|
|
|
-----------------------------------------------------------------
|
|
5. Memcheck doesn't find my bug
|
|
-----------------------------------------------------------------
|
|
|
|
5.1. I try running "valgrind --tool=memcheck my_program" and get
|
|
Valgrind's startup message, but I don't get any errors and I know
|
|
my program has errors.
|
|
|
|
By default, Valgrind only traces the top-level process. So if your
|
|
program spawns children, they won't be traced by Valgrind by default.
|
|
Also, if your program is started by a shell script, Perl script, or
|
|
something similar, Valgrind will trace the shell, or the Perl
|
|
interpreter, or equivalent.
|
|
|
|
To trace child processes, use the --trace-children=yes option.
|
|
|
|
If you are tracing large trees of processes, it can be less disruptive
|
|
to have the output sent over the network. Give Valgrind the flag
|
|
--log-socket=127.0.0.1:12345 (if you want logging output sent to port
|
|
12345 on localhost). You can use the valgrind-listener program to
|
|
listen on that port:
|
|
|
|
valgrind-listener 12345
|
|
|
|
Obviously you have to start the listener process first. See the
|
|
documentation for more details.
|
|
|
|
-----------------------------------------------------------------
|
|
|
|
5.2. Why doesn't Memcheck find the array overruns in this program?
|
|
|
|
int static[5];
|
|
|
|
int main(void)
|
|
{
|
|
int stack[5];
|
|
|
|
static[5] = 0;
|
|
stack [5] = 0;
|
|
|
|
return 0;
|
|
}
|
|
|
|
Unfortunately, Memcheck doesn't do bounds checking on static or stack
|
|
arrays. We'd like to, but it's just not possible to do in a reasonable
|
|
way that fits with how Memcheck works. Sorry.
|
|
|
|
-----------------------------------------------------------------
|
|
|
|
5.3. My program dies with a segmentation fault, but Memcheck doesn't give
|
|
any error messages before it, or none that look related.
|
|
|
|
One possibility is that your program accesses to memory with
|
|
inappropriate permissions set, such as writing to read-only memory.
|
|
Maybe your program is writing to a static string like this:
|
|
|
|
char* s = "hello";
|
|
s[0] = 'j';
|
|
|
|
or something similar. Writing to read-only memory can also apparently
|
|
make LinuxThreads behave strangely.
|
|
|
|
|
|
-----------------------------------------------------------------
|
|
6. Miscellaneous
|
|
-----------------------------------------------------------------
|
|
|
|
6.1. I tried writing a suppression but it didn't work. Can you
|
|
write my suppression for me?
|
|
|
|
Yes! Use the --gen-suppressions=yes feature to spit out suppressions
|
|
automatically for you. You can then edit them if you like, eg.
|
|
combining similar automatically generated suppressions using wildcards
|
|
like '*'.
|
|
|
|
If you really want to write suppressions by hand, read the manual
|
|
carefully. Note particularly that C++ function names must be _mangled_.
|
|
|
|
-----------------------------------------------------------------
|
|
|
|
6.2. With Memcheck/Addrcheck's memory leak detector, what's the
|
|
difference between "definitely lost", "possibly lost", "still
|
|
reachable", and "suppressed"?
|
|
|
|
The details are in section 3.6 of the manual.
|
|
|
|
In short:
|
|
|
|
- "definitely lost" means your program is leaking memory -- fix it!
|
|
|
|
- "possibly lost" means your program is probably leaking memory,
|
|
unless you're doing funny things with pointers.
|
|
|
|
- "still reachable" means your program is probably ok -- it didn't
|
|
free some memory it could have. This is quite common and often
|
|
reasonable. Don't use --show-reachable=yes if you don't want to see
|
|
these reports.
|
|
|
|
- "suppressed" means that a leak error has been suppressed. There are
|
|
some suppressions in the default suppression files. You can ignore
|
|
suppressed errors.
|
|
|
|
-----------------------------------------------------------------
|
|
|
|
(this is the end of the FAQ.)
|