It only ever worked on x86 and amd64, and even on those it had a high false
positive rate and was slow. Everything it does, ASan can do faster, better,
and on more architectures. So there's no reason to keep this tool any more.
This patch changes the option parsing framework to allow a set of
core or tool (currently only memcheck) options to be changed dynamically.
Here is a summary of the new functionality (extracted from NEWS):
* It is now possible to dynamically change the value of many command
line options while your program (or its children) are running under
Valgrind.
To have the list of dynamically changeable options, run
valgrind --help-dyn-options
You can change the options from the shell by using vgdb to launch
the monitor command "v.clo <clo option>...".
The same monitor command can be used from a gdb connected
to the valgrind gdbserver.
Your program can also change the dynamically changeable options using
the client request VALGRIND_CLO_CHANGE(option).
Here is a brief description of the code changes.
* the command line options parsing macros are now checking a 'parsing' mode
to decide if the given option must be handled or not.
(more about the parsing mode below).
* the 'main' command option parsing code has been split in a function
'process_option' that can be called now by:
- early_process_cmd_line_options
(looping over args, calling process_option in mode "Early")
- main_process_cmd_line_options
(looping over args, calling process_option in mode "Processing")
- the new function VG_(process_dynamic_option) called from
gdbserver or from VALGRIND_CLO_CHANGE (calling
process_option in mode "Dynamic" or "Help")
* So, now, during startup, process_option is called twice for each arg:
- once during Early phase
- once during normal Processing
Then process_option can then be called again during execution.
So, the parsing mode is defined so that the option parsing code
behaves differently (e.g. allows or not to handle the option)
depending on the mode.
// Command line option parsing happens in the following modes:
// cloE : Early processing, used by coregrind m_main.c to parse the
// command line options that must be handled early on.
// cloP : Processing, used by coregrind and tools during startup, when
// doing command line options Processing.
// clodD : Dynamic, used to dynamically change options after startup.
// A subset of the command line options can be changed dynamically
// after startup.
// cloH : Help, special mode to produce the list of dynamically changeable
// options for --help-dyn-options.
typedef
enum {
cloE = 1,
cloP = 2,
cloD = 4,
cloH = 8
} Clo_Mode;
The option parsing macros in pub_tool_options.h have now all a new variant
*_CLOM with the mode(s) in which the given option is accepted.
The old variant is kept and calls the new variant with mode cloP.
The function VG_(check_clom) in the macro compares the current mode
with the modes allowed for the option, and returns True if qq_arg
should be further processed.
For example:
// String argument, eg. --foo=yes or --foo=no
(VG_(check_clom) \
(qq_mode, qq_arg, qq_option, \
VG_STREQN(VG_(strlen)(qq_option)+1, qq_arg, qq_option"=")) && \
({const HChar* val = &(qq_arg)[ VG_(strlen)(qq_option)+1 ]; \
if VG_STREQ(val, "yes") (qq_var) = True; \
else if VG_STREQ(val, "no") (qq_var) = False; \
else VG_(fmsg_bad_option)(qq_arg, "Invalid boolean value '%s'" \
" (should be 'yes' or 'no')\n", val); \
True; }))
VG_BOOL_CLOM(cloP, qq_arg, qq_option, qq_var)
To make an option dynamically excutable, it is typically enough to replace
VG_BOOL_CLO(...)
by
VG_BOOL_CLOM(cloPD, ...)
For example:
- else if VG_BOOL_CLO(arg, "--show-possibly-lost", tmp_show) {
+ else if VG_BOOL_CLOM(cloPD, arg, "--show-possibly-lost", tmp_show) {
cloPD means the option value is set/changed during the main command
Processing (P) and Dynamically during execution (D).
Note that the 'body/further processing' of a command is only executed when
the option is recognised and the current parsing mode is ok for this option.
This commit thoroughly overhauls DHAT, moving it out of the
"experimental" ghetto. It makes moderate changes to DHAT itself,
including dumping profiling data to a JSON format output file. It also
implements a new data viewer (as a web app, in dhat/dh_view.html).
The main benefits over the old DHAT are as follows.
- The separation of data collection and presentation means you can run a
program once under DHAT and then sort the data in various ways. Also,
full data is in the output file, and the viewer chooses what to omit.
- The data can be sorted in more ways than previously. Some of these
sorts involve useful filters such as "short-lived" and "zero reads or
zero writes".
- The tree structure view avoids the need to choose stack trace depth.
This avoids both the problem of not enough depth (when records that
should be distinct are combined, and may not contain enough
information to be actionable) and the problem of too much depth (when
records that should be combined are separated, making them seem less
important than they really are).
- Byte and block measures are shown with a percentage relative to the
global count, which helps gauge relative significance of different
parts of the profile.
- Byte and blocks measures are also shown with an allocation rate
(bytes and blocks per million instructions), which enables comparisons
across multiple profiles, even if those profiles represent different
workloads.
- Both global and per-node measurements are taken at the global heap
peak ("At t-gmax"), which gives Massif-like insight into the point of
peak memory use.
- The final/liftimes stats are a bit more useful than the old deaths
stats. (E.g. the old deaths stats didn't take into account lifetimes
of unfreed blocks.)
- The handling of realloc() has changed. The sequence `p = malloc(100);
realloc(p, 200);` now increases the total block count by 2 and the
total byte count by 300. Previously it increased them by 1 and 200.
The new handling is a more operational view that better reflects the
effect of allocations on performance. It makes a significant
difference in the results, giving paths involving reallocation (e.g.
repeated pushing to a growing vector) more prominence.
Other things of note:
- There is now testing, both regression tests that run within the
standard test suite, and viewer-specific tests that cannot run within
the standard test suite. The latter are run by loading
dh_view.html?test=1 in a web browser.
- The commit puts all tool lists in Makefiles (and similar files) in the
following consistent order: memcheck, cachegrind, callgrind, helgrind,
drd, massif, dhat, lackey, none; exp-sgcheck, exp-bbv.
- A lot of fields in dh_main.c have been given more descriptive names.
Those names now match those used in dh_view.js.
Author: Nicholas Nethercote <nnethercote@mozilla.com>
Use inlined frames in Massif XTree output.
This makes Massif's output much easier to follow.
The commit also removes a -1 used on all Massif stack frame addresses.
There was a big comment questioning the presence of that -1, and with it
gone the addresses now match those produced by DHAT.
* New command line options --xtree-leak=no|yes and --xtree-leak-file=<file>
to produce the end of execution leak report in a xtree callgrind format
file.
* New option 'xtleak' in the memcheck leak_check monitor command, to
produce the leak report in an xtree file.
* File name template arguments (such as --log-file, --xtree-memory-file, ...)
have a new %n format letter that is replaced by a sequence number.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16205
Final patch of the xtree serie, which provides the documentation.
The xtree concept was committed in the revisions
16120 : Support pool of unique string in pub_tool_deduppoolalloc.h
16121 : Implement a cache 'address -> symbol name' in m_debuginfo.c
16122 : Add VG_(strIsMemberXA) in pub_tool_xarray.h
16123 : Addition of the pub_tool_xtree.h and pub_tool_xtmemory.h modules, and of the --xtree-memory* options
16124 : Addition of the options --xtree-memory and --xtree-memory-file
16125 : Small changes in callgrind_annotate and callgrind manual
16126 : Locally define vgPlain_scrcmp in 2 unit tests
16127 : Support for xtree memory profiling and xtmemory gdbsrv monitor command in helgrind
16128 : Support for xtree memory profiling and xtmemory gdbsrv monitor command in memcheck
16129 : Update massif implementation to xtree
Some smaller follow-up patches to be expected to add some regtests,
and refine documentation.
Thanks to Ivo, Julian and Josef for the review comments.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16131
The msg telling brk cannot be extended confuses some users
so improve the documentation and have the msg referencing the doc.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15880
from libstdc++ when available, similar to existing __libc_freeres().
New option --run-cxx-freeres=<yes|no> can be used to change whether
this cleanup function is called or not.
Note that __gnu_cxx::__freeres() is currently available
only in gcc 6. It is not yet decided what to do about
libstdc++ from gcc 5.
Tracked under https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69945
for libstdc++.
Fixes BZ#345307 (partially).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15840
This implements the interception of all globally public allocation
functions by default. It works by adding a flag to the spec to say the
interception only applies to global functions. Which is set for the
somalloc spec. The librarypath to match is set to "*" unless the user
overrides it. Then each DiSym keeps track of whether the symbol is local
or global. For a spec which has isGlobal set only isGlobal symbols will
match.
Note that because of padding to keep the addresses in DiSym aligned the
addition of the extra bool isGlobal doesn't actually grow the struct.
The comments explain how the struct could be made more compact on 32bit
systems, but this isn't as easy on 64bit systems. So I didn't try to do
that in this patch.
For ELF symbols keeping track of which are global is trivial. For pdb I
had to guess and made only the "Public" symbols global. I don't know
how/if macho keeps track of global symbols or not. For now I just mark
all of them local (which just means things work as previously on platforms
that use machos, no non-system symbols are matches by default for somalloc
unless the user explicitly tells which library name to match).
Included are two testcases for shared libraries (wrapmalloc) and staticly
linked (wrapmallocstatic) malloc/free overrides that depend on the new
default. One existing testcase (new_override) was adjusted to explicitly
not use the new somalloc default because it depends on a user defined
new implementation that has side-effects and should explicitly not be
intercepted.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15726
to restrict the change to those architectures that do provide automatic
D-I coherence (x86, amd64, s390x). This commit restores the default
value for all other architectures back to its pre r15601 state, so as not
to burden those architectures unnecessarily with =all-non-file.
Also, this rewrites the relevant manual section.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15602
the size of the translation table sectors, either to gain memory
or to avoid too many retranslations.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15016
This allows to decrease memory usage when using many threads,
if no big stacksize is needed by Valgrind.
If needed (e.g. for demangling big c++ symbols), the V stacksize
can be increased.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15004
for the number of connected processes. This is still lame but better
than asking her to recompile.
Part of fixing BZ #337869.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14838
* This option can be used to mark the begin/end of errors in textual
output mode, to facilitate searching/extracting errors in output files
mixing valgrind errors with program output.
* Use the new option in various existing regtests to test the various
possible usage.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14714
the man page to contain:
></programlisting>
<para>And here are the same errors with
<option>--read-var-info=yes</option>:</para>
<programlisting><![CDATA[
Rather than the nicely formatted 'And here ...' paragraph.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14561
Activating this hint using --sim-hints=no-nptl-pthread-stackcache
means the glibc nptl stack cache will be disabled.
Disabling this stack/tls cache avoids helgrind false positive race conditions
errors when using __thread variables.
Note: disabling the stack cache is done by a kludge, dependent on
internal knowledge of glibc code, and using libpthread debug info.
So, this kludge might be broken with newer glibc version.
This has been tested on various platforms and various
glibc versions 2.11, 2.16 and 2.18
To check if the disabling works, you can do:
valgrind --tool=helgrind --sim-hints=no-nptl-pthread-stackcache -d -v ./helgrind/tests/tls_threads |& grep kludge
If you see the below 2 lines, then hopefully the stack cache has been disabled.
--12624-- deactivate nptl pthread stackcache via kludge: found symbol stack_cache_actsize at addr 0x3AF178
--12624:1:sched pthread stack cache size disabling done via kludge
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14313
* Mention --read-inline-info=yes as an alternative to compile without inlining.
* Mention that stabs debuginfo reader is not working anymore since 3.9.0
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14160
showing inlined function calls.
See 278972 valgrind stacktraces and suppression do not handle inlined function call debuginfo
Reading the inlined dwarf call info is activated using the new clo
--read-inline-info=yes
Default is currently no but an objective is to optimise the performance
and memory in order to possibly set it on by default.
(see below discussion about performances).
Basically, the patch provides the following pieces:
1. Implement a new dwarf3 reader that reads the inlined call info
2. Some performance improvements done for this new parser, and
on some common code between the new parser and the var info parser.
3. Use the parsed inlined info to produce stacktrace showing inlined calls
4. Use the parsed inlined info in the suppression matching and suppression generation
5. and of course, some reg tests
1. new dwarf3 reader:
---------------------
Two options were possible: add the reading of the inlined info
in the current var info dwarf reader, or add a 2nd reader.
The 2nd approach was preferred, for the following reasons:
The var info reader is slow, memory hungry and quite complex.
Having a separate parsing phase for the inlined information
is simpler/faster when just reading the inlined info.
Possibly, a single parser would be faster when using both
--read-var-info=yes and --read-inline-info=yes.
However, var-info being extremely memory/cpu hungry, it is unlikely
to be used often, and having a separate parsing for inlined info
does in any case make not much difference.
(--read-var-info=yes is also now less interesting thanks to commit
r13991, which provides a fast and low memory "reasonable" location
for an address).
The inlined info parser reads the dwarf info to make calls
to priv_storage.h ML_(addInlInfo).
2. performance optimisations
----------------------------
* the abbrev cache has been improved in revision r14035.
* The new parser skips the non interesting DIEs
(the var-info parser has no logic to skip uninteresting DIEs).
* Some other minor perf optimisation here and there.
In total now, on a big executable, 15 seconds CPU are needed to
create the inlined info (on my slow x86 pentium).
With regards to memory, the dinfo arena:
with inlined info: 172281856/121085952 max/curr mmap'd
without : 157892608/106721280 max/curr mmap'd,
So, basically, inlined information costs about 15Mb of memory for
my big executable (compared to first version of the patch, this is
already using less memory, thanks to the strpool deduppoolalloc.
The needed memory can probably be decreased somewhat more.
3. produce better stack traces
------------------------------
VG_(describe_IP) has a new argument InlIPCursor *iipc which allows
to describe inlined function calls by doing repetitive calls
to describe_IP. See pub_tool_debuginfo.h for a description.
4. suppression generation and matching
--------------------------------------
* suppression generation now also uses an InlIPCursor *iipc
to generate a line for each inlined fn call.
* suppression matching: to allow suppression matching to
match one IP to several function calls in a suppression entry,
the 'inputCompleter' object (that allows to lazily generate
function or object names for a stacktrace when matching
an error with a suppression) has been generalised a little bit
more to also lazily generate the input sequence.
VG_(generic_match) has been updated so as to be more generic
with respect to the input completer : when providing an
input completer, VG_(generic_match) does not need anymore
to produce/compute any input itself : this is all delegated
to the input completer.
5. various regtests
-------------------
to test stack traces with inlined calls, and suppressions
of (some of) these errors using inlined fn calls matching.
Work still to do:
-----------------
* improve parsing performance
* improve the memory overhead.
* handling the directory name for files of the inlined function calls is not yet done.
(probably implies to refactor some code)
* see if m_errormgr.c *offsets arrays cannot be managed via xarray
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14036
The current manual and man page tells:
--tool=<toolname> [default: memcheck]
Run the Valgrind tool called toolname, e.g. Memcheck, Cachegrind, etc.
where the toolname examples do not list all the tools, and use uppercase
first letter, which is not understood.
So, use lower case, and give the list of all known tools.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13920