C++ function names can contain substrings like "{lambda()#1}". But
callgrind_annotate and cg_annotate interpret the '#'-character as a
comment marker anywhere on each input line, and thus truncate such names
there.
On the other hand, the documentation in docs/cl-format.xml, states:
Everywhere, comments on own lines starting with '#' are allowed.
This seems to imply that a comment line must start with '#' in the first
column. Thus skip exactly such lines in the input file and don't handle
'#' as a comment marker anywhere else.
Signed-off-by: Philippe Waroquiers <philippe.waroquiers@skynet.be>
Sync VEX/LICENSE.GPL with top-level COPYING file. We used 3 different
addresses for writing to the FSF to receive a copy of the GPL. Replace
all different variants with an URL <http://www.gnu.org/licenses/>.
The following files might still have some slightly different (L)GPL
copyright notice because they were derived from other programs:
- files under coregrind/m_demangle which come from libiberty:
cplus-dem.c, d-demangle.c, demangle.h, rust-demangle.c,
safe-ctype.c and safe-ctype.h
- coregrind/m_demangle/dyn-string.[hc] derived from GCC.
- coregrind/m_demangle/ansidecl.h derived from glibc.
- VEX files for FMA detived from glibc:
host_generic_maddf.h and host_generic_maddf.c
- files under coregrin/m_debuginfo derived from LZO:
lzoconf.h, lzodefs.h, minilzo-inl.c and minilzo.h
- files under coregrind/m_gdbserver detived from GDB:
gdb/signals.h, inferiors.c, regcache.c, regcache.h,
regdef.h, remote-utils.c, server.c, server.h, signals.c,
target.c, target.h and utils.c
Plus the following test files:
- none/tests/ppc32/testVMX.c derived from testVMX.
- ppc tests derived from QEMU: jm-insns.c, ppc64_helpers.h
and test_isa_3_0.c
- tests derived from bzip2 (with embedded GPL text in code):
hackedbz2.c, origin5-bz2.c, varinfo6.c
- tests detived from glibc: str_tester.c, pth_atfork1.c
- test detived from GCC libgomp: tc17_sembar.c
- performance tests derived from bzip2 or tinycc (with embedded GPL
text in code): bz2.c, test_input_for_tinycc.c and tinycc.c
Because it's very useful. As part of this, the "percentage of events
annotated" numbers at the bottom of the output is changed to "events
annotated" so that --show-percs doesn't compute a percentage of a
percentage.
Example output lines:
```
4,967,137,442 (100.0%) PROGRAM TOTALS
4,543 (25.23%) 17,566 ( 0.43%) 47,993 ( 0.92%) /build/glibc-OTsEL5/glibc-2.27/elf/dl-lookup.c
1 ( 0.01%) 2,000,001 (49.29%) 3,000,004 (57.36%) for (int i = 0; i < 1000000; i++) {
```
The commit also adds some much-needed tests for cg_annotate and
callgrind_annotate.
precisely the name of the profile data file it should use (instead of
assuming cachegrind.out.<pid> where <pid> is specified by the --<pid>
flag). The old mechanism is still supported though.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@6573
Cachegrind docs.
Removed the Cachegrind tech docs, because they're so out of date to be
useless. My PhD dissertation gives a much better description of how
Cachegrind works. (I mentioned this in the Cachegrind user manual.) The
only still-useful part of Cachegrind's tech docs, the output file format
description, I moved into the Cachegrind user manual.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@6332
was set to zero and so no annotation was done.
Also put the file format into this file, and some other tiny changes.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5396
changes from r4341 through r4787 inclusive). That branch is now dead.
Please do not commit anything else to it.
For the most part the merge was not troublesome. The main areas of
uncertainty are:
- build system: I had to import by hand Makefile.core-AM_CPPFLAGS.am
and include it in a couple of places. Building etc seems to still
work, but I haven't tried building the documentation.
- syscall wrappers: Following analysis by Greg & Nick, a whole lot of
stuff was moved from -generic to -linux after the branch was created.
I think that is satisfactorily glued back together now.
- Regtests: although this appears to work, no .out files appear, which
is strange, and makes it hard to diagnose regtest failures. In
particular memcheck/tests/x86/scalar.stderr.exp remains in a
conflicted state.
- amd64 is broken (slightly), and ppc32 will be unbuildable. I'll
attend to the former shortly.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@4789
scheme, there are two main structures:
1. The CC table holds a cost centre (CC) for every distinct source code
line, as found using debug/symbol info. It's arranged by files, then
functions, then lines.
2. The instr-info-table holds certain important pieces of info about
each instruction -- instr_addr, instr_size, data_size, its line-CC.
A pointer to the instr's info is passed to the simulation functions,
which is shorter and quicker than passing the pieces individually.
This is nice and simple. Previously, there was a single data structure
(the BBCC table) which mingled the two purposes (maintaining CCs and
caching instruction info). The CC stuff was done at the level of
instructions, and there were different CC types for different kinds of
instructions, and it was pretty yucky. The two simple data structures
together are much less complex than the original single data structure.
As a result, we have the following general improvements:
- Previously, when code was unloaded all its hit/miss counts were stuck
in a single "discard" CC, and so that code would not be annotated. Now
this code is profiled and annotatable just like all other code.
- Source code size is 27% smaller. cg_main.c is now 1472 lines, down
from 2174. Some (1/3?) of this is from removing the special handling
of JIFZ and general compaction, but most is from the data structure
changes. Happily, a lot of the removed code was nasty.
- Object code size (vgskin_cachegrind.so) is 15% smaller.
- cachegrind.out.pid size is about 90+% smaller(!) Annotation time is
accordingly *much* faster. Doing cost-centres at the level of source
code lines rather than instructions makes a big difference, since
there's typically 2--3 instructions per source line. Even better,
when debug info is not present, entire functions (and even files) get
collapsed into a single "???" CC. (This behaviour is no different
to what happened before, it's just the collapsing used to occur in the
annotation script, rather than within Cachegrind.) This is a huge win
for stripped libraries.
- Memory consumption is about 10--20% less, due to fewer CCs.
- Speed is not much changed -- the changes were not in the intensive
parts, so the only likely change is a cache improvement due to using
less memory. SPEC experiments go -3 -- 10% faster, with the "average"
being unchanged or perhaps a tiny bit faster.
I've tested it reasonably thoroughly, it seems extremely similar result
as the old version, which is highly encouraging. (The results aren't
quite the same, because they are so sensitive to memory layout; even
tiny changes to Cachegrind affect the results slightly.)
Some particularly nice changes that happened:
- No longer need an instrumentation prepass; this is because CCs are not
stored grouped by BB, and they're all the same size now. (This makes
various bits of code much simpler than before).
- The actions to take when a BB translation is discarded (due to the
translation table getting full) are much easier -- just chuck all the
instr-info nodes for the BB, without touching the CCs.
- Dumping the cachegrind.out.pid file at the end is much simpler, just
because the CC data structure is much neater.
Some other, specific changes:
- Removed the JIFZ special handling, which never did what it was
intended to do and just complicated things. This changes the results
for REP-prefixed instructions very slightly, but it's not important.
- Abbreviated the FP/MMX/SSE crap by being slightly laxer with size
checking -- not an issue, since this checking was just a pale
imitation of the stricter checking done in codegen anyway.
- Removed "fi" and "fe" handling from cg_annotate, no longer needed due
to neatening of the CC-table.
- Factorised out some code a bit, so fewer monolithic slabs,
particularly in SK_(instrument)().
- Just improved formatting and compacted code in general in various
places.
- Removed the long-commented-out sanity checking code at the bottom.
Phew.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2458
the places that normal users will see:
- command line: --tool=foo (although --skin=foo still works)
- docs: removed all traces (included renaming coregrind_skins.html to
coregrind_tools.html)
- in the usage messages
- in error messages
Also did in in some places that I judged were unlikely to cause clashes with
existing workspaces:
- in the header comments of many files (eg. "This file is part of Memcheck, a
Valgrind tool for...")
- in the regtests script
- in the .supp files
- in AUTHORS
- in README_MISSING_SYSCALL_OR_IOCTL
Also update the AUTHORS file to mention Jeremy.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2027
- don't keel over if run on an empty file
- abort if the "summary:" line is missing; previously it gave a warning
and tried to keep going but then other things broke.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@1572
- fixed a bug that was breaking the --threshold option.
vg_cachesim.c:
- fixed a bug that meant instructions that didn't have a line number in the
debug info were being written in cachegrind.out with whatever was the
last known line number. Now using 0.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@403
automatic cache configuration detection using the CPUID instruction.
This can be overridden from the command-line if necessary.
vg_include.h:
- added the cache_t type and UNDEFINED_CACHE macro
- added command line args (of type cache_t) allowing manual override of
I1/D1/L2 configuration
- added log2(), which is generally useful
vg_main.c, valgrind.in, cachegrind.in:
- added handling of the new --{I1,D1,L2}=<size>,<assoc>,<line_size>
options
vg_cachesim.c:
- lots of stuff for auto-detecting cache configuration with CPUID.
Only handles Intel and AMD chips at the moment, and possibly not all of
them. Falls back onto defaults if anything goes wrong, and the configs
can be manually overridden from the command line anyway.
- now not printing cache summary stats if verbosity == 0. Still writing
cachegrind.out, though.
vg_cachesim_gen.c:
- new file containing stuff shared by the I1/D1/L2 simulations
vg_cachesim_{I1,D1,L2}:
- removed most of it; each now just calls a macro defined in
vg_cachesim_gen.c
vg_cachegen:
- has been cvs removed as it is no longer needed.
Makefile.am:
- added vg_cachesim_gen.c
- removed vg_cachegen
configure.in:
- removed vg_cachegen
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@400
do things like "show functions covering 99% of all D2mr events *and* 99% of all
D2mw events" - before you could only choose the threshold for one.
Useful for me, but probably no-one else. Still mentioned it in the docs,
though.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@269
- vg_cachesim.c
- vg_cachesim_{I1,D1,L2}.c
- vg_annotate.in
- vg_cachegen.in
Changes to existing files:
- valgrind/valgrind.in, added option:
--cachesim=no|yes [no]
- Makefile/Makefile.am:
* added vg_cachesim.c to valgrind_so_SOURCES var
* added vg_cachesim_I1.c, vg_cachesim_D1.c, vg_cachesim_L2.c to
noinst_HEADERS var
* added vg_annotate, vg_cachegen to 'bin_SCRIPTS' var, and added empty
targets for them
- vg_main.c:
* added two offsets for cache sim functions (put in positions 17a,17b)
* added option handling (detection of --cachesim=yes which turns off of
--instrument);
* added calls to cachesim initialisation/finalisation functions
- vg_mylibc: added some system call wrappers (for chmod, open_write, etc) for
file writing
- vg_symtab2.c:
* allow it to read symbols if either of --instrument or --cachesim is
used
* made vg_symtab2.c:vg_what_{line,fn}_is_this extern, renaming it as
VG_(what_line_is_this) (and added to vg_include.h)
* completely rewrote the read loop in vg_read_lib_symbols, fixing
several bugs. Much better now, although probably not perfect. It's
also relatively fragile -- I'm using the "die immediately if anything
unexpected happens" approach.
- vg_to_ucode.c:
* in VG_(disBB), patching in x86 instruction size into extra4b field of
JMP instructions at the end of basic blocks if --cachesim=yes.
Shifted things around to do this; also had to fiddle around with
single-step stuff to get this to work, by not sticking extra JMPs on
the end of the single-instruction block if there was already one
there (to avoid breaking an assertion in vg_cachesim.c). Did a
similar thing to avoid an extra JMP on huge basic blocks that are
split.
- vg_translate.c:
* if --cachesim=yes call the cachesim instrumentation phase
* made some functions extern and renamed:
allocCodeBlock() --> VG_(allocCodeBlock)()
freeCodeBlock() --> VG_(freeCodeBlock)()
copyUInstr() --> VG_(copyUInstr)()
(added to vg_include.h too)
- vg_include.c: declared
* cachesim offsets
* exports of vg_cachesim.c
* added four new profiling events (increasing VGP_M_CCS to 24 -- I kept
the spare ones)
* added comment about UInstr.extra4b field being used for instr size in
JMPs for cache simulation
- docs/manual.html:
* Added --cachesim option to section 2.5.
* Added cache profiling stuff as section 7.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@168