include/pub_tool_deduppoolalloc.h
coregrind/pub_core_deduppoolalloc.h
coregrind/m_deduppoolalloc.c
and uses it (currently only) for the strings in m_debuginfo/storage.c
The idea is that such ddup pool allocator will also be used for other
highly duplicated information (e.g. the DiCFSI information), where
significant gains can also be achieved.
The dedup pool for strings also decreases significantly the memory
needed by the read inline information (patch still to be committed,
see bug 278972).
When testing with a big executable (tacot_process),
this reduces the size of the dinfo arena from
trunk: 158941184/109760512 max/curr mmap'd, 156775944/107882728 max/curr,
to
ddup: 157892608/106614784 max/curr mmap'd, 156362160/101414712 max/curr
(so 3Mb less mmap-ed once debug info is read, 1Mb less mmap-ed in peak,
6Mb less allocated once debug info is read).
This is all gained due to the string which changes from:
trunk: 17,434,704 in 266: di.storage.addStr.1
to
ddup: 10,966,608 in 750: di.storage.addStr.1
(6.5Mb less memory used by strings)
The gain in mmap-ed memory is smaller due to fragmentation.
Probably one could decrease the fragmentation by using bigger
size for the dedup pool, but then we would lose memory on the last
allocated pool (and for small libraries, we often do not use much
of a big pool block).
Solution might be to increase the pool size but have a "shrink_block"
operation. To be looked at in the future.
In terms of performance, startup of a big executable (on an old pentium)
is not influenced significantly (something like 0.1 seconds on 15 seconds
startup for a big executable, on a slow pentium).
The dedup pool uses a hash table. The hash function used currently
is the VG_(adler32) check sum. It is reported (and visible also here)
that this checksum is not a very good hash function (many collisions).
To have statistics about collisions, use --stats -v -v -v
As an example of the collisions, on the strings in debug info of memcheck tool on x86,
one obtain:
--4789-- dedupPA:di.storage.addStr.1 9983 allocs (8174 uniq) 11 pools (4820 bytes free in last pool)
--4789-- nr occurences of chains of len N, N-plicated keys, N-plicated elts
--4789-- N: 0 : nr chain 6975, nr keys 0, nr elts 0
--4789-- N: 1 : nr chain 3670, nr keys 6410, nr elts 8174
--4789-- N: 2 : nr chain 1070, nr keys 226, nr elts 0
--4789-- N: 3 : nr chain 304, nr keys 100, nr elts 0
--4789-- N: 4 : nr chain 104, nr keys 84, nr elts 0
--4789-- N: 5 : nr chain 72, nr keys 42, nr elts 0
--4789-- N: 6 : nr chain 44, nr keys 34, nr elts 0
--4789-- N: 7 : nr chain 18, nr keys 13, nr elts 0
--4789-- N: 8 : nr chain 17, nr keys 8, nr elts 0
--4789-- N: 9 : nr chain 4, nr keys 6, nr elts 0
--4789-- N:10 : nr chain 9, nr keys 4, nr elts 0
--4789-- N:11 : nr chain 1, nr keys 0, nr elts 0
--4789-- N:13 : nr chain 1, nr keys 1, nr elts 0
--4789-- total nr of unique chains: 12289, keys 6928, elts 8174
which shows that on 8174 different strings, we have only 6410 strings which have
a unique hash value. As other examples, N:13 line shows we have 13 strings
mapping to the same key. N:14 line shows we have 4 groups of 10 strings mapping to the
same key, etc.
So, adler32 is definitely a bad hash function.
Trials have been done with another hash function, giving a much lower
collision rate. So, a better (but still fast) hash function would probably
be beneficial. To be looked at ...
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14029
This only works in non-bi arch mode. If ever aarch64+arm
are compiled bi-arch, then some more work is needed to have
a 64 bits vgdb able to ptrace invoke a 32 bits valgrind.
Note also that PTRACE_GETREGSET is defined on other platforms
(e.g. ppc64 fedora 18 defines it), but it is not used on
these platforms, as again, PTRACE_GETREGSET implies some
work for bi-arch to work properly.
So, on all platforms except arm64, we use PTRACE_GETREGS
or PTRACE_PEEKUSER.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13981
of memcheck and helgrind in a common module:
pub_tool_addrinfo.h pub_core_addrinfo.h m_addrinfo.c
At the same time, the factorised code is made usable by other
tools also (and is used by the gdbserver command 'v.info location'
which replaces the helgrind 'describe addr' introduced 1 week ago
and which is now callable by all tools).
The new address description code can describe more addresses
(e.g. for memcheck, if the block is not on the free list anymore,
but is in an arena free list, this will also be described).
Similarly, helgrind address description can now describe more addresses
when --read-var-info=no is given (e.g. global symbols are
described, or addresses on the stack are described as
being on the stack, freed blocks in the arena free list are
described, ...).
See e.g. the change in helgrind/tests/annotate_rwlock.stderr.exp
or locked_vs_unlocked2.stderr.exp
The patch touches many files, but is basically a lot of improvements
in helgrind output files.
The code changes are mostly refactorisation of existing code.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13965
Patch by Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>.
Partial fix. make && make check now works with builddir != srcdir.
But make regtest doesn't yet.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13949
VALGRIND_DISABLE_ADDR_ERROR_REPORTING_IN_RANGE and
VALGRIND_ENABLE_ADDR_ERROR_REPORTING_IN_RANGE
and supporting machinery for managing whole-address-space sparse
mappings. n-i-bz. In support of
https://bugzilla.mozilla.org/show_bug.cgi?id=970643
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13884
vgdb.c contained (conditionally compiled) "invoker" code to have ptrace syscalls
used to allow gdb/vgdb to connect to a valgrind process blocked in a syscall.
This "invoker" code is ptrace based.
Not all platforms are using ptrace.
=> refactor vgdb so as allow invoker code to be added more cleanly
for non ptrace based platforms (e.g. Darwin, Solaris).
* add file vgdb.h for:
- definition of the vgdb-invoker interface
- common declarations between vgdb.c and vgdb-invoker implementations
* move ptrace related code from vgdb.c to new file vgdb-invoker-ptrace.c
* new file vgdb-invoker-none.c containing an empty invoker implementation
used on platforms that do not (yet) have a invoker implementation
(e.g. android and darwin).
* modified Makefile.am to use one of the vgdb-invoker-*.c file depending
on the platform.
* small changes related to changing ptraceinvoker to invoker in various files:
gdbserver_tests/make_local_links, gdbserver_tests/nlcontrolc.vgtest,
gdbserver_tests/mcinvokeRU.vgtest, gdbserver_tests/nlsigvgdb.vgtest
gdbserver_tests/mcinvokeWS.vgtest, coregrind/m_gdbserver/README_DEVELOPERS
Patch from Ivo Raisr, slightly modified
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13743
Necessary changes to Valgrind to support MIPS64LE on Linux.
Minor cleanup/style changes embedded in the patch as well.
The change corresponds to r2687 in VEX.
Patch written by Dejan Jevtic and Petar Jovanovic.
More information about this issue:
https://bugs.kde.org/show_bug.cgi?id=313267
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13292
--profile-flags=00000000 now prints summary statistics, one line per
profiled block, but with no translation details. Previously it had
no effect.
--profile-interval=<number> is a new flag that causes the profile data
to be dumped and zeroed every <number> event checks. This makes it
possible to get profile data without waiting for runs to end, and to
get profile data which depends on the current workload etc. If
--profile-interval=0 or is unset, the profile is printed only once, at
the end of the run, as before.
--profile-flags=XXXXXXXX (for at least one nonzero X) prints the
summary lines both at the start and end of the profile, so you don't
have to scroll back up to the top to see the summary.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13213
It's reorg only. No new cache autodetection stuff has been added.
coregrind
pub_tool_cpuid.h is removed as it is no longer exposed to tools.
Its contents has moved to pub_core_cpuid.h.
New file: coregrind/m_cache.c to contain the autodetect code for
cache configurations and define other cache characteristics that
cannot be autodetected (i.e. icaches_maintain_coherence). Most of
cg-arch/x86-amd64.c was moved here. The cache detection code for
x86-64 needs to be fixed to properly initialise VexCacheInfo. It
currently has cachegrind bias.
m_cache.c exports a single function (to coregrind):
VG_(machine_get_cache_info)(VexArchInfo *vai)
This function is called from VG_(machine_get_hwcaps) after hwcaps have
been detected.
cachegrind
Remove cachegrind/cg-{ppc32,ppc43,arm,mips32,s390x,x86-amd64}.c
With the exception of x86/mamd64 those were only establishing a
default cache configuration and that is so small a code snippet that
a separate file is no longer warranted. So, the code was moved to
cg-arch.c. Code was added to extract the relevant info from
x86-amd64.
New function maybe_tweak_LLc which captures the code to massage the
LLc cache configuration into something the simulator can handle. This
was originally in cg-x86-amd64.c but should be used to all architectures.
Changed warning message about missing cache auto-detect feature
to be more useful. Adapted filter-stderr scripts accordingly.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13028
From: Ian Campbell <Ian.Campbell@citrix.com>
Under Xen the toolstack is responsible for managing the domains in
the system, e.g. creating, destroying, and otherwise manipulating
them.
To do this it uses a number of ioctls on the /proc/xen/privcmd
device. Most of these (the MMAPBATCH ones) simply set things up such
that a subsequenct mmap call will map the desired guest memory. Since
valgrind has no way of knowing what the memory contains we assume
that it is all initialised (to do otherwise would require valgrind to
be observing the complete state of the system and not just the given
process).
The most interesting ioctl is XEN_IOCTL_PRIVCMD_HYPERCALL which
allows the toolstack to make arbitrary hypercalls. Although the
mechanism here is specific to the OS of the guest running the
toolstack the hypercalls themselves are defined solely by the
hypervisor. Therefore I have split support for this ioctl into a part
in syswrap-linux.c which handles the ioctl itself and passes things
onto a new syswrap-xen.c which handles the specifics of the
hypercalls themselves. Porting this to another OS should just be a
matter of wiring up syswrap-$OS.c to decode the ioctl and call into
syswrap-xen.c. In the future we may want to split this into
syswrap-$ARCH-xen.c but for now this is x86 only.
The hypercall coverage here is pretty small but is enough to get
reasonable(-ish) results out of the xl toolstack when listing,
creating and destroying domains.
One issue is that the hypercalls which are exlusively used by the
toolstacks (as opposed to those used by guest operating systems) are
not considered a stable ABI, since the hypervisor and the lowlevel
tools are considered a matched pair. This covers the sysctl and
domctl hypercalls which are a fairly large chunk of the support
here. I'm not sure how to solve this without invoking a massive
amount of duplication. Right now this targets the Xen unstable
interface (which will shortly be released as Xen 4.2), perhaps I can
get away with deferring this problem until the first change .
On the plus side the vast majority of hypercalls are not of interest
to the toolstack (they are used by guests) so we can get away without
implementing them.
Note: a hypercall only reads as many words from the ioctl arg
struct as there are actual arguments to that hypercall and the
toolstack only initialises the arguments which are used. However
there is no space in the DEFN_PRE_TEMPLATE prototype to allow this to
be communicated from syswrap-xen.c back to syswrap-linux.c. Since a
hypercall can have at most 5 arguments I have hackily stolen ARG8 for
this purpose.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12963
This implies to change the interface between the
arch independent gdbserver files and the arch dependent files
as AVX implies a choice of xml files at run time.
In valgrind-low-amd64.c, the xml files and the nr of registers
are different depending on AVX support or not.
Other platforms still have a fully static nr of registers.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12581
AVX support implies to have target xml files which are selected
according to the machine hwcaps.
This change improves the structure of the gdbserver software layering
to prepare for this.
Basically, the protocol files (e.g. server.c) are now calling directly
the valgrind target operations which are now defined in target.h/target.c
(before, there was a level of indirection inheritated from the GDB
structure which was useless for valgrind gdbserver).
+ clarified some comments
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12579
once place. This was breaking vg-in-place on platforms
needing gdbserver target description files.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12401
* new files include/pub_tool_groupalloc.h and coregrind/m_groupalloc.c
implementing a group allocator (based on helgrind group alloc).
* include/Makefile.am coregrind/Makefile.am : added pub_tool_groupalloc.h
and m_groupalloc.c
* helgrind/libhb_core.c : use pub_tool_groupalloc.h/m_groupalloc.c
instead of the local implementation.
* include/pub_tool_oset.h coregrind/m_oset.c : new function
allowing to create an oset that will use a pool allocator.
new function allowing to clone an oset (so as to share the pool alloc)
* memcheck/tests/unit_oset.c drd/tests/unit_bitmap.c : modified
so that it compiles with the new m_oset.c
* memcheck/mc_main.c : use group alloc for MC_Chunk
memcheck/mc_include.h : declare the MC_Chunk group alloc
* memcheck/mc_main.c : use group alloc for the nodes of the secVBitTable OSet
* include/pub_tool_hashtable.h coregrind/m_hashtable.c : pass the free node
function in the VG_(HT_destruct).
(needed as the hashtable user can allocate a node with its own alloc,
the hash table destroy must be able to free the nodes with the user
own free).
* coregrind/m_gdbserver/m_gdbserver.c : pass free function to VG_(HT_destruct)
* memcheck/mc_replace_strmem.c memcheck/mc_machine.c
memcheck/mc_malloc_wrappers.c memcheck/mc_leakcheck.c
memcheck/mc_errors.c memcheck/mc_translate.c : new include needed
due to group alloc.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12341
* initial support for Pandaboard/Linaro
* on Android/ARM, ask for non-executable stacks in the executables
* disable Memcheck's strcasestr intercept; its use of tolower()
causes the dynamic linker to fail.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12234
Android. Making that work will require a bit of extra effort due to
minor glibc-vs-bionic differences.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11885
__builtin_setjmp and __builtin_longjmp so that they can be selectively
replaced, on a platform by platform basis. Does not change any
functionality. Related to #259977.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11687
in certain circumstances. This works around a bug in the linker that
ships in Xcode 4.0.0 and 4.0.1 causing the 64-bit tool executables to
segfault at startup. Fixes#267997.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11686
svn merge -r11143:HEAD svn://svn.valgrind.org/valgrind/branches/MACOSX106
There were some easy-to-resolve conflicts.
Then I had to fix up coregrind/link_tool_exe*.in -- those files had been
added independently on both the trunk and the branch, AFAICT. I just
overwrote the trunk versions with the branch versions.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11194
executables. Gets rid of the linker script kludgery and uniformly
uses -Ttext=0x38000000 (or whatever) on Linux, so as to accomodate
both traditional ld and gold. Should fix#193413 although I have
been unable to test it. Using a whole new program seems like
overkill, but this is infrastructure to support static linking of
the tool executables on MacOS too.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11141
the changes to do with reading and using ELF and DWARF3 info.
This breaks all targets except amd64-linux and x86-linux.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@10982
following improvements:
- Arch/OS/platform-specific files are now included/excluded via the
preprocessor, rather than via the build system. This is more consistent
(we use the pre-processor for small arch/OS/platform-specific chunks
within files) and makes the build system much simpler, as the sources for
all programs are the same on all platforms.
- Vast amounts of cut+paste Makefile.am code has been factored out. If a
new platform is implemented, you need to add 11 extra Makefile.am lines.
Previously it was over 100 lines.
- Vex has been autotoolised. Dependency checking now works in Vex (no more
incomplete builds). Parallel builds now also work. --with-vex no longer
works; it's little use and a pain to support. VEX/Makefile is still in
the Vex repository and gets overwritten at configure-time; it should
probably be renamed Makefile-gcc to avoid possible problems, such as
accidentally committing a generated Makefile. There's a bunch of hacky
copying to deal with the fact that autotools don't handle same-named files
in different directories. Julian plans to rename the files to avoid this
problem.
- Various small Makefile.am things have been made more standard automake
style, eg. the use of pkginclude/pkglib prefixes instead of rolling our
own.
- The existing five top-level Makefile.am include files have been
consolidated into three.
- Most Makefile.am files now are structured more clearly, with comment
headers separating sections, declarations relating to the same things next
to each other, better spacing and layout, etc.
- Removed the unused exp-ptrcheck/tests/x86 directory.
- Renamed some XML files.
- Factored out some duplicated dSYM handling code.
- Split auxprogs/ into auxprogs/ and mpi/, which allowed the resulting
Makefile.am files to be much more standard.
- Cleaned up m_coredump by merging a bunch of files that had been
overzealously separated.
The net result is 630 fewer lines of Makefile.am code, or 897 if you exclude
the added Makefile.vex.am, or 997 once the hacky file copying for Vex is
removed. And the build system is much simpler.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@10364
- Introduced VG_SYSNUM_STRING and VG_SYSNUM_STRING_EXTRA which factor out
differences in the way syscall numbers are printed on different platforms.
This gets rid of seven "DDD" fixme-style comments.
- This also meant that Darwin syscall numbers are now printed in a
non-ambiguous way -- previously Unix, machine-dependent and diagnostic
syscalls were all printed the same way, even though their numbers overlap.
Now each number is prefixed with "unix", "mdep", etc. And Mach trap
numbers aren't printed as negative numbers now that they have a "mach"
prefix.
- Split each of pub_core_vkiscnums.h and pub_tool_vkiscnums.h into two
parts, one suitable for inclusion in asm files, one suitable for inclusion
in C files; in both cases the latter includes the former. This makes
this module more like other modules that have asm-only components (eg.
m_transtab); it also allows the hacky VG_IN_ASSEMBLY_SOURCE macros and
tests to be removed.
- Removed some of the VG_DARWIN_SYSNO_* macros that were no longer needed,
and renamed some of the existing ones to make their meanings clearer.
- Added comments on the encoding of Darwin syscall numbers so it's
possible for mortals to understand without reading the kernel code..
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@10218