8 Commits

Author SHA1 Message Date
Philippe Waroquiers
53df23f0a6 This patch adds a 'de-duplicating memory pool allocator':
include/pub_tool_deduppoolalloc.h
  coregrind/pub_core_deduppoolalloc.h
  coregrind/m_deduppoolalloc.c
and uses it (currently only) for the strings in m_debuginfo/storage.c
The idea is that such ddup pool allocator will also be used for other
highly duplicated information (e.g. the DiCFSI information), where
significant gains can also be achieved.
The dedup pool for strings also decreases significantly the memory
needed by the read inline information (patch still to be committed,
see bug 278972).

When testing with a big executable (tacot_process),
this reduces the size of the dinfo arena from
  trunk:  158941184/109760512  max/curr mmap'd, 156775944/107882728 max/curr,
to
  ddup:   157892608/106614784  max/curr mmap'd, 156362160/101414712 max/curr

(so 3Mb less mmap-ed once debug info is read, 1Mb less mmap-ed in peak,
6Mb less allocated once debug info is read).

This is all gained due to the string which changes from:
  trunk:   17,434,704 in       266: di.storage.addStr.1
to
  ddup:    10,966,608 in       750: di.storage.addStr.1
(6.5Mb less memory used by strings)
The gain in mmap-ed memory is smaller due to fragmentation.
Probably one could decrease the fragmentation by using bigger
size for the dedup pool, but then we would lose memory on the last
allocated pool (and for small libraries, we often do not use much
of a big pool block).
Solution might be to increase the pool size but have a "shrink_block"
operation. To be looked at in the future.

In terms of performance, startup of a big executable (on an old pentium)
is not influenced significantly (something like 0.1 seconds on 15 seconds
startup for a big executable, on a slow pentium).

The dedup pool uses a hash table. The hash function used currently
is the VG_(adler32) check sum. It is reported (and visible also here)
that this checksum is not a very good hash function (many collisions).

To have statistics about collisions, use  --stats -v -v -v

As an example of the collisions, on the strings in debug info of memcheck tool on x86,
one obtain:
   --4789-- dedupPA:di.storage.addStr.1 9983 allocs (8174 uniq) 11 pools (4820 bytes free in last pool)
   --4789-- nr occurences of chains of len N, N-plicated keys, N-plicated elts
   --4789-- N: 0 : nr chain   6975, nr keys      0, nr elts      0
   --4789-- N: 1 : nr chain   3670, nr keys   6410, nr elts   8174
   --4789-- N: 2 : nr chain   1070, nr keys    226, nr elts      0
   --4789-- N: 3 : nr chain    304, nr keys    100, nr elts      0
   --4789-- N: 4 : nr chain    104, nr keys     84, nr elts      0
   --4789-- N: 5 : nr chain     72, nr keys     42, nr elts      0
   --4789-- N: 6 : nr chain     44, nr keys     34, nr elts      0
   --4789-- N: 7 : nr chain     18, nr keys     13, nr elts      0
   --4789-- N: 8 : nr chain     17, nr keys      8, nr elts      0
   --4789-- N: 9 : nr chain      4, nr keys      6, nr elts      0
   --4789-- N:10 : nr chain      9, nr keys      4, nr elts      0
   --4789-- N:11 : nr chain      1, nr keys      0, nr elts      0
   --4789-- N:13 : nr chain      1, nr keys      1, nr elts      0
   --4789-- total nr of unique   chains:  12289, keys   6928, elts   8174
which shows that on 8174 different strings, we have only 6410 strings which have
a unique hash value. As other examples, N:13 line shows we have 13 strings
mapping to the same key. N:14 line shows we have 4 groups of 10 strings mapping to the
same key, etc.
So, adler32 is definitely a bad hash function.
Trials have been done with another hash function, giving a much lower
collision rate. So, a better (but still fast) hash function would probably
be beneficial. To be looked at ...




git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14029
2014-06-14 16:30:09 +00:00
Julian Seward
dbf9b63605 Update copyright dates (20XY-2012 ==> 20XY-2013)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13658
2013-10-18 14:27:36 +00:00
Florian Krohm
a6c7a2893c Fix coregrind header files such that they can be included without
having to worry what other header files may have to be included
beforehand.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13549
2013-09-15 13:54:34 +00:00
Florian Krohm
de4c6ffc54 Fix include guard.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13547
2013-09-15 09:56:18 +00:00
Philippe Waroquiers
a44190db93 Fix misleading comment, as VG_(releasePA) is deleting the PA when zero ref count
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13519
2013-08-30 15:48:16 +00:00
Florian Krohm
25b18b0aa1 Char/HChar and constness fixes. Mostly cost center
on allocators which is always a const HChar *


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13089
2012-10-27 23:07:42 +00:00
Bart Van Assche
deb44ff6f0 Pool allocator: change the semantics of VG_(releasePA)() according to comment http://bugs.kde.org/show_bug.cgi?id=282230#c50.
This change also makes the semantics of releasePA match the semantics of
other release functions, e.g. those in XPCOM (see also http://developer.mozilla.org/en/XPCOM_Interface_Reference/nsISupports#Release%28%29).


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12343
2012-01-18 08:12:16 +00:00
Philippe Waroquiers
be97cddd7a Fixes 282230 group allocator for small fixed size, use it for MC_Chunk/SEc vbit
* new files include/pub_tool_groupalloc.h and coregrind/m_groupalloc.c
  implementing a group allocator (based on helgrind group alloc).
* include/Makefile.am coregrind/Makefile.am : added pub_tool_groupalloc.h
  and m_groupalloc.c
* helgrind/libhb_core.c : use pub_tool_groupalloc.h/m_groupalloc.c
  instead  of the local implementation.
* include/pub_tool_oset.h coregrind/m_oset.c : new function
  allowing to create an oset that will use a pool allocator.
  new function allowing to clone an oset (so as to share the pool alloc)
* memcheck/tests/unit_oset.c drd/tests/unit_bitmap.c : modified
  so that it compiles with the new m_oset.c
* memcheck/mc_main.c : use group alloc for MC_Chunk
  memcheck/mc_include.h : declare the MC_Chunk group alloc
* memcheck/mc_main.c : use group alloc for the nodes of the secVBitTable OSet
* include/pub_tool_hashtable.h coregrind/m_hashtable.c : pass the free node
  function in the VG_(HT_destruct).
  (needed as the hashtable user can allocate a node with its own alloc,
  the hash table destroy must be able to free the nodes with the user
  own free).
* coregrind/m_gdbserver/m_gdbserver.c : pass free function to VG_(HT_destruct)
* memcheck/mc_replace_strmem.c memcheck/mc_machine.c
  memcheck/mc_malloc_wrappers.c memcheck/mc_leakcheck.c
  memcheck/mc_errors.c memcheck/mc_translate.c : new include needed
  due to group alloc.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12341
2012-01-17 21:16:30 +00:00