Commit Graph

17 Commits

Author SHA1 Message Date
Julian Seward
adc2dafee9 Update copyright dates, to include 2015. No functional change.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15577
2015-08-21 11:32:26 +00:00
Philippe Waroquiers
f4dde903ab This patch decreases significantly the memory needed for OldRef and
slightly increases the performance. It also moderately improves
the nr of cases where helgrind can provide the stack trace of the old
access (when using the same amount of memory for the OldRef entries).
The patch also provides a new helgrind monitor command to show
the recorded accesses for an address+len, and adds an optional argument
lock_address to the monitor command 'info locks', to show the info
about just this lock.

Currently, oldref are maintained in a sparse WA, that points to N
entries, as specified by --conflict-cache-size=N.
For each entry (associated to an address), we have the last 5 accesses.

Old entries are recycled in an exact LRU order.
But inside an entry, we could have a recent access, and 4 very
old accesses that are kept 'alive' by a single thread accessing
repetitively the address shared with the 4 other old entries.


The attached patch replaces the sparse WA that maintains the OldREf
by an hash table.
Each OldRef now also only maintains one single access for an address.
As an OldRef now maintains only one access, all the entries are now
strictly in LRU mode.

Memory used for OldRef
-----------------------
For the trunk, an OldRef has a size of 72 bytes (on 32 bits archs)
maintaining up to 5 accesses to the same address.
On 64 bits arch, an OldRef is 104 bytes.

With the patch, an OldRef has a size of 32 bytes (on 32 bits archs)
or 56 bytes (on 64 bits archs).

So, for one single access, the new code needs (on 32 bits)
32 bytes, while the trunk needs only 14.4 bytes.
However, that is the worst case, assuming that the 5 entries in the
accs array are all used.
Looking on 2 big apps (one of them being firefox), we see that
we have very few OldRef entries that have the 5 entries occupied.
On a firefox startup, of the 5x1,000,000 accesses, we only have
1,406,939 accesses that are used.
So, in average, the trunk uses in reality around 52 bytes per access.

The default value for --conflict-cache-size has been doubled to 2000000.
This ensures that the memory used for the OldRef is more or less the
same as the trunk (104Mb for OldRef entries).

Memory used for sparseWA versus hashtable
-----------------------------------------
Looking on 2 big apps (one of them being firefox), we see that
there are big variations on the size of the WA : it can go in a few
seconds from 10MB to 250MB, or can decrease back to 10 MB.
This all depends where the last N accesses were done: if well localised,
the WA will be small.
If the last N accesses were distributed over a big address space,
then the WA will be big: the last level of WA (the biggest memory consumer)
uses slightly more than 1KB (2KB on 64 bits) for each '256 bytes' memory
zone where there is an oldref. So, in the worst case, on 32 bits, we
need > 1_000_000_000 sparseWA memory to keep 1_000_000 OldRef.

The hash table has between 1 to 2 Word overhead per OldRef
(as the chain array is +- doubled each time the hash table is full).
So, unless the OldRef are extremely localised, the overhead of the
hash table will be significantly less.

With the patch, the core arena total alloc is:
  5299535/1201448632 totalloc-blocks/bytes
The trunk is
  6693111/3959050280 totalloc-blocks/bytes
(so, around 1.20Gb versus 3.95Gb).
This big difference is due to the fact that the sparseWA repetitively
allocates then frees Level0 or LevelN when OldRef in the region covered
by the Level0/N have all been recycled.

In terms of CPU
---------------
With the patch, on amd64, a firefox startup seems slightly faster (around 1%).
The peak memory mmaped/used decreases by 200Mb.
For a libreoffice test, the memory decreases by 230Mb. CPU also decreases
slightly (1%).


In terms of correctness:
-----------------------
The trunk could potentially show not the most recent access
to the memory of a race : the first OldRef entry matching the raced upon
address was used, while we could have a more recent access in a following
OldRef entry. In other words, the trunk only guaranteed to find the
most recent access in an OldRef, but not between the several OldRef that
could cover the raced upon address.
So, assuming it is important to show the most recent access, this patch
ensures we really show the most recent access, even in presence of overlapping
accesses.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15289
2015-05-25 17:24:27 +00:00
Philippe Waroquiers
4328dff7d1 This patch decreases the memory used by the helgrind SecMap,
by implementing a Garbage Collection for the SecMap.

The basic change is that freed memory is marked as noaccess
(while before, it kept the previous marking, on the basis that
non buggy applications are not accessing freed memory in any case).
Keeping the previous marking avoids the CPU/memory changes needed
to mark noaccess.

However, marking freed memory noaccess and GC the secmap reduces
the memory on big apps.
For example, a firefox test needs 220Mb less (on about 2.06 Gb).
Similar reduction for libreoffice batch (260 MB less on 1.09 Gb).
On such applications, the performance with the patch is similar to the trunk.

There is a performance decrease for applications that are doing
a lot of malloc/free repetitively: e.g. on some perf tests, an increase
in cpu of up to 15% has been observed.

Several performance optimisations can be done afterwards to not loose
too much performance. The decrease of memory is expected to produce
in any case significant benefit in memory constrained environments
(e.g. android phones).

So, after discussion with Julian, it was decided to commit as-is
and (re-)gain (part of) performance in follow-up commits.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15207
2015-05-10 22:19:31 +00:00
Julian Seward
dbf9b63605 Update copyright dates (20XY-2012 ==> 20XY-2013)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13658
2013-10-18 14:27:36 +00:00
Julian Seward
4a3633e266 Update copyright dates to include 2012.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12843
2012-08-05 15:46:46 +00:00
Julian Seward
c96096ab24 Update all copyright dates, from 20xy-2010 to 20xy-2011.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12206
2011-10-23 07:32:08 +00:00
Julian Seward
3e9f7836c5 Merge the contents of the HGDEV2 branch into trunk:
* performance and scalability improvements
* show locks held by both threads in a race
* show all 4 locks involved in a lock order violation
* better delimited error messages



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11824
2011-06-24 10:09:41 +00:00
Julian Seward
aca925fd10 When handling client munmaps and mprotects with r=0 & w=0, actually
paint the relevant address range as NoAccess rather than ignoring the
event.  This is important for avoiding VTS leaks in libhb_core.
More details in comments in the code.

Also rename the _noaccess_ painters that do nothing to make it clearer
that they do nothing :-)



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11654
2011-03-17 19:39:55 +00:00
Julian Seward
f701e93f04 Followup to r11619: more tidying up w.r.t. the renaming of
'struct _Thr :: opaque' to 'struct _Thr :: hgthread'.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11623
2011-03-10 21:34:21 +00:00
Julian Seward
f4b08e78ed Minor cleanup (no functional change): rename 'struct _Thr :: opaque'
to 'hgthread', since that's what it is really.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11619
2011-03-10 15:14:37 +00:00
Julian Seward
9b0574dff8 Update copyright dates to 2010.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11121
2010-05-03 21:37:12 +00:00
Julian Seward
6f885d9f81 Rollup fixes for Helgrind:
* tracking of barriers: add support for resizable barriers

* resync TSan-compatible client requests with latest changes

* add direct access to the client requests used in hg_intercepts.c

* add a client request pair to disable and re-enable tracking
  of arbitrary address ranges



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11062
2010-03-03 23:03:40 +00:00
Julian Seward
6334c4c3f9 Various improvements:
* rename many functions to do with shadow memory handling, to
  more clearly differentiate reads and writes directly of the
  shadow state from client reads and writes, each of which
  generate both a read and a write of the client state.  It was
  getting confusing (== hard to verify) in there.

* use idempotency of memory state machine transition rules to
  speed up long sequential sections, speedups in range 0% to 28%

* remove 4-way Pord (EQ, LT, GT, UN) and associated machinery,
  and replace it with something that merely computes LEQ in the
  partial ordering, since that's all that is necessary, and
  this simplifies some fast-case paths.

* add optional approx history mechanism a la DRD (start/end stack
  of conflicting segment), much faster if you don't need exact
  conflicting-access details

* libhb_so_recv: tick the VTS in the receiving thread; don't just
  join with the VC in the SO.  It's probably correct without this
  modification, but that correctness is fragile and depends on
  complex properties of how SOs are used/created.  Much better to
  be completely safe.  (Needs cache-isation).

* get rid of unnecessary shadow memory state "SVal_NOACCESS"
  and simplify associated fast-case paths in msmc{read,write}



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@10589
2009-07-24 08:45:08 +00:00
Nicholas Nethercote
2001629c3f Updated copyright years.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@9344
2009-03-10 22:02:09 +00:00
Julian Seward
5da3926707 * In the conflicting-event mechanism, also record the size and
read-or-writeness of each access, so that these can be displayed in
  error messages.

* Use recorded read-or-writeness info to avoid producing error
  messages that claim claim two reads race against each other -- this
  is clearly silly.  For each pair of racing accesses now reported, at
  least one of them will (should!) always now be a write, and (as
  previously ensured) they will be from different threads.

* Lookups in the conflicting-access map is expensive, so don't do that
  as soon as a race is detected.  Instead wait until the update_extra
  method is called.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@8809
2008-12-07 01:41:46 +00:00
Julian Seward
a500c38d13 A bit of tidying up:
* get rid of 'struct _EC' (a.k.a 'struct EC_') and use ExeContext
  everywhere

* remove stacktrace_to_EC and call
  VG_(make_ExeContext_from_StackTrace) directly

* comment out some unused code



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@8749
2008-11-08 20:36:26 +00:00
Julian Seward
35c28b721f Merge Helgrind from branches/YARD into the trunk. Also includes some
minor changes to make stack unwinding on amd64-linux approximately
twice as fast as it was before.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@8707
2008-10-25 16:22:41 +00:00