mirror of
https://github.com/Zenithsiz/ftmemsim-valgrind.git
synced 2026-02-03 18:13:01 +00:00
slightly increases the performance. It also moderately improves the nr of cases where helgrind can provide the stack trace of the old access (when using the same amount of memory for the OldRef entries). The patch also provides a new helgrind monitor command to show the recorded accesses for an address+len, and adds an optional argument lock_address to the monitor command 'info locks', to show the info about just this lock. Currently, oldref are maintained in a sparse WA, that points to N entries, as specified by --conflict-cache-size=N. For each entry (associated to an address), we have the last 5 accesses. Old entries are recycled in an exact LRU order. But inside an entry, we could have a recent access, and 4 very old accesses that are kept 'alive' by a single thread accessing repetitively the address shared with the 4 other old entries. The attached patch replaces the sparse WA that maintains the OldREf by an hash table. Each OldRef now also only maintains one single access for an address. As an OldRef now maintains only one access, all the entries are now strictly in LRU mode. Memory used for OldRef ----------------------- For the trunk, an OldRef has a size of 72 bytes (on 32 bits archs) maintaining up to 5 accesses to the same address. On 64 bits arch, an OldRef is 104 bytes. With the patch, an OldRef has a size of 32 bytes (on 32 bits archs) or 56 bytes (on 64 bits archs). So, for one single access, the new code needs (on 32 bits) 32 bytes, while the trunk needs only 14.4 bytes. However, that is the worst case, assuming that the 5 entries in the accs array are all used. Looking on 2 big apps (one of them being firefox), we see that we have very few OldRef entries that have the 5 entries occupied. On a firefox startup, of the 5x1,000,000 accesses, we only have 1,406,939 accesses that are used. So, in average, the trunk uses in reality around 52 bytes per access. The default value for --conflict-cache-size has been doubled to 2000000. This ensures that the memory used for the OldRef is more or less the same as the trunk (104Mb for OldRef entries). Memory used for sparseWA versus hashtable ----------------------------------------- Looking on 2 big apps (one of them being firefox), we see that there are big variations on the size of the WA : it can go in a few seconds from 10MB to 250MB, or can decrease back to 10 MB. This all depends where the last N accesses were done: if well localised, the WA will be small. If the last N accesses were distributed over a big address space, then the WA will be big: the last level of WA (the biggest memory consumer) uses slightly more than 1KB (2KB on 64 bits) for each '256 bytes' memory zone where there is an oldref. So, in the worst case, on 32 bits, we need > 1_000_000_000 sparseWA memory to keep 1_000_000 OldRef. The hash table has between 1 to 2 Word overhead per OldRef (as the chain array is +- doubled each time the hash table is full). So, unless the OldRef are extremely localised, the overhead of the hash table will be significantly less. With the patch, the core arena total alloc is: 5299535/1201448632 totalloc-blocks/bytes The trunk is 6693111/3959050280 totalloc-blocks/bytes (so, around 1.20Gb versus 3.95Gb). This big difference is due to the fact that the sparseWA repetitively allocates then frees Level0 or LevelN when OldRef in the region covered by the Level0/N have all been recycled. In terms of CPU --------------- With the patch, on amd64, a firefox startup seems slightly faster (around 1%). The peak memory mmaped/used decreases by 200Mb. For a libreoffice test, the memory decreases by 230Mb. CPU also decreases slightly (1%). In terms of correctness: ----------------------- The trunk could potentially show not the most recent access to the memory of a race : the first OldRef entry matching the raced upon address was used, while we could have a more recent access in a following OldRef entry. In other words, the trunk only guaranteed to find the most recent access in an OldRef, but not between the several OldRef that could cover the raced upon address. So, assuming it is important to show the most recent access, this patch ensures we really show the most recent access, even in presence of overlapping accesses. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15289
187 lines
8.0 KiB
C
187 lines
8.0 KiB
C
|
|
/*--------------------------------------------------------------------*/
|
|
/*--- LibHB: a library for implementing and checking ---*/
|
|
/*--- the happens-before relationship in concurrent programs. ---*/
|
|
/*--- libhb_main.c ---*/
|
|
/*--------------------------------------------------------------------*/
|
|
|
|
/*
|
|
This file is part of LibHB, a library for implementing and checking
|
|
the happens-before relationship in concurrent programs.
|
|
|
|
Copyright (C) 2008-2013 OpenWorks Ltd
|
|
info@open-works.co.uk
|
|
|
|
This program is free software; you can redistribute it and/or
|
|
modify it under the terms of the GNU General Public License as
|
|
published by the Free Software Foundation; either version 2 of the
|
|
License, or (at your option) any later version.
|
|
|
|
This program is distributed in the hope that it will be useful, but
|
|
WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with this program; if not, write to the Free Software
|
|
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
|
|
02111-1307, USA.
|
|
|
|
The GNU General Public License is contained in the file COPYING.
|
|
*/
|
|
|
|
#ifndef __LIBHB_H
|
|
#define __LIBHB_H
|
|
|
|
/* Abstract to user: thread identifiers */
|
|
/* typedef struct _Thr Thr; */ /* now in hg_lock_n_thread.h */
|
|
|
|
/* Abstract to user: synchronisation objects */
|
|
/* typedef struct _SO SO; */ /* now in hg_lock_n_thread.h */
|
|
|
|
/* Initialise library; returns Thr* for root thread. 'shadow_alloc'
|
|
should never return NULL, instead it should simply not return if
|
|
they encounter an out-of-memory condition. */
|
|
Thr* libhb_init (
|
|
void (*get_stacktrace)( Thr*, Addr*, UWord ),
|
|
ExeContext* (*get_EC)( Thr* )
|
|
);
|
|
|
|
/* Shut down the library, and print stats (in fact that's _all_
|
|
this is for.) */
|
|
void libhb_shutdown ( Bool show_stats );
|
|
|
|
/* Thread creation: returns Thr* for new thread */
|
|
Thr* libhb_create ( Thr* parent );
|
|
|
|
/* Thread async exit */
|
|
void libhb_async_exit ( Thr* exitter );
|
|
void libhb_joinedwith_done ( Thr* exitter );
|
|
|
|
/* Synchronisation objects (abstract to caller) */
|
|
|
|
/* Allocate a new one (alloc'd by library) */
|
|
SO* libhb_so_alloc ( void );
|
|
|
|
/* Dealloc one */
|
|
void libhb_so_dealloc ( SO* so );
|
|
|
|
/* Send a message via a sync object. If strong_send is true, the
|
|
resulting inter-thread dependency seen by a future receiver of this
|
|
message will be a dependency on this thread only. That is, in a
|
|
strong send, the VC inside the SO is replaced by the clock of the
|
|
sending thread. For a weak send, the sender's VC is joined into
|
|
that already in the SO, if any. This subtlety is needed to model
|
|
rwlocks: a strong send corresponds to releasing a rwlock that had
|
|
been w-held (or releasing a standard mutex). A weak send
|
|
corresponds to releasing a rwlock that has been r-held.
|
|
|
|
(rationale): Since in general many threads may hold a rwlock in
|
|
r-mode, a weak send facility is necessary in order that the final
|
|
SO reflects the join of the VCs of all the threads releasing the
|
|
rwlock, rather than merely holding the VC of the most recent thread
|
|
to release it. */
|
|
void libhb_so_send ( Thr* thr, SO* so, Bool strong_send );
|
|
|
|
/* Recv a message from a sync object. If strong_recv is True, the
|
|
resulting inter-thread dependency is considered adequate to induce
|
|
a h-b ordering on both reads and writes. If it is False, the
|
|
implied h-b ordering exists only for reads, not writes. This is
|
|
subtlety is required in order to support reader-writer locks: a
|
|
thread doing a write-acquire of a rwlock (or acquiring a normal
|
|
mutex) models this by doing a strong receive. A thread doing a
|
|
read-acquire of a rwlock models this by doing a !strong_recv. */
|
|
void libhb_so_recv ( Thr* thr, SO* so, Bool strong_recv );
|
|
|
|
/* Has this SO ever been sent on? */
|
|
Bool libhb_so_everSent ( SO* so );
|
|
|
|
/* Memory accesses (1/2/4/8 byte size). They report a race if one is
|
|
found. */
|
|
#define LIBHB_CWRITE_1(_thr,_a) zsm_sapply08_f__msmcwrite((_thr),(_a))
|
|
#define LIBHB_CWRITE_2(_thr,_a) zsm_sapply16_f__msmcwrite((_thr),(_a))
|
|
#define LIBHB_CWRITE_4(_thr,_a) zsm_sapply32_f__msmcwrite((_thr),(_a))
|
|
#define LIBHB_CWRITE_8(_thr,_a) zsm_sapply64_f__msmcwrite((_thr),(_a))
|
|
#define LIBHB_CWRITE_N(_thr,_a,_n) zsm_sapplyNN_f__msmcwrite((_thr),(_a),(_n))
|
|
|
|
#define LIBHB_CREAD_1(_thr,_a) zsm_sapply08_f__msmcread((_thr),(_a))
|
|
#define LIBHB_CREAD_2(_thr,_a) zsm_sapply16_f__msmcread((_thr),(_a))
|
|
#define LIBHB_CREAD_4(_thr,_a) zsm_sapply32_f__msmcread((_thr),(_a))
|
|
#define LIBHB_CREAD_8(_thr,_a) zsm_sapply64_f__msmcread((_thr),(_a))
|
|
#define LIBHB_CREAD_N(_thr,_a,_n) zsm_sapplyNN_f__msmcread((_thr),(_a),(_n))
|
|
|
|
void zsm_sapply08_f__msmcwrite ( Thr* thr, Addr a );
|
|
void zsm_sapply16_f__msmcwrite ( Thr* thr, Addr a );
|
|
void zsm_sapply32_f__msmcwrite ( Thr* thr, Addr a );
|
|
void zsm_sapply64_f__msmcwrite ( Thr* thr, Addr a );
|
|
void zsm_sapplyNN_f__msmcwrite ( Thr* thr, Addr a, SizeT len );
|
|
|
|
void zsm_sapply08_f__msmcread ( Thr* thr, Addr a );
|
|
void zsm_sapply16_f__msmcread ( Thr* thr, Addr a );
|
|
void zsm_sapply32_f__msmcread ( Thr* thr, Addr a );
|
|
void zsm_sapply64_f__msmcread ( Thr* thr, Addr a );
|
|
void zsm_sapplyNN_f__msmcread ( Thr* thr, Addr a, SizeT len );
|
|
|
|
void libhb_Thr_resumes ( Thr* thr );
|
|
|
|
/* Set memory address ranges to new (freshly allocated), or noaccess
|
|
(no longer accessible). NB: "AHAE" == "Actually Has An Effect" :-) */
|
|
void libhb_srange_new ( Thr*, Addr, SizeT );
|
|
void libhb_srange_untrack ( Thr*, Addr, SizeT );
|
|
void libhb_srange_noaccess_NoFX ( Thr*, Addr, SizeT ); /* IS IGNORED */
|
|
void libhb_srange_noaccess_AHAE ( Thr*, Addr, SizeT ); /* IS NOT IGNORED */
|
|
|
|
/* Counts the nr of bytes addressable in the range [a, a+len[
|
|
(so a+len excluded) and returns the nr of addressable bytes found.
|
|
If abits /= NULL, abits must point to a block of memory of length len.
|
|
In this array, each addressable byte will be indicated with 0xff.
|
|
Non-addressable bytes are indicated with 0x00. */
|
|
UWord libhb_srange_get_abits (Addr a, /*OUT*/UChar *abits, SizeT len);
|
|
|
|
/* Get and set the hgthread (pointer to corresponding Thread
|
|
structure). */
|
|
Thread* libhb_get_Thr_hgthread ( Thr* );
|
|
void libhb_set_Thr_hgthread ( Thr*, Thread* );
|
|
|
|
/* Low level copy of shadow state from [src,src+len) to [dst,dst+len).
|
|
Overlapping moves are checked for and asserted against. */
|
|
void libhb_copy_shadow_state ( Thr* thr, Addr src, Addr dst, SizeT len );
|
|
|
|
/* Call this periodically to give libhb the opportunity to
|
|
garbage-collect its internal data structures. */
|
|
void libhb_maybe_GC ( void );
|
|
|
|
/* Extract info from the conflicting-access machinery. */
|
|
Bool libhb_event_map_lookup ( /*OUT*/ExeContext** resEC,
|
|
/*OUT*/Thr** resThr,
|
|
/*OUT*/SizeT* resSzB,
|
|
/*OUT*/Bool* resIsW,
|
|
/*OUT*/WordSetID* locksHeldW,
|
|
Thr* thr, Addr a, SizeT szB, Bool isW );
|
|
|
|
typedef void (*Access_t) (StackTrace ips, UInt n_ips,
|
|
Thr* Thr_a,
|
|
Addr ga,
|
|
SizeT SzB,
|
|
Bool isW,
|
|
WordSetID locksHeldW );
|
|
/* Call fn for each recorded access history that overlaps with range [a, a+szB[.
|
|
fn is first called for oldest access.*/
|
|
void libhb_event_map_access_history ( Addr a, SizeT szB, Access_t fn );
|
|
|
|
/* ------ Exported from hg_main.c ------ */
|
|
/* Yes, this is a horrible tangle. Sigh. */
|
|
|
|
/* Get the univ_lset (universe for locksets) from hg_main.c. Sigh. */
|
|
WordSetU* HG_(get_univ_lsets) ( void );
|
|
|
|
/* Get the the header pointer for the double linked list of locks
|
|
(admin_locks). */
|
|
Lock* HG_(get_admin_locks) ( void );
|
|
|
|
#endif /* __LIBHB_H */
|
|
|
|
/*--------------------------------------------------------------------*/
|
|
/*--- end libhb.h ---*/
|
|
/*--------------------------------------------------------------------*/
|