686 Commits

Author SHA1 Message Date
Philippe Waroquiers
ed9721f0f4 Addition of helgrind client request VALGRIND_HG_GNAT_DEPENDENT_MASTER_JOIN
See helgrind.h for description


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16158
2016-11-28 18:16:27 +00:00
Petar Jovanovic
4b49c4ee66 make bar_bad tests more deterministic
Canceling the thread slp2 before the case 5 makes behaviour of this test
more deterministic.
Also, as Philippe W. pointed out, adding --fair-sched=try seems to avoid
variable and sometimes very long run time for these tests.

Related BZ #358213


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16154
2016-11-23 17:38:29 +00:00
Philippe Waroquiers
482d87b925 Comments change only: add the profile of the hook called by the gnat runtime
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16151
2016-11-22 21:16:37 +00:00
Philippe Waroquiers
d513fcfe77 xtree: some documentation and --help-debug fine tuning
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16146
2016-11-20 11:41:25 +00:00
Philippe Waroquiers
8c86667f1a Support for xtree memory profiling and xtmemory gdbsrv monitor command in helgrind
* helgrind will produce xtree memory profiling according to the options
  --xtree-memory.
* addition of the xtmemory gdbserver monitor command.

(this is the first real xtree functional difference)



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16127
2016-11-11 14:33:27 +00:00
Mark Wielaard
ee8cfbc434 Add libc_test to workaround pth_cond_destroy_busy test hangs.
This is a workaround for bug #371396. It adds a new test program
that can be used skip tests given a specific libc implementation
and optionally a specific minimum version. Currently only glibc
is recognized. This is used for the drd and helgrind tests
pth_cond_destroy_busy to be skipped on glibc 2.24.90+.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16097
2016-10-21 00:02:10 +00:00
Petar Jovanovic
ddc3b67f41 mips: replace use of (d)addi with (d)addiu
Replace use of daddi/addi with daddiu/addiu.
This is more R6-friendly and we actually want to use the instructions
that do not cause integer overflow exception.

Patch by Vicente Olivert Riera.

Related issue - BZ#356112.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16018
2016-10-05 14:16:25 +00:00
Philippe Waroquiers
fef93fffb2 Avoid unused variable warning.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15975
2016-09-21 21:06:04 +00:00
Mark Wielaard
1ea3342d69 Add missing file for bug #358213 workaround.
svn commit r15962 missed adding bar_bad.stderr.exp-destroy-hang.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15966
2016-09-19 19:25:33 +00:00
Mark Wielaard
5f37e4dcde Workaround bar_bad testcase hanging with newer glibc in helgrind/drd tests.
This is a workaround for bug #358213 helgrind/drd pthread_barrier tests
hangs with new glibc pthread barrier implementation. This makes sure that
the tests don't hang anymore. It does this by creating new threads that
sleep and kill the other threads after some time. But this introduces
some non-determinism that might cause the tests to occassionally fail
(both against old and new glibc implementations).

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15962
2016-09-19 14:16:35 +00:00
Carl Love
4f2ab6749e Add tc06_two_races_xml.exp output for ppc64
Update xml filter to suppress pthread_create_WRK frame.  Update the filter_xml
filter to suppress the frame containing the pthread_create_WRK function.  This
allows the tc06_two_races_xml test to complete reliably on power.

This change also adds the ability to suppress the printf that generates a
"pthread_create_WRK...pthread_create" entry to replace the suppressed frame.

This is conceptually a follow-up from r13983, which suppresses the
pthread_create_WRK entry from non-xml outputs.

Patch submitted by Will Schmidt  <will_schmidt@vnet.ibm.com>

Bugzilla 368416
  

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15956
2016-09-14 16:43:27 +00:00
Petar Jovanovic
f2e1eb2d48 [mips] update stderr exp file for tc19_shadowmem
A few changes have had impact on expected output of tc19_shadowmem recently.
These are:
- r14175 (added extra "Block was alloc'd by thread #x" output)
- r13983 (removed "pthread_create_WRK (hg_intercepts.c:" output)
- r13965 (a few empty lines removed)

However, expected stderr file for mips32 has not been updated accordingly.
Update it now. This fixes helgrind/tests/tc19_shadowmem failure on mips32.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15832
2016-03-21 14:05:23 +00:00
Philippe Waroquiers
f2e1687cb8 Fix misplaced closing parenthesis in various VG_(....) calls
At many places, we have:
   VG_(fun(a,b,c))
instead of
   VG_(fun)(a,b,c)
So, fix these cases, found using:
grep -n -i -e 'VG_([a-z][a-z0-9_]*[^a-z0-9_)]' *.c */*.c */*/*.c



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15776
2016-01-27 22:35:14 +00:00
Ivo Raisr
da7db302c1 Fix expected output of helgrind/tests/tc20_verifywrap on Solaris.
n-i-bz


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15758
2016-01-13 05:37:36 +00:00
Ivo Raisr
702c19e525 Fix typo in Helgrind's wrapper of pthread_spin_destroy().
Patch provided by: Jason Dillaman <dillaman@redhat.com>.
Fixes BZ #357871.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15756
2016-01-12 20:31:15 +00:00
Petar Jovanovic
13e817c2ea mips: update exp files for helgrind/tests/tc20_verifywrap
Some recent changes, starting from r15426, have modified the test and
its expected output. The exp files have been only partially updated for
MIPS. We complete that with this change.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15752
2015-12-23 18:48:18 +00:00
Petar Jovanovic
a5f0b51ef3 mips: update expected output for helgrind/tests/tc18_semabuse
r15620 changed the test and the expected output for tc18_semabuse,
r15630 fixed the expected output file for other architectures but not
for mips.
Now we update it for mips as well.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15751
2015-12-22 16:06:07 +00:00
Ivo Raisr
0d30686d21 When searching for global public symbols (like for the somalloc
synonym symbols), exclude the dynamic (runtime) linker as it is very
special.
Fixes BZ#355454


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15728
2015-11-18 20:38:37 +00:00
Philippe Waroquiers
6aa9e95b75 If --history-level=full was not provided at startup, report an error in
helgrind accesshistory monitor command

As accesshistory will never show anything unless this option is given.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15723
2015-11-08 10:42:06 +00:00
Mark Wielaard
887805da64 Correct expected output of tc18 and tc20 helgrind tests.
The addition if the safe wrapper in r15620 introduced an extra
output frame in the backtrace of helgrind/tests/tc18_semabuse and
helgrind/tests/tc20_verifywrap.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15630
2015-09-05 20:45:04 +00:00
Mark Wielaard
235c116f2d Add safe sem_post handler and glibc-2.21 expected output for helgrind tests.
This fixes the tc18 and tc20 testcases.

On some bad semaphores glibc now might just abort, we catch the SIGABRT
and turn it into a EINVAL. The program will see this, but the helgrind
wrapper won't. Which works for tc18 since there is an alternate exp file
with that result (silent bad sem_post). We add a similar alternative exp
file for tc21.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15620
2015-09-04 09:41:42 +00:00
Mark Wielaard
cba6bd0b31 Add safe-pthread.h to helgrind/tests/Makefile.am noinst_HEADERS.
Otherwise the header file won't show up in the dist tar ball.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15589
2015-08-25 13:07:42 +00:00
Tom Hughes
c5a1912be8 Use sigjmp_buf
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15588
2015-08-24 19:26:56 +00:00
Tom Hughes
781dec8f80 Restore signal masks when recovering from xend related signals
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15587
2015-08-24 19:10:06 +00:00
Julian Seward
adc2dafee9 Update copyright dates, to include 2015. No functional change.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15577
2015-08-21 11:32:26 +00:00
Julian Seward
a87df80edf Remove non-ASCII characters from this file. No functional change.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15576
2015-08-21 11:04:48 +00:00
Ivo Raisr
8372cfdb0f Follow-up fix for r15565.
Expected output of some helgrind tests slightly differed on Solaris.
n-i-bz


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15571
2015-08-20 05:50:49 +00:00
Mark Wielaard
9b322bb026 Also install sigsegv handler in safe-pthread tests wrapper.
In case we do recognize the xend, but detect it is invalid
(used outside a transaction) we generate a segsegv instead
of a sigill. Handle that in the same way in the test case.

git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15570
2015-08-19 13:26:28 +00:00
Rhys Kidd
b9389efb87 Follow-up fix for r15565: sa_restorer should not be used. n-i-bz.
It is obsolete and not specified by POSIX. See man sigaction on Linux.
No regressions reported.

The following error may be seen on platforms that don't implement this extension:

depbase=`echo tc12_rwl_trivial.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
    gcc -DHAVE_CONFIG_H -I. -I../..  -I../.. -I../../include -I../../coregrind -I../../include -I../../VEX/pub -I../../VEX/pub -DVGA_amd64=1 -DVGO_darwin=1 -DVGP_amd64_darwin=1 -DVGPV_amd64_darwin_vanilla=1 -DVGA_SEC_x86=1 -DVGP_SEC_amd64_darwin=1  -Winline -Wall -Wshadow -Wno-long-long -g -fno-stack-protector  -Wno-format-extra-args -Wno-literal-range -Wno-tautological-constant-out-of-range-compare -Wno-self-assign -Wno-string-plus-int -Wno-uninitialized -Wno-unused-value  -arch x86_64  -MT tc12_rwl_trivial.o -MD -MP -MF $depbase.Tpo -c -o tc12_rwl_trivial.o tc12_rwl_trivial.c &&\
    mv -f $depbase.Tpo $depbase.Po
In file included from tc12_rwl_trivial.c:8:
./safe-pthread.h:37:7: error: no member named 'sa_restorer' in 'struct sigaction'
   sa.sa_restorer = NULL;
   ~~ ^
1 error generated.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15569
2015-08-19 12:18:31 +00:00
Tom Hughes
7678f91cd1 Install the SIGILL handler everywhere so we get consistent
stacks and don't have to worry about __GLIBC_PREREQ not being
defined on all platforms.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15567
2015-08-19 08:27:06 +00:00
Tom Hughes
b22d60778f Attempt to work around issues with xend being executed unconditionally
when a pthread_rwlock is used in an invalid way.

Recent glibcs use transactional memory instructions to do lock ellision
but will sometimes, when locks are used in an invalid way, may calls to
xend on systems which don't support it, on the grounds that the program
is invalid anyway.

So we try and catch and ignore the resulting SIGILL in our tests that
deliberately work with invalid locks.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15565
2015-08-18 10:29:20 +00:00
Philippe Waroquiers
b7380c7e2d Remove duplicate definition ofVALGRIND_HG_ENABLE_CHECKING , wrongly introduced in rev 15207
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15513
2015-08-09 14:43:33 +00:00
Philippe Waroquiers
7844f70805 (try to) avoid tc09_bad_unlock random failure
tc09_bad_unlock fails randomly for the following reason:
Thread 1 is creating a lock in a stack variable and locks it.
It then clones a Thread 2 that will unlock this lock.
The test fails if the Thread 2 unlocks the lock while the
main thread is still just after the clone syscall:
There is no unwind info in this area, and so doing a stacktrace
implies a nasty hack (see hg_main.c evh__pre_thread_ll_create).

There is no such hack when describing the address of the lock
(as there is no logic in the 'normal' stack trace to detect we are
in the clone syscall code).
In such a case, the unwind fail, and the lock address description
lacks the frame nr derived from the captured stack trace.

Adding --fair-sched=yes seems to make a more reproducible test.

Note that the proper solution to all these 'racy helgrind regtests'
would be to add some synchronisations operations between threads
that helgrind does not observe (e.g. using a technique similar to
the pipe big lock) and have correct (but invisible to helgrind) synchronisation
between the threads actions needed for a reproducible regtest.

Not very cheap to develop, --fair-sched=try is cheap and easy
so use that till someone courageous implements non visible synchronisation



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15497
2015-08-05 17:43:11 +00:00
Florian Krohm
7bd7811604 The number of elements in a hash table cannot be negative.
Let the return type of VG_(HT_count_nodes) reflect that.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15490
2015-08-05 11:26:10 +00:00
Florian Krohm
ad32052369 Fix printf format inconsistencies as pointed out by gcc -Wformat-signedness.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15480
2015-08-03 21:21:42 +00:00
Bart Van Assche
d138ed7448 regression tests: Remove superfluous backticks
Backticks are not needed around a shell statement that does not produce
any output.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15439
2015-07-23 02:47:42 +00:00
Julian Seward
ac60633d65 Bug 345248 - add support for Solaris OS in valgrind
Authors of this port:
    Petr Pavlu         setup@dagobah.cz
    Ivo Raisr          ivosh@ivosh.net
    Theo Schlossnagle  theo@omniti.com
            


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15426
2015-07-21 14:44:28 +00:00
Rhys Kidd
449347505f Block the running of a known hanging regression test on OS X. Partial fix for bz#344416, and related to BZ#216837.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15417
2015-07-19 07:19:54 +00:00
Florian Krohm
7a474c9455 Fix typos in source code. Patch by Dmitriy (olshevskiy87@bk.ru).
Fixes BZ #349874


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15394
2015-07-05 21:53:33 +00:00
Philippe Waroquiers
cc8f7a352f Make some numbers in helgrind stats use , separators, as the numbers can be big
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15294
2015-05-26 21:27:36 +00:00
Philippe Waroquiers
f4dde903ab This patch decreases significantly the memory needed for OldRef and
slightly increases the performance. It also moderately improves
the nr of cases where helgrind can provide the stack trace of the old
access (when using the same amount of memory for the OldRef entries).
The patch also provides a new helgrind monitor command to show
the recorded accesses for an address+len, and adds an optional argument
lock_address to the monitor command 'info locks', to show the info
about just this lock.

Currently, oldref are maintained in a sparse WA, that points to N
entries, as specified by --conflict-cache-size=N.
For each entry (associated to an address), we have the last 5 accesses.

Old entries are recycled in an exact LRU order.
But inside an entry, we could have a recent access, and 4 very
old accesses that are kept 'alive' by a single thread accessing
repetitively the address shared with the 4 other old entries.


The attached patch replaces the sparse WA that maintains the OldREf
by an hash table.
Each OldRef now also only maintains one single access for an address.
As an OldRef now maintains only one access, all the entries are now
strictly in LRU mode.

Memory used for OldRef
-----------------------
For the trunk, an OldRef has a size of 72 bytes (on 32 bits archs)
maintaining up to 5 accesses to the same address.
On 64 bits arch, an OldRef is 104 bytes.

With the patch, an OldRef has a size of 32 bytes (on 32 bits archs)
or 56 bytes (on 64 bits archs).

So, for one single access, the new code needs (on 32 bits)
32 bytes, while the trunk needs only 14.4 bytes.
However, that is the worst case, assuming that the 5 entries in the
accs array are all used.
Looking on 2 big apps (one of them being firefox), we see that
we have very few OldRef entries that have the 5 entries occupied.
On a firefox startup, of the 5x1,000,000 accesses, we only have
1,406,939 accesses that are used.
So, in average, the trunk uses in reality around 52 bytes per access.

The default value for --conflict-cache-size has been doubled to 2000000.
This ensures that the memory used for the OldRef is more or less the
same as the trunk (104Mb for OldRef entries).

Memory used for sparseWA versus hashtable
-----------------------------------------
Looking on 2 big apps (one of them being firefox), we see that
there are big variations on the size of the WA : it can go in a few
seconds from 10MB to 250MB, or can decrease back to 10 MB.
This all depends where the last N accesses were done: if well localised,
the WA will be small.
If the last N accesses were distributed over a big address space,
then the WA will be big: the last level of WA (the biggest memory consumer)
uses slightly more than 1KB (2KB on 64 bits) for each '256 bytes' memory
zone where there is an oldref. So, in the worst case, on 32 bits, we
need > 1_000_000_000 sparseWA memory to keep 1_000_000 OldRef.

The hash table has between 1 to 2 Word overhead per OldRef
(as the chain array is +- doubled each time the hash table is full).
So, unless the OldRef are extremely localised, the overhead of the
hash table will be significantly less.

With the patch, the core arena total alloc is:
  5299535/1201448632 totalloc-blocks/bytes
The trunk is
  6693111/3959050280 totalloc-blocks/bytes
(so, around 1.20Gb versus 3.95Gb).
This big difference is due to the fact that the sparseWA repetitively
allocates then frees Level0 or LevelN when OldRef in the region covered
by the Level0/N have all been recycled.

In terms of CPU
---------------
With the patch, on amd64, a firefox startup seems slightly faster (around 1%).
The peak memory mmaped/used decreases by 200Mb.
For a libreoffice test, the memory decreases by 230Mb. CPU also decreases
slightly (1%).


In terms of correctness:
-----------------------
The trunk could potentially show not the most recent access
to the memory of a race : the first OldRef entry matching the raced upon
address was used, while we could have a more recent access in a following
OldRef entry. In other words, the trunk only guaranteed to find the
most recent access in an OldRef, but not between the several OldRef that
could cover the raced upon address.
So, assuming it is important to show the most recent access, this patch
ensures we really show the most recent access, even in presence of overlapping
accesses.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15289
2015-05-25 17:24:27 +00:00
Philippe Waroquiers
50f5deb159 helgrind stats: show the total nr of thr_n_rcec
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15285
2015-05-23 15:47:35 +00:00
Philippe Waroquiers
2550bcbf3e helgrind stats: give the memory occupied by the OldRef
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15284
2015-05-23 15:35:29 +00:00
Philippe Waroquiers
5ae8b759d4 Add stats in helgrind for oldref history found versus not found
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15283
2015-05-23 12:25:22 +00:00
Philippe Waroquiers
d28004b4f3 Follow up to r15253:
Having a one elt free lineF cache avoids many PA calls.
This seems to slightly improve (a few %) a firefox startup.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15254
2015-05-17 21:36:05 +00:00
Philippe Waroquiers
435d51c1c9 This patch reduces the memory needed for the linesF.
Currently, each SecMap has an array of linesF, referenced by the linesZ
of the secmap that needs a lineF, via an index stored in dict[1].
When the array is full, its size is doubled.
The linesF array of a secmap is freed when the SecMap is GC-ed.
The above strategy has the following consequences:
  A. in average, 25% of the LinesF are unused.
  B. if a SecMap has 'temporarily' a need for linesF, but afterwards,
     these linesF are converted to normal lineZ representation, the linesF
     will not be recuperated unless the SecMap is GC-ed (i.e. fully marked
     no access).

The patch replaces the linesF array private per SecMap
by a pool allocator of LinesF shared between all SecMap.
A lineZ that needs a lineF will directly point to its lineF (using a pointer
stored in dict[1]), instead of having in dict[1] the index in the SecMap
linesF array.
When a lineZ needs a lineF, it is allocated from the pool allocator.
When a lineZ does not need anymore a lineF, it is returned back to the
pool allocator.

On a firefox startup, the above strategy reduces the memory for linesF
by about 42Mb. It seems that the more firefox is used (e.g. to visit
a few websites), the bigger the memory gain.
After opening the home page of valgrind, wikipedia and google, the memory
gain is about 94Mb:
trunk:
  linesF:    392,181 allocd ( 203,934,120 bytes occupied) (   173,279 used)
patch:
  linesF:    212,966 allocd ( 109,038,592 bytes occupied) (   170,252 used)

There is also less alloc/free operations in core arena with the patch:
trunk:
  core    :   810,680,320/  802,291,712 max/curr mmap'd, 17/19 unsplit/split sb unmmap'd,    759,441,224/  703,191,896 max/curr,    40631760/16376828248 totalloc-blocks/bytes,   188015696 searches 8 rzB
patch:
  core    :   701,628,416/  690,753,536 max/curr mmap'd, 12/29 unsplit/split sb unmmap'd,    643,041,944/  577,793,712 max/curr,    32050040/14056017712 totalloc-blocks/bytes,   174097728 searches 8 rzB


In terms of performance, no CPU impact detected on Firefox startup.
Note we have no representative reproducible (and preferrably small)
perf test that uses extensively linesF. Firefox is a good heavy lineF
user but is far to be reproducible, and is very far to be small.

Theoretically, in terms of CPU performance, the patch might have some
small benefits here and there for read operations, as the lineF pointer
is directly retrieved from the lineZ, rather than retrieved via an indirection
in the linesF array.
For write operations, the patch might need a little bit more CPU,
as we replace an
  assignment to lineF inUse boolean to False (and then probably back to True
  when the cacheline is written back)
by 
  a call to pool allocator VG_(freeEltPA) (and then probably a call to
  VG_(allocEltPA) when the cacheline is written back).
These PA functions are small, so cost should be ok.
We might however still maintain in clear_LineF_of_Z the last cleared lineF
and re-use it in alloc_LineF_for_Z. Not sure how many calls to the PA functions
would be avoided by this '1 elt cache' (and the needed 'if elt == NULL'
check in both clear_LineF_of_Z and alloc_LineF_for_Z.
This possible optimisationwill be looked at later.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15253
2015-05-17 19:32:42 +00:00
Philippe Waroquiers
00ef870633 When process dies due to a signal, show the signal and the stacktrace
at default verbosity


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15251
2015-05-17 18:31:55 +00:00
Carl Love
d97d1f6cf3 Patch 5 in a revised series of cleanup patches from Will Schmidt
Add a .exp for the pth_cond_destroy_busy for PPC64 big endian.
This is specifically to cover the last line of output as
seen on ppc64BE, which is "ERROR SUMMARY: X errors from 3 contexts",
where X is 6, versus 3 as seen on other architectures.
The additional errors show up on BE during the "Thread #1: pthread_cond
_destroy: destruction of condition variable being waited upon."

Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>

This patch fixes Vagrind bugzilla 347686


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15239
2015-05-15 20:09:05 +00:00
Philippe Waroquiers
758450e623 Add statistics about the nr of used linesF
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15237
2015-05-15 13:17:17 +00:00
Philippe Waroquiers
6d0f2f35c0 This patch (re-)gains performance in helgrind, following revision 15207, that
reduced memory use doing SecMap GC, but was slowing down some workloads
(typically, workloads doing a lot of malloc/free).

A significant part of the slowdown came from the clear of the filter,
that was not optimised for big ranges : the filter was working byte
per byte till an 8 alignment. Then working per 8 bytes at a time.

With the patch, the filter clear is done the following way:
   * all the bytes till 8 alignement are done together
   * then 8 bytes at a time till filter_line alignment (32 bytes)
   * then 32 bytes at a time.

Moreover, as the filter cache is small (1024 lines of 32 bytes),
clearing filter for ranges bigger than 32Kb was uselessly checking
several times the same entry. This is now avoided by using a range
check rather than a tag equality check.

As the new filter clear is significanly more complex than the previous simple
algorithm, the old algorithm is kept and used to check the new algorithm
when CHECK_ZSM is defined as 1.

The patch also contains a few micro optimisations and
disables 
   // VG_(track_die_mem_stack)       ( evh__die_mem );
as this had no effect and was somewhat costly.

With this patch, we have almost reached for all perf tests the same
performance as we had before revision 15207. Some tests are still
slightly slower than before the SecMap GC (max 2% difference).
Some tests are now significantly faster (e.g. sarp).
For almost all tests, we are now faster than valgrind 3.10.1.
Details below.

Regtested on x86/amd64/ppc64 (and regtested with all compile time
checks set).
I have also regtested with libreoffice and firefox.
(with firefox, also with CHECK_ZSM set to 1).


Details about performance:
hgtrace = this patch
trunk_untouched = trunk
base_secmap = trunk before secmap GC
valgrind 3.10.1 included for comparison
Measured on core i5 2.53GHz


-- Running  tests in perf ----------------------------------------------
-- bigcode1 --
bigcode1 hgtrace   :0.14s  he: 2.6s (18.4x, -----)
bigcode1 trunk_untouched:0.14s  he: 2.6s (18.4x, -0.4%)
bigcode1 base_secmap:0.14s  he: 2.6s (18.6x, -1.2%)
bigcode1 valgrind-3.10.1:0.14s  he: 2.8s (19.8x, -7.8%)
-- bigcode2 --
bigcode2 hgtrace   :0.14s  he: 6.3s (44.7x, -----)
bigcode2 trunk_untouched:0.14s  he: 6.2s (44.6x,  0.2%)
bigcode2 base_secmap:0.14s  he: 6.3s (45.0x, -0.6%)
bigcode2 valgrind-3.10.1:0.14s  he: 6.6s (47.1x, -5.4%)
-- bz2 --
bz2      hgtrace   :0.64s  he:11.3s (17.7x, -----)
bz2      trunk_untouched:0.64s  he:11.7s (18.2x, -3.2%)
bz2      base_secmap:0.64s  he:11.1s (17.3x,  1.9%)
bz2      valgrind-3.10.1:0.64s  he:12.6s (19.7x,-11.3%)
-- fbench --
fbench   hgtrace   :0.29s  he: 3.4s (11.8x, -----)
fbench   trunk_untouched:0.29s  he: 3.4s (11.7x,  0.6%)
fbench   base_secmap:0.29s  he: 3.6s (12.4x, -5.0%)
fbench   valgrind-3.10.1:0.29s  he: 3.5s (12.2x, -3.5%)
-- ffbench --
ffbench  hgtrace   :0.26s  he: 9.8s (37.7x, -----)
ffbench  trunk_untouched:0.26s  he:10.0s (38.4x, -1.9%)
ffbench  base_secmap:0.26s  he: 9.8s (37.8x, -0.2%)
ffbench  valgrind-3.10.1:0.26s  he:10.0s (38.4x, -1.9%)
-- heap --
heap     hgtrace   :0.11s  he: 9.2s (84.0x, -----)
heap     trunk_untouched:0.11s  he: 9.6s (87.1x, -3.7%)
heap     base_secmap:0.11s  he: 9.0s (81.9x,  2.5%)
heap     valgrind-3.10.1:0.11s  he: 9.1s (82.9x,  1.3%)
-- heap_pdb4 --
heap_pdb4 hgtrace   :0.13s  he:10.7s (82.3x, -----)
heap_pdb4 trunk_untouched:0.13s  he:11.0s (84.8x, -3.0%)
heap_pdb4 base_secmap:0.13s  he:10.5s (80.8x,  1.8%)
heap_pdb4 valgrind-3.10.1:0.13s  he:10.6s (81.8x,  0.7%)
-- many-loss-records --
many-loss-records hgtrace   :0.01s  he: 1.5s (152.0x, -----)
many-loss-records trunk_untouched:0.01s  he: 1.6s (157.0x, -3.3%)
many-loss-records base_secmap:0.01s  he: 1.6s (158.0x, -3.9%)
many-loss-records valgrind-3.10.1:0.01s  he: 1.7s (167.0x, -9.9%)
-- many-xpts --
many-xpts hgtrace   :0.03s  he: 2.8s (91.7x, -----)
many-xpts trunk_untouched:0.03s  he: 2.8s (94.7x, -3.3%)
many-xpts base_secmap:0.03s  he: 2.8s (94.0x, -2.5%)
many-xpts valgrind-3.10.1:0.03s  he: 2.9s (97.7x, -6.5%)
-- memrw --
memrw    hgtrace   :0.06s  he: 7.3s (121.2x, -----)
memrw    trunk_untouched:0.06s  he: 7.2s (120.3x,  0.7%)
memrw    base_secmap:0.06s  he: 7.1s (117.7x,  2.9%)
memrw    valgrind-3.10.1:0.06s  he: 8.1s (135.2x,-11.6%)
-- sarp --
sarp     hgtrace   :0.02s  he: 7.6s (378.5x, -----)
sarp     trunk_untouched:0.02s  he: 8.4s (422.0x,-11.5%)
sarp     base_secmap:0.02s  he: 8.6s (431.0x,-13.9%)
sarp     valgrind-3.10.1:0.02s  he: 8.8s (442.0x,-16.8%)
-- tinycc --
tinycc   hgtrace   :0.20s  he:12.4s (62.0x, -----)
tinycc   trunk_untouched:0.20s  he:12.6s (63.2x, -1.9%)
tinycc   base_secmap:0.20s  he:12.6s (63.0x, -1.6%)
tinycc   valgrind-3.10.1:0.20s  he:12.7s (63.5x, -2.3%)
-- Finished tests in perf ----------------------------------------------

== 12 programs, 48 timings =================




git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15236
2015-05-15 11:41:54 +00:00