ftmemsim-valgrind

mirror of https://github.com/Zenithsiz/ftmemsim-valgrind.git synced 2026-02-04 02:18:37 +00:00

Author	SHA1	Message	Date
Ivo Raisr	702c19e525	Fix typo in Helgrind's wrapper of pthread_spin_destroy(). Patch provided by: Jason Dillaman <dillaman@redhat.com>. Fixes BZ #357871. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15756	2016-01-12 20:31:15 +00:00
Petar Jovanovic	13e817c2ea	mips: update exp files for helgrind/tests/tc20_verifywrap Some recent changes, starting from r15426, have modified the test and its expected output. The exp files have been only partially updated for MIPS. We complete that with this change. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15752	2015-12-23 18:48:18 +00:00
Petar Jovanovic	a5f0b51ef3	mips: update expected output for helgrind/tests/tc18_semabuse r15620 changed the test and the expected output for tc18_semabuse, r15630 fixed the expected output file for other architectures but not for mips. Now we update it for mips as well. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15751	2015-12-22 16:06:07 +00:00
Ivo Raisr	0d30686d21	When searching for global public symbols (like for the somalloc synonym symbols), exclude the dynamic (runtime) linker as it is very special. Fixes BZ#355454 git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15728	2015-11-18 20:38:37 +00:00
Philippe Waroquiers	6aa9e95b75	If --history-level=full was not provided at startup, report an error in helgrind accesshistory monitor command As accesshistory will never show anything unless this option is given. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15723	2015-11-08 10:42:06 +00:00
Mark Wielaard	887805da64	Correct expected output of tc18 and tc20 helgrind tests. The addition if the safe wrapper in r15620 introduced an extra output frame in the backtrace of helgrind/tests/tc18_semabuse and helgrind/tests/tc20_verifywrap. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15630	2015-09-05 20:45:04 +00:00
Mark Wielaard	235c116f2d	Add safe sem_post handler and glibc-2.21 expected output for helgrind tests. This fixes the tc18 and tc20 testcases. On some bad semaphores glibc now might just abort, we catch the SIGABRT and turn it into a EINVAL. The program will see this, but the helgrind wrapper won't. Which works for tc18 since there is an alternate exp file with that result (silent bad sem_post). We add a similar alternative exp file for tc21. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15620	2015-09-04 09:41:42 +00:00
Mark Wielaard	cba6bd0b31	Add safe-pthread.h to helgrind/tests/Makefile.am noinst_HEADERS. Otherwise the header file won't show up in the dist tar ball. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15589	2015-08-25 13:07:42 +00:00
Tom Hughes	c5a1912be8	Use sigjmp_buf git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15588	2015-08-24 19:26:56 +00:00
Tom Hughes	781dec8f80	Restore signal masks when recovering from xend related signals git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15587	2015-08-24 19:10:06 +00:00
Julian Seward	adc2dafee9	Update copyright dates, to include 2015. No functional change. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15577	2015-08-21 11:32:26 +00:00
Julian Seward	a87df80edf	Remove non-ASCII characters from this file. No functional change. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15576	2015-08-21 11:04:48 +00:00
Ivo Raisr	8372cfdb0f	Follow-up fix for r15565. Expected output of some helgrind tests slightly differed on Solaris. n-i-bz git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15571	2015-08-20 05:50:49 +00:00
Mark Wielaard	9b322bb026	Also install sigsegv handler in safe-pthread tests wrapper. In case we do recognize the xend, but detect it is invalid (used outside a transaction) we generate a segsegv instead of a sigill. Handle that in the same way in the test case. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15570	2015-08-19 13:26:28 +00:00
Rhys Kidd	b9389efb87	Follow-up fix for r15565: sa_restorer should not be used. n-i-bz. It is obsolete and not specified by POSIX. See man sigaction on Linux. No regressions reported. The following error may be seen on platforms that don't implement this extension: depbase=`echo tc12_rwl_trivial.o \| sed 's\|[^/]*$\|.deps/&\|;s\|\.o$\|\|'`;\ gcc -DHAVE_CONFIG_H -I. -I../.. -I../.. -I../../include -I../../coregrind -I../../include -I../../VEX/pub -I../../VEX/pub -DVGA_amd64=1 -DVGO_darwin=1 -DVGP_amd64_darwin=1 -DVGPV_amd64_darwin_vanilla=1 -DVGA_SEC_x86=1 -DVGP_SEC_amd64_darwin=1 -Winline -Wall -Wshadow -Wno-long-long -g -fno-stack-protector -Wno-format-extra-args -Wno-literal-range -Wno-tautological-constant-out-of-range-compare -Wno-self-assign -Wno-string-plus-int -Wno-uninitialized -Wno-unused-value -arch x86_64 -MT tc12_rwl_trivial.o -MD -MP -MF $depbase.Tpo -c -o tc12_rwl_trivial.o tc12_rwl_trivial.c &&\ mv -f $depbase.Tpo $depbase.Po In file included from tc12_rwl_trivial.c:8: ./safe-pthread.h:37:7: error: no member named 'sa_restorer' in 'struct sigaction' sa.sa_restorer = NULL; ~~ ^ 1 error generated. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15569	2015-08-19 12:18:31 +00:00
Tom Hughes	7678f91cd1	Install the SIGILL handler everywhere so we get consistent stacks and don't have to worry about __GLIBC_PREREQ not being defined on all platforms. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15567	2015-08-19 08:27:06 +00:00
Tom Hughes	b22d60778f	Attempt to work around issues with xend being executed unconditionally when a pthread_rwlock is used in an invalid way. Recent glibcs use transactional memory instructions to do lock ellision but will sometimes, when locks are used in an invalid way, may calls to xend on systems which don't support it, on the grounds that the program is invalid anyway. So we try and catch and ignore the resulting SIGILL in our tests that deliberately work with invalid locks. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15565	2015-08-18 10:29:20 +00:00
Philippe Waroquiers	b7380c7e2d	Remove duplicate definition ofVALGRIND_HG_ENABLE_CHECKING , wrongly introduced in rev 15207 git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15513	2015-08-09 14:43:33 +00:00
Philippe Waroquiers	7844f70805	(try to) avoid tc09_bad_unlock random failure tc09_bad_unlock fails randomly for the following reason: Thread 1 is creating a lock in a stack variable and locks it. It then clones a Thread 2 that will unlock this lock. The test fails if the Thread 2 unlocks the lock while the main thread is still just after the clone syscall: There is no unwind info in this area, and so doing a stacktrace implies a nasty hack (see hg_main.c evh__pre_thread_ll_create). There is no such hack when describing the address of the lock (as there is no logic in the 'normal' stack trace to detect we are in the clone syscall code). In such a case, the unwind fail, and the lock address description lacks the frame nr derived from the captured stack trace. Adding --fair-sched=yes seems to make a more reproducible test. Note that the proper solution to all these 'racy helgrind regtests' would be to add some synchronisations operations between threads that helgrind does not observe (e.g. using a technique similar to the pipe big lock) and have correct (but invisible to helgrind) synchronisation between the threads actions needed for a reproducible regtest. Not very cheap to develop, --fair-sched=try is cheap and easy so use that till someone courageous implements non visible synchronisation git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15497	2015-08-05 17:43:11 +00:00
Florian Krohm	7bd7811604	The number of elements in a hash table cannot be negative. Let the return type of VG_(HT_count_nodes) reflect that. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15490	2015-08-05 11:26:10 +00:00
Florian Krohm	ad32052369	Fix printf format inconsistencies as pointed out by gcc -Wformat-signedness. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15480	2015-08-03 21:21:42 +00:00
Bart Van Assche	d138ed7448	regression tests: Remove superfluous backticks Backticks are not needed around a shell statement that does not produce any output. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15439	2015-07-23 02:47:42 +00:00
Julian Seward	ac60633d65	Bug 345248 - add support for Solaris OS in valgrind Authors of this port: Petr Pavlu setup@dagobah.cz Ivo Raisr ivosh@ivosh.net Theo Schlossnagle theo@omniti.com git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15426	2015-07-21 14:44:28 +00:00
Rhys Kidd	449347505f	Block the running of a known hanging regression test on OS X. Partial fix for bz#344416, and related to BZ#216837. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15417	2015-07-19 07:19:54 +00:00
Florian Krohm	7a474c9455	Fix typos in source code. Patch by Dmitriy (olshevskiy87@bk.ru). Fixes BZ #349874 git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15394	2015-07-05 21:53:33 +00:00
Philippe Waroquiers	cc8f7a352f	Make some numbers in helgrind stats use , separators, as the numbers can be big git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15294	2015-05-26 21:27:36 +00:00
Philippe Waroquiers	f4dde903ab	This patch decreases significantly the memory needed for OldRef and slightly increases the performance. It also moderately improves the nr of cases where helgrind can provide the stack trace of the old access (when using the same amount of memory for the OldRef entries). The patch also provides a new helgrind monitor command to show the recorded accesses for an address+len, and adds an optional argument lock_address to the monitor command 'info locks', to show the info about just this lock. Currently, oldref are maintained in a sparse WA, that points to N entries, as specified by --conflict-cache-size=N. For each entry (associated to an address), we have the last 5 accesses. Old entries are recycled in an exact LRU order. But inside an entry, we could have a recent access, and 4 very old accesses that are kept 'alive' by a single thread accessing repetitively the address shared with the 4 other old entries. The attached patch replaces the sparse WA that maintains the OldREf by an hash table. Each OldRef now also only maintains one single access for an address. As an OldRef now maintains only one access, all the entries are now strictly in LRU mode. Memory used for OldRef ----------------------- For the trunk, an OldRef has a size of 72 bytes (on 32 bits archs) maintaining up to 5 accesses to the same address. On 64 bits arch, an OldRef is 104 bytes. With the patch, an OldRef has a size of 32 bytes (on 32 bits archs) or 56 bytes (on 64 bits archs). So, for one single access, the new code needs (on 32 bits) 32 bytes, while the trunk needs only 14.4 bytes. However, that is the worst case, assuming that the 5 entries in the accs array are all used. Looking on 2 big apps (one of them being firefox), we see that we have very few OldRef entries that have the 5 entries occupied. On a firefox startup, of the 5x1,000,000 accesses, we only have 1,406,939 accesses that are used. So, in average, the trunk uses in reality around 52 bytes per access. The default value for --conflict-cache-size has been doubled to 2000000. This ensures that the memory used for the OldRef is more or less the same as the trunk (104Mb for OldRef entries). Memory used for sparseWA versus hashtable ----------------------------------------- Looking on 2 big apps (one of them being firefox), we see that there are big variations on the size of the WA : it can go in a few seconds from 10MB to 250MB, or can decrease back to 10 MB. This all depends where the last N accesses were done: if well localised, the WA will be small. If the last N accesses were distributed over a big address space, then the WA will be big: the last level of WA (the biggest memory consumer) uses slightly more than 1KB (2KB on 64 bits) for each '256 bytes' memory zone where there is an oldref. So, in the worst case, on 32 bits, we need > 1_000_000_000 sparseWA memory to keep 1_000_000 OldRef. The hash table has between 1 to 2 Word overhead per OldRef (as the chain array is +- doubled each time the hash table is full). So, unless the OldRef are extremely localised, the overhead of the hash table will be significantly less. With the patch, the core arena total alloc is: 5299535/1201448632 totalloc-blocks/bytes The trunk is 6693111/3959050280 totalloc-blocks/bytes (so, around 1.20Gb versus 3.95Gb). This big difference is due to the fact that the sparseWA repetitively allocates then frees Level0 or LevelN when OldRef in the region covered by the Level0/N have all been recycled. In terms of CPU --------------- With the patch, on amd64, a firefox startup seems slightly faster (around 1%). The peak memory mmaped/used decreases by 200Mb. For a libreoffice test, the memory decreases by 230Mb. CPU also decreases slightly (1%). In terms of correctness: ----------------------- The trunk could potentially show not the most recent access to the memory of a race : the first OldRef entry matching the raced upon address was used, while we could have a more recent access in a following OldRef entry. In other words, the trunk only guaranteed to find the most recent access in an OldRef, but not between the several OldRef that could cover the raced upon address. So, assuming it is important to show the most recent access, this patch ensures we really show the most recent access, even in presence of overlapping accesses. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15289	2015-05-25 17:24:27 +00:00
Philippe Waroquiers	50f5deb159	helgrind stats: show the total nr of thr_n_rcec git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15285	2015-05-23 15:47:35 +00:00
Philippe Waroquiers	2550bcbf3e	helgrind stats: give the memory occupied by the OldRef git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15284	2015-05-23 15:35:29 +00:00
Philippe Waroquiers	5ae8b759d4	Add stats in helgrind for oldref history found versus not found git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15283	2015-05-23 12:25:22 +00:00
Philippe Waroquiers	d28004b4f3	Follow up to r15253: Having a one elt free lineF cache avoids many PA calls. This seems to slightly improve (a few %) a firefox startup. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15254	2015-05-17 21:36:05 +00:00
Philippe Waroquiers	435d51c1c9	This patch reduces the memory needed for the linesF. Currently, each SecMap has an array of linesF, referenced by the linesZ of the secmap that needs a lineF, via an index stored in dict[1]. When the array is full, its size is doubled. The linesF array of a secmap is freed when the SecMap is GC-ed. The above strategy has the following consequences: A. in average, 25% of the LinesF are unused. B. if a SecMap has 'temporarily' a need for linesF, but afterwards, these linesF are converted to normal lineZ representation, the linesF will not be recuperated unless the SecMap is GC-ed (i.e. fully marked no access). The patch replaces the linesF array private per SecMap by a pool allocator of LinesF shared between all SecMap. A lineZ that needs a lineF will directly point to its lineF (using a pointer stored in dict[1]), instead of having in dict[1] the index in the SecMap linesF array. When a lineZ needs a lineF, it is allocated from the pool allocator. When a lineZ does not need anymore a lineF, it is returned back to the pool allocator. On a firefox startup, the above strategy reduces the memory for linesF by about 42Mb. It seems that the more firefox is used (e.g. to visit a few websites), the bigger the memory gain. After opening the home page of valgrind, wikipedia and google, the memory gain is about 94Mb: trunk: linesF: 392,181 allocd ( 203,934,120 bytes occupied) ( 173,279 used) patch: linesF: 212,966 allocd ( 109,038,592 bytes occupied) ( 170,252 used) There is also less alloc/free operations in core arena with the patch: trunk: core : 810,680,320/ 802,291,712 max/curr mmap'd, 17/19 unsplit/split sb unmmap'd, 759,441,224/ 703,191,896 max/curr, 40631760/16376828248 totalloc-blocks/bytes, 188015696 searches 8 rzB patch: core : 701,628,416/ 690,753,536 max/curr mmap'd, 12/29 unsplit/split sb unmmap'd, 643,041,944/ 577,793,712 max/curr, 32050040/14056017712 totalloc-blocks/bytes, 174097728 searches 8 rzB In terms of performance, no CPU impact detected on Firefox startup. Note we have no representative reproducible (and preferrably small) perf test that uses extensively linesF. Firefox is a good heavy lineF user but is far to be reproducible, and is very far to be small. Theoretically, in terms of CPU performance, the patch might have some small benefits here and there for read operations, as the lineF pointer is directly retrieved from the lineZ, rather than retrieved via an indirection in the linesF array. For write operations, the patch might need a little bit more CPU, as we replace an assignment to lineF inUse boolean to False (and then probably back to True when the cacheline is written back) by a call to pool allocator VG_(freeEltPA) (and then probably a call to VG_(allocEltPA) when the cacheline is written back). These PA functions are small, so cost should be ok. We might however still maintain in clear_LineF_of_Z the last cleared lineF and re-use it in alloc_LineF_for_Z. Not sure how many calls to the PA functions would be avoided by this '1 elt cache' (and the needed 'if elt == NULL' check in both clear_LineF_of_Z and alloc_LineF_for_Z. This possible optimisationwill be looked at later. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15253	2015-05-17 19:32:42 +00:00
Philippe Waroquiers	00ef870633	When process dies due to a signal, show the signal and the stacktrace at default verbosity git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15251	2015-05-17 18:31:55 +00:00
Carl Love	d97d1f6cf3	Patch 5 in a revised series of cleanup patches from Will Schmidt Add a .exp for the pth_cond_destroy_busy for PPC64 big endian. This is specifically to cover the last line of output as seen on ppc64BE, which is "ERROR SUMMARY: X errors from 3 contexts", where X is 6, versus 3 as seen on other architectures. The additional errors show up on BE during the "Thread #1: pthread_cond _destroy: destruction of condition variable being waited upon." Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> This patch fixes Vagrind bugzilla 347686 git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15239	2015-05-15 20:09:05 +00:00
Philippe Waroquiers	758450e623	Add statistics about the nr of used linesF git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15237	2015-05-15 13:17:17 +00:00
Philippe Waroquiers	6d0f2f35c0	This patch (re-)gains performance in helgrind, following revision 15207, that reduced memory use doing SecMap GC, but was slowing down some workloads (typically, workloads doing a lot of malloc/free). A significant part of the slowdown came from the clear of the filter, that was not optimised for big ranges : the filter was working byte per byte till an 8 alignment. Then working per 8 bytes at a time. With the patch, the filter clear is done the following way: * all the bytes till 8 alignement are done together * then 8 bytes at a time till filter_line alignment (32 bytes) * then 32 bytes at a time. Moreover, as the filter cache is small (1024 lines of 32 bytes), clearing filter for ranges bigger than 32Kb was uselessly checking several times the same entry. This is now avoided by using a range check rather than a tag equality check. As the new filter clear is significanly more complex than the previous simple algorithm, the old algorithm is kept and used to check the new algorithm when CHECK_ZSM is defined as 1. The patch also contains a few micro optimisations and disables // VG_(track_die_mem_stack) ( evh__die_mem ); as this had no effect and was somewhat costly. With this patch, we have almost reached for all perf tests the same performance as we had before revision 15207. Some tests are still slightly slower than before the SecMap GC (max 2% difference). Some tests are now significantly faster (e.g. sarp). For almost all tests, we are now faster than valgrind 3.10.1. Details below. Regtested on x86/amd64/ppc64 (and regtested with all compile time checks set). I have also regtested with libreoffice and firefox. (with firefox, also with CHECK_ZSM set to 1). Details about performance: hgtrace = this patch trunk_untouched = trunk base_secmap = trunk before secmap GC valgrind 3.10.1 included for comparison Measured on core i5 2.53GHz -- Running tests in perf ---------------------------------------------- -- bigcode1 -- bigcode1 hgtrace :0.14s he: 2.6s (18.4x, -----) bigcode1 trunk_untouched:0.14s he: 2.6s (18.4x, -0.4%) bigcode1 base_secmap:0.14s he: 2.6s (18.6x, -1.2%) bigcode1 valgrind-3.10.1:0.14s he: 2.8s (19.8x, -7.8%) -- bigcode2 -- bigcode2 hgtrace :0.14s he: 6.3s (44.7x, -----) bigcode2 trunk_untouched:0.14s he: 6.2s (44.6x, 0.2%) bigcode2 base_secmap:0.14s he: 6.3s (45.0x, -0.6%) bigcode2 valgrind-3.10.1:0.14s he: 6.6s (47.1x, -5.4%) -- bz2 -- bz2 hgtrace :0.64s he:11.3s (17.7x, -----) bz2 trunk_untouched:0.64s he:11.7s (18.2x, -3.2%) bz2 base_secmap:0.64s he:11.1s (17.3x, 1.9%) bz2 valgrind-3.10.1:0.64s he:12.6s (19.7x,-11.3%) -- fbench -- fbench hgtrace :0.29s he: 3.4s (11.8x, -----) fbench trunk_untouched:0.29s he: 3.4s (11.7x, 0.6%) fbench base_secmap:0.29s he: 3.6s (12.4x, -5.0%) fbench valgrind-3.10.1:0.29s he: 3.5s (12.2x, -3.5%) -- ffbench -- ffbench hgtrace :0.26s he: 9.8s (37.7x, -----) ffbench trunk_untouched:0.26s he:10.0s (38.4x, -1.9%) ffbench base_secmap:0.26s he: 9.8s (37.8x, -0.2%) ffbench valgrind-3.10.1:0.26s he:10.0s (38.4x, -1.9%) -- heap -- heap hgtrace :0.11s he: 9.2s (84.0x, -----) heap trunk_untouched:0.11s he: 9.6s (87.1x, -3.7%) heap base_secmap:0.11s he: 9.0s (81.9x, 2.5%) heap valgrind-3.10.1:0.11s he: 9.1s (82.9x, 1.3%) -- heap_pdb4 -- heap_pdb4 hgtrace :0.13s he:10.7s (82.3x, -----) heap_pdb4 trunk_untouched:0.13s he:11.0s (84.8x, -3.0%) heap_pdb4 base_secmap:0.13s he:10.5s (80.8x, 1.8%) heap_pdb4 valgrind-3.10.1:0.13s he:10.6s (81.8x, 0.7%) -- many-loss-records -- many-loss-records hgtrace :0.01s he: 1.5s (152.0x, -----) many-loss-records trunk_untouched:0.01s he: 1.6s (157.0x, -3.3%) many-loss-records base_secmap:0.01s he: 1.6s (158.0x, -3.9%) many-loss-records valgrind-3.10.1:0.01s he: 1.7s (167.0x, -9.9%) -- many-xpts -- many-xpts hgtrace :0.03s he: 2.8s (91.7x, -----) many-xpts trunk_untouched:0.03s he: 2.8s (94.7x, -3.3%) many-xpts base_secmap:0.03s he: 2.8s (94.0x, -2.5%) many-xpts valgrind-3.10.1:0.03s he: 2.9s (97.7x, -6.5%) -- memrw -- memrw hgtrace :0.06s he: 7.3s (121.2x, -----) memrw trunk_untouched:0.06s he: 7.2s (120.3x, 0.7%) memrw base_secmap:0.06s he: 7.1s (117.7x, 2.9%) memrw valgrind-3.10.1:0.06s he: 8.1s (135.2x,-11.6%) -- sarp -- sarp hgtrace :0.02s he: 7.6s (378.5x, -----) sarp trunk_untouched:0.02s he: 8.4s (422.0x,-11.5%) sarp base_secmap:0.02s he: 8.6s (431.0x,-13.9%) sarp valgrind-3.10.1:0.02s he: 8.8s (442.0x,-16.8%) -- tinycc -- tinycc hgtrace :0.20s he:12.4s (62.0x, -----) tinycc trunk_untouched:0.20s he:12.6s (63.2x, -1.9%) tinycc base_secmap:0.20s he:12.6s (63.0x, -1.6%) tinycc valgrind-3.10.1:0.20s he:12.7s (63.5x, -2.3%) -- Finished tests in perf ---------------------------------------------- == 12 programs, 48 timings ================= git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15236	2015-05-15 11:41:54 +00:00
Philippe Waroquiers	ba3235fb91	micro-opt: add an UNLIKELY git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15235	2015-05-15 09:38:54 +00:00
Philippe Waroquiers	1d5575986a	VTS stats * add the missing increment to the nr of gc done * add vts pruning stat git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15214	2015-05-11 20:56:49 +00:00
Philippe Waroquiers	36dd38f551	Simplify shmem__invalidate_scache_range : it only has to handle cacheline aligned ranges. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15213	2015-05-11 20:18:10 +00:00
Philippe Waroquiers	a182f73861	Small optimisations in libhb_core.c * avoid indirection via function pointers to call SVal__rcinc and SVal__rcdec * declare these functions inlined * transform 2 asserts on hot path in conditionally compiled checks on CHECK_ZSM This slightly optimises some perf tests with helgrind git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15212	2015-05-11 19:45:08 +00:00
Philippe Waroquiers	4328dff7d1	This patch decreases the memory used by the helgrind SecMap, by implementing a Garbage Collection for the SecMap. The basic change is that freed memory is marked as noaccess (while before, it kept the previous marking, on the basis that non buggy applications are not accessing freed memory in any case). Keeping the previous marking avoids the CPU/memory changes needed to mark noaccess. However, marking freed memory noaccess and GC the secmap reduces the memory on big apps. For example, a firefox test needs 220Mb less (on about 2.06 Gb). Similar reduction for libreoffice batch (260 MB less on 1.09 Gb). On such applications, the performance with the patch is similar to the trunk. There is a performance decrease for applications that are doing a lot of malloc/free repetitively: e.g. on some perf tests, an increase in cpu of up to 15% has been observed. Several performance optimisations can be done afterwards to not loose too much performance. The decrease of memory is expected to produce in any case significant benefit in memory constrained environments (e.g. android phones). So, after discussion with Julian, it was decided to commit as-is and (re-)gain (part of) performance in follow-up commits. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15207	2015-05-10 22:19:31 +00:00
Philippe Waroquiers	9655203ffb	Reduce nr of lines produced by laog gc --stats=yes git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15176	2015-05-03 10:56:16 +00:00
Philippe Waroquiers	22988c423a	This patch reduces the memory needed for a VtsTE by 25% (one word) on 32 bits platforms. No memory reduction on 64 bits platforms, due to alignment. The patch also shows the vts stats when showing the helgrind stats. The perf/memrw.c perf test gets also some new additional features allowing e.g. to control the size of the read or written blocks. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15174	2015-05-01 17:12:00 +00:00
Philippe Waroquiers	0543df0e12	Give statistics about RCEC helgrind hash table chains. Improve statistic in coregrind hash table git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15141	2015-04-25 14:00:24 +00:00
Philippe Waroquiers	31e81facbe	Add some stats to helgrind stats: * nr of client malloc-ed blocks * how many OldRef helgrind has, and the distribution of these OldRef according to the nr of accs they have git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15128	2015-04-21 21:58:14 +00:00
Philippe Waroquiers	fddca72337	Do RCEC_GC when approaching the max nr of RCEC, not when reaching it. Otherwise, long running applications still see the max nr of RCEC slowly growing, which increases the memory usage and makes the (fixed) contextTab hash table slower to search. Without this margin, the max could increase as the GC code is not called at exactly the moment we reach the previous max, but rather when a thread has run a bunch of basic blocks. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15126	2015-04-21 20:55:40 +00:00
Philippe Waroquiers	1c8026546e	This patch changes the policy that does the GC of OldRef and RCEC conflict cache size. The current policy is: A 'more or less' LRU policy is implemented by giving to each OldRef a generation nr in which it was last touched. A new generation is created every 50000 new access. GC is done when the nr of OldRef reaches --conflict-cache-size. The GC consists in removing enough generations to free half of the entries. After GC of OldRef, the RCEC (Ref Counted Exe Contexts) not referenced anymore are GC-ed. The new policy is: An exact LRU policy is implemented using a doubly linked list of OldRef. When reaching --conflict-cache-size, the LRU entry is re-used. The not referenced RCEC are GC-ed when less than 75% of the RCEC are referenced, and the nr of RCEC is 'big' (at least half the size of the contextTab, and at least the max nr of RCEC reached previously). (note: tried to directly recover a unref'ed RCEC when recovering the LRU oldref, but that gives a lot of re-creation of RCEC). new policy has the following advantages/disadvantages: 1. It is faster (at least for big applications) On a firefox startup/exit, we gain about 1m30 second on 11m. Similar 5..10% speed up encountered on other big applications or on the new perf/memrw test. The speed increase depends on the amount of memory touched by the application. For applications with a working set fitting in conflict-cache-size, the new policy might be marginally slower than previous policy on platforms having a small cache : the current policy only sets a generation nr when an address is re-accessed, while the new policy has to unchain and rechain the OldRef access in the LRU doubly linked list. 2. It uses less memory (at least for big applications) Firefox startup/exit "core" arena max use decreases from 1175MB mmap-ed/1060MB alloc-ed to 994MB mmap-ed/913MB alloc-ed The decrease in memory is the result of having a lot less RCEC: The current policy let the nr of RCEC grow till the conflict cache size is GC-ed. The new policy limits the nr of RCEC to 133% of the RCEC really referenced. So, we end up with a max nr of RCEC a lot smaller with the new policy : max RCEC 191000 versus 1317000, for a total nr of discard RCEC operations almost the same: 33M versus 32M. Also, the current policy allocates a big temporary array to do the GC of OldRef. With the new policy, size of an OldRef increases because we need 2 pointers for the LRU doubly linked list, and we need the accessed address. In total, the OldRef increase is limited to one Word, as we do not need anymore the gen, and the 'magic' for sanity check was removed (the check somewhat becomes less needed, because an OldRef is never freed anymore. Also, we do a new cross-check between the ga in the OldRef and the sparseWA key). For applications using small memory and having a small nr of different stack traces accessing memory, the new policy causes an increase in memory (one Word per OldRef). 3. Functionally, the new policy gives better past information: once the steady state is reached (i.e. the conflict cache is full), the new policy has always --conflict-cache-size entries of past information. The current policy has a nr of past information varying between --conflict-cache-size/2 and --conflict-cache-size (so in average, 75% of conflict-cache-size). 4. The new code is a little bit smaller/simpler: The generation based GC is replaced by a simpler LRU policy. So, in summary, this patch should allow big applications to use less cpu/memory, while having very little or no impact on memory/cpu of small applications. Note that the OldRef data structure LRU policy is not really explicitely tested by a regtest. Not easy at first sight to make such a test portable between platforms/OS/compilers/.... git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15119	2015-04-20 21:33:16 +00:00
Philippe Waroquiers	3041d1036a	Fix statistics about ctxt_rcec : * the nr of discards was always 0 * the cur nr of values was shown as max git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15105	2015-04-17 21:19:43 +00:00
Philippe Waroquiers	da3a839f90	Remove useless arguments in sparsewa, that were inheritated from WordFM These arguments are not needed for sparsewa, as they can only return the key given in input. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15083	2015-04-11 11:42:22 +00:00
Philippe Waroquiers	e6c0ac4195	Have the event map GC use the same approach as the other GC done from libhb_maybe_GC, i.e. check the condition in libhb_maybe_GC, and call the (non inlined) GC only if a GC is needed. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15082	2015-04-10 19:34:14 +00:00

1 2 3 4 5 ...

672 Commits