For calls (structure jCC), Callgrind maintains for the source
both the BBCC (counter array for the source context of the call, which
includes the BB of the source call position), as well as a jump
number in the source BB to reconstruct the guest instruction address
of the call. In setup_bbcc, this jump number is stored in <passed>, and
used when creating a new jCC on a call.
The value of <passed> got out of sync when we simulate a real jump
between different functions as return/call pair: the call source was
reset for the popped jCC, but not <passed>.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11579
to previous behaviour, in which it was constructed but any resulting
errors were not shown, hence wasting CPU and memory.) Partial fix
for #255353. (Philippe Waroquiers, philippe.waroquiers@skynet.be)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11574
VG_(newSizedXA)) since r11571 removes the only use of the
functionality that r11568 introduces.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11573
ScalarTSs, have the ScalarTS array as a trailing array directly on the
VTS structure. This reduces the number of malloc'd blocks per VTS
from 3 to 1, since an XArray always requires 2 malloc'd blocks. At
least for tc19_shadowmem this reduces the total amount of heap
turnover in Arena 'tool' by a factor of 3, and modestly improves
performance whilst modestly reducing overall memory use.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11571
timestamps) from 16 to 8 bytes. This halves the size of vector
timestamps and reduces the amount of memory needed to run programs
that have many threads and/or many synchronisation events.
The tradeoff is that Helgrind must abort the run if the program
creates more than 2^20 (1.0e+6) threads or performs more than 2^44
(1.76e+13) synchronisation events. Neither of these seem like a
significant limitation in practice. It's easy to argue that a limit
of 2^44 synch events would take at a minimum, several CPU months on a
very fast machine.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11570
creating new vector timestamps (VTSs) via tick and join operations,
preallocate the underlying XArray of ScalarTSs (scalar timestamps) at
the likely final size, using new function VG_(newSizedXA) introduced
in r11558. This reduces overall heap turnover (in VG_AR_TOOL) by a
factor of several. Together with revs 11567 and 11568, it mitigates
the worst-case performance falloff in long runs that involve lots of
threads and lots of synchronisation events (a.k.a Vector timestamps).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11569
identical to VG_(newXA) but allows passing in a size hint. In the
case where the likely final size of the XArray is known at creation
time, this allows avoiding the repeated (implicit) resizing and
copying of the array as elements are added, which can save a vast
amount of dynamic memory allocation turnover.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11568
workload: when scanning a freelist of a given size for a big-enough
block (to allocate), don't scan all the way around the list. Instead
give up after 100 blocks and try the freelist above. The pathological
case (as observed) is that the freelist contains tens of thousands of
blocks, but all are too small for the current request, hence they are
all visited pointlessly. If the new heuristic is used, the freelist
start point is moved along by one block, so that future searches
eventually inspect the entire freelist, just very slowly.
Also, some improvements to stats gathering, and rename of some
existing stats fields in struct Arena.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11567
Also, improve the failure message a bit, so as to tell people what package
they need to install, in at least some cases.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11538