At many places, we have:
VG_(fun(a,b,c))
instead of
VG_(fun)(a,b,c)
So, fix these cases, found using:
grep -n -i -e 'VG_([a-z][a-z0-9_]*[^a-z0-9_)]' *.c */*.c */*/*.c
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15776
when the size is known in advance.
3 places identified where this function can be used trivially.
The result is a reduction of 'realloc' operations in core
arena, and a small reduction in ttaux arena
(it is the nr of operations that decreases, the memory usage itself
stays the same (ignoring some 'rounding' effects).
E.g. for perf/bigcode 0, we change from
core 1085742/ 216745904 totalloc-blocks/bytes, 1085733 searches
ttaux 5348/ 6732560 totalloc-blocks/bytes, 5326 searches
to
core 712666/ 190998592 totalloc-blocks/bytes, 712657 searches
ttaux 5319/ 6731808 totalloc-blocks/bytes, 5296 searches
For bz2, we switch from
core 50285/ 32383664 totalloc-blocks/bytes, 50256 searches
ttaux 670/ 245160 totalloc-blocks/bytes, 669 searches
to
core 32564/ 29971984 totalloc-blocks/bytes, 32535 searches
ttaux 605/ 243280 totalloc-blocks/bytes, 604 searches
Performance wise, on amd64, this improves memcheck performance
on perf tests by 0.0, 0.1 or 0.2 seconds depending on the test.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15173
For default memcheck configuration, 32 bits) this patch
decreases by 13.6 MB ie. from 89945856 to 76317696.
Note that the type EClassNo is introduced only for readibility
purpose (and avoid some cast). That does not change the size
of the TTEntry.
The TTEntry size is reduced by using unions and/or Bool on 1 bit.
No performance impact detected (outer callgrind/inner memcheck bz2
on x86 shows a small improvement).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15054
* give the avg nr of IPs per execontext
* use the newly introduced %f in m_transtab.c ratio
and in the avg nr of execontext per list
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15041
on 32 bits memcheck default nr of sectors).
Memory is reduced by using UShort typedef-s for Sector no and TTE no.
Note that for TTE no, we had a mixture of UShort, UInt and Int used
depending on the place (a TTE no was in any case constrained to be an UShort).
The bss memory/startup space is also reduced by allocating the htt on demand
(like tt and tc), using mmap the first time a sector is initialised.
Changes:
* pub_core_transtab.h :
* 2 typedef to identify a sector and a tt entry (these 2 types are UShort)
* add 2 #define 'invalid values' for these types
* change the interface to use these types rather than UInt
* m_transtab.c
* use wherever relevant these 2 new types rather than UInt or UShort
* replace the use of -1 by INV_SNO or INV_TTE
* remove now useless typecast from Int/UInt to UShort for tte
* schedule.c: use the new types
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15036
This patch changes the way the transtab entries hash table is done.
Currently, the hash table is an open hash table considered full at 65%.
This means that in average, 1 entry on 3 is unused.
(all the hash table memory will be 'active' for big applications,
as the active entries are normally reasonably distributed over the hash table).
The size of a transtab entry is significant (about 150 Bytes).
To avoid having 35% of the entries unused, the translation table
is split in 2:
An hash table, that will contain an index pointing at the transtab entries.
With this technique, we are adding a small hash table,
but we spare 35% of the translation table.
Performance measurements have shown no degradation,
and some platforms have better performance. Not too clear why,
probably this helps platforms with small caches ?).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15023
the size of the translation table sectors, either to gain memory
or to avoid too many retranslations.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15016
* common up the identical debug and clo_stat traces
* add in the stats the nr of sectors recycled
* add the avg translation size in each sector recycled
and in the final statistics
(no functional change)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15005
only allowed to be called in certain contexts which is
enforced at runtime.
Change callgrind accordingly.
New header file pub_tool_transtab.h added.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14867
endianness in VEX).
In short: in m_machine.c, VG_(machine_get_hwcaps), get the endianness
of the host, and pass it through to all places (in VEX) where it is
required.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14184
A previous commit had decreased to 6 (on android) and increased to 16
(other platforms) the nr of sectors in the translation cache.
This patch adds a command line option to let the user specify
the nr of sectors as e.g. 16 sectors might be a lot and cause
an out of memory for some workloads or might be too small for
huge executable or executables using a lot of shared libs.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13652
non-phone/tablet targets. The previous apparently-huge sizing is
evidently not huge enough for recent apps, eg, recent Firefox requires
circa 350k translations to get started and almost fills an 8-sector
cache merely starting up and then idling.
On Android targets, fall back to 6 sectors; space is critical.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13568
Assertion
valgrind: m_transtab.c:674 (find_TTEntry_from_hcode):
Assertion '(UChar*)sec->tt[tteNo].tcptr <= (UChar*)hcode' failed.
failure (encountered on some platforms while running gdbsrv tests).
The problem is related to invalidated entries and the host_extents
mapping between hostcode and the translation table entry.
The problem: when an entry is invalidated, the translation table
entry is changed to status Deleted. However, the host extent array
element is not cleaned up.
If a search for a host code address (find_TTEntry_from_hcode)
finds this entry, the translation table entry in Deleted status
is considered as a 'not found', which ensures that the invalidated
entry is not used (e.g. for chaining).
This is all ok.
However, it might be that this Deleted entry is re-used
(see function VG_(add_to_transtab), searching for a Empty
or Deleted entry.
If the Deleted entry is re-used, then a search for the
dead host code can give a result pointing to the re-used
entry. That is clearly wrong.
Note that it is unclear if this bug can only be triggered
while using gdbsrv or if this bug can be triggered with
just the "normal" invalidation logic of translation.
gdbsrv being a heavy "user" of invalidation, it might
be it helps to trigger the code. Alternatively, as gdbsrv
invalidation is special (e.g. invalidation of some entries
is done during translation of other entries), it might be
the bug is specific to gdbsrv.
In any case, to avoid the bug:
searching for an host code address must not only
ignore Deleted entries, but must also ignore an entry
found via a host_extent element which is for a Deleted
entry that was re-used afterwards (pointed to by a
newer host_extent element).
Multiple solutions are possible for fixing the bug:
Sol1: cleanup the host_extents array when an entry is deleted.
The cleanup is however deemed costly:
Each invalidate operation must do a search in the host_extents.
The host_extents array must then be "compacted" to remove
the "dead" host extent element from the array.
The compact operation can be avoided if instead of removing
the element, one marks instead the element as "dead"
e.g. by using one bit of UInt len for that:
UInt len : 31;
Bool dead : 1;
This avoids the compact, but still incurrs the cost of
search and modify the host_extent for each entry invalidated.
Invalidating entries seems to be a critical operation
(e.g. specific ECLASS related data structures have been
done to allow fast deletion).
=> it is deemed that a solution not incurring cost during
invaliation is preferrable.
* Sol 2: detect in find_TTEntry_from_hcode
that the host_extent element is re-used, and handle it similarly
to an host_extents which points at a Deleted entry.
This detection is possible as if an entry is re-used after
having been deleted, this implies that its host code will be
after the end of the host code of the deleted entry
(as host code of a sector is not re-used).
The attached patch implements this solution.
* Sol 3: avoid re-using an entry : the entry would then stay
in Deleted state. This is deemed not ok as it would
imply that invalidation of entries will cause a sector to
become full faster.
The patch:
* adds a new function
Bool HostExtent__is_dead (const HostExtent* hx, const Sector* sec)
telling if the host extent hx from sector sec is a dead entry.
* this function is used in find_TTEntry_from_hcode so that
dead host extents are not resulting in host code to be found.
* adds a regression test which caused the assert failure before
(bug was found/reported/isolated in a small test case by Dejan Jevtic).
* To check the logic of HostExtent__is_dead, m_transtab.c sanity check is
completed to verify that the nr of entries in use in a sector is equal
to the nr of non dead entries in the host extent array.
* adds/improves traces in m_transtab.c (enabled at compile
time using #define DEBUG_TRANSTAB).
Some already existing 'if (0)' conditions are replaced
by if (DEBUG_TRANSTAB)
Regression tested on
f12/x86
debian6/amd64 (also with export EXTRA_REGTEST_OPTS=--sanity-level=4)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13290
--profile-flags=00000000 now prints summary statistics, one line per
profiled block, but with no translation details. Previously it had
no effect.
--profile-interval=<number> is a new flag that causes the profile data
to be dumped and zeroed every <number> event checks. This makes it
possible to get profile data without waiting for runs to end, and to
get profile data which depends on the current workload etc. If
--profile-interval=0 or is unset, the profile is printed only once, at
the end of the run, as before.
--profile-flags=XXXXXXXX (for at least one nonzero X) prints the
summary lines both at the start and end of the profile, so you don't
have to scroll back up to the top to see the summary.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13213
in each IRSB, rather than considering each IRSB to have a weight of 1.
This probably gives more representative profiles, especially post
t-chain merge, which made inter-SB transitions more or less free
compared to what they were before.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12542
This is the followup to rev 12488.
With this revision, translation chaining is not done
if the translation with 'from address' is not existing
anymore (discarded or erased).
The assumption documented in 12488 comment has been checked by:
* first reproduce a crash in Firefox when always setting
caused discard to False
* then upgrade to rev 12488
* with this upgrade, no crash anymore.
=> this verifies that the caused discard logic is properly
replaced by revision 12488.
So, the caused discard logic can be removed.
git-svn-id: svn://svn.valgrind.org/valgrind/branches/TCHAIN@12492
The fix consists in checking if the translation
of the 'from' address is still existing.
Patch also contains a big comment explaining why it is
safe to discard/erase the current translation being
executed.
In a follow-up patch, the Bool in VG_(translate) will
be removed :
Bool VG_(translate) ( /*OUT*/Bool* caused_discardP,
(if experiment confirms the hypothesis that it is
safe to discard current translation).
git-svn-id: svn://svn.valgrind.org/valgrind/branches/TCHAIN@12488
Massif: specify avg translation size at all, so as to avoid excessive
retranslations caused by the fact that the default value is far below
reality for Massif.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11494
found in the fast-cache.
* reduce max loading of the per-sector TT hash tables from 80% to 65%.
This reduces the number of required probes by a factor of 3.
* when searching for a translation, don't visit the sectors in a fixed
order. Instead, use an MTF array in which the most popular sectors
(in terms of most likely to hold the translation we're looking for)
are visited first. This reduces the number of required probes by
another factor of 2.
These improvements have no effect on small programs, but improve
scalability on big apps. For an application comprising 300k
translations, runtime on Memcheck is reduced by 3% and on None by
about 20%. The average number of probes per fast-cache miss is
reduced from around 22 to less than 5.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11091
the changes to do with reading and using ELF and DWARF3 info.
This breaks all targets except amd64-linux and x86-linux.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@10982
This commit tidies up and rationalises what could be called the
"messaging" system -- that part of V to do with presenting output to
the user. In particular it brings significant improvements to XML
output.
Changes are:
* XML and normal text output now have separate file descriptors,
which solves longstanding problems for XML consumers caused by
the XML output getting polluted by unexpected non-XML output.
* This also means that we no longer have to hardwire all manner
of output settings (verbosity, etc) when XML is requested.
* The XML output format has been revised, cleaned up, and made
more suitable for use by error detecting tools in general
(various Memcheck-specific features have been removed). XML
output is enabled for Ptrcheck and Helgrind, and Memcheck is
updated to the new format.
* One side effect is that the behaviour of VG_(message) has been
made to be consistent with printf: it no longer automatically
adds a newline at the end of the output. This means multiple
calls to it can be used to build up a single line message; or a
single call can write a multi-line message. The ==pid==
preamble is automatically inserted at each newline.
* VG_(message)(Vg_UserMsg, ..args..) now has the abbreviated form
VG_(UMSG)(..args..); ditto VG_(DMSG) for Vg_DebugMsg and
VG_(EMSG) for Vg_DebugExtraMsg. A couple of other useful
printf derivatives have been added to pub_tool_libcprint.h,
most particularly VG_(vcbprintf).
* There's a small change in the core-tool interface to do with
error handling: VG_(needs_tool_errors) has a new method
void (*before_pp_Error)(Error* err) which, if non-NULL, is
called just before void (*pp_Error)(Error* err). This is to
give tools the chance to look at errors before any part of them
is printed, so they can print any XML preamble they like.
* coregrind/m_errormgr.c has been overhauled and cleaned up, and
is a bit simpler and more commented. In particular pp_Error
and VG_(maybe_record_error) are significantly changed.
The diff is huge, but mostly very boring. Most of the changes
are of the form
- VG_(message)(Vg_UserMsg, "this is a message %d", n);
+ VG_(message)(Vg_UserMsg, "this is a message %d\n", n);
Unfortunately as a result of this, it touches a large number
of source files.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@10465
DARWIN branch. A big ugly DARWIN/trunk sync commit, mostly to do with
changing the representation of SysRes and vki_sigset_t. Functionality of
the trunk shouldn't be changed by it.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@9876
relatively minor extensions to m_debuginfo, a major overhaul of
m_debuginfo/readdwarf3.c to get its space usage under control, and
changes throughout the system to enable heap-use profiling.
The majority of the merged changes were committed into
branches/PTRCHECK as the following revs: 8591 8595 8598 8599 8601 and
8161.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@8621