fix 315545 (find_TTEntry_from_hcode): Assertion '(UChar*)sec->tt[tteNo].tcptr <= (UChar*)hcode' failed

Assertion 
  valgrind: m_transtab.c:674 (find_TTEntry_from_hcode):
  Assertion '(UChar*)sec->tt[tteNo].tcptr <= (UChar*)hcode' failed.
failure (encountered on some platforms while running gdbsrv tests).

The problem is related to invalidated entries and the host_extents
mapping between hostcode and the translation table entry.

The problem: when an entry is invalidated, the translation table
entry is changed to status Deleted. However, the host extent array
element is not cleaned up.
If a search for a host code address (find_TTEntry_from_hcode)
finds this entry, the translation table entry in Deleted status
is considered as a 'not found', which ensures that the invalidated
entry is not used (e.g. for chaining).
This is all ok.

However, it might be that this Deleted entry is re-used
(see function VG_(add_to_transtab), searching for a Empty
or Deleted entry.
If the Deleted entry is re-used, then a search for the
dead host code can give a result pointing to the re-used
entry. That is clearly wrong.
Note that it is unclear if this bug can only be triggered
while using gdbsrv or if this bug can be triggered with
just the "normal" invalidation logic of translation.
gdbsrv being a heavy "user" of invalidation, it might
be it helps to trigger the code. Alternatively, as gdbsrv
invalidation is special (e.g. invalidation of some entries
is done during translation of other entries), it might be
the bug is specific to gdbsrv.

In any case, to avoid the bug:
searching for an host code address must not only
ignore Deleted entries, but must also ignore an entry
found via a host_extent element which is for a Deleted
entry that was re-used afterwards (pointed to by a
newer host_extent element).


Multiple solutions are possible for fixing the bug:
Sol1: cleanup the host_extents array when an entry is deleted.
  The cleanup is however deemed costly:
  Each invalidate operation must do a search in the host_extents.
  The host_extents array must then be "compacted" to remove
  the "dead" host extent element from the array.
  The compact operation can be avoided if instead of removing
  the element, one marks instead the element as "dead"
  e.g. by using one bit of UInt len for that:
     UInt len : 31;
     Bool dead : 1;
  This avoids the compact, but still incurrs the cost of
  search and modify the host_extent for each entry invalidated.
  Invalidating entries seems to be a critical operation
  (e.g. specific ECLASS related  data structures have been
   done to allow fast deletion).
  => it is deemed that a solution not incurring cost during
  invaliation is preferrable.

* Sol 2: detect in find_TTEntry_from_hcode
  that the host_extent element is re-used, and handle it similarly
  to an host_extents which points at a Deleted entry.
  This detection is possible as if an entry is re-used after
  having been deleted, this implies that its host code will be
  after the end of the host code of the deleted entry
  (as host code of a sector is not re-used).
  The attached patch implements this solution.

* Sol 3: avoid re-using an entry : the entry would then stay
  in Deleted state. This is deemed not ok as it would
  imply that invalidation of entries will cause a sector to
  become full faster.

The patch:
* adds a new function
  Bool HostExtent__is_dead (const HostExtent* hx, const Sector* sec)
  telling if the host extent hx from sector sec is a dead entry.
* this function is used in find_TTEntry_from_hcode so that
  dead host extents are not resulting in host code to be found.
* adds a regression test which caused the assert failure before
  (bug was found/reported/isolated in a small test case by Dejan Jevtic).
* To check the logic of HostExtent__is_dead, m_transtab.c sanity check is
  completed to verify that the nr of entries in use in a sector is equal
  to the nr of non dead entries in the host extent array.
* adds/improves traces in m_transtab.c (enabled at compile
  time using #define DEBUG_TRANSTAB).
  Some already existing 'if (0)' conditions are replaced
  by if (DEBUG_TRANSTAB)

Regression tested on 
   f12/x86
   debian6/amd64 (also with export EXTRA_REGTEST_OPTS=--sanity-level=4)




git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13290
This commit is contained in:
Philippe Waroquiers 2013-02-24 23:16:58 +00:00
parent 0d1fe31512
commit ec83134fa2
8 changed files with 152 additions and 13 deletions

1
NEWS
View File

@ -86,6 +86,7 @@ m = merged into 3_8_BRANCH
312913 [390] Dangling pointers error should also report the alloc stack trace
312980 [390] Building on Mountain Lion generates some compiler warnings
313811 [390] Buffer overflow in assert_fail
315545 [390] (find_TTEntry_from_hcode): Assertion '(UChar*)sec->tt[tteNo].tcptr <= (UChar*)hcode' failed
n-i-bz [390] report error for vgdb snapshot requested before execution
n-i-bz [390] Some wrong command line options could be ignored
n-i-bz [390] same as 303624 (fixed in 3.8.0), but for x86 android

View File

@ -46,7 +46,7 @@
#include "pub_core_dispatch.h" // For VG_(disp_cp*) addresses
/* #define DEBUG_TRANSTAB */
#define DEBUG_TRANSTAB 0
/*-------------------------------------------------------------*/
@ -632,6 +632,40 @@ Int HostExtent__cmpOrd ( const void* v1, const void* v2 )
return 0; /* partial overlap */
}
/* True if hx is a dead host extent, i.e. corresponds to host code
of an entry that was invalidated. */
static
Bool HostExtent__is_dead (const HostExtent* hx, const Sector* sec)
{
const UInt tteNo = hx->tteNo;
#define LDEBUG(m) if (DEBUG_TRANSTAB) \
VG_(printf) (m \
" start 0x%p len %u sector %d ttslot %u" \
" tt.entry 0x%llu tt.tcptr 0x%p\n", \
hx->start, hx->len, (int)(sec - sectors), \
hx->tteNo, \
sec->tt[tteNo].entry, sec->tt[tteNo].tcptr)
/* Entry might have been invalidated and not re-used yet.*/
if (sec->tt[tteNo].status == Deleted) {
LDEBUG("found deleted entry");
return True;
}
/* Maybe we found this entry via a host_extents which was
inserted for an entry which was changed to Deleted then
re-used after. If this entry was re-used, then its tcptr
is >= to host_extents start (i.e. the previous tcptr) + len.
This is the case as there is no re-use of host code: a new
entry or re-used entry always gets "higher value" host code. */
if ((UChar*) sec->tt[tteNo].tcptr >= hx->start + hx->len) {
LDEBUG("found re-used entry");
return True;
}
return False;
#undef LDEBUG
}
static __attribute__((noinline))
Bool find_TTEntry_from_hcode( /*OUT*/UInt* from_sNo,
/*OUT*/UInt* from_tteNo,
@ -663,10 +697,12 @@ Bool find_TTEntry_from_hcode( /*OUT*/UInt* from_sNo,
UInt tteNo = hx->tteNo;
/* Do some additional sanity checks. */
vg_assert(tteNo <= N_TTES_PER_SECTOR);
/* Entry might have been invalidated. Consider this
as not found. */
if (sec->tt[tteNo].status == Deleted)
/* if this hx entry corresponds to dead host code, we must
tell this code has not been found, as it cannot be patched. */
if (HostExtent__is_dead (hx, sec))
return False;
vg_assert(sec->tt[tteNo].status == InUse);
/* Can only half check that the found TTEntry contains hcode,
due to not having a length value for the hcode in the
@ -834,11 +870,14 @@ static
void unchain_in_preparation_for_deletion ( VexArch vex_arch,
UInt here_sNo, UInt here_tteNo )
{
if (0)
VG_(printf)("QQQ unchain_in_prep %u.%u\n", here_sNo, here_tteNo);
if (DEBUG_TRANSTAB)
VG_(printf)("QQQ unchain_in_prep %u.%u...\n", here_sNo, here_tteNo);
UWord i, j, n, m;
Int evCheckSzB = LibVEX_evCheckSzB(vex_arch);
TTEntry* here_tte = index_tte(here_sNo, here_tteNo);
if (DEBUG_TRANSTAB)
VG_(printf)("... QQQ tt.entry 0x%llu tt.tcptr 0x%p\n",
here_tte->entry, here_tte->tcptr);
vg_assert(here_tte->status == InUse);
/* Visit all InEdges owned by here_tte. */
@ -984,7 +1023,7 @@ UInt addEClassNo ( /*MOD*/Sector* sec, Int ec, UShort tteno )
vg_assert(ec >= 0 && ec < ECLASS_N);
vg_assert(tteno < N_TTES_PER_SECTOR);
if (0) VG_(printf)("ec %d gets %d\n", ec, (Int)tteno);
if (DEBUG_TRANSTAB) VG_(printf)("ec %d gets %d\n", ec, (Int)tteno);
if (sec->ec2tte_used[ec] >= sec->ec2tte_size[ec]) {
@ -1002,7 +1041,7 @@ UInt addEClassNo ( /*MOD*/Sector* sec, Int ec, UShort tteno )
sec->ec2tte_size[ec] = new_sz;
sec->ec2tte[ec] = new_ar;
if (0) VG_(printf)("expand ec %d to %d\n", ec, new_sz);
if (DEBUG_TRANSTAB) VG_(printf)("expand ec %d to %d\n", ec, new_sz);
}
/* Common case */
@ -1215,13 +1254,29 @@ static Bool sanity_check_all_sectors ( void )
Bool sane;
Sector* sec;
for (sno = 0; sno < N_SECTORS; sno++) {
Int i;
Int nr_not_dead_hx = 0;
Int szhxa;
sec = &sectors[sno];
if (sec->tc == NULL)
continue;
sane = sanity_check_eclasses_in_sector( sec );
if (!sane)
return False;
szhxa = VG_(sizeXA)(sec->host_extents);
for (i = 0; i < szhxa; i++) {
const HostExtent* hx = VG_(indexXA)(sec->host_extents, i);
if (!HostExtent__is_dead (hx, sec))
nr_not_dead_hx++;
}
if (nr_not_dead_hx != sec->tt_n_inuse) {
VG_(debugLog)(0, "transtab",
"nr_not_dead_hx %d sanity fail (expected == in use %d)\n",
nr_not_dead_hx, sec->tt_n_inuse);
return False;
}
}
if ( !sanity_check_redir_tt_tc() )
return False;
if ( !sanity_check_sector_search_order() )
@ -1375,7 +1430,8 @@ static void initialiseSector ( Int sno )
VG_(machine_get_VexArchInfo)( &vex_arch, NULL );
/* Visit each just-about-to-be-abandoned translation. */
if (0) VG_(printf)("QQQ unlink-entire-sector: %d START\n", sno);
if (DEBUG_TRANSTAB) VG_(printf)("QQQ unlink-entire-sector: %d START\n",
sno);
for (i = 0; i < N_TTES_PER_SECTOR; i++) {
if (sec->tt[i].status == InUse) {
vg_assert(sec->tt[i].n_tte2ec >= 1);
@ -1394,7 +1450,8 @@ static void initialiseSector ( Int sno )
sec->tt[i].status = Empty;
sec->tt[i].n_tte2ec = 0;
}
if (0) VG_(printf)("QQQ unlink-entire-sector: %d END\n", sno);
if (DEBUG_TRANSTAB) VG_(printf)("QQQ unlink-entire-sector: %d END\n",
sno);
/* Free up the eclass structures. */
for (i = 0; i < ECLASS_N; i++) {
@ -1467,8 +1524,8 @@ void VG_(add_to_transtab)( VexGuestExtents* vge,
/* Generally stay sane */
vg_assert(n_guest_instrs < 200); /* it can be zero, tho */
if (0)
VG_(printf)("add_to_transtab(entry = 0x%llx, len = %d)\n",
if (DEBUG_TRANSTAB)
VG_(printf)("add_to_transtab(entry = 0x%llx, len = %d) ...\n",
entry, code_len);
n_in_count++;
@ -1593,6 +1650,9 @@ void VG_(add_to_transtab)( VexGuestExtents* vge,
vg_assert(hx_prev->start + hx_prev->len <= hx.start);
}
VG_(addToXA)(hx_array, &hx);
if (DEBUG_TRANSTAB)
VG_(printf)("... hx.start 0x%p hx.len %u sector %d ttslot %d\n",
hx.start, hx.len, y, i);
}
/* Update the fast-cache. */
@ -2228,7 +2288,7 @@ void VG_(print_tt_tc_stats) ( void )
" transtab: discarded %'llu (%'llu -> ?" "?)\n",
n_disc_count, n_disc_osize );
if (0) {
if (DEBUG_TRANSTAB) {
Int i;
VG_(printf)("\n");
for (i = 0; i < ECLASS_N; i++) {

View File

@ -112,6 +112,10 @@ EXTRA_DIST = \
nlpasssigalrm.stderr.exp \
nlpasssigalrm.stdinB.gdb \
nlpasssigalrm.stdoutB.exp \
nlself_invalidate.stderrB.exp \
nlself_invalidate.stderr.exp \
nlself_invalidate.stdinB.gdb \
nlself_invalidate.vgtest \
nlsigvgdb.vgtest \
nlsigvgdb.stderr.exp \
nlsigvgdb.stderrB.exp \
@ -123,6 +127,7 @@ check_PROGRAMS = \
gone \
main_pic \
passsigalrm \
self_invalidate \
sleepers \
t \
watchpoints

View File

@ -0,0 +1,8 @@
Nulgrind, the minimal Valgrind tool
(action at startup) vgdb me ...

View File

@ -0,0 +1 @@
relaying data between gdb and process ....

View File

@ -0,0 +1,14 @@
# connect gdb to Valgrind gdbserver:
target remote | ./vgdb --wait=60 --vgdb-prefix=./vgdb-prefix-nlself_invalidate
echo vgdb launched process attached\n
# place a breakpoint that will cause an invalidation of the currently executed translation
break *top
# Now, continue till the end. This should not crash
continue
continue
continue
continue
continue
continue
# and the process should exit very quickly now
quit

View File

@ -0,0 +1,12 @@
# reproduces a bug triggered by translation chaining and gdbserver
# invalidation due to breakpoint, when there is a jump from a tranlation
# block into itself.
prog: self_invalidate
vgopts: --tool=none --vgdb=yes --vgdb-error=0 --vgdb-prefix=./vgdb-prefix-nlself_invalidate
stderr_filter: filter_stderr
prereq: test -e gdb && ( ../tests/arch_test amd64 || ../tests/arch_test mips32 || ../tests/arch_test mips64 )
progB: gdb
argsB: --quiet -l 60 --nx ./self_invalidate
stdinB: nlself_invalidate.stdinB.gdb
stdoutB_filter: filter_make_empty
stderrB_filter: filter_gdb

View File

@ -0,0 +1,38 @@
/* Based on reproducer done by Dejan Jevtic */
int main (void)
{
#if defined(VGA_amd64)
/* Note that small changes in the code below
caused the bug not to be triggered anymore.
E.g. removing the dec %%eax avoids the assert
while this removal causes one more loop to be
executed. */
__asm__ __volatile__
("mov $30, %%eax\n\t"
"top:\n\t"
"mov $-4, %%ebx\n\t"
"add %%ebx, %%eax\n\t"
"dec %%eax\n\t"
"cmp $0x0,%%eax\n\t"
"jne top\n\t"
"mov $60, %%eax\n\t"
"mov $0, %%rdi\n\t"
"syscall\n\t"
: : : "eax", "ebx", "rdi"
);
#elif defined(VGA_mips32) || defined(VGA_mips64)
__asm__ __volatile__
("li $t0, 42\n\t"
"top:\n\t"
"li $t1, -4\n\t"
"addu $t0, $t0, $t1\n\t"
"li $t2, -2\n\t"
"addu $t0, $t0, $t2\n\t"
"addiu $t0, $t0, -1\n\t"
"bnez $t0, top\n\t"
"nop\n\t"
: : : "t0", "t1"
);
#endif
return 0;
}