DRD changes (Bart Van Assche)

* Add docs: exp-drd/docs/README.txt
* Added one drd suppression pattern, and cleaned up the suppression file.
* All regression tests now pass on x86_64 and i386, including sigalrm.
* Updated TODO.txt file.
* pth_create_chain test is now started with 100 threads instead of 10
  -- 10 was not enough.
* DRD no longer exits on PPC32 and PPC64 but just prints a warning
  message before it starts.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@7275
This commit is contained in:
Julian Seward 2007-12-04 21:27:18 +00:00
parent aac5718e0a
commit de4d2156cc
9 changed files with 182 additions and 123 deletions

3
NEWS
View File

@ -1,7 +1,6 @@
Release 3.3.0 (7 December 2007)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3.3.0 is a feature release with many significant improvements and the
usual collection of bug fixes. This release supports X86/Linux,
AMD64/Linux, PPC32/Linux and PPC64/Linux. Support for recent distros
@ -50,7 +49,7 @@ Omega and DRD. There are many other smaller improvements. In detail:
exp-omega/docs/omega_introduction.txt.
* exp-DRD: a data race detector based on the happens-before
relation. See exp-drd/TODO.txt.
relation. See exp-drd/docs/README.txt.
- Scalability improvements for very large programs, particularly those
which have a million or more malloc'd blocks in use at once. These

View File

@ -4,13 +4,16 @@ Last updated February 22, 2006
Data-race detection algorithm
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Implement glibc version detection in drd_main.c.
- Rename drd_preloaded.c into drd_intercepts.c.
- Propose on the Valgrind developers mailing list to add scripts
"run-in-place" and "debug-in-place".
- Implement segment merging, such that the number of segments per thread
remains limited even when there is no synchronization between threads.
- Find out why a race is reported on std::string::string(std::string const&)
(stc test case 16).
- Make sure that drd supports more than 256 mutexes.
- Performance testing and tuning.
- pthread semaphore support.
- pthread rwlock state tracking and support.
- pthread barrier state tracking and support.
- mutexes: support for pthread_mutex_timedlock() (recently added to the POSIX
@ -42,15 +45,12 @@ Documentation
Known bugs
~~~~~~~~~~
- Gets killed by the OOM handler for some applications, e.g. knode and
OpenOffice.
- [AMD64] Reports "Allocation context: unknown" for BSS symbols on AMD64
(works fine on X86). This is a bug in Valgrind's debug info reader
-- VG_(find_seginfo)() returns NULL for BSS symbols on AMD64. Not yet in
- Gets killed by the OOM handler for realistically sized applications,
e.g. knode and OpenOffice.
- [x86_64] Reports "Allocation context: unknown" for BSS symbols on AMD64
(works fine on i386). This is a bug in Valgrind's debug info reader
-- VG_(find_seginfo)() returns NULL for BSS symbols on x86_64. Not yet in
the KDE bug tracking system.
- False positives are reported when a signal is sent via pthread_kill() from
one thread to another (bug 152728).
- Crashes (cause not known): VALGRIND_LIB=$PWD/.in_place coregrind/valgrind --tool=exp-drd --trace-mem=yes /bin/ls
Known performance issues:
- According to cachegrind, VG_(OSet_Next)() is taking up most CPU cycles.

View File

@ -1 +1 @@
#EXTRA_DIST = drd-manual.xml
EXTRA_DIST = README.txt

83
exp-drd/docs/README.txt Normal file
View File

@ -0,0 +1,83 @@
DRD: a Data Race Detector
=========================
Last update: December 3, 2007 by Bart Van Assche.
The Difficulty of Multithreading Programming
--------------------------------------------
Multithreading is a concept to model multiple concurrent activities within a
single process. Since the invention of the multithreading concept, there is an
ongoing debate about which way to model concurrent activities is better --
multithreading or message passing. This debate exists because
multithreaded programming is error prone: multithreaded programs can exhibit
data races and/or deadlocks. Despite these risks multithreaded programming is
popular: for many applications multithreading is a more natural programming
style, and multithreaded code often runs faster than the same application
implemented via message passing.
In the context of DRD, a data race is defined as two concurrent memory
accesses, where at least one of these two memory accesses is a store operation,
and these accesses are not protected by proper locking constructs. Data
races are harmful because these may lead to unpredictable results in
multithreaded programs. There is a general consensus that data races
should be avoided in multithreaded programs.
About DRD
---------
The current version of DRD is able to perform data race detection on small
programs -- DRD quickly runs out of memory for realistically sized programs.
The current version runs well under Linux on x86 CPU's for multithreaded
programs that use the POSIX threading library. Regular POSIX threads, detached
threads, mutexes, condition variables and spinlocks are supported. POSIX
semaphores, barriers and reader-writer locks are not yet supported.
Extensive scientific research has been carried out on the area of data-race
detection. The two most important algorithms are known as the Eraser algorithm
and the algorithm based on the happens-before relationship, first documented by
Netzer. The Eraser algorithm can result in false positives, while the Netzer
algorithm guarantees not to report false positives. The Netzer algorithm
ignores a certain class of data races however. Both algorithms have been
implemented in Valgrind. The helgrind tool implements the Eraser algorithm,
and the DRD tool implements the Netzer algorithm. Although [Savage 1997]
claims that the Netzer algorithm is harder to implement efficiently, as of
version 3.3.0 drd runs significantly faster on several regression tests than
helgrind.
How to use DRD
--------------
To use this tool, specify --tool=drd on the Valgrind command line.
Future DRD Versions
-------------------
The following may be expected in future versions of DRD:
* More extensive documentation.
* Drastically reduced memory consumption, such that realistic applications can
be analyzed with DRD.
* Faster operation.
* Support for semaphores, barriers and reader-writer locks.
* Support for PowerPC CPU's.
* A lock dependency analyzer, as a help in deadlock prevention.
* Support for more than 256 mutexes per process.
See also
--------
* Robert H. B. Netzer and Barton P. Miller. What are race
conditions? Some issues and formalizations. ACM Letters 136
on Programming Languages and Systems, 1(1):7488, March 1992.
* John Ousterhout, Why Threads Are A Bad Idea (for most
purposes), Invited Talk at the 1996 USENIX Technical Conference (January
25, 1996). http://home.pacbell.net/ouster/threads.pdf
* Stefan Savage, Michael Burrows, Greg Nelson, Patrick
Sobalvarro and Thomas Anderson, Eraser: A Dynamic Data Race Detector for
Multithreaded Programs, ACM Transactions on Computer Systems,
15(4):391-411, November 1997.

View File

@ -74,23 +74,6 @@ static Bool drd_print_stats = False;
static Bool drd_trace_mem = False;
static Bool drd_trace_fork_join = False;
static Addr drd_trace_address = 0;
#if 0
// Note: using the information below for suppressing data races is only
// possible when the client and the shared libraries it uses contain
// debug information. Not every Linux distribution includes debug information
// in shared libraries.
static const SuppressedSymbol s_suppressed_symbols[] =
{
{ "ld-linux.so.2", "_rtld_local" },
{ "libpthread.so.0", "__nptl_nthreads" },
{ "libpthread.so.0", "stack_cache" },
{ "libpthread.so.0", "stack_cache_actsize" },
{ "libpthread.so.0", "stack_cache_lock" },
{ "libpthread.so.0", "stack_used" },
{ "libpthread.so.0", "libgcc_s_forcedunwind" },
{ "libpthread.so.0", "libgcc_s_getcfa" },
};
#endif
//
@ -155,6 +138,8 @@ VG_REGPARM(2) void drd_trace_load(Addr addr, SizeT size)
{
Segment* sg;
thread_set_vg_running_tid(VG_(get_running_tid)());
if (! thread_is_recording(thread_get_running_tid()))
return;
@ -195,6 +180,8 @@ VG_REGPARM(2) void drd_trace_store(Addr addr, SizeT size)
{
Segment* sg;
thread_set_vg_running_tid(VG_(get_running_tid)());
if (! thread_is_recording(thread_get_running_tid()))
return;
@ -237,29 +224,7 @@ static void drd_pre_mem_read(const CorePart part,
const Addr a,
const SizeT size)
{
const ThreadId running_tid = VG_(get_running_tid)();
if (size == 0)
return;
if (tid != running_tid)
{
if (VgThreadIdToDrdThreadId(tid) != DRD_INVALID_THREADID)
{
drd_set_running_tid(tid);
drd_trace_load(a, size);
drd_set_running_tid(running_tid);
}
else
{
VG_(message)(Vg_DebugMsg,
"drd_pre_mem_read() was called before"
" drd_post_thread_create() for thread ID %d",
tid);
tl_assert(0);
}
}
else
if (size > 0)
{
drd_trace_load(a, size);
}
@ -270,31 +235,7 @@ static void drd_post_mem_write(const CorePart part,
const Addr a,
const SizeT size)
{
const ThreadId running_tid = VG_(get_running_tid)();
if (size == 0)
return;
if (tid != running_tid)
{
if (VgThreadIdToDrdThreadId(tid) != DRD_INVALID_THREADID)
{
drd_set_running_tid(tid);
drd_trace_store(a, size);
drd_set_running_tid(running_tid);
}
else
{
#if 1
VG_(message)(Vg_DebugMsg,
"drd_pre_mem_write() was called before"
" drd_post_thread_create() for thread ID %d",
tid);
tl_assert(0);
#endif
}
}
else
if (size > 0)
{
drd_trace_store(a, size);
}
@ -302,6 +243,8 @@ static void drd_post_mem_write(const CorePart part,
static void drd_start_using_mem(const Addr a1, const Addr a2)
{
thread_set_vg_running_tid(VG_(get_running_tid)());
if (a1 <= drd_trace_address && drd_trace_address < a2
&& thread_is_recording(thread_get_running_tid()))
{
@ -354,9 +297,7 @@ static void drd_start_using_mem_stack(const Addr a, const SizeT len)
/* Assumption: stacks grow downward. */
static void drd_stop_using_mem_stack(const Addr a, const SizeT len)
{
#if 0
VG_(message)(Vg_DebugMsg, "stop_using_mem_stack(0x%lx, %ld)", a, len);
#endif
thread_set_vg_running_tid(VG_(get_running_tid)());
thread_set_stack_min(thread_get_running_tid(),
a + len - VG_STACK_REDZONE_SZB);
drd_stop_using_mem(a - VG_STACK_REDZONE_SZB,
@ -384,11 +325,7 @@ void drd_pre_thread_create(const ThreadId creator, const ThreadId created)
// Hack: compensation for code missing in coregrind/m_main.c.
if (created == 1)
{
extern ThreadId VG_(running_tid);
tl_assert(VG_(running_tid) == VG_INVALID_THREADID);
VG_(running_tid) = 1;
drd_start_client_code(VG_(running_tid), 0);
VG_(running_tid) = VG_INVALID_THREADID;
thread_set_running_tid(1, 1);
}
#endif
if (IsValidDrdThreadId(drd_creator))
@ -559,10 +496,7 @@ void drd_post_clo_init(void)
# if defined(VGP_x86_linux) || defined(VGP_amd64_linux)
/* fine */
# else
VG_(printf)("\nDRD currently only works on x86-linux and amd64-linux.\n");
VG_(printf)("At the very least you need to set PTHREAD_{MUTEX,COND}_SIZE\n");
VG_(printf)("in pthread_object_size.h to correct values. Sorry.\n\n");
VG_(exit)(0);
VG_(printf)("\nWARNING: DRD has only been tested on x86-linux and amd64-linux.\n\n");
# endif
}
@ -696,21 +630,21 @@ IRSB* drd_instrument(VgCallbackClosure* const closure,
return bb;
}
static void drd_set_running_tid(const ThreadId tid)
static void drd_set_running_tid(const ThreadId vg_tid)
{
static ThreadId s_last_tid = VG_INVALID_THREADID;
if (tid != s_last_tid)
static ThreadId s_last_vg_tid = VG_INVALID_THREADID;
if (vg_tid != s_last_vg_tid)
{
const DrdThreadId drd_tid = VgThreadIdToDrdThreadId(tid);
const DrdThreadId drd_tid = VgThreadIdToDrdThreadId(vg_tid);
tl_assert(drd_tid != DRD_INVALID_THREADID);
s_last_tid = tid;
s_last_vg_tid = vg_tid;
if (drd_trace_fork_join)
{
VG_(message)(Vg_DebugMsg,
"drd_track_thread_run tid = %d / drd tid %d",
tid, drd_tid);
vg_tid, drd_tid);
}
thread_set_running_tid(drd_tid);
thread_set_running_tid(vg_tid, drd_tid);
}
}
@ -766,7 +700,7 @@ static void drd_load_suppression_file(void)
static const Char drd_supp[] = "glibc-2.X-drd.supp";
const Int len = VG_(strlen)(VG_(libdir)) + 1 + sizeof(drd_supp);
Char* const buf = VG_(arena_malloc)(VG_AR_CORE, len);
VG_(sprintf)(buf, "%s/%s", VG_(libdir), drd_supp);
VG_(snprintf)(buf, len, "%s/%s", VG_(libdir), drd_supp);
VG_(clo_suppressions)[VG_(clo_n_suppressions)] = buf;
VG_(clo_n_suppressions)++;
}

View File

@ -84,7 +84,8 @@ static ULong s_report_races_count;
static ULong s_update_danger_set_count;
static ULong s_danger_set_bitmap_creation_count;
static ULong s_danger_set_bitmap2_creation_count;
static DrdThreadId s_running_tid = DRD_INVALID_THREADID;
static ThreadId s_vg_running_tid = VG_INVALID_THREADID;
static DrdThreadId s_drd_running_tid = DRD_INVALID_THREADID;
static ThreadInfo s_threadinfo[DRD_N_THREADS];
static struct bitmap* s_danger_set;
@ -420,17 +421,49 @@ void thread_set_name_fmt(const DrdThreadId tid, const char* const fmt,
fmt, arg);
s_threadinfo[tid].name[sizeof(s_threadinfo[tid].name) - 1] = 0;
}
DrdThreadId thread_get_running_tid(void)
{
tl_assert(s_running_tid != DRD_INVALID_THREADID);
return s_running_tid;
// HACK. To do: remove the if-statement and keep the assert.
if (VG_(get_running_tid)() != VG_INVALID_THREADID)
tl_assert(VG_(get_running_tid)() == s_vg_running_tid);
tl_assert(s_drd_running_tid != DRD_INVALID_THREADID);
return s_drd_running_tid;
}
void thread_set_running_tid(const DrdThreadId tid)
void thread_set_vg_running_tid(const ThreadId vg_tid)
{
s_running_tid = tid;
thread_update_danger_set(tid);
s_context_switch_count++;
// HACK. To do: uncomment the line below.
// tl_assert(vg_tid != VG_INVALID_THREADID);
if (vg_tid != s_vg_running_tid)
{
thread_set_running_tid(vg_tid, VgThreadIdToDrdThreadId(vg_tid));
}
tl_assert(s_vg_running_tid != VG_INVALID_THREADID);
tl_assert(s_drd_running_tid != DRD_INVALID_THREADID);
}
void thread_set_running_tid(const ThreadId vg_tid, const DrdThreadId drd_tid)
{
// HACK. To do: remove the next two lines.
if (vg_tid == VG_INVALID_THREADID)
return;
tl_assert(vg_tid != VG_INVALID_THREADID);
tl_assert(drd_tid != DRD_INVALID_THREADID);
if (vg_tid != s_vg_running_tid)
{
s_vg_running_tid = vg_tid;
s_drd_running_tid = drd_tid;
thread_update_danger_set(drd_tid);
s_context_switch_count++;
}
tl_assert(s_vg_running_tid != VG_INVALID_THREADID);
tl_assert(s_drd_running_tid != DRD_INVALID_THREADID);
}
/**
@ -640,7 +673,7 @@ void thread_combine_vc(DrdThreadId joiner, DrdThreadId joinee)
vc_combine(&s_threadinfo[joiner].last->vc, &s_threadinfo[joinee].last->vc);
thread_discard_ordered_segments();
if (joiner == s_running_tid)
if (joiner == s_drd_running_tid)
{
thread_update_danger_set(joiner);
}
@ -668,7 +701,7 @@ void thread_stop_using_mem(const Addr a1, const Addr a2)
for (p = s_threadinfo[i].first; p; p = p->next)
{
if (other_user == DRD_INVALID_THREADID
&& i != s_running_tid
&& i != s_drd_running_tid
&& bm_has_any_access(p->bm, a1, a2))
{
other_user = i;
@ -939,7 +972,7 @@ static void thread_update_danger_set(const DrdThreadId tid)
tl_assert(0 <= tid && tid < DRD_N_THREADS
&& tid != DRD_INVALID_THREADID);
tl_assert(tid == s_running_tid);
tl_assert(tid == s_drd_running_tid);
s_update_danger_set_count++;
s_danger_set_bitmap_creation_count -= bm_get_bitmap_creation_count();

View File

@ -70,7 +70,9 @@ void thread_set_name(const DrdThreadId tid, const char* const name);
void thread_set_name_fmt(const DrdThreadId tid, const char* const name,
const UWord arg);
DrdThreadId thread_get_running_tid(void);
void thread_set_running_tid(const DrdThreadId tid);
void thread_set_vg_running_tid(const ThreadId vg_tid);
void thread_set_running_tid(const ThreadId vg_tid,
const DrdThreadId drd_tid);
Segment* thread_get_segment(const DrdThreadId tid);
void thread_new_segment(const DrdThreadId tid);
VectorClock* thread_get_vc(const DrdThreadId tid);

View File

@ -1 +1 @@
prog: pth_create_chain 10
prog: pth_create_chain 100

View File

@ -20,29 +20,29 @@
fun:exit
}
{
dl
dl-2.6.*
exp-drd:ConflictingAccess
obj:/lib64/ld-2.6.1.so
obj:/lib*/ld-*.so
fun:exit
}
{
dl-2.6.1
dl-2.6.*
exp-drd:ConflictingAccess
obj:/lib64/ld-2.6.1.so
obj:/lib64/ld-2.6.1.so
obj:/lib64/ld-2.6.1.so
obj:/lib*/ld-*.so
obj:/lib*/ld-*.so
obj:/lib*/ld-*.so
}
{
libc
exp-drd:ConflictingAccess
fun:__libc_enable_asynccancel
obj:/lib/libc-*
obj:/lib*/libc-*
}
{
libc
exp-drd:ConflictingAccess
fun:__libc_disable_asynccancel
obj:/lib/libc-*
obj:/lib*/libc-*
}
{
librt
@ -78,14 +78,14 @@
{
pthread
exp-drd:ConflictingAccess
obj:/lib64/libpthread-2.6.1.so
obj:/lib*/libpthread-*.so
fun:start_thread
fun:clone
}
{
pthread
exp-drd:ConflictingAccess
obj:/lib64/libc-2.6.1.so
obj:/lib*/libc-*.so
fun:__libc_thread_freeres
fun:start_thread
fun:clone
@ -93,8 +93,8 @@
{
pthread
exp-drd:ConflictingAccess
obj:/lib64/libc-2.6.1.so
obj:/lib64/libc-2.6.1.so
obj:/lib*/libc-*.so
obj:/lib*/libc-*.so
fun:__libc_thread_freeres
fun:start_thread
fun:clone
@ -154,6 +154,14 @@
fun:pthread_join
fun:pthread_join
}
{
pthread
exp-drd:ConflictingAccess
fun:free_stacks
fun:__deallocate_stack
fun:pthread_join
fun:pthread_join
}
{
pthread
exp-drd:ConflictingAccess
@ -193,7 +201,7 @@
pthread
exp-drd:ConflictingAccess
fun:sigcancel_handler
obj:/lib/libpthread-*
obj:/lib*/libpthread-*
}
{
pthread-unwind
@ -201,7 +209,7 @@
fun:_Unwind_ForcedUnwind
fun:__pthread_unwind
fun:sigcancel_handler
obj:/lib/libpthread-*
obj:/lib*/libpthread-*
}
{
pthread-unwind