This fixes a helgrind crash detected on android.
Android bionic pthread lib unmaps the stack for detached threads
before exiting.
Helgrind tries to unwind the stack to record a 'read' after
the stack unmap, just before the exit syscall.
The unwind then causes a SEGV.
The solution consists in tightening the calculation of
the stack limits, so as to stop unwinding when no valid stack
can be found.
Regression test reproduces the same problem by simulating the
bionic behaviour on linux, using asm similar to bionic lib.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14976
Some arches (s390x and ppc64) return ENOSYS instead of EINVAL for
undefined futex operations. Adjust the helgrind filter_stderr to
handle that case.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14375
Activating this hint using --sim-hints=no-nptl-pthread-stackcache
means the glibc nptl stack cache will be disabled.
Disabling this stack/tls cache avoids helgrind false positive race conditions
errors when using __thread variables.
Note: disabling the stack cache is done by a kludge, dependent on
internal knowledge of glibc code, and using libpthread debug info.
So, this kludge might be broken with newer glibc version.
This has been tested on various platforms and various
glibc versions 2.11, 2.16 and 2.18
To check if the disabling works, you can do:
valgrind --tool=helgrind --sim-hints=no-nptl-pthread-stackcache -d -v ./helgrind/tests/tls_threads |& grep kludge
If you see the below 2 lines, then hopefully the stack cache has been disabled.
--12624-- deactivate nptl pthread stackcache via kludge: found symbol stack_cache_actsize at addr 0x3AF178
--12624:1:sched pthread stack cache size disabling done via kludge
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14313
to add PPC64 LE support. The other two patches can be found in Bugzillas
334384 and 334834. Note, there are no VEX changes in this patch.
PP64 Little Endian test case fixes.
This patch adds new LE and BE expect files where needed. In other
cases, the test was fixed to run correctly on LE and BE using based on
testing to see which platform is being used.
Where practical, the test cases have been changed so that the output
produced for BE and LE will be identical. The test cases that require
a major rewrite to make the output identical for BE and LE simply
had an additional expect file added.
Signed-off-by: Carl Love <carll@us.ibm.com>
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14240
to add PPC64 LE support. The other two patches can be found in Bugzillas
334834 and 334836. The commit does not have a VEX commit associated with it.
POWER PC, add initial Little Endian support
The IBM POWER processor now supports both Big Endian and Little Endian.
This patch renames the #defines with the name ppc64 to ppc64be for the BE
specific code. This patch adds the Little Endian #define ppc64le to the
Additionally, a few functions are renamed to remove BE from the name if the
function is used by BE and LE. Functions that are BE specific have BE put
in the name.
The goals of this patch is to make sure #defines, function names and
variables consistently use PPC64/ppc64 if it refers to BE and LE,
PPC64BE/ppc64be if it is specific to BE, PPC64LE/ppc64le if it is LE
specific. The patch does not break the code for PPC64 Big Endian.
The test files memcheck/tests/atomic_incs.c, tests/power_insn_available.c
and tests/power_insn_available.c are also updated to the new #define
definition for PPC64 BE.
Signed-off-by: Carl Love <carll@us.ibm.com>
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14238
* Add lock announcements in various helgrind errors that were not
announcing the locks
* ensure locks are also announced in xml (note that this is compatible
with xml protocol version 4, so no impact on GUI which properly
implement the protocol)
Changes done:
* Like other HG record_error functions, HG_(record_error_LockOrder) is
now passing Lock* rather than lock guest addresses.
* update exp files for tests that were showing locks without announcing them
* change tc14_laog_dinphils.c and tc15_laog_lockdel.c so as to
have same sizes on 32 and 64 bits systems for allocated or symbol sizes.
* factorise all code that was announcing first lock observation
* enable xml lock announcement
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14204
(note that some error messages are not announcing the lock,
which is not that nice).
At least the lock order violation message do not announce locks.
That should be improved/fixed
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14188
and stack address description.
* A race condition on an allocated block shows the stacktrace, but
does not show the thread # that allocated the block.
This patch adds the output of the thread # that allocated the block.
* The patch also fixes the confusion that might appear between
the core threadid and the helgrind thread nr in Stack address description:
A printed stack addrinfo was containing a thread id, while all other helgrind
messages are using (supposed to use) an 'helgrind thread #' which
is used in the thread announcement.
Basically, the idea is to let a tool set a "tool specific thread nr'
in an addrinfo.
The pretty printing of the addrinfo is then by preference showing this
thread nr (if it was set, i.e. different of 0).
Currently, only helgrind uses this addrinfo tnr.
Note: in xml mode, the output is matching the protocol description.
I.e., GUI should not be impacted by this change, if they properly implement
the xml protocol.
* Also, make the output produced by m_addrinfo consistent:
The message 'block was alloc'd at' is changed to be like all other
output : one character indent, and starting with an uppercase
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14175
showing where a thread was created.
This makes many tests fail => use sed to delete pthread_create_WRK
from the stacktrace to let tests succeed on ppc64.
With this change, on ppc64 gcc110 (fedora 18), helgrind failures
goes from 28 tests failing to 4, with following reasons:
helgrind/tests/pth_cond_destroy_busy (stderr)
(6 errors instead of 3 in the summary line ???)
helgrind/tests/tc06_two_races_xml (stderr)
similar change needed in filter_xml to del pthread_create_WRK
helgrind/tests/tc18_semabuse (stderr)
- with error code 22 (EINVAL: Invalid argument)
+ with error code 38 (ENOSYS: Function not implemented)
helgrind/tests/tc20_verifywrap (stderr)
- with error code 22 (EINVAL: Invalid argument)
+ with error code 38 (ENOSYS: Function not implemented)
More details about the stacktrace not containing pthread_create_WRK:
--------------------------------------------------------------------
Here is the stacktrace obtained by GDB+vgdb:
(gdb) bt
#0 0x0000008074f7ac5c in .__clone () from /lib64/libc.so.6
#1 0x000000807517b1ec in do_clone (pd=0x4c6f200, attr=0x8075189c90 <default_attr>, stackaddr=<optimized out>, stopped=<optimized out>,
fct=@0x80751a01e0: 0x807517c500 <start_thread>, clone_flags=4001536) at ../nptl/sysdeps/pthread/createthread.c:74
#2 0x000000000403ed0c in pthread_create_WRK (thread=<error reading variable: value has been optimized out>,
attr=<error reading variable: value has been optimized out>, start=<error reading variable: value has been optimized out>,
arg=0xfff00ee18) at hg_intercepts.c:269
#3 0x000000000403ef1c in _vgw00000ZZ_libpthreadZdsoZd0_pthreadZucreateZAZa (thread=<optimized out>, attr=<optimized out>,
start=<optimized out>, arg=<optimized out>) at hg_intercepts.c:300
#4 0x000000003806f1d8 in ?? ()
#5 0x0000008074e9fb94 in generic_start_main (main=@0x100200d8: 0x100013a0 <main>, argc=<optimized out>, ubp_av=0xfff00f2d8,
auxvec=0xfff00f408, init=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>)
at ../csu/libc-start.c:225
#6 0x0000008074e9fd90 in __libc_start_main (argc=<optimized out>, ubp_av=<optimized out>, ubp_ev=<optimized out>,
auxvec=<optimized out>, rtld_fini=<optimized out>, stinfo=<optimized out>, stack_on_entry=<optimized out>)
at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:91
#7 0x0000000000000000 in ?? ()
(gdb)
and here is the stacktrace produced by Valgrind unwinder:
Thread 1: status = VgTs_Runnable
==41687== at 0x8074F7AC5C: clone (in /usr/lib64/libc-2.16.so)
==41687== by 0x807517B1EB: do_clone.constprop.3 (createthread.c:74)
==41687== by 0x403EF1B: pthread_create@* (hg_intercepts.c:300)
==41687== by 0x10001597: main (tc19_shadowmem.c:172)
valgrind stack top usage: 15328 of 1048576
When the 2nd clone break is encountered (in the child thread), here is
the GDB stacktraces:
Thread 2 (Thread 6028):
#0 0x0000008074f75fb0 in .madvise () from /lib64/libc.so.6
#1 0x000000807517c700 in start_thread (arg=0x4c6f200) at pthread_create.c:402
#2 0x0000008074f7acf0 in .__clone () from /lib64/libc.so.6
Thread 1 (Thread 41687):
#0 pthread_create_WRK (thread=0xfff00ee10, attr=0x0, start=@0x100200e8: 0x10001dd0 <steer>, arg=0xfff00ee18) at hg_intercepts.c:248
#1 0x000000000403ef1c in _vgw00000ZZ_libpthreadZdsoZd0_pthreadZucreateZAZa (thread=<optimized out>, attr=<optimized out>,
start=<optimized out>, arg=<optimized out>) at hg_intercepts.c:300
#2 0x000000003806f1d8 in ?? ()
#3 0x0000008074e9fb94 in generic_start_main (main=@0x100200d8: 0x100013a0 <main>, argc=<optimized out>, ubp_av=0xfff00f2d8,
auxvec=0xfff00f408, init=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>)
at ../csu/libc-start.c:225
#4 0x0000008074e9fd90 in __libc_start_main (argc=<optimized out>, ubp_av=<optimized out>, ubp_ev=<optimized out>,
auxvec=<optimized out>, rtld_fini=<optimized out>, stinfo=<optimized out>, stack_on_entry=<optimized out>)
at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:91
#5 0x0000000000000000 in ?? ()
(gdb)
Here are the valgrind stacktraces:
Thread 1: status = VgTs_Runnable
==41687== at 0x403EBE0: pthread_create_WRK (hg_intercepts.c:248)
==41687== by 0x403EF1B: pthread_create@* (hg_intercepts.c:300)
==41687== by 0x8074E9FB93: generic_start_main.isra.0 (libc-start.c:225)
==41687== by 0x8074E9FD8F: (below main) (libc-start.c:91)
valgrind stack top usage: 15328 of 1048576
Thread 2: status = VgTs_WaitSys
==41687== at 0x8074F75FB0: madvise (in /usr/lib64/libc-2.16.so)
==41687== by 0x807517C6FF: start_thread (pthread_create.c:402)
valgrind stack top usage: 10320 of 1048576
And then after a few more next/breaks:
Thread 1: status = VgTs_Runnable
==41687== at 0x8074F7AC5C: clone (in /usr/lib64/libc-2.16.so)
==41687== by 0x807517B1EB: do_clone.constprop.3 (createthread.c:74)
==41687== by 0x403EF1B: pthread_create@* (hg_intercepts.c:300)
==41687== by 0x100015BB: main (tc19_shadowmem.c:173)
valgrind stack top usage: 15328 of 1048576
Thread 2: status = VgTs_WaitSys
==41687== at 0x8074F75FB0: madvise (in /usr/lib64/libc-2.16.so)
==41687== by 0x807517C6FF: start_thread (pthread_create.c:402)
valgrind stack top usage: 10320 of 1048576
So, pthread_create_WRK is not in the stacktrace anymore.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13983
of memcheck and helgrind in a common module:
pub_tool_addrinfo.h pub_core_addrinfo.h m_addrinfo.c
At the same time, the factorised code is made usable by other
tools also (and is used by the gdbserver command 'v.info location'
which replaces the helgrind 'describe addr' introduced 1 week ago
and which is now callable by all tools).
The new address description code can describe more addresses
(e.g. for memcheck, if the block is not on the free list anymore,
but is in an arena free list, this will also be described).
Similarly, helgrind address description can now describe more addresses
when --read-var-info=no is given (e.g. global symbols are
described, or addresses on the stack are described as
being on the stack, freed blocks in the arena free list are
described, ...).
See e.g. the change in helgrind/tests/annotate_rwlock.stderr.exp
or locked_vs_unlocked2.stderr.exp
The patch touches many files, but is basically a lot of improvements
in helgrind output files.
The code changes are mostly refactorisation of existing code.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13965
On MIPS architecture helgrind is showing race condition error in printf if
the printf is first time called from the child thread. If we call printf
from the main for the first time we will suppress this error on mips.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13749
(When calling pthread_cond_destroy or pthread_mutex_destroy
with initializers as argument Helgrind (incorrectly)
reports errors.)
This introduces a new race report (but no new race) in
some conditions. I think this is OK because the race only
occurs in the case where the program is buggy (racey) anyway.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13643
Fix some compiler warnings when compiling Valgrind for mips32/mips64.
Clean up exp files for mips32 BE and LE.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13496
Add mips64-le implementation of:
- atomic_add_8bit
- atomic_add_16bit
- atomic_add_32bit
- atomic_add_64bit
- do_acasW
Minor fixes for mips32 implementations are included as well.
These functions are needed to execute atomic_incs and annotate_hbefore
tests on mips64le.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13357
(problem reported in bug 307082, comment 8).
Solution applied is similar to what is in 307082 patch
(i.e. do not destroy the internal helgrind var if nWaiters > 0).
But also do not remove it from the FM.
+ add a test case (re-using the drd test case)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13329
In case a lock order violation is detected in a multi lock cycle,
then the current code cannot produce the set of locks and the
stack traces involved in the cycle.
However, it is still possible to produce the stack trace of
the new lock and the other lock between which a cycle was discovered.
Also, add a comment in the code clarifying why the set of locks
establishing the required order cannot (currently) be produced.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13281
This patch changes the way static variables are
recorded by readdwarf3.c (when giving --read-var-info=yes),
improving the way such variables are described.
Currently:
A static variable does not have the DW_AT_external tag.
So, readdwarf3.c does not consider it a global variable.
It is rather considered a "local" variable.
When it is recorded, it is associated to a range of program counters
(the functions in the file where it is visible).
However, even if the static variable is only visible
in the source file where it is declared, it can in reality
be used by any range of program counters, typically
by having the address of the local variable passed
to other functions.
Such local variable can then only be described
when the program counter is in the range of program
counters for which it has been recorded.
However, this (local) description is obtained
by a kludge in debuginfo.c (around line 3285).
This kludge then produces a strange description,
telling that the variable has been declared in
frame 0 of a thread (see second example below).
The kludge is not always able to describe
the address (if the IP of the tid is in another file than
where the variable has been declared).
I suspect the kludge can sometimes describe the var as being
declared in an unrelated thread
(e.g. if an error is triggered by tid 5, but tid1 is by
luck in an IP corresponding to the recorded range).
The patch changes the way a static variable is recorded:
if DW_AT_external tag is found, a variable is marked as global.
If a variable is not external, but is seen when level is 1,
then we record the variable as a global variable (i.e.
with a full IP range).
This improves the way such static variable are described:
* they are described even if being accessed by other files.
* their description is not in an artificial "thread frame".
First example:
**************
a variable cannot be described because it is
accessed by a function in another file:
with the trunk:
==20410== ----------------------------------------------------------------
==20410==
==20410== Possible data race during read of size 4 at 0x600F54 by thread #1
==20410== Locks held: none
==20410== at 0x4007E4: a (abc.c:42)
==20410== by 0x4006BC: main (mabc.c:24)
==20410==
==20410== This conflicts with a previous write of size 4 by thread #2
==20410== Locks held: none
==20410== at 0x4007ED: a (abc.c:42)
==20410== by 0x400651: brussels_fn (mabc.c:9)
==20410== by 0x4C2B54E: mythread_wrapper (hg_intercepts.c:219)
==20410== by 0x4E348C9: start_thread (pthread_create.c:300)
==20410==
==20410== ----------------------------------------------------------------
with the patch:
==4515== ----------------------------------------------------------------
==4515==
==4515== Possible data race during read of size 4 at 0x600F54 by thread #1
==4515== Locks held: none
==4515== at 0x4007E4: a (abc.c:42)
==4515== by 0x4006BC: main (mabc.c:24)
==4515==
==4515== This conflicts with a previous write of size 4 by thread #2
==4515== Locks held: none
==4515== at 0x4007ED: a (abc.c:42)
==4515== by 0x400651: brussels_fn (mabc.c:9)
==4515== by 0x4C2B54E: mythread_wrapper (hg_intercepts.c:219)
==4515== by 0x4E348C9: start_thread (pthread_create.c:300)
==4515==
==4515== Location 0x600f54 is 0 bytes inside global var "static_global"
==4515== declared at mabc.c:4
==4515==
==4515== ----------------------------------------------------------------
Second example:
***************
When the kludge can describe the variable, it is strangely described
as being declared in a frame of a thread, while for sure the declaration
has nothing to do with a thread
With the trunk:
==20410== Location 0x600f68 is 0 bytes inside local var "static_global_a"
==20410== declared at abc.c:3, in frame #0 of thread 1
With the patch:
==4515== Location 0x600f68 is 0 bytes inside global var "static_global_a"
==4515== declared at abc.c:3
#include <stdio.h>
static int static_global_a = 0; //// <<<< this is abc.c:3
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13153
struct pthread_mutex_t is different on MIPS32 and x86_64, and thus passing a
bogus mutex pthread_cond_wait (line 72) will corrupt memory in a different way
on two platforms. This causes the subsequent call to pthread_cond_wait to fail
on MIPS and i386 but not on x86_64.
This change fixes helgrind/tests/tc23_bogus_condwait on MIPS and i386.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13001
Writing to a double is done via two store-word instructions on MIPS platforms.
Thus, Helgrind will report "Possible data race during write of size 4" twice on
subsequent locations on MIPS instead of a single "Possible data race during
write of size 8". New exp file is added to cover this case.
This fixes helgrind/tests/tc19_shadowmem on MIPS.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13000
Different error numbers on MIPS require us to add an extra exp file for this
test. EDEADLK is 45 on MIPS (and not 35), and EOPNOTSUPP is 122 (and not 95).
Furthermore, sem_post will pass due to different implementation on MIPS (in
comparison to x86_64), and thus one error less has to be reported in the log.
This fixes helgrind/tests/tc20_verifywrap.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12998
The existing tc18_semabuse.stderr.exp expects that sem_post on a bogus semaphore
will fail, yet it does not on platforms such as MIPS or ARM. This is specific to
implementation of sem_post for i386/x86_64 that has some assumptions such as that
'private' field is not clobbered. This will eventually result in different
parameter passed to syscall and thus different output is encountered.
This change fixes helgrind/tests/tc18_semabuse for MIPS.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12976
* when cond var is destroyed, in the PRE, report an error if nwaiters > 0.
* when cond_wait succeeds, get the cond var but do not create one in helgrind
(it must exist if cond_wait was done).
Report an error if cond not found (assuming this is caused by a destroy
done while the thread was cond_wait-ing).
* added a test
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12721
Even without fair scheduling, this ensures the progress
of each thread.
This avoids the test looping forever in an outer/inner
setup.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12452
of this is to not have to link against -lrt because that causes
a different back-trace on certain x86 and s390x environments.
See the thread with subject
"helgrind/tests/cond_timedwait_invalid failing on x86"
on valgrind-developers for more details.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12246
quadratic growth on some apparently simple test cases. Fixes#267925.
(Philippe Waroquiers, philippe.waroquiers@skynet.be)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12201
the tarball generated by "make dist".
With this change running regtest from the tarball produces the same
results as a regtest on a checked out repository (on x86 that is).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12172