PTRACE_GET_THREAD_AREA is not handled by amd64 linux syswrap, which leads
to false positive errors in 64 bits program ptrace-ing 32 bits processes.
For example, the below error was wrongly reported on GDB:
==25377== Conditional jump or move depends on uninitialised value(s)
==25377== at 0x8A1D7EC: td_thr_get_info (td_thr_get_info.c:35)
==25377== by 0x526819: thread_from_lwp(thread_info*, ptid_t) (linux-thread-db.c:417)
==25377== by 0x5281D4: thread_db_notice_clone(ptid_t, ptid_t) (linux-thread-db.c:442)
==25377== by 0x51773B: linux_handle_extended_wait(lwp_info*, int) (linux-nat.c:2027)
....
==25377== Uninitialised value was created by a stack allocation
==25377== at 0x69A360: x86_linux_get_thread_area(int, void*, unsigned int*) (x86-linux-nat.c:278)
Fix this by implementing PTRACE_GET|SET_THREAD_AREA on amd64.
Fixes: 388786 - Support bpf syscall in amd64 Linux
Add support for bpf() Linux-specific system call on amd64 platform. The
bpf() syscall is used to handle eBPF objects (programs and maps), and
can be used for a number of operations. It takes three arguments:
- "cmd" is an integer encoding a subcommand to run. Available subcommand
include loading a new program, creating a map or updating its entries,
retrieving information about an eBPF object, and may others.
- "attr" is a pointer to an object of type union bpf_attr. This object
converts to a struct related to selected subcommand, and embeds the
various parameters used with this subcommand. Some of those parameters
are read by the kernel (example for an eBPF map lookup: the key of the
entry to lookup), others are written into (the value retrieved from
the map lookup).
- "attr_size" is the size of the object pointed by "attr".
Since the action performed by the kernel, and the way "attr" attributes
are processed depends on the subcommand in use, the PRE() and POST()
wrappers need to make the distinction as well. For each subcommand, mark
the attributes that are read or written.
For some map operations, the only way to infer the size of the memory
areas used for read or write operations seems to involve reading
from /proc/<pid>/fdinfo/<fd> in order to retrieve the size of keys
and values for this map.
The definitions of union bpf_attr and of other eBPF-related elements
required for adequately performing the checks were added to the Linux
header file.
Processing related to file descriptors is added in a follow-up patch.
Currently arch_prctl calls VG_(core_panic) when it sees an unknown
arch_prctl option which kills the process. glibc uses arch_prctl with
an (as yet) unknown option to see if the kernel supports CET. This
breaks any application running under valgrind on x86_64 with:
valgrind: the 'impossible' happened:
Unsupported arch_prctl option
Thread 1: status = VgTs_Runnable (lwpid 19934)
==19934== at 0x121A15: get_cet_status (cpu-features.c:28)
==19934== by 0x121A15: init_cpu_features (cpu-features.c:474)
==19934== by 0x121A15: dl_platform_init (dl-machine.h:228)
==19934== by 0x121A15: _dl_sysdep_start (dl-sysdep.c:231)
==19934== by 0x10A1D7: _dl_start_final (rtld.c:413)
==19934== by 0x10A1D7: _dl_start (rtld.c:520)
We already handle all known options. It would be better to do as the
kernel does and just return failure with EINVAL instead.
Fix 373192 Calling posix_spawn in glibc 2.24 completely broken
Functionally, this patch just does the following 2 changes to the
fork clone handling:
* It does not mask anymore CLONE_VFORK :
The only effect of this flag is to suspend the parent, waiting for
the child to either exit or execve.
If some applications depends on this synchronisation, better keep it,
as it will not harm to suspend the parent valgrind waiting for the
child valgrind to exit or execve.
* In case the guest calls the clone syscall providing a non zero client stack,
set the child guest SP after the syscall, before executing guest instructions.
Not setting the guest stack ptr was the source of the problem reported
in the bugs.
This also adds a test case none/tests/linux/clonev.
Before this patch, test gives a SEGV, which is fixed by the patch.
The patch is however a lot bigger : this fix was touching some (mostly
identical/duplicated) code in all the linux platforms.
So, the clone/fork code has been factorised as much as possible.
This removes about 1700 lines of code.
This has been tested on:
* amd64
* x86
* ppc64 be and le
* ppc32
* arm64
This has been compiled on but *not really tested* on:
* mips64 (not too clear how to properly build and run valgrind on gcc22)
It has *not* been compiled and *not* tested on:
* arm
* mips32
* tilegx
* darwin (normally, no impact)
* solaris (normally, no impact)
The changes are relatively mechanical, so it is not impossible that
it will compile and work out of the box on these platforms.
Otherwise, questions welcome.
A few points of interest:
* Some platforms did have a typedef void vki_modify_ldt_t,
and some platforms had no definition for this type at all.
To make it easier to factorise, for such platforms, the following has
been used:
typedef char vki_modify_ldt_t;
When the sizeof vki_modify_ldt_t is > 1, then the arg syscall is checked.
This is somewhat a hack, but was simplifying the factorisation.
* for mips32/mips64 and tilegx, there is a strange unconditional assignment
of 0 to a register (guest_r2 on mips, guest_r0 on tilegx).
Unclear what this is, in particular because this is assigned whatever
the result of the syscall (success or not).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16186
(valgrind side).
In summary: we were counting somewhat on the luck for FS,
we now similarly count on luch for GS
See VEX commit log r3043 for more details.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14815
At various places, there were either some assumption that the 'end'
boundary (highest address) was either not included, included,
or was the highest addressable word, or the highest addressable byte.
This e.g. was very visible when doing:
./vg-in-place -d -d ./helgrind/tests/tc01_simple_race|&grep regi
giving
--24040:2:stacks register 0xBEDB4000-0xBEDB4FFF as stack 0
--24040:2:stacks register 0x402C000-0x4A2C000 as stack 1
showing that the main stack end was (on x86) not the highest word
but the highest byte, while for the thread 1, the registered end
was a byte not part of the stack.
The attached patch ensures that stack bounds semantic are documented and
consistent. Also, some of the stack handling code is factorised.
The convention that the patch ensures and documents is:
start is the lowest addressable byte, end is the highest addressable byte.
(the words 'min' and 'max' have been kept when already used, as this wording is
consistent with the new semantic of start/end).
In various debug log, used brackets [ and ] to make clear that
both bounds are included.
The code to guess and register the client stack was duplicated
in all the platform specific syswrap-<plat>-<os>.c files.
Code has been factorised in syswrap-generic.c
The patch has been regression tested on
x86, amd64, ppc32/64, s390x.
It has been compiled and one test run on arm64.
Not compiled/not tested on darwin, android, mips32/64, arm
More in details, the patch does the following:
coregrind/pub_core_aspacemgr.h
include/valgrind.h
include/pub_tool_machine.h
coregrind/pub_core_scheduler.h
coregrind/pub_core_stacks.h
- document start/end semantic in various functions
also in pub_tool_machine.h:
- replaces unclear 'bottommost address' by 'lowest address'
(unclear as stack bottom is or at least can be interpreted as
the 'functional' bottom of the stack, which is the highest
address for 'stack growing downwards').
coregrind/pub_core_initimg.h
replace unclear clstack_top by clstack_end
coregrind/m_main.c
updated to clstack_end
coregrind/pub_core_threadstate.h
renamed client_stack_highest_word to client_stack_highest_byte
coregrind/m_scheduler/scheduler.c
computes client_stack_highest_byte as the highest addressable byte
Update comments in call to VG_(show_sched_status)
coregrind/m_machine.c
coregrind/m_stacktrace.c
updated to client_stack_highest_byte, and switched
stack_lowest/highest_word to stack_lowest/highest_byte accordingly
coregrind/m_stacks.c
clarify semantic of start/end,
added a comment to indicate why we invert start/end in register call
(note that the code find_stack_by_addr was already assuming that
end was included as the checks were doing e.g.
sp >= i->start && sp <= i->end
coregrind/pub_core_clientstate.h
coregrind/m_clientstate.c
renames Addr VG_(clstk_base) to Addr VG_(clstk_start_base)
(start to indicate it is the lowest address, base suffix kept
to indicate it is the initial lowest address).
coregrind/m_initimg/initimg-darwin.c
updated to VG_(clstk_start_base)
replace unclear iicii.clstack_top by iicii.clstack_end
updated clstack_max_size computation according to both bounds included.
coregrind/m_initimg/initimg-linux.c
updated to VG_(clstk_start_base)
updated VG_(clstk_end) computation according to both bounds included.
replace unclear iicii.clstack_top by iicii.clstack_end
coregrind/pub_core_aspacemgr.h
extern Addr VG_(am_startup) : clarify semantic of the returned value
coregrind/m_aspacemgr/aspacemgr-linux.c
removed a copy of a comment that was already in pub_core_aspacemgr.h
(avoid double maintenance)
renamed unclear suggested_clstack_top to suggested_clstack_end
(note that here, it looks like suggested_clstack_top was already
the last addressable byte)
* factorisation of the stack guessing and registration causes
mechanical changes in the following files:
coregrind/m_syswrap/syswrap-ppc64-linux.c
coregrind/m_syswrap/syswrap-x86-darwin.c
coregrind/m_syswrap/syswrap-amd64-linux.c
coregrind/m_syswrap/syswrap-arm-linux.c
coregrind/m_syswrap/syswrap-generic.c
coregrind/m_syswrap/syswrap-mips64-linux.c
coregrind/m_syswrap/syswrap-ppc32-linux.c
coregrind/m_syswrap/syswrap-amd64-darwin.c
coregrind/m_syswrap/syswrap-mips32-linux.c
coregrind/m_syswrap/priv_syswrap-generic.h
coregrind/m_syswrap/syswrap-x86-linux.c
coregrind/m_syswrap/syswrap-s390x-linux.c
coregrind/m_syswrap/syswrap-darwin.c
coregrind/m_syswrap/syswrap-arm64-linux.c
Some files to look at more in details:
syswrap-darwin.c : the handling of sysctl(kern.usrstack) looked
buggy to me, and has probably be made correct by the fact that
VG_(clstk_end) is now the last addressable byte. However,unsure
about this, as I could not find any documentation about
sysctl(kern.usrstack). I only find several occurences on the web,
showing that the result of this is page aligned, which I guess
means it must be 1+ the last addressable byte.
syswrap-x86-darwin.c and syswrap-amd64-darwin.c
I suspect the code that was computing client_stack_highest_word
was wrong, and the patch makes it correct.
syswrap-mips64-linux.c
not sure what to do for this code. This is the only code
that was guessing the stack differently from others.
Kept (almost) untouched. To be discussed with mips maintainers.
coregrind/pub_core_libcassert.h
coregrind/m_libcassert.c
* void VG_(show_sched_status):
renamed Bool valgrind_stack_usage to Bool stack_usage
if stack_usage, shows both the valgrind stack usage and
the client stack boundaries
coregrind/m_scheduler/scheduler.c
coregrind/m_gdbserver/server.c
coregrind/m_gdbserver/remote-utils.c
Updated comments in callers to VG_(show_sched_status)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14392
of clo which are (or should be) 'enum set'.
* pub_tool_options.h : add new macrox VG_USET_CLO and VG_USETX_CLO to
parse an 'enum set' command line option (with or without "all" keyword).
* use VG_USET_CLO for existing enum set clo options:
memcheck --errors-for-leak-kinds, --show-leak-kinds, --leak-check-heuristics
coregrind --vgdb-stop-at
* change --sim-hints and --kernel-variants to enum set
(this allows to detect user typos: currently, a typo in a sim-hint
or kernel variant is silently ignored. Now, an error will be given
to the user)
* The 2 new sets (--sim-hints and --kernel-variants) should not make
use of the 'all' keyword => VG_(parse_enum_set) has a new argument
to enable/disable the use of the "all" keyword.
* The macros defining an 'all enum' set definition was duplicating
all enum values (so addition of a new enum value could easily
give a bug). Removing these macros as they are unused
(to the exception of the leak-kind set).
For this set, the 'all macro' has been replaced by an 'all function',
coded using parse_enum_set parsing the "all" keyword.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14301
Almost mechanical transformation, removes > 1000 SLOC.
Compiled and regtested on amd64/x86/mips32
Compiled and (somewhat) tested on mips64
Compiled on arm
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13302
original patch from
Andreas Arnez <arnez AT linux DOT vnet DOT ibm DOT com>
Seems that ppc and mips dont have ptrace support....
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13113
rev 10493 fixed bug 117564 in syswrap-x86-linux.c.
This commit fixes the same problem in syswrap-amd64-linux.c.
The problem makes memcheck/tests/linux/stack_switch fails (at least on gcc20)
with unexpected
==802== Syscall param clone(child_tidptr) contains uninitialised byte(s)
The problem originates from always checking 3 optional args PRE_read,
while these should be checked only if the corresponding flags are set.
syswrap-{arm,ppc32,ppc64}-linux.c seems to have the same problem
(but no visible effect) : VKI_CLONE_PARENT_SETTID,VKI_CLONE_CHILD_SETTID
and VKI_CLONE_SETTLS not properly handled in the PRE part.
syswrap-s390x-linux.c seems to have the VKI_CLONE_SETTLS part wrong,
but VKI_CLONE_PARENT_SETTID and VKI_CLONE_CHILD_SETTID correct.
Commiting a fix just for amd64 for now.
We probably better make some common code in syswrap-generic.c
to regroup all similar platforms.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12586
When doing experiment with gcc 4.7.0 and link time optimisation,
encountered link failures on amd64 which were solved by adding
.globl and used attribute.
=> added .globl in similar places for arm/x86/ppc32/s390.
Did not touch darwin (which asm seems somewhat different).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12506
If the pre_thread_ll_create tracking function would be invoked without the
big lock being held, that would trigger a race condition in the tools that
implement this tracking function.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12458
perf_event_open some time after we added it, so correct the name
wherever it appears to match the current kernel source.
Also fixup the PRE handler to do the check correctly, using the
size field of the structure to work out how much data there is.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11804
__builtin_setjmp and __builtin_longjmp so that they can be selectively
replaced, on a platform by platform basis. Does not change any
functionality. Related to #259977.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11687