Fixes: 388786 - Support bpf syscall in amd64 Linux
Add support for bpf() Linux-specific system call on amd64 platform. The
bpf() syscall is used to handle eBPF objects (programs and maps), and
can be used for a number of operations. It takes three arguments:
- "cmd" is an integer encoding a subcommand to run. Available subcommand
include loading a new program, creating a map or updating its entries,
retrieving information about an eBPF object, and may others.
- "attr" is a pointer to an object of type union bpf_attr. This object
converts to a struct related to selected subcommand, and embeds the
various parameters used with this subcommand. Some of those parameters
are read by the kernel (example for an eBPF map lookup: the key of the
entry to lookup), others are written into (the value retrieved from
the map lookup).
- "attr_size" is the size of the object pointed by "attr".
Since the action performed by the kernel, and the way "attr" attributes
are processed depends on the subcommand in use, the PRE() and POST()
wrappers need to make the distinction as well. For each subcommand, mark
the attributes that are read or written.
For some map operations, the only way to infer the size of the memory
areas used for read or write operations seems to involve reading
from /proc/<pid>/fdinfo/<fd> in order to retrieve the size of keys
and values for this map.
The definitions of union bpf_attr and of other eBPF-related elements
required for adequately performing the checks were added to the Linux
header file.
Processing related to file descriptors is added in a follow-up patch.
Fix 373192 Calling posix_spawn in glibc 2.24 completely broken
Functionally, this patch just does the following 2 changes to the
fork clone handling:
* It does not mask anymore CLONE_VFORK :
The only effect of this flag is to suspend the parent, waiting for
the child to either exit or execve.
If some applications depends on this synchronisation, better keep it,
as it will not harm to suspend the parent valgrind waiting for the
child valgrind to exit or execve.
* In case the guest calls the clone syscall providing a non zero client stack,
set the child guest SP after the syscall, before executing guest instructions.
Not setting the guest stack ptr was the source of the problem reported
in the bugs.
This also adds a test case none/tests/linux/clonev.
Before this patch, test gives a SEGV, which is fixed by the patch.
The patch is however a lot bigger : this fix was touching some (mostly
identical/duplicated) code in all the linux platforms.
So, the clone/fork code has been factorised as much as possible.
This removes about 1700 lines of code.
This has been tested on:
* amd64
* x86
* ppc64 be and le
* ppc32
* arm64
This has been compiled on but *not really tested* on:
* mips64 (not too clear how to properly build and run valgrind on gcc22)
It has *not* been compiled and *not* tested on:
* arm
* mips32
* tilegx
* darwin (normally, no impact)
* solaris (normally, no impact)
The changes are relatively mechanical, so it is not impossible that
it will compile and work out of the box on these platforms.
Otherwise, questions welcome.
A few points of interest:
* Some platforms did have a typedef void vki_modify_ldt_t,
and some platforms had no definition for this type at all.
To make it easier to factorise, for such platforms, the following has
been used:
typedef char vki_modify_ldt_t;
When the sizeof vki_modify_ldt_t is > 1, then the arg syscall is checked.
This is somewhat a hack, but was simplifying the factorisation.
* for mips32/mips64 and tilegx, there is a strange unconditional assignment
of 0 to a register (guest_r2 on mips, guest_r0 on tilegx).
Unclear what this is, in particular because this is assigned whatever
the result of the syscall (success or not).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16186
Some architectures, e.g. s390, don't have dedicated recvmmsg and sendmmsg
system calls, but use the socketcall multiplexing system call with
SYS_RECVMMSG or SYS_SENDMMSG (just like the accept4 systemcall can also
be called through socketcall). Create separate helpers for recvmmsg and
sendmmsg helpers that can be used by either the direct syscall or the
socket call.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14964
Almost mechanical transformation, removes > 1000 SLOC.
Compiled and regtested on amd64/x86/mips32
Compiled and (somewhat) tested on mips64
Compiled on arm
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13302
sys_socketcall was duplicated in syswrap-{ppc64|ppc32|arm|mips32|s390x}-linux.c
=>
* Similarly for what was done for sys_ipc, factorise the code in syswrap-linux.c
* re-enabled PRE_MEM_READ for VKI_SYS_SENDMSG and VKI_SYS_RECVMSG
(PRE_MEM_READ calls were commented out around 2003, for what
was supposed a glibc bug.
The PRE_MEM_READ calls were already re-enabled in s390x)
* s390x also had some more checking to verify the addressibility of
the args and fail the syscall with EFAULT if not addressable
=> same checks are now done for all platforms.
(tested on x86/amd64/mips32/s390x/ppc32/ppc64,
compiled for arm-android-emulator)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13104
and POST(sys_sigaction) in syswrap-x86-linux.c and
syswrap-ppc32-linux.c, and replace them with a single version in
syswrap-linux.c instead. Derived from patch in bug 266035 comment 10
(Jeff Brown, jeffbrown@google.com).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11870
perf_event_open some time after we added it, so correct the name
wherever it appears to match the current kernel source.
Also fixup the PRE handler to do the check correctly, using the
size field of the structure to work out how much data there is.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11804
is no support for amd64 because there is no getcpu system call on that
platform - it is always done as a vsyscall in user space.
Based on patch from Aleksander Salwa. Closes#223758.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11054