This implements rseq for amd64, arm, arm64, ppc32, ppc64,
s390x and x86 linux as ENOSYS (without warning).
glibc will start using rseq to accelerate sched_getcpu, if
available. This would cause a warning from valgrind every
time a new thread is started.
Real rseq (restartable sequences) support is pretty hard, so
for now just explicitly return ENOSYS (just like we do for clone3).
https://sourceware.org/pipermail/libc-alpha/2021-December/133656.html
This is a system call introduced in Linux 5.9.
It's typically used to bulk-close file descriptors that a process inherited
without having desired so and doesn't want to pass them to its offspring
for security reasons. For this reason the sensible upper limit value tends
to be unknown and the users prefer to stay on the safe side by setting it
high.
This is a bit peculiar because, if unfiltered, the syscall could end up
closing descriptors Valgrind uses for its purposes, ending in no end of
mayhem and suffering.
This patch adjusts the upper bounds to a safe value and then skips over
the descriptor Valgrind uses by potentially calling the real system call
with sub-ranges that are safe to close.
The call can fail on negative ranges and bad flags -- we're dealing with
the first condition ourselves while letting the real call fail on bad
flags.
https://bugs.kde.org/show_bug.cgi?id=439090
glibc 2.34 will try to use clone3 first before falling back to
the clone syscall. So implement clone3 as sys_ni_syscall which
simply return ENOSYS without producing a warning.
https://bugs.kde.org/show_bug.cgi?id=439590
io_uring syscalls only work on x86/amd64, but they can be enabled on
all arches. Based on a patch by Nathan Ringo <nathan@remexre.xyz>.
https://bugs.kde.org/show_bug.cgi?id=423361
faccessat2 is a new syscall in linux 5.8 and will be used by glibc 2.33.
faccessat2 is simply faccessat with a new flag argument. It has
a common number across all linux arches.
https://bugs.kde.org/427787
The only "special" thing about these syscalls is that the given
struct sched_attr determines its own size for future expansion.
Original fix by "ISHIKAWA,chiaki" <ishikawa@yk.rim.or.jp>
https://bugs.kde.org/show_bug.cgi?id=369029
This patch adds sycall wrappers for the following syscalls which
use a 64bit time_t on 32bit arches: gettime64, settime64,
clock_getres_time64, clock_nanosleep_time64, timer_gettime64,
timer_settime64, timerfd_gettime64, timerfd_settime64,
utimensat_time64, pselect6_time64, ppoll_time64, recvmmsg_time64,
mq_timedsend_time64, mq_timedreceive_time64, semtimedop_time64,
rt_sigtimedwait_time64, futex_time64 and sched_rr_get_interval_time64.
Still missing are clock_adjtime64 and io_pgetevents_time64.
For the more complicated syscalls futex[_time64], pselect6[_time64]
and ppoll[_time64] there are shared pre and/or post helper functions.
Other functions just have their own PRE and POST handler.
Note that the vki_timespec64 struct really is the struct as used by
by glibc (it internally translates a 32bit timespec struct to a 64bit
timespec64 struct before passing it to any of the time64 syscalls).
The kernel uses a 64-bit signed int, but is ignoring the upper 32 bits
of the tv_nsec field. It does always write the full struct though.
So avoid checking the padding is only needed for PRE_MEM_READ.
There are two helper pre_read_timespec64 and pre_read_itimerspec64
to check the new structs.
https://bugs.kde.org/show_bug.cgi?id=416753
Sync VEX/LICENSE.GPL with top-level COPYING file. We used 3 different
addresses for writing to the FSF to receive a copy of the GPL. Replace
all different variants with an URL <http://www.gnu.org/licenses/>.
The following files might still have some slightly different (L)GPL
copyright notice because they were derived from other programs:
- files under coregrind/m_demangle which come from libiberty:
cplus-dem.c, d-demangle.c, demangle.h, rust-demangle.c,
safe-ctype.c and safe-ctype.h
- coregrind/m_demangle/dyn-string.[hc] derived from GCC.
- coregrind/m_demangle/ansidecl.h derived from glibc.
- VEX files for FMA detived from glibc:
host_generic_maddf.h and host_generic_maddf.c
- files under coregrin/m_debuginfo derived from LZO:
lzoconf.h, lzodefs.h, minilzo-inl.c and minilzo.h
- files under coregrind/m_gdbserver detived from GDB:
gdb/signals.h, inferiors.c, regcache.c, regcache.h,
regdef.h, remote-utils.c, server.c, server.h, signals.c,
target.c, target.h and utils.c
Plus the following test files:
- none/tests/ppc32/testVMX.c derived from testVMX.
- ppc tests derived from QEMU: jm-insns.c, ppc64_helpers.h
and test_isa_3_0.c
- tests derived from bzip2 (with embedded GPL text in code):
hackedbz2.c, origin5-bz2.c, varinfo6.c
- tests detived from glibc: str_tester.c, pth_atfork1.c
- test detived from GCC libgomp: tc17_sembar.c
- performance tests derived from bzip2 or tinycc (with embedded GPL
text in code): bz2.c, test_input_for_tinycc.c and tinycc.c
Fix 373192 Calling posix_spawn in glibc 2.24 completely broken
Functionally, this patch just does the following 2 changes to the
fork clone handling:
* It does not mask anymore CLONE_VFORK :
The only effect of this flag is to suspend the parent, waiting for
the child to either exit or execve.
If some applications depends on this synchronisation, better keep it,
as it will not harm to suspend the parent valgrind waiting for the
child valgrind to exit or execve.
* In case the guest calls the clone syscall providing a non zero client stack,
set the child guest SP after the syscall, before executing guest instructions.
Not setting the guest stack ptr was the source of the problem reported
in the bugs.
This also adds a test case none/tests/linux/clonev.
Before this patch, test gives a SEGV, which is fixed by the patch.
The patch is however a lot bigger : this fix was touching some (mostly
identical/duplicated) code in all the linux platforms.
So, the clone/fork code has been factorised as much as possible.
This removes about 1700 lines of code.
This has been tested on:
* amd64
* x86
* ppc64 be and le
* ppc32
* arm64
This has been compiled on but *not really tested* on:
* mips64 (not too clear how to properly build and run valgrind on gcc22)
It has *not* been compiled and *not* tested on:
* arm
* mips32
* tilegx
* darwin (normally, no impact)
* solaris (normally, no impact)
The changes are relatively mechanical, so it is not impossible that
it will compile and work out of the box on these platforms.
Otherwise, questions welcome.
A few points of interest:
* Some platforms did have a typedef void vki_modify_ldt_t,
and some platforms had no definition for this type at all.
To make it easier to factorise, for such platforms, the following has
been used:
typedef char vki_modify_ldt_t;
When the sizeof vki_modify_ldt_t is > 1, then the arg syscall is checked.
This is somewhat a hack, but was simplifying the factorisation.
* for mips32/mips64 and tilegx, there is a strange unconditional assignment
of 0 to a register (guest_r2 on mips, guest_r0 on tilegx).
Unclear what this is, in particular because this is assigned whatever
the result of the syscall (success or not).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16186
At various places, there were either some assumption that the 'end'
boundary (highest address) was either not included, included,
or was the highest addressable word, or the highest addressable byte.
This e.g. was very visible when doing:
./vg-in-place -d -d ./helgrind/tests/tc01_simple_race|&grep regi
giving
--24040:2:stacks register 0xBEDB4000-0xBEDB4FFF as stack 0
--24040:2:stacks register 0x402C000-0x4A2C000 as stack 1
showing that the main stack end was (on x86) not the highest word
but the highest byte, while for the thread 1, the registered end
was a byte not part of the stack.
The attached patch ensures that stack bounds semantic are documented and
consistent. Also, some of the stack handling code is factorised.
The convention that the patch ensures and documents is:
start is the lowest addressable byte, end is the highest addressable byte.
(the words 'min' and 'max' have been kept when already used, as this wording is
consistent with the new semantic of start/end).
In various debug log, used brackets [ and ] to make clear that
both bounds are included.
The code to guess and register the client stack was duplicated
in all the platform specific syswrap-<plat>-<os>.c files.
Code has been factorised in syswrap-generic.c
The patch has been regression tested on
x86, amd64, ppc32/64, s390x.
It has been compiled and one test run on arm64.
Not compiled/not tested on darwin, android, mips32/64, arm
More in details, the patch does the following:
coregrind/pub_core_aspacemgr.h
include/valgrind.h
include/pub_tool_machine.h
coregrind/pub_core_scheduler.h
coregrind/pub_core_stacks.h
- document start/end semantic in various functions
also in pub_tool_machine.h:
- replaces unclear 'bottommost address' by 'lowest address'
(unclear as stack bottom is or at least can be interpreted as
the 'functional' bottom of the stack, which is the highest
address for 'stack growing downwards').
coregrind/pub_core_initimg.h
replace unclear clstack_top by clstack_end
coregrind/m_main.c
updated to clstack_end
coregrind/pub_core_threadstate.h
renamed client_stack_highest_word to client_stack_highest_byte
coregrind/m_scheduler/scheduler.c
computes client_stack_highest_byte as the highest addressable byte
Update comments in call to VG_(show_sched_status)
coregrind/m_machine.c
coregrind/m_stacktrace.c
updated to client_stack_highest_byte, and switched
stack_lowest/highest_word to stack_lowest/highest_byte accordingly
coregrind/m_stacks.c
clarify semantic of start/end,
added a comment to indicate why we invert start/end in register call
(note that the code find_stack_by_addr was already assuming that
end was included as the checks were doing e.g.
sp >= i->start && sp <= i->end
coregrind/pub_core_clientstate.h
coregrind/m_clientstate.c
renames Addr VG_(clstk_base) to Addr VG_(clstk_start_base)
(start to indicate it is the lowest address, base suffix kept
to indicate it is the initial lowest address).
coregrind/m_initimg/initimg-darwin.c
updated to VG_(clstk_start_base)
replace unclear iicii.clstack_top by iicii.clstack_end
updated clstack_max_size computation according to both bounds included.
coregrind/m_initimg/initimg-linux.c
updated to VG_(clstk_start_base)
updated VG_(clstk_end) computation according to both bounds included.
replace unclear iicii.clstack_top by iicii.clstack_end
coregrind/pub_core_aspacemgr.h
extern Addr VG_(am_startup) : clarify semantic of the returned value
coregrind/m_aspacemgr/aspacemgr-linux.c
removed a copy of a comment that was already in pub_core_aspacemgr.h
(avoid double maintenance)
renamed unclear suggested_clstack_top to suggested_clstack_end
(note that here, it looks like suggested_clstack_top was already
the last addressable byte)
* factorisation of the stack guessing and registration causes
mechanical changes in the following files:
coregrind/m_syswrap/syswrap-ppc64-linux.c
coregrind/m_syswrap/syswrap-x86-darwin.c
coregrind/m_syswrap/syswrap-amd64-linux.c
coregrind/m_syswrap/syswrap-arm-linux.c
coregrind/m_syswrap/syswrap-generic.c
coregrind/m_syswrap/syswrap-mips64-linux.c
coregrind/m_syswrap/syswrap-ppc32-linux.c
coregrind/m_syswrap/syswrap-amd64-darwin.c
coregrind/m_syswrap/syswrap-mips32-linux.c
coregrind/m_syswrap/priv_syswrap-generic.h
coregrind/m_syswrap/syswrap-x86-linux.c
coregrind/m_syswrap/syswrap-s390x-linux.c
coregrind/m_syswrap/syswrap-darwin.c
coregrind/m_syswrap/syswrap-arm64-linux.c
Some files to look at more in details:
syswrap-darwin.c : the handling of sysctl(kern.usrstack) looked
buggy to me, and has probably be made correct by the fact that
VG_(clstk_end) is now the last addressable byte. However,unsure
about this, as I could not find any documentation about
sysctl(kern.usrstack). I only find several occurences on the web,
showing that the result of this is page aligned, which I guess
means it must be 1+ the last addressable byte.
syswrap-x86-darwin.c and syswrap-amd64-darwin.c
I suspect the code that was computing client_stack_highest_word
was wrong, and the patch makes it correct.
syswrap-mips64-linux.c
not sure what to do for this code. This is the only code
that was guessing the stack differently from others.
Kept (almost) untouched. To be discussed with mips maintainers.
coregrind/pub_core_libcassert.h
coregrind/m_libcassert.c
* void VG_(show_sched_status):
renamed Bool valgrind_stack_usage to Bool stack_usage
if stack_usage, shows both the valgrind stack usage and
the client stack boundaries
coregrind/m_scheduler/scheduler.c
coregrind/m_gdbserver/server.c
coregrind/m_gdbserver/remote-utils.c
Updated comments in callers to VG_(show_sched_status)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14392
sys_get_mempolicy.
This patch add support for the PPC64 sytem calls:
259 - sys_mbind
260 - sys_get_mempolicy
261 - sys_set_mempolicy
This patch also adds the Add syscall 259, sys_mbind, support for the PPC32
platform.
The patch fixes bugzilla 318932.
Signed-off-by: Carl Love <cel@us.ibm.com>
---
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13383
sys_socketcall was duplicated in syswrap-{ppc64|ppc32|arm|mips32|s390x}-linux.c
=>
* Similarly for what was done for sys_ipc, factorise the code in syswrap-linux.c
* re-enabled PRE_MEM_READ for VKI_SYS_SENDMSG and VKI_SYS_RECVMSG
(PRE_MEM_READ calls were commented out around 2003, for what
was supposed a glibc bug.
The PRE_MEM_READ calls were already re-enabled in s390x)
* s390x also had some more checking to verify the addressibility of
the args and fail the syscall with EFAULT if not addressable
=> same checks are now done for all platforms.
(tested on x86/amd64/mips32/s390x/ppc32/ppc64,
compiled for arm-android-emulator)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13104
When doing experiment with gcc 4.7.0 and link time optimisation,
encountered link failures on amd64 which were solved by adding
.globl and used attribute.
=> added .globl in similar places for arm/x86/ppc32/s390.
Did not touch darwin (which asm seems somewhat different).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12506
If the pre_thread_ll_create tracking function would be invoked without the
big lock being held, that would trigger a race condition in the tools that
implement this tracking function.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12458
changes for x86-linux and ppc32-linux. Derived from patch in bug
266035 comment 10 (Jeff Brown, jeffbrown@google.com).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11871
and POST(sys_sigaction) in syswrap-x86-linux.c and
syswrap-ppc32-linux.c, and replace them with a single version in
syswrap-linux.c instead. Derived from patch in bug 266035 comment 10
(Jeff Brown, jeffbrown@google.com).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11870
perf_event_open some time after we added it, so correct the name
wherever it appears to match the current kernel source.
Also fixup the PRE handler to do the check correctly, using the
size field of the structure to work out how much data there is.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11804
__builtin_setjmp and __builtin_longjmp so that they can be selectively
replaced, on a platform by platform basis. Does not change any
functionality. Related to #259977.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11687