This is a system call introduced in Linux 5.9.
It's typically used to bulk-close file descriptors that a process inherited
without having desired so and doesn't want to pass them to its offspring
for security reasons. For this reason the sensible upper limit value tends
to be unknown and the users prefer to stay on the safe side by setting it
high.
This is a bit peculiar because, if unfiltered, the syscall could end up
closing descriptors Valgrind uses for its purposes, ending in no end of
mayhem and suffering.
This patch adjusts the upper bounds to a safe value and then skips over
the descriptor Valgrind uses by potentially calling the real system call
with sub-ranges that are safe to close.
The call can fail on negative ranges and bad flags -- we're dealing with
the first condition ourselves while letting the real call fail on bad
flags.
https://bugs.kde.org/show_bug.cgi?id=439090
glibc 2.34 will try to use clone3 first before falling back to
the clone syscall. So implement clone3 as sys_ni_syscall which
simply return ENOSYS without producing a warning.
https://bugs.kde.org/show_bug.cgi?id=439590
io_uring syscalls only work on x86/amd64, but they can be enabled on
all arches. Based on a patch by Nathan Ringo <nathan@remexre.xyz>.
https://bugs.kde.org/show_bug.cgi?id=423361
faccessat2 is a new syscall in linux 5.8 and will be used by glibc 2.33.
faccessat2 is simply faccessat with a new flag argument. It has
a common number across all linux arches.
https://bugs.kde.org/427787
The only "special" thing about these syscalls is that the given
struct sched_attr determines its own size for future expansion.
Original fix by "ISHIKAWA,chiaki" <ishikawa@yk.rim.or.jp>
https://bugs.kde.org/show_bug.cgi?id=369029
I've tested this on amd64 and arm but I'm enabling it on all
arches since the syscall should work identically on all of them.
This was requested by users for a long time (almost 5 years) and
in fact, some programs (like libvirt) use namespaces and fork off
to enter other namespaces. Lack of implementation means valgrind
can't be used with these programs (or their configuration must be
changed to not use namespaces, which defeats the purpose).
Without knowing it, I've converged to same patch as mentioned in
bugs below.
https://bugs.kde.org/show_bug.cgi?id=343099https://bugs.kde.org/show_bug.cgi?id=368923https://bugs.kde.org/show_bug.cgi?id=369031
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Hook up sys_tee for mips32 and mips64 correctly.
For mips64, it is just a simplification to use generic linux implementation.
This fixes tee01 test in the LTP test suite for mips32.
Use the correct generic linux sys wrapper.
Follow-up for
commit b0861063a8d2a55bb7423e90d26806bab0f78a12
Author: Alexandra Hajkova <ahajkova@redhat.com>
Date: Tue Jun 4 13:47:14 2019 +0200
Add support for preadv2 and pwritev2 syscalls
This should fix
memcheck/tests/linux/sys-preadv2_pwritev2 (stderr)
memcheck/tests/linux/sys-preadv_pwritev (stderr)
on mips32/mips64.
Support for amd64, x86 - 64 and 32 bit, arm64, ppc64, ppc64le,
s390x, mips64. This should work identically on all
arches, tested on x86 32bit and 64bit one, but enabled on all.
Refactor the code to be reusable between old/new syscalls. Resolve TODO
items in the code. Add the testcase for the preadv2/pwritev2 and also
add the (similar) testcase for the older preadv/pwritev syscalls.
Trying to test handling an uninitialized flag argument for the v2 syscalls
does not work because the flag always comes out as defined zero.
Turns out glibc does this deliberately on 64bit architectures because
the kernel does actually have a low_offset and high_offset argument, but
ignores the high_offset/assumes it is zero.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=601cc11d054ae4b5e9b5babec3d8e4667a2cb9b5https://bugs.kde.org/408414
Sync VEX/LICENSE.GPL with top-level COPYING file. We used 3 different
addresses for writing to the FSF to receive a copy of the GPL. Replace
all different variants with an URL <http://www.gnu.org/licenses/>.
The following files might still have some slightly different (L)GPL
copyright notice because they were derived from other programs:
- files under coregrind/m_demangle which come from libiberty:
cplus-dem.c, d-demangle.c, demangle.h, rust-demangle.c,
safe-ctype.c and safe-ctype.h
- coregrind/m_demangle/dyn-string.[hc] derived from GCC.
- coregrind/m_demangle/ansidecl.h derived from glibc.
- VEX files for FMA detived from glibc:
host_generic_maddf.h and host_generic_maddf.c
- files under coregrin/m_debuginfo derived from LZO:
lzoconf.h, lzodefs.h, minilzo-inl.c and minilzo.h
- files under coregrind/m_gdbserver detived from GDB:
gdb/signals.h, inferiors.c, regcache.c, regcache.h,
regdef.h, remote-utils.c, server.c, server.h, signals.c,
target.c, target.h and utils.c
Plus the following test files:
- none/tests/ppc32/testVMX.c derived from testVMX.
- ppc tests derived from QEMU: jm-insns.c, ppc64_helpers.h
and test_isa_3_0.c
- tests derived from bzip2 (with embedded GPL text in code):
hackedbz2.c, origin5-bz2.c, varinfo6.c
- tests detived from glibc: str_tester.c, pth_atfork1.c
- test detived from GCC libgomp: tc17_sembar.c
- performance tests derived from bzip2 or tinycc (with embedded GPL
text in code): bz2.c, test_input_for_tinycc.c and tinycc.c
Adding MIPS N32 ABI support.
BZ issue - #345763.
Contributed and maintained by mulitple people over the years:
Crestez Dan Leonard, Maran Pakkirisamy, Dimitrije Nikolic,
Aleksandar Rikalo, Tamara Vlahovic.
On majority of architectures size of long matches register width.
On mips n32 size of long is 32 bits and register width is 64 bits.
Valgrind is written with assumption that long size matches register
width. This is the reason why both UWord for Valgrind and HWord for VEX
match size of long. Long size differs from register size on mips n32 ABI.
Introducing RegWord type that will match size of registers.
Part of the changes required for BZ issue - #345763.
Contributed by:
Tamara Vlahovic and Dimitrije Nikolic.
Fix 373192 Calling posix_spawn in glibc 2.24 completely broken
Functionally, this patch just does the following 2 changes to the
fork clone handling:
* It does not mask anymore CLONE_VFORK :
The only effect of this flag is to suspend the parent, waiting for
the child to either exit or execve.
If some applications depends on this synchronisation, better keep it,
as it will not harm to suspend the parent valgrind waiting for the
child valgrind to exit or execve.
* In case the guest calls the clone syscall providing a non zero client stack,
set the child guest SP after the syscall, before executing guest instructions.
Not setting the guest stack ptr was the source of the problem reported
in the bugs.
This also adds a test case none/tests/linux/clonev.
Before this patch, test gives a SEGV, which is fixed by the patch.
The patch is however a lot bigger : this fix was touching some (mostly
identical/duplicated) code in all the linux platforms.
So, the clone/fork code has been factorised as much as possible.
This removes about 1700 lines of code.
This has been tested on:
* amd64
* x86
* ppc64 be and le
* ppc32
* arm64
This has been compiled on but *not really tested* on:
* mips64 (not too clear how to properly build and run valgrind on gcc22)
It has *not* been compiled and *not* tested on:
* arm
* mips32
* tilegx
* darwin (normally, no impact)
* solaris (normally, no impact)
The changes are relatively mechanical, so it is not impossible that
it will compile and work out of the box on these platforms.
Otherwise, questions welcome.
A few points of interest:
* Some platforms did have a typedef void vki_modify_ldt_t,
and some platforms had no definition for this type at all.
To make it easier to factorise, for such platforms, the following has
been used:
typedef char vki_modify_ldt_t;
When the sizeof vki_modify_ldt_t is > 1, then the arg syscall is checked.
This is somewhat a hack, but was simplifying the factorisation.
* for mips32/mips64 and tilegx, there is a strange unconditional assignment
of 0 to a register (guest_r2 on mips, guest_r0 on tilegx).
Unclear what this is, in particular because this is assigned whatever
the result of the syscall (success or not).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16186
MIPS32 implementation missed to set up a correct (zero) return address.
This led to incorrect execution of get_StackTrace_wrk as it was not
able to unwind stack correctly.
This change fixes memcheck/tests/leak-autofreepool-5.
MIPS64 implementation missed clearing all integer registers before
entering the function.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16166
Use platform specific pre-wrapper for fadvise64 system call and respect
size of parameters, instead of using generic wrapper written for 32bit
architectures.
Issue reported by Marcin Juszkiewicz.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16163
Add missing POST wrapper for sys_prctl.
This fixes regressions from r15934 (on MIPS32 platforms) and r16003
(on MIPS64 platforms).
Related test: memcheck/tests/threadname
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16110
Replace use of daddi/addi with daddiu/addiu.
This is more R6-friendly and we actually want to use the instructions
that do not cause integer overflow exception.
Patch by Vicente Olivert Riera.
Related issue - BZ#356112.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16018
Add MIPS specific wrapper for prctl(GET/SET_FP_MODE) syscalls to
support FP32/FP64 mode switch.
Patch by Aleksandar Rikalo.
Related VEX change r3253.
Related bug - BZ #366079.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16003
Follow up to Philippe's change in r14392 which does a cleanup and makes
all architectures use the same code to guess and register stack.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14490
At various places, there were either some assumption that the 'end'
boundary (highest address) was either not included, included,
or was the highest addressable word, or the highest addressable byte.
This e.g. was very visible when doing:
./vg-in-place -d -d ./helgrind/tests/tc01_simple_race|&grep regi
giving
--24040:2:stacks register 0xBEDB4000-0xBEDB4FFF as stack 0
--24040:2:stacks register 0x402C000-0x4A2C000 as stack 1
showing that the main stack end was (on x86) not the highest word
but the highest byte, while for the thread 1, the registered end
was a byte not part of the stack.
The attached patch ensures that stack bounds semantic are documented and
consistent. Also, some of the stack handling code is factorised.
The convention that the patch ensures and documents is:
start is the lowest addressable byte, end is the highest addressable byte.
(the words 'min' and 'max' have been kept when already used, as this wording is
consistent with the new semantic of start/end).
In various debug log, used brackets [ and ] to make clear that
both bounds are included.
The code to guess and register the client stack was duplicated
in all the platform specific syswrap-<plat>-<os>.c files.
Code has been factorised in syswrap-generic.c
The patch has been regression tested on
x86, amd64, ppc32/64, s390x.
It has been compiled and one test run on arm64.
Not compiled/not tested on darwin, android, mips32/64, arm
More in details, the patch does the following:
coregrind/pub_core_aspacemgr.h
include/valgrind.h
include/pub_tool_machine.h
coregrind/pub_core_scheduler.h
coregrind/pub_core_stacks.h
- document start/end semantic in various functions
also in pub_tool_machine.h:
- replaces unclear 'bottommost address' by 'lowest address'
(unclear as stack bottom is or at least can be interpreted as
the 'functional' bottom of the stack, which is the highest
address for 'stack growing downwards').
coregrind/pub_core_initimg.h
replace unclear clstack_top by clstack_end
coregrind/m_main.c
updated to clstack_end
coregrind/pub_core_threadstate.h
renamed client_stack_highest_word to client_stack_highest_byte
coregrind/m_scheduler/scheduler.c
computes client_stack_highest_byte as the highest addressable byte
Update comments in call to VG_(show_sched_status)
coregrind/m_machine.c
coregrind/m_stacktrace.c
updated to client_stack_highest_byte, and switched
stack_lowest/highest_word to stack_lowest/highest_byte accordingly
coregrind/m_stacks.c
clarify semantic of start/end,
added a comment to indicate why we invert start/end in register call
(note that the code find_stack_by_addr was already assuming that
end was included as the checks were doing e.g.
sp >= i->start && sp <= i->end
coregrind/pub_core_clientstate.h
coregrind/m_clientstate.c
renames Addr VG_(clstk_base) to Addr VG_(clstk_start_base)
(start to indicate it is the lowest address, base suffix kept
to indicate it is the initial lowest address).
coregrind/m_initimg/initimg-darwin.c
updated to VG_(clstk_start_base)
replace unclear iicii.clstack_top by iicii.clstack_end
updated clstack_max_size computation according to both bounds included.
coregrind/m_initimg/initimg-linux.c
updated to VG_(clstk_start_base)
updated VG_(clstk_end) computation according to both bounds included.
replace unclear iicii.clstack_top by iicii.clstack_end
coregrind/pub_core_aspacemgr.h
extern Addr VG_(am_startup) : clarify semantic of the returned value
coregrind/m_aspacemgr/aspacemgr-linux.c
removed a copy of a comment that was already in pub_core_aspacemgr.h
(avoid double maintenance)
renamed unclear suggested_clstack_top to suggested_clstack_end
(note that here, it looks like suggested_clstack_top was already
the last addressable byte)
* factorisation of the stack guessing and registration causes
mechanical changes in the following files:
coregrind/m_syswrap/syswrap-ppc64-linux.c
coregrind/m_syswrap/syswrap-x86-darwin.c
coregrind/m_syswrap/syswrap-amd64-linux.c
coregrind/m_syswrap/syswrap-arm-linux.c
coregrind/m_syswrap/syswrap-generic.c
coregrind/m_syswrap/syswrap-mips64-linux.c
coregrind/m_syswrap/syswrap-ppc32-linux.c
coregrind/m_syswrap/syswrap-amd64-darwin.c
coregrind/m_syswrap/syswrap-mips32-linux.c
coregrind/m_syswrap/priv_syswrap-generic.h
coregrind/m_syswrap/syswrap-x86-linux.c
coregrind/m_syswrap/syswrap-s390x-linux.c
coregrind/m_syswrap/syswrap-darwin.c
coregrind/m_syswrap/syswrap-arm64-linux.c
Some files to look at more in details:
syswrap-darwin.c : the handling of sysctl(kern.usrstack) looked
buggy to me, and has probably be made correct by the fact that
VG_(clstk_end) is now the last addressable byte. However,unsure
about this, as I could not find any documentation about
sysctl(kern.usrstack). I only find several occurences on the web,
showing that the result of this is page aligned, which I guess
means it must be 1+ the last addressable byte.
syswrap-x86-darwin.c and syswrap-amd64-darwin.c
I suspect the code that was computing client_stack_highest_word
was wrong, and the patch makes it correct.
syswrap-mips64-linux.c
not sure what to do for this code. This is the only code
that was guessing the stack differently from others.
Kept (almost) untouched. To be discussed with mips maintainers.
coregrind/pub_core_libcassert.h
coregrind/m_libcassert.c
* void VG_(show_sched_status):
renamed Bool valgrind_stack_usage to Bool stack_usage
if stack_usage, shows both the valgrind stack usage and
the client stack boundaries
coregrind/m_scheduler/scheduler.c
coregrind/m_gdbserver/server.c
coregrind/m_gdbserver/remote-utils.c
Updated comments in callers to VG_(show_sched_status)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14392
On mips platforms the second cacheflush parameter is the number
of bytes in cache that needs to be flushed. When we are discarding
translation we need to use this number instead of:
((ULong) ARG2) - ((ULong) ARG1) + 1ULL
This patch also include syscall wrapper for __NR_sigaction on mips32.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13707
Add cases for PTRACE_GETREGSET in PRE(sys_ptrace) and POST(sys_ptrace).
This fixes memcheck/tests/linux/getregset on MIPS64 platforms with kernel
that supports ptrace call with PTRACE_GETREGSET.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13418
Enable wrappers for syscalls prlimit64, process_vm_readv, process_vm_writev,
needed by the following tests:
- none/tests/rlimit64_nofile and
- none/tests/process_vm_readv_writev.
The change also adds definitions for several system calls for MIPS64.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13400
Enable wrappers on mips64 for the following calls:
- dup3
- accept4
- epoll_create1
- timerfd_settime
- newfstatat
Also, allow additional flock64 values in sys_fcntl for mips64.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13358
Almost mechanical transformation, removes > 1000 SLOC.
Compiled and regtested on amd64/x86/mips32
Compiled and (somewhat) tested on mips64
Compiled on arm
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13302
Necessary changes to Valgrind to support MIPS64LE on Linux.
Minor cleanup/style changes embedded in the patch as well.
The change corresponds to r2687 in VEX.
Patch written by Dejan Jevtic and Petar Jovanovic.
More information about this issue:
https://bugs.kde.org/show_bug.cgi?id=313267
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13292