After working on an issue that turns out to seem to be with the
FreeBSD kernel sched_uler I played a lot with the Valgrind
syscall and scheduler code. I've kept the comments and the
reformatting.
These concern auxv, swapoff and fcntl F_KINFO
I wanted to use the new fcntl K_INFO to replace the existing
horrible implementation of resolve_filename, but it seems to
have change the behaviour for redirected files. Several
fdleak regtests fail because stdout resolves to an empty
string.
memfd_secret is a new syscall in linux 5.14. memfd_secret() is
disabled by default and a command-line option needs to be added to
enable it at boot time.
$ cat /proc/cmdline
[...] secretmem.enable=y
https://bugs.kde.org/451878https://lwn.net/Articles/865256/
Valgrind fork+execs debuginfod-find in order to perform debuginfod
queries. Any SIGCHLD debuginfod-find sends upon termination can
mistakenly be delivered to the client running under valgrind.
To prevent this, record in a hash table the PID of each process
valgrind forks for internal use. Do not send SIGCHLD to the client
if it is from a PID in this hash table.
https://bugs.kde.org/show_bug.cgi?id=445011
This implements rseq for amd64, arm, arm64, ppc32, ppc64,
s390x and x86 linux as ENOSYS (without warning).
glibc will start using rseq to accelerate sched_getcpu, if
available. This would cause a warning from valgrind every
time a new thread is started.
Real rseq (restartable sequences) support is pretty hard, so
for now just explicitly return ENOSYS (just like we do for clone3).
https://sourceware.org/pipermail/libc-alpha/2021-December/133656.html
Adds syscall wrappers for __specialfd and __realpathat.
Also remove kernel dependency on COMPAT_FREEBSD10.
This change also reorganizes somewhat the scalar test
and adds configure time checks for the FreeBSD version,
allowing regression tests to be compiled depending on the
FreeBSD release.
From now on, scalar.c will contain syscalls for FreeBSD 11 and 12
and subsequent releases will get their own scalar, starting with
scalar_13_plus.c.
I tried to test drd/tests/pth_mutex_signal on Solaris
(you never know) but encountered a missing syscall
wrapper. So this adds a very basic wrapper for lwp_mutex_unlock.
Also update a Solaris expected that I missed amongst the FreeBSD changes.
Implement BPF_MAP_LOOKUP_AND_DELETE_ELEM (command 21) and
BPF_MAP_FREEZE (command 22) and produce a WARNING instead of a fatal
error for unrecognized BPF commands.
https://bugs.kde.org/show_bug.cgi?id=426148
This is a system call introduced in Linux 5.9.
It's typically used to bulk-close file descriptors that a process inherited
without having desired so and doesn't want to pass them to its offspring
for security reasons. For this reason the sensible upper limit value tends
to be unknown and the users prefer to stay on the safe side by setting it
high.
This is a bit peculiar because, if unfiltered, the syscall could end up
closing descriptors Valgrind uses for its purposes, ending in no end of
mayhem and suffering.
This patch adjusts the upper bounds to a safe value and then skips over
the descriptor Valgrind uses by potentially calling the real system call
with sub-ranges that are safe to close.
The call can fail on negative ranges and bad flags -- we're dealing with
the first condition ourselves while letting the real call fail on bad
flags.
https://bugs.kde.org/show_bug.cgi?id=439090
Make the STFLE instruction report the miscellaneous-instruction-extensions
facility 3 and the vector-enhancements facility 2 as supported. Indicate
support for the latter in the HWCAP vector as well.
glibc 2.34 will try to use clone3 first before falling back to
the clone syscall. So implement clone3 as sys_ni_syscall which
simply return ENOSYS without producing a warning.
https://bugs.kde.org/show_bug.cgi?id=439590
Implement the new instructions/features that were added to z/Architecture
with the vector-enhancements facility 1. Also cover the instructions from
the vector-packed-decimal facility that are defined outside the chapter
"Vector Decimal Instructions", but not the ones from that chapter itself.
For a detailed list of newly supported instructions see the updates to
`docs/internals/s390-opcodes.csv'.
Since the miscellaneous instruction extensions facility 2 was already
addressed by Bug 404406, this completes the support necessary to run
general programs built with `--march=z14' under Valgrind. The
vector-packed-decimal facility is currently not exploited by the standard
toolchain and libraries.
faccessat2 is a new syscall in linux 5.8 and will be used by glibc 2.33.
faccessat2 is simply faccessat with a new flag argument. It has
a common number across all linux arches.
https://bugs.kde.org/427787
I've tested this on amd64 and arm but I'm enabling it on all
arches since the syscall should work identically on all of them.
This was requested by users for a long time (almost 5 years) and
in fact, some programs (like libvirt) use namespaces and fork off
to enter other namespaces. Lack of implementation means valgrind
can't be used with these programs (or their configuration must be
changed to not use namespaces, which defeats the purpose).
Without knowing it, I've converged to same patch as mentioned in
bugs below.
https://bugs.kde.org/show_bug.cgi?id=343099https://bugs.kde.org/show_bug.cgi?id=368923https://bugs.kde.org/show_bug.cgi?id=369031
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
[get|set]rlimit system calls are becoming deprecated.
Coregrind should use prlimit64 as the first candidate in order to
achieve "rlimit" functionality.
There are also systems that do not even support older "rlimits".
Modify the previously added support VG_(getrlimit) and VG_(setrlimit)
using __NR_prlimit64 by making it similar to the glibc implementation.
It fixes none/tests/stackgrowth and none/tests/sigstackgrowth
tests on nanoMIPS.
Patch by: Stefan Maksimovic and Aleksandar Rikalo
This patch resolves KDE #416285.
This patch adds sycall wrappers for the following syscalls which
use a 64bit time_t on 32bit arches: gettime64, settime64,
clock_getres_time64, clock_nanosleep_time64, timer_gettime64,
timer_settime64, timerfd_gettime64, timerfd_settime64,
utimensat_time64, pselect6_time64, ppoll_time64, recvmmsg_time64,
mq_timedsend_time64, mq_timedreceive_time64, semtimedop_time64,
rt_sigtimedwait_time64, futex_time64 and sched_rr_get_interval_time64.
Still missing are clock_adjtime64 and io_pgetevents_time64.
For the more complicated syscalls futex[_time64], pselect6[_time64]
and ppoll[_time64] there are shared pre and/or post helper functions.
Other functions just have their own PRE and POST handler.
Note that the vki_timespec64 struct really is the struct as used by
by glibc (it internally translates a 32bit timespec struct to a 64bit
timespec64 struct before passing it to any of the time64 syscalls).
The kernel uses a 64-bit signed int, but is ignoring the upper 32 bits
of the tv_nsec field. It does always write the full struct though.
So avoid checking the padding is only needed for PRE_MEM_READ.
There are two helper pre_read_timespec64 and pre_read_itimerspec64
to check the new structs.
https://bugs.kde.org/show_bug.cgi?id=416753
Necessary changes to support nanoMIPS on Linux.
Part 4/4 - Other changes (mainly include/*)
Patch by Aleksandar Rikalo, Dimitrije Nikolic, Tamara Vlahovic,
Nikola Milutinovic and Aleksandra Karadzic.
Related KDE issue: #400872.
As it turned out, the size of vki_siginfo_t is incorrect on these 64-bit
architectures:
(gdb) p sizeof(vki_siginfo_t)
$1 = 136
(gdb) ptype struct vki_siginfo
type = struct vki_siginfo {
int si_signo;
int si_errno;
int si_code;
union {
int _pad[29];
struct {...} _kill;
struct {...} _timer;
struct {...} _rt;
struct {...} _sigchld;
struct {...} _sigfault;
struct {...} _sigpoll;
} _sifields;
}
It looks like that for this architecture, __VKI_ARCH_SI_PREAMBLE_SIZE
hasn't been defined properly, which resulted in incorrect
VKI_SI_PAD_SIZE calculation (29 instead of 28).
<6a9e4> DW_AT_name : (indirect string, offset: 0xcf59): _sifields
<6a9ef> DW_AT_data_member_location: 16
This issue has been discovered with strace's "make check-valgrind-memcheck",
which produced false out-of-bounds writes on ptrace(PTRACE_GETSIGINFO) calls:
SYSCALL[24264,1](101) sys_ptrace ( 16898, 24283, 0x0, 0x606bd40 )
==24264== Syscall param ptrace(getsiginfo) points to unaddressable byte(s)
==24264== at 0x575C06E: ptrace (ptrace.c:45)
==24264== by 0x443244: next_event (strace.c:2431)
==24264== by 0x443D30: main (strace.c:2845)
==24264== Address 0x606bdc0 is 0 bytes after a block of size 144 alloc'd
(Note that the address passed is 0x606bd40 and the address reported is
0x606bdc0).
After the patch, no such errors observed.
* include/vki/vki-amd64-linux.h [__x86_64__ && __ILP32__]
(__vki_kernel_si_clock_t): New typedef.
[__x86_64__ && __ILP32__] (__VKI_ARCH_SI_CLOCK_T,
__VKI_ARCH_SI_ATTRIBUTES): New macros.
[__x86_64__ && !__ILP32__] (__VKI_ARCH_SI_PREAMBLE_SIZE): New macro,
define to 4 ints.
* include/vki/vki-arm64-linux.h (__VKI_ARCH_SI_PREAMBLE_SIZE): Likewise.
* include/vki/vki-ppc64-linux.h [__powerpc64__] (__VKI_ARCH_SI_PREAMBLE_SIZE):
Likewise.
* include/vki/vki-linux.h [!__VKI_ARCH_SI_CLOCK_T]
(__VKI_ARCH_SI_CLOCK_T): New macro, define to vki_clock_t.
[!__VKI_ARCH_SI_ATTRIBUTES] (__VKI_ARCH_SI_ATTRIBUTES): New macro,
define to nil.
(struct vki_siginfo): Use __VKI_ARCH_SI_CLOCK_T type for _utime and
_stime fields. Add __VKI_ARCH_SI_ATTRIBUTES.
Resolves: https://bugs.kde.org/show_bug.cgi?id=405201
Reported-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Eugene Syromyatnikov <evgsyr@gmail.com>
*STAT* system calls other than statx are becoming deprecated.
Coregrind should use statx as the first candidate in order to achieve
"stat" functionality.
There are also systems that do not even support older "stats".
This fixes KDE #400593.
Patch by Aleksandar Rikalo.
Introduce new header files for the system call numbers that are shared
across all Linux architectures and also for the system call numbers that
are shared across all 32-bit architectures.
Sync VEX/LICENSE.GPL with top-level COPYING file. We used 3 different
addresses for writing to the FSF to receive a copy of the GPL. Replace
all different variants with an URL <http://www.gnu.org/licenses/>.
The following files might still have some slightly different (L)GPL
copyright notice because they were derived from other programs:
- files under coregrind/m_demangle which come from libiberty:
cplus-dem.c, d-demangle.c, demangle.h, rust-demangle.c,
safe-ctype.c and safe-ctype.h
- coregrind/m_demangle/dyn-string.[hc] derived from GCC.
- coregrind/m_demangle/ansidecl.h derived from glibc.
- VEX files for FMA detived from glibc:
host_generic_maddf.h and host_generic_maddf.c
- files under coregrin/m_debuginfo derived from LZO:
lzoconf.h, lzodefs.h, minilzo-inl.c and minilzo.h
- files under coregrind/m_gdbserver detived from GDB:
gdb/signals.h, inferiors.c, regcache.c, regcache.h,
regdef.h, remote-utils.c, server.c, server.h, signals.c,
target.c, target.h and utils.c
Plus the following test files:
- none/tests/ppc32/testVMX.c derived from testVMX.
- ppc tests derived from QEMU: jm-insns.c, ppc64_helpers.h
and test_isa_3_0.c
- tests derived from bzip2 (with embedded GPL text in code):
hackedbz2.c, origin5-bz2.c, varinfo6.c
- tests detived from glibc: str_tester.c, pth_atfork1.c
- test detived from GCC libgomp: tc17_sembar.c
- performance tests derived from bzip2 or tinycc (with embedded GPL
text in code): bz2.c, test_input_for_tinycc.c and tinycc.c
PTRACE_GET_THREAD_AREA is not handled by amd64 linux syswrap, which leads
to false positive errors in 64 bits program ptrace-ing 32 bits processes.
For example, the below error was wrongly reported on GDB:
==25377== Conditional jump or move depends on uninitialised value(s)
==25377== at 0x8A1D7EC: td_thr_get_info (td_thr_get_info.c:35)
==25377== by 0x526819: thread_from_lwp(thread_info*, ptid_t) (linux-thread-db.c:417)
==25377== by 0x5281D4: thread_db_notice_clone(ptid_t, ptid_t) (linux-thread-db.c:442)
==25377== by 0x51773B: linux_handle_extended_wait(lwp_info*, int) (linux-nat.c:2027)
....
==25377== Uninitialised value was created by a stack allocation
==25377== at 0x69A360: x86_linux_get_thread_area(int, void*, unsigned int*) (x86-linux-nat.c:278)
Fix this by implementing PTRACE_GET|SET_THREAD_AREA on amd64.
When code uses utimensat with UTIME_NOW or UTIME_OMIT valgrind memcheck
would generate a warning. But as the utimensat manpage says:
If the tv_nsec field of one of the timespec structures has the special
value UTIME_NOW, then the corresponding file timestamp is set to the
current time. If the tv_nsec field of one of the timespec structures
has the special value UTIME_OMIT, then the corresponding file timestamp
is left unchanged. In both of these cases, the value of the corre‐
sponding tv_sec field is ignored.
So ignore the timespec tv_sec when tv_nsec is set to UTIME_NOW or
UTIME_OMIT.