After working on an issue that turns out to seem to be with the
FreeBSD kernel sched_uler I played a lot with the Valgrind
syscall and scheduler code. I've kept the comments and the
reformatting.
Currently debuginfod is enabled in Valgrind when the $DEBUGINFOD_URLS
environment variable is set and disabled when it isn't set.
This patch adds an --enable-debuginfod=<yes|no> command line option
to provide another level of control over whether Valgrind attempts
to download debuginfo. "yes" is the default value.
$DEBUGINFOD_URLS must still contain debuginfod server URLs in order
for this feature to work when --enable-debuginfod=yes.
https://bugs.kde.org/show_bug.cgi?id=453602
These concern auxv, swapoff and fcntl F_KINFO
I wanted to use the new fcntl K_INFO to replace the existing
horrible implementation of resolve_filename, but it seems to
have change the behaviour for redirected files. Several
fdleak regtests fail because stdout resolves to an empty
string.
I've made these changes only for FreeBSD and Solaris for the moment.
I don't know what should be done on Linux for aligned_alloc/memalign.
The current Valgrind code refects the glibc implementation, but not
what the documentation says.
memfd_secret is a new syscall in linux 5.14. memfd_secret() is
disabled by default and a command-line option needs to be added to
enable it at boot time.
$ cat /proc/cmdline
[...] secretmem.enable=y
https://bugs.kde.org/451878https://lwn.net/Articles/865256/
It fixes a known iusse whose details are described at [1] and more
generally it guarantees that Valgrind is properly compiled for ulibc.
[1] https://www.mail-archive.com/valgrind-users@lists.sourceforge.net/msg05295.html
Suggested-by Michael Trimarchi <michael@amarulasolutions.com>
Co-developed-by: Michael Trimarchi <michael@amarulasolutions.com>
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
FreeBSD (and Darwin) use the carry flag for syscall syscall status.
That means that in the assembler for do_syscall_for_client_WRK
they have a call to LibVEX_GuestAMD64_put_rflag_c (amd64) or
LibVEX_GuestX86_put_eflag_c (x86). These also call WRK functions.
The problem is that do_syscall_for_client_WRK has carefully crafted
labels correspinding to IP addresses. If a signal interrupts
processdings, IP can be compared to these addresses so that
VG_(fixup_guest_state_after_syscall_interrupted) can work
out how to resume the syscall. But if IP is in the save
carry flag functions, the address is not recognized and
VG_(fixup_guest_state_after_syscall_interrupted) fails.
The crash in the title happens because the interrupted
syscall does not reset its status, and on the next syscall
it is expected that the status be idle.
To fix this I added global variables that get set to 1
just before calling the save carry flag functions, and cleared
just after. VG_(fixup_guest_state_after_syscall_interrupted)
can then check this and work out which section we are in
and resume the syscall correctly.
Also:
Start a new NEWS section for 3.20
Add a regtest for this and also a similar one for Bug 445032
(x86-freebsd only, new subdir).
I saw that this problem also probably exists with macOS, so I made
the same changes there (not yet tested)
Found this by testing the Solaris execx (the bits that are
Linux-cmpatible) test. That was giving
--28286-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting
--28286-- si_code=2; Faulting address: 0x4A0095A; sp: 0x1002ca9c88
valgrind: the 'impossible' happened:
Killed by fatal signal
host stacktrace:
==28286== at 0x5803DE54: vgPlain_strcpy (m_libcbase.c:309)
==28286== by 0x5810A9B3: vgSysWrap_linux_sys_execveat_before (syswrap-linux.c:13310)
==28286== by 0x580953C9: vgPlain_client_syscall (syswrap-main.c:2234)
It's a mistake to copy the path obtained with VG_(resolve_filename) to
the client ARG2, it's unlikely to have space for the path.
Instead just copy the pointer.
ht_sigchld_ignore and ht_ignore_node were defined in pub_core_signals.h
which cannot include any other tool header.
...checking header files and include directives
*** File coregrind/pub_core_signals.h must not include pub_tool_hashtable.h
So move the definition and type to pub_tool_signals.h
Valgrind fork+execs debuginfod-find in order to perform debuginfod
queries. Any SIGCHLD debuginfod-find sends upon termination can
mistakenly be delivered to the client running under valgrind.
To prevent this, record in a hash table the PID of each process
valgrind forks for internal use. Do not send SIGCHLD to the client
if it is from a PID in this hash table.
https://bugs.kde.org/show_bug.cgi?id=445011
For execve valgrind would silently fail when argv was NULL or
unadressable. Make sure that this produces a warning under memcheck.
The linux kernel accepts argv[0] being NULL, but most other kernels
don't since posix says it should be non-NULL and it causes argc to
be zero which is unexpected and might cause security issues.
This adjusts some testcases so they don't rely on execve succeeding
when argv is NULL and expect warnings about argv or argv[0] being
NULL or unaddressable.
https://bugs.kde.org/show_bug.cgi?id=450437
For BPF_RAW_TRACEPOINT_OPEN attr->raw_tracepoint.name may be NULL.
Otherwise it should point to a valid (max 128 char) string. Only
raw_tracepoint.prog_fd needs to be set.
https://bugs.kde.org/show_bug.cgi?id=451626
The check for the scv instruction in coregrind/m_machine.c issues an scv
instruction and uses sigill to determine if the instruction is supported.
Issuing scv on systems that don't support scv, i.e. scv support is not in
HWCAPS2, generates a message in dmesg "Facility 'SCV' unavailable (12),
exception".
This patch removes the sigill based scv instruction test from
coregrind/m_machine.c. The scv support is now determined by reading the
HWCAPS2 in setup_client_stack(). VG_(machine_ppc64_set_scv_support) is
called to set the flag ppc_scv_supported in struct VexArchInfo.
The allow_scv flag is added in disInstr_PPC_WRK. The allow_scv flag is
used to ensure the host has support for scv before generating the iops for
the scv instruction.
On s390x Linux platforms the sys_ipc semtimedop call has four instead of
five parameters, where the timeout is passed in the third instead of the
fifth.
Reflect this difference in the handling of VKI_SEMTIMEDOP.
Update the libiberty demangler using the auxprogs/update-demangler
script to gcc git commit d3b2ead595467166c849950ecd3710501a5094d9.
This update includes:
- libiberty rust-demangle, ignore .suffix
- libiberty: Fix infinite recursion in rust demangler
- Update copyright years
- libiberty: support digits in cpp mangled clone names
- d-demangle: properly skip anonymous symbols
- d-demangle: remove parenthesis where it is not needed
In POST(sys_io_uring_setup) we tried to use record_fd_open_with_given_name
with ARG1 as name. But ARG1 isn't a char pointer. So this might crash with
--track-fds=yes. Since no (file) name is associated with the fd returned by
io_uring_setup use record_fd_open_nameless instead.
https://bugs.kde.org/show_bug.cgi?id=449838
Patch contributed by Will Schmidt <will_schmidt@vnet.ibm.com>
This problem was initially reported by Tulio, he assisted me in
identifying the underlying issue here.
This was discovered on a Power10, and occurs since the ISA 3.1 support
check uses the brh instruction via a hardcoded ".long 0x7f1401b6" asm stanza.
That encoding writes to r20, and since the stanza does not contain a clobber
the compiler did not know to save or restore that register upon entry or exit.
The junk value remaining in r20 subsequently caused a segfault.
This patch adds clobber masks to the instruction stanzas, as well as
updates the associated comments to clarify which registers are being
used.
As part of this change I've also
- updated the .long for the cnttzw instruction to write to r20, and
zeroed the reserved bits from that instruction so it is properly
decoded by the disassembler.
- updated the .long for the dadd instruction to write to f0.
I've inspected the current codegen with these changes in place, and
confirm that r20 is now saved and restored on entry and exit from the
machine_get_hwcaps() function.
bugzilla 447995 Valgrind segfault on power10 due to hwcap checking code
This implements rseq for amd64, arm, arm64, ppc32, ppc64,
s390x and x86 linux as ENOSYS (without warning).
glibc will start using rseq to accelerate sched_getcpu, if
available. This would cause a warning from valgrind every
time a new thread is started.
Real rseq (restartable sequences) support is pretty hard, so
for now just explicitly return ENOSYS (just like we do for clone3).
https://sourceware.org/pipermail/libc-alpha/2021-December/133656.html
Adds syscall wrappers for __specialfd and __realpathat.
Also remove kernel dependency on COMPAT_FREEBSD10.
This change also reorganizes somewhat the scalar test
and adds configure time checks for the FreeBSD version,
allowing regression tests to be compiled depending on the
FreeBSD release.
From now on, scalar.c will contain syscalls for FreeBSD 11 and 12
and subsequent releases will get their own scalar, starting with
scalar_13_plus.c.
Also make drd/tests/shared_timed_mutex more robust
Already not great using time delays, but the test seems
to fail intermittently due to spurious wakeups. So instead
of railing straight away, make it "three strikes and you're out".
Newer Linux kernels on s390x may use the vDSO as a "trampoline" for
syscall restart. This means that the vDSO is no longer optional, and
unmapping it may lead to a segmentation fault when a system call restart
is performed.
So far Valgrind has been unmapping the vDSO on s390x. Just don't do this
anymore.
This patch rewrites the Level 2 origin-tracking cache (ocacheL2) so that
set-address-range-permissions (SARP) operations on it, for large ranges, are
at least a factor of 2.5 x faster. This is primarily targeted at SARPs in the
range of hundreds to thousands of megabytes. The Level 1 origin-tracking
cache covers 64MB address space, so SARPs that fit within it are mostly
unaffected. There are extensive comments in-line. Changes are:
* Change the Level 2 cache from a single AVL tree (OSet) into 4096 such trees,
selected by middle bits of the tag, hence "taking out" 12 significant bits
of search in any given tree.
* For the OCacheLine type, use a union so as to overlay the w32 and descr
arrays with an array of 64-bit values. This is used to speed up cases where
those fields are to be set to zero, or checked against zero.
* Due to the various fast-paths added by this patch, OC_BITS_PER_LINE has
pretty much been frozen at the current value, 5.
* ocache_sarp_Set_Origins, ocache_sarp_Clear_Origins: deal with large ranges
in 32-byte steps instead of 4-byte steps.
* MC_(helperc_b_store32), MC_(helperc_b_store16): rewrite these to be (much)
more efficient.
* fast-return cases for VG_(OSetGen_Lookup) and VG_(OSetGen_Remove) when the
tree is empty
* a few extra inline hints
gdb considers FreeBSD SIGTHR to be the evuivalent if SIGLWP
not a signal in its own right. Remove the extra enum entry
(which fixes errors in converting signals from number to
string) and map TARGET_SIGNAL_LWP to SIGTHR.
Leaving it in place for 11 (which is now EOL) and 12 - not
woth the complexity for them. Improve comment for supporession.
Also add a pointer to the illumos source web page for lwp_unlock_mutex
in case the syswrap ever needs improving.
I tried to test drd/tests/pth_mutex_signal on Solaris
(you never know) but encountered a missing syscall
wrapper. So this adds a very basic wrapper for lwp_mutex_unlock.
Also update a Solaris expected that I missed amongst the FreeBSD changes.
This was broken by commit 75e3ef0f3 "readdwarf3: Skip units without
addresses when looking for inlined functions". Specifically by this
part: "Also use skip_DIE instead of read_DIE when not parsing
(skipping) children"
rustc puts concrete function instances in namespaces (which is
allowed in DWARF since there is no strict separation between type
declarations and program scope entries in a DIE tree), the inline
parser didn't expect this and so skipped any DIE under a namespace
entry. This wasn't an issue before because "skipping" a DIE tree was
done by reading it, so it wasn't actually skipped. But now that we
really skip the DIE (sub)tree (which is faster than actually parsing
it) some entries were missed in the rustc case.
https://bugs.kde.org/show_bug.cgi?id=445668
It's currently broken due to a silly test that prevents the v0
demangling code from even running.
The commit also adds a test, to avoid such problems in the future.
The problem was that 'struct sigframe' has both a uContext struct
member and a puContext pointer to that struct. And puContext wasn't
being initialized to point to uContext.
It seems that the pthread sigreturn code uses puContext on i386.
amd64, with register arguments, didn't have this problem.
Depending on architecture glibc has various functions that set things
up to call "main". glibc 2.34 added __libc_start_call_main (at least
on ppc64le and s390x). Other variants recognized are __libc_start_main,
generic_start_main and variants of those names.
This fixes the massif/tests/deep-D and massif/tests/mmapunmap on ppc64le.