Currently debuginfod is enabled in Valgrind when the $DEBUGINFOD_URLS
environment variable is set and disabled when it isn't set.
This patch adds an --enable-debuginfod=<yes|no> command line option
to provide another level of control over whether Valgrind attempts
to download debuginfo. "yes" is the default value.
$DEBUGINFOD_URLS must still contain debuginfod server URLs in order
for this feature to work when --enable-debuginfod=yes.
https://bugs.kde.org/show_bug.cgi?id=453602
This was broken by commit 75e3ef0f3 "readdwarf3: Skip units without
addresses when looking for inlined functions". Specifically by this
part: "Also use skip_DIE instead of read_DIE when not parsing
(skipping) children"
rustc puts concrete function instances in namespaces (which is
allowed in DWARF since there is no strict separation between type
declarations and program scope entries in a DIE tree), the inline
parser didn't expect this and so skipped any DIE under a namespace
entry. This wasn't an issue before because "skipping" a DIE tree was
done by reading it, so it wasn't actually skipped. But now that we
really skip the DIE (sub)tree (which is faster than actually parsing
it) some entries were missed in the rustc case.
https://bugs.kde.org/show_bug.cgi?id=445668
Depending on architecture glibc has various functions that set things
up to call "main". glibc 2.34 added __libc_start_call_main (at least
on ppc64le and s390x). Other variants recognized are __libc_start_main,
generic_start_main and variants of those names.
This fixes the massif/tests/deep-D and massif/tests/mmapunmap on ppc64le.
With the inline parser often a lot of DIEs are skipped, so reading
all abbrevs up front wastes time and memory. A lot of time and memory
can be saved by reading the abbrevs on demand. Do this by introducing
an abbv_state that is used to keep track of the abbrevs already read.
This does technically make the CUConst struct not const.
Instead of destroying the ht_abbrvs after processing a CU save it
and the offset so it can be reused for the next CU if that happens
to have the same abbrev offset. dwz compressed DWARF often reuse
the same abbrev for multiple CUs.
Both the var parser and the inl parser kept a fndn_ix_Table.
Initialize only one per debuginfo read pass and reuse if the stmt offset
is the same as last time (CUs can share the same line table and alt
files do share one for all units).
When a unit doesn't cover any addresses skip it because no actual code
will be inside. Also use skip_DIE instead of read_DIE when not parsing
(skipping) children.
With glibc 2.34 we might see the _start symbol as the frame that
called main instead of directly after __libc_start_main or
generic_start_main.
Fixes memcheck/tests/badjump[2], memcheck/tests/origin4-many,
helgrind/tests/tc04_free_lock, helgrind/tests/tc09_bad_unlock
and helgrind/tests/tc20_verifywrap.
move_CEnt_to_top is on the hot path when reading large amounts of debug info,
especially Dwarf inlined-function info. It shows up in 'perf' profiles. This
commit removes assertions which are asserted elsewhere, and tries to avoid a
couple of conditional branches.
When copy_convert_CfiExpr_tree sees a DwReg on arm64 we simply call
I_die_here; This causes an issue in the case we really do have to handle
that case (see https://bugzilla.redhat.com/show_bug.cgi?id=1923493).
Handle the stack pointer (sp), link register (x30) and frame pointer (x29),
which we already keep in D3UnwindRegs, like we do for other architectures
in evalCfiExpr and copy_convert_CfiExpr_tree.
https://bugs.kde.org/show_bug.cgi?id=433898
The VG_(get_fnname_kind) function detects some special "below main"
function names. Specifically __libc_start_main and generic_start_main
both of which are used to call the actual main () function from the
application. We already recognized one variant, generic_start_main.isra.0,
but only for powerpc. Recognize all possibly specialed optimized variants
gcc can produce by simply checking for the function name with dot as
prefix. This fixes the memcheck/tests/supp_unknown.vgtest and
massif/tests/deep-D.vgtest with gcc 11.
We can now also get rid of the special cases in
massif/tests/deep-D.post.exp-ppc64 and memcheck/tests/supp_unknown.supp.
https://bugs.kde.org/show_bug.cgi?id=430158
Skip some stuff when seeing an unknown language, be less chatty about
parser issues.
All the issues seem to come from the multi-file, that is the shared
(supplementary or alt) file containing debuginfo shared by all the
gcc/runtime libraries.
There are a couple of issues that this patch works around:
- The multifile contains entries for the 'D' language, which has some
constructs we don't expect.
- We don't read partial units correctly, which means we often don't know
the language we are looking at.
- The parser is very chatty about issues it didn't expect (even if they
are ignored, it will still output something)
It only shows up with --read-var-info=yes which some tests enable, but
which is disabled by default.
Also increate the timeout of drd/tests/pth_cleanup_handler.c because
DWARF reading is so slow.
https://bugs.kde.org/show_bug.cgi?id=433500
debuginfod is an HTTP server for distributing ELF/DWARF debugging
information. When a debuginfo file cannot be found locally, Valgrind
is able to query debuginfod servers for the file using its build-id.
readelf.c: Add debuginfod_find_debug_file(). Spawns a child process to
exec `debuginfod-find` in order to query servers for the debuginfo
file. Also add helper debuginfod_find_path().
pub_core_pathscan.h: Moved from priv_initimg_pathscan.h in order to use
VG_(find_executable)() in readelf.c.
docs: Add information regarding debuginfod to valgrind.1
memcheck/tests/linux: Add new test debuginfod-check.
tests/vg_regtest.in: Clear $DEBUGINFOD_URLS before running any tests.
https://bugs.kde.org/show_bug.cgi?id=432215
This typo meant the directory entry was most often zero, which
happened to be sometimes correct anyway (since zero is the compdir).
So for simple testcases it looked correct. But it would be wrong for
compilation units not in the current compdir. Like files compiled with
a relative of absolute path (and then combined into the same compilation
unit with LTO).
The same typo was in both readdwarf.c (read_dwarf2_lineblock) and
readdwarf3.c (read_filename_table). read_dwarf2_lineblock also had
an extra "dwarf" string in the --debug-dump=line output.
https://bugzilla.redhat.com/show_bug.cgi?id=1927153
Implement DWARF5 in readdwarf.c and readdwarf3.c
Since gcc11 will default to DWARF5 by default it is time for
valgrind to support it. The patch handles everything gcc11 produces
(except for the new DWARF expressions).
There is some duplication in the patch since we actually have two DWARF
readers which use slightly different abstractions (Slices vs Cursors).
It would be nice if we could merge these somehow. The reader in
readdwarf3.c is only used when --read-var-info=yes is used (which
drd uses to provide the allocation context).
The handling of DW_FORM_implicit_const is tricky with the current design.
An abbrev which contains an attribute encoded with DW_FORM_implicit_const
has its value also in the abbrev. The code in readdwarf3.c assumed it
always could simply get the data from the .debug_info/current Cursor.
For now I added a value field to the name_form field that holds the
associated value. This is slightly wasteful since the extra field is
not necessary for other forms.
Tested against GCC10 (defaulting to DWARF4) and GCC11 (defaulting to
DWARF5) on x86_64. No regressions in the regtests.
https://bugs.kde.org/show_bug.cgi?id=432102
GCC warns:
readpdb.c:1631:16: warning: this 'if' clause does not guard...
[-Wmisleading-indentation]
1631 | if (debug)
| ^~
In file included from ./pub_core_basics.h:38,
from m_debuginfo/readpdb.c:38:
../include/pub_tool_basics.h:69:30: note: ...this statement, but the latter
is misleadingly indented as if it were guarded by the 'if'
69 | #define ML_(str) VGAPPEND(vgModuleLocal_, str)
| ^~~~~~~~~~~~~~
../include/pub_tool_basics.h:66:29: note: in definition of macro 'VGAPPEND'
66 | #define VGAPPEND(str1,str2) str1##str2
| ^~~~
m_debuginfo/readpdb.c:1636:19: note: in expansion of macro 'ML_'
1636 | ML_(addLineInfo)(
| ^~~
The warning message is slightly hard to read because of the macro expansion.
But GCC is right that the indentation is misleading. Fixed by reindenting.
With gcc 9 and --enable-lto, we now have spurious warnings telling
that the line information in the debug info has huge line numbers,
greater than the (valgrind) maximum of 2^20.
These spurious warnings make that all tests are failing.
This change modifies the tracing/debugging of the line info to:
* disable by default the warning for line info greater than 2^20.
When using -d, such warnings are however still shown (once).
* allow to see all such warnings, when using at least -d -d -d -d
On ubuntu 19.10, valgrind fails telling that it cannot find
the mandatory redirection for strlen in ld-linux-x86-64.so.2.
This is due to /bin being a symlink to usr/bin: ld is found
in /usr/lib/x86_64-linux-gnu/ld-2.30.so
but its debug info is
in /usr/lib/debug/lib/x86_64-linux-gnu/ld-2.30.so
Without this patch, valgrind searches the debug info (a.o.)
in /usr/lib/debug/usr/lib/x86_64-linux-gnu/ld-2.30.so
so using the concatenation of /usr/lib/debug
and /usr/lib/x86_64-linux-gnu/ld-2.30.so,
but the debug info is located at the concatenation of
/usr/lib/debug and /lib/x86_64-linux-gnu/ld-2.30.so
(so without the leading /usr).
Modify the debug info search so as to try with and without the /usr.
Patch derived from the patch done by Mathieu Trudel-Lapierre
to solve https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1808508
Necessary changes to support nanoMIPS on Linux.
Part 2/4 - Coregrind changes
Patch by Aleksandar Rikalo, Dimitrije Nikolic, Tamara Vlahovic and
Aleksandra Karadzic.
Related KDE issue: #400872.
Sync VEX/LICENSE.GPL with top-level COPYING file. We used 3 different
addresses for writing to the FSF to receive a copy of the GPL. Replace
all different variants with an URL <http://www.gnu.org/licenses/>.
The following files might still have some slightly different (L)GPL
copyright notice because they were derived from other programs:
- files under coregrind/m_demangle which come from libiberty:
cplus-dem.c, d-demangle.c, demangle.h, rust-demangle.c,
safe-ctype.c and safe-ctype.h
- coregrind/m_demangle/dyn-string.[hc] derived from GCC.
- coregrind/m_demangle/ansidecl.h derived from glibc.
- VEX files for FMA detived from glibc:
host_generic_maddf.h and host_generic_maddf.c
- files under coregrin/m_debuginfo derived from LZO:
lzoconf.h, lzodefs.h, minilzo-inl.c and minilzo.h
- files under coregrind/m_gdbserver detived from GDB:
gdb/signals.h, inferiors.c, regcache.c, regcache.h,
regdef.h, remote-utils.c, server.c, server.h, signals.c,
target.c, target.h and utils.c
Plus the following test files:
- none/tests/ppc32/testVMX.c derived from testVMX.
- ppc tests derived from QEMU: jm-insns.c, ppc64_helpers.h
and test_isa_3_0.c
- tests derived from bzip2 (with embedded GPL text in code):
hackedbz2.c, origin5-bz2.c, varinfo6.c
- tests detived from glibc: str_tester.c, pth_atfork1.c
- test detived from GCC libgomp: tc17_sembar.c
- performance tests derived from bzip2 or tinycc (with embedded GPL
text in code): bz2.c, test_input_for_tinycc.c and tinycc.c
On s390x-linux, adds CFI based unwinding for %f0..%f7, since these are sometimes
used by gcc >= 8.0 to spill integer register values in leaf functions. Hence the
lack of unwinding them was causing unwind failures on this platform.
At least with libopenblas, we can have several rx mappings
with some holes between mappings.
Change the invariant (2) checking so that such holes are ok,
as long as no cfsi refers to such an hole.
Some applications are mapping an object ro, and then unmaps it directly.
In such a case, we have a di that contains obsolete fsm.maps (not matching
OS mappings). The di for this unmapped object is not active,
and has no dinfo (have_dinfo == False).
(more generally, fsm.maps can contain a whole bunch of obsolete mappings).
Later on, some other libs can be mapped with a mapping overlapping
this obsolete mapping.
A di that never had its debug info loaded can really be discarded,
even if CG_(clo_keep_debuginfo).
In such a case, it is normal to have to discard a not active di.
(it might be better to keep fsm.maps in sync with the real OS
mapping, but that is a much bigger change/fix).
The FSM debug tracing was static, it is now dynamic according
to debug loglevel >= 3.
The below is an extract of the trace showing what happens.
SYSCALL[4384,1](257) sys_openat ( 4294967196, 0x4244398(/usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqeglfs.so), 524288 ) --> [async] ...
SYSCALL[4384,1](257) ... [async] --> Success(0x3)
SYSCALL[4384,1](72) sys_fcntl[ARG3=='arg'] ( 3, 2, 1 )[sync] --> Success(0x0)
SYSCALL[4384,1](5) sys_newfstat ( 3, 0x1ffefff8b0 )[sync] --> Success(0x0)
SYSCALL[4384,1](5) sys_newfstat ( 3, 0x1ffefff9c0 )[sync] --> Success(0x0)
SYSCALL[4384,1](9) sys_mmap ( 0x0, 10520, 1, 1, 3, 0 )--4384-- di_notify_mmap-0:
--4384-- di_notify_mmap-1: 0x4027000-0x4029fff r--
--4384-- di_notify_mmap-2: /usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqeglfs.so
--4384-- di_notify_mmap-3: is_rx_map 0, is_rw_map 0, is_ro_map 1
--4384-- di_notify_mmap-4: noting details in DebugInfo* at 0x10024CEA10
--4384-- di_notify_mmap-6: no dinfo loaded /usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqeglfs.so (no rx or no rw mapping)
--> [pre-success] Success(0x4027000)
SYSCALL[4384,1](3) sys_close ( 3 )[sync] --> Success(0x0)
SYSCALL[4384,1](11) sys_munmap ( 0x4027000, 10520 )[sync] --> Success(0x0)
^^^^ the above munmap has not cleaned up or removed anything in DebugInfo* at 0x10024CEA10
Later on, /usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqxcb.so is mapped
overlapping the memory where libqeglfs.so was mapped ro.
Now, this cleans up the (useless) di that never had have_dinfo true, e.g.
------ start ELF OBJECT -------------------------------------------------------
------ name = /usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqxcb.so
...
--4384-- Discarding syms at 0x0-0x0 in /usr/lib/x86_64-linux-gnu/qt5/plugins/platforms/libqeglfs.so (have_dinfo 0)
(the 0x0-0x0 in the trace is because there was never any text mapping for libqeglfs.so).
In check_CFSI_related_invariants, this commit improves the check for invariant
(2), which, as noted in an existing comment, "might need to be improved".
Instead of assuming that the CFSI range fits entirely into one "rx" mapping,
check that it is covered by the union of all the "rx" mappings we have. This
is the correct check. The previous check was observed to have failed as below
for at least some Clang generated objects (possibly in conjunction with lld as
the linker.)
valgrind: m_debuginfo/debuginfo.c:717 (check_CFSI_related_invariants): Assertion 'cfsi_fits' failed.
* discard_or_archive_marked_DebugInfos: clear the mark bit for a Debuginfo
that will be archived
* discard_DebugInfos_which_overlap_with: when selecting DebugInfos to be
discarded or archived, fix a mistake in which some mark bits wouldn't be
changed at all, meaning their "old" value was used to influence the current
operation.
These may (or may not) fix#393146; at the very least, they are somehow
related.
Once we've read debuginfo for an object, ignore all further mappings. If we
don't do that, applications that mmap in their own objects to inspect them for
whatever reason, will cause "irrelevant" mappings to be recorded in the
object's fsm.maps table. This can lead to serious problems later on.
This has become necessary because 64aa729bfa of
Thu Jul 12 2018 (the fix for bug 395682) started recording readonly segments
in the fsm.maps table, where before they were ignored.
The new binutils ld -z separate-code option creates multiple read-only
PT_LOAD segments and might place .rodata in a non-executable segment.
Allow and keep track of separate read-only segments and allow a readonly
page with .rodata section.
Based on patches from Tom Hughes <tom@compton.nu> and
H.J. Lu <hjl.tools@gmail.com>.
https://bugs.kde.org/show_bug.cgi?id=395682
Adding MIPS N32 ABI support.
BZ issue - #345763.
Contributed and maintained by mulitple people over the years:
Crestez Dan Leonard, Maran Pakkirisamy, Dimitrije Nikolic,
Aleksandar Rikalo, Tamara Vlahovic.
Follow up to "Introduce RegWord type" change.
Part of the changes required for BZ issue - #345763.
Contributed by:
Tamara Vlahovic and Dimitrije Nikolic.
Bug #393062 - Reading build-id ELF note through phdrs triggers
"debuginfo reader: ensure_valid failed"
Skip the phdrs when we have to search the shdrs. In separate
.debug files the phdrs might not be valid (they are a copy of
the main ELF file) and might trigger assertions when getting
image notes based on them.
Normally, an inlined call has a dwarf entry that points at the abstract origin, i.e. the
function that was inlined.
However, in some cases, the abstract origin tag is not present (observed with gcc 6.3.0, when
compiling with link time optimisation).
Such missing abstract origin was then causing an error message when reading the dwarf debug info.
This patch ensures we handle this case more gracefully, by using UnknownInlinedFun as inlined
function name for such a missing abstract origin;
As reported by Matthias Schwarzott <zzam@gentoo.org>. Testcase patch from him. The fix is
for check_CFSI_related_invariants() to avoid checking for overlaps against DebugInfos that are
in 'archived' status, since -- if a previously dlopened-and-then-dlclosed object is later
re-dlopened -- this may cause an overlap between the active and archived DebugInfos, which
is of no consequence. If the kernel maps the object to the same VMA the second time around
then there will *certainly* be an overlap.
https://bugs.kde.org/show_bug.cgi?id=387773
The path to the alt file is relative to the actual debug file.
Make sure that we got the real file, not a (build-id) symlink.
Also handle the case where a debug or alt file is an absolute path.
This patch implements the flag --delta-stacktrace=yes/no.
Yes indicates to calculate the full history stack traces by
changing just the last frame if no call/return instruction was
executed.
This can speed up helgrind by up to 25%.
This flags is currently set to yes only on linux x86 and amd64, as some
platform dependent validation of the used heuristics is needed before
setting the default to yes on a platform. See function check_cached_rcec_ok
in libhb_core.c for more details about how to validate/check the behaviour
on a new platform.