Executing vgdb --multi makes vgdb talk the gdb extended-remote
protocol. This means that the gdb run command is supported and
vgdb will start up the program under valgrind. Which means you
don't need to run gdb and valgrind from different terminals.
Also vgdb keeps being connected to gdb after valgrind exits. So
you can easily rerun the program with the same breakpoints in
place.
vgdb now implements a minimal gdbserver that just recognizes
a few extended-remote protocol packets. Once it starts up valgrind
it sets up noack and qsupported then it will forward packets
between gdb and valgrind gdbserver. After valgrind shutsdown it
resumes handling gdb packets itself.
https://bugs.kde.org/show_bug.cgi?id=434057
Co-authored-by: Mark Wielaard <mark@klomp.org>
Most notable, the "Function summary" section, which printed one CC for each
`file:function` combination, has been replaced by two sections, "File:function
summary" and "Function:file summary".
These new sections both feature "deep CCs", which have an "outer CC" for the
file (or function), and one or more "inner CCs" for the paired functions (or
files).
Here is a file:function example, which helps show which files have a lot of
events, even if those events are spread across a lot of functions.
```
> 12,427,830 (5.4%, 26.3%) /home/njn/moz/gecko-dev/js/src/ds/LifoAlloc.h:
6,107,862 (2.7%) js::frontend::ParseNodeVerifier::visit(js::frontend::ParseNode*)
3,685,203 (1.6%) js::detail::BumpChunk::setBump(unsigned char*)
1,640,591 (0.7%) js::LifoAlloc::alloc(unsigned long)
711,008 (0.3%) js::detail::BumpChunk::assertInvariants()
```
And here is a function:file example, which shows how heavy inlining can result
in a machine code function being derived from source code from multiple files:
```
> 1,343,736 (0.6%, 35.6%) js::gc::TenuredCell::isMarkedGray() const:
651,108 (0.3%) /home/njn/moz/gecko-dev/js/src/d64/dist/include/js/HeapAPI.h
292,672 (0.1%) /home/njn/moz/gecko-dev/js/src/gc/Cell.h
254,854 (0.1%) /home/njn/moz/gecko-dev/js/src/gc/Heap.h
```
Previously these patterns were very hard to find, and it was easy to overlook a
hot piece of code because its counts were spread across multiple non-adjacent
entries. I have already found these changes very useful for profiling Rust
code.
Also, cumulative percentages on the outer CCs (e.g. the 26.3% and 35.6% in the
example) tell you what fraction of all events are covered by the entries so
far, something I've wanted for a long time.
Some other, related changes:
- Column event headers are now padded with `_`, e.g. `Ir__________`. This makes
the column/event mapping clearer.
- The "Cachegrind profile" section is now called "Metadata", which is
shorter and clearer.
- A few minor test tweaks, beyond those required for the output changes.
- I converted some doc comments to normal comments. Not standard Python, but
nicer to read, and there are no public APIs here.
- Roughly 2x speedups to `cg_annotate` and smaller improvements for `cg_diff`
and `cg_merge`, due to the following.
- Change the `Cc` class to a type alias for `list[int]`, to avoid the class
overhead (sigh).
- Process event count lines in a single split, instead of a regex
match + split.
- Add the `add_cc_to_ccs` function, which does multiple CC additions in a
single function call.
- Better handling of dicts while reading input, minimizing lookups.
- Pre-computing the missing CC string for each CcPrinter, instead of
regenerating it each time.
All for clang and mostly Apple clang
There are still numerous deprecated warnings on macOS 10.13
(sem* functions, syscall, sbrk, i386, PIEi, OSSpinLocki, swapcontext, getcontext)
- Move it to `auxprogs/`, alongside `pybuild.sh`.
- Disable the annoying design lints, instead of just modifying the
values (which often requires modifying them again later).
Provide the user with a hint of what caused an out of memory error.
And explain that some memory policies, like selinux deny_execmem
might cause Permission denied errors.
Add an err argument to out_of_memory_NORETURN. And change
am_shadow_alloc to return a SysRes (all three callers were already
checking for errors and calling out_of_memory_NORETURN).
Users shouldn't ever see this, but it's useful to distinguish this
malformed data file case from the missing symbol case (which is still
shown as `???`).
It's currently written in C, but `cg_annotate` and `cg_diff` are written in
Python. It's better to have them all in the same language.
The good news is that the Python code is 4.5x shorter than the C code.
The bad news is that the Python code is roughly 3x slower than the C
code. But `cg_merge` isn't used that often, so I think it's a reasonable
trade-off.
For all the same reasons I rewrote `cg_annotate` in Python.
The commit also moves the Python "build" steps into
`auxprogs/pybuild.sh`, for easy sharing.
Finally, it very slightly tweaks the whitespace in the output of
`cg_annotate`.
- Every section now has a heading with the long `----` lines above and
below.
- Event names are always shown below that heading, rather than within
it.
- Each Unreadable file now gets its own section, much like files that
lack any data.
These tests generate a varying number of errors per argument
depending on the platform and compiler.
The filter just prints the first unique error stanza which
allows 8 expecteds to be removed.
Currently their width is mostly hard-wired in a quick and dirty fashion.
This commit does them properly, so:
- all columns are always the right width, even ones with really large
percentages
- things like `( 1.00%)` are now `(1.00%)`
- any percentages that would involve a division by zero now show as
`(n/a)` rather than `( 0.00%)`