ftmemsim-valgrind

mirror of https://github.com/Zenithsiz/ftmemsim-valgrind.git synced 2026-02-13 06:33:56 +00:00

Author	SHA1	Message	Date
Rhys Kidd	b06c2c7e23	config: remove unrequired AC_HEADER_STDC Autoconf says: "This macro is obsolescent, as current systems have conforming header files. New programs need not use this macro". Was previously required to ensure the system has C header files conforming to ANSI C89 (ISO C90). Specifically, this macro checks for stdlib.h, stdarg.h, string.h, and float.h. This autoconf option was used to provide conditional fallback support via defined STDC_HEADERS. valgrind does not utilize conditional fallback support so, so this macro is both obsolete and unused, so let's drop it. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>	2019-03-11 22:49:37 +11:00
Julian Seward	dffe3a2d1b	Add a 3_14_BUGSTATUS.txt file and add to it all bugs reported since 3.14 was release. At least, the bugs are post-triaged ones, so some have been removed.	2019-03-10 11:11:16 +01:00
Julian Seward	4ee1dd2778	bb_to_IR(): increase assertion limits on the maximum size of self-checking translations. n-i-bz.	2019-03-09 17:58:11 +01:00
Petar Jovanovic	3217459c72	modify massif/tests/mmapunmap.vgtest to comply with glibc change The change in the glibc version (2.27 -> 2.28) results in one additional function call being present in the backtrace for mips64, which leads to the line to be checked to be out of bounds. Changed the post line in mmapunmap.vgtest to work around this. This fixes massif/tests/mmapunmap failure on mips64. Patch by Stefan Maksimovic.	2019-03-04 19:26:37 +01:00
Mark Wielaard	7f74ba249e	Bug 405079 - unhandled ppc64le-linux syscall: 131 (quotactl) quotactl is really a "generic" linux syscall that just happened to not have been hooked up for ppc64le. Add it to syswrap-ppc64-linux.c.	2019-03-04 17:22:56 +01:00
Julian Seward	6bcb493b03	Adjust the built-in profiler so that it can try to count host insns as well as guest insns. n-i-bz.	2019-02-26 09:57:57 +01:00
Julian Seward	85545d9d25	Fix another format string signedness warning, arm64-linux only. n-i-bz.	2019-02-25 11:48:43 +01:00
Mark Wielaard	256cf43c5e	memcheck powerpc subfe x, x, x initializes x to 0 or -1 based on CA GCC might use subfe x, x, x to initialize x to 0 or -1, based on whether the carry flag is set. This happens in some cases when g++ compiles resetting a unique_ptr. The "trick" used by the compiler is that it can AND a pointer with the register x (now 0x0 or 0xffffffff) to set something to NULL or to the given pointer. subfe is implemented as rD = (log not)rA + rB + XER[CA] if we instead implement it as rD = rB - rA - (XER[CA] ^ 1) then memcheck can see that rB and Ra cancel each other out if they are the same. https://bugs.kde.org/show_bug.cgi?id=404054	2019-02-21 17:21:53 +01:00
Carl Love	de7fc1a059	Fix missed changes from Rename some int<->fp conversion IROps patch The previous commit `6b16f0e2a0` dated Sat Jan 26 17:38:01 2019 by Julian Seward <jseward@acm.org> renamed some of the int<->fp conversion Iops to add a trailing _DEP. The patch missed renaming two of the Iops. This patch renames the missed Iops.	2019-02-05 10:19:01 -06:00
Julian Seward	e125eb3931	Make the DHAT viewer components be copied into the distribution tarball. Followup to `441bfc5f51` (dhat overhaul).	2019-02-03 10:31:15 +01:00
Julian Seward	15ac949bef	Make the DHAT viewer components be copied into the install tree. Followup to `441bfc5f51` (dhat overhaul).	2019-02-03 10:06:36 +01:00
Julian Seward	cad6b8a984	Fix "make post-regtest-checks" after `441bfc5f51` (dhat overhaul).	2019-02-02 16:10:50 +01:00
Julian Seward	7094b51f0a	Another -Wformat-signedness fix that was missed in `dee1c5ac84`.	2019-02-02 14:22:43 +01:00
Julian Seward	cadbb5d441	Enable -Wformat-signedness, if the compiler supports it.	2019-02-02 14:20:49 +01:00
Julian Seward	dee1c5ac84	Fix format string warnings from gcc9. No functional change (I think!)	2019-02-02 14:06:51 +01:00
Nicholas Nethercote	7e5fc882e9	Remove reference to non-existent *.post.exp files in dhat/tests/.	2019-02-02 07:41:02 +11:00
Nicholas Nethercote	f71002f1b5	Add missing stuff for a DHAT test.	2019-02-01 15:08:31 +11:00
Nicholas Nethercote	441bfc5f51	Overhaul DHAT. This commit thoroughly overhauls DHAT, moving it out of the "experimental" ghetto. It makes moderate changes to DHAT itself, including dumping profiling data to a JSON format output file. It also implements a new data viewer (as a web app, in dhat/dh_view.html). The main benefits over the old DHAT are as follows. - The separation of data collection and presentation means you can run a program once under DHAT and then sort the data in various ways. Also, full data is in the output file, and the viewer chooses what to omit. - The data can be sorted in more ways than previously. Some of these sorts involve useful filters such as "short-lived" and "zero reads or zero writes". - The tree structure view avoids the need to choose stack trace depth. This avoids both the problem of not enough depth (when records that should be distinct are combined, and may not contain enough information to be actionable) and the problem of too much depth (when records that should be combined are separated, making them seem less important than they really are). - Byte and block measures are shown with a percentage relative to the global count, which helps gauge relative significance of different parts of the profile. - Byte and blocks measures are also shown with an allocation rate (bytes and blocks per million instructions), which enables comparisons across multiple profiles, even if those profiles represent different workloads. - Both global and per-node measurements are taken at the global heap peak ("At t-gmax"), which gives Massif-like insight into the point of peak memory use. - The final/liftimes stats are a bit more useful than the old deaths stats. (E.g. the old deaths stats didn't take into account lifetimes of unfreed blocks.) - The handling of realloc() has changed. The sequence `p = malloc(100); realloc(p, 200);` now increases the total block count by 2 and the total byte count by 300. Previously it increased them by 1 and 200. The new handling is a more operational view that better reflects the effect of allocations on performance. It makes a significant difference in the results, giving paths involving reallocation (e.g. repeated pushing to a growing vector) more prominence. Other things of note: - There is now testing, both regression tests that run within the standard test suite, and viewer-specific tests that cannot run within the standard test suite. The latter are run by loading dh_view.html?test=1 in a web browser. - The commit puts all tool lists in Makefiles (and similar files) in the following consistent order: memcheck, cachegrind, callgrind, helgrind, drd, massif, dhat, lackey, none; exp-sgcheck, exp-bbv. - A lot of fields in dh_main.c have been given more descriptive names. Those names now match those used in dh_view.js.	2019-02-01 14:54:34 +11:00
Julian Seward	b19f6882cf	s390 back end: s390_isel_vec_expr_wrk: fix some enum type confusion. n-i-bz. In s390_isel_vec_expr_wrk() there has been some assignments of enum-typed values to variables of different enum types. This fixes it. It also adds a few initialisations to variables of type HReg for safety against the possibility of them being used uninitialised. No functional change. Tested by Andreas Arnez.	2019-01-31 07:56:26 +01:00
Rhys Kidd	5cd48eed00	memcheck,macos: Fix vbit-test building on macOS x86 architectures. n-i-bz. Secondary architectures on macOS are generally x86, which requires additional LDFLAGS to be set to avoid linker errors. apple clang (clang-800.0.42.1) error: ld: illegal text-relocation to '___stderrp' in /usr/lib/libSystem.dylib from '_main' in vbit_test_sec-main.o for architecture i386 Fixes: `49ca185` ("Also test memcheck/tests/vbit-test on any secondary arch.") Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>	2019-01-29 01:34:27 -05:00
Philippe Waroquiers	e911f75ee3	Fix callgrind_annotate Use of uninitialized value in numeric gt (>) When a callgrind dump file contains no event (at all I think), then callgrind_annotate can produce the below error messages: Ir sysCount sysTime file:function -------------------------------------------------------------------------------- Use of uninitialized value in numeric gt (>) at ../trunk_untouched/Inst/bin/callgrind_annotate line 957. Use of uninitialized value in numeric gt (>) at ../trunk_untouched/Inst/bin/callgrind_annotate line 957. Use of uninitialized value in numeric gt (>) at ../trunk_untouched/Inst/bin/callgrind_annotate line 957. . . . /build/glibc-yWQXbR/glibc-2.24/csu/../csu/libc-start.c:(below main) [/lib/x86_64-linux-gnu/libc-2.24.so] Use of uninitialized value in numeric gt (>) at ../trunk_untouched/Inst/bin/callgrind_annotate line 957. Use of uninitialized value in numeric gt (>) at ../trunk_untouched/Inst/bin/callgrind_annotate line 957. Use of uninitialized value in numeric gt (>) at ../trunk_untouched/Inst/bin/callgrind_annotate line 957. . . . /build/glibc-yWQXbR/glibc-2.24/elf/../sysdeps/x86_64/dl-trampoline.h:_dl_runtime_resolve_xsave [/lib/x86_64-linux-gnu/ld-2.24.so] Use of uninitialized value in numeric gt (>) at ../trunk_untouched/Inst/bin/callgrind_annotate line 957. ..... The above can be produced by: run sleep 100 under callgrind. take some callgrind dumps after the startup. ./Inst/bin/callgrind_annotate --threshold=1 callgrind.out.31377.2 Check that the value is defined before doing the comparison. Note: callgrind_annotate shows functions which have undefined costs for all events (and I guess it would also show functions that have zero costs for all events). Maybe it would be better to not show at all such functions, rather than show them with all '.'.	2019-01-27 13:12:42 +01:00
Philippe Waroquiers	f57661926b	Fix callgrind_annotate --threshold=100 does not print all functions.	2019-01-27 12:36:54 +01:00
Philippe Waroquiers	423c754049	Update callgrind_annotate documentation. Clarify the meaning of the threshold argument. Document the per event thresholds that can be given as part of the --sort option.	2019-01-27 12:32:32 +01:00
Philippe Waroquiers	50f76c756a	Fix callgrind_annotate non deterministic order for equal total Patch by Matthias Schwarzott	2019-01-27 11:15:30 +01:00
Philippe Waroquiers	52713e29c7	Sort the bug entries by bug nr, add a entry for a fixed bug.	2019-01-27 11:04:01 +01:00
Julian Seward	3e94579a5a	Enable warning flag -Wenum-conversion if the compiler supports it. This picks up some enum type confusion, and so looks useful. Unfortunately only Clang seems to have it; gcc doesn't.	2019-01-26 18:19:50 +01:00
Julian Seward	130ac30533	s390 front end: remove unused function 'put_gpr_int'. n-i-bz.	2019-01-26 18:18:28 +01:00
Julian Seward	2656009e6f	amd64 pipeline: generate a much better translation for PMADDUBSW. This seems pretty common in some codecs, and the existing translation was somewhat longwinded.	2019-01-26 18:00:41 +01:00
Julian Seward	6b16f0e2a0	Rename some int<->fp conversion IROps for consistency. No functional change. n-i-bz. 2018-Dec-27: some of int<->fp conversion operations have been renamed so as to have a trailing _DEP, meaning "deprecated". This is because they don't specify a rounding mode to be used for the conversion and so are underspecified. Their use should be replaced with equivalents that do specify a rounding mode, either as a first argument or using a suffix on the name, that indicates the rounding mode to use.	2019-01-26 17:38:01 +01:00
Julian Seward	a05a920edc	VG_(discard_translations): try to avoid invalidating the entire VG_(tt_fast) cache. n-i-bz. It is very commonly the case that a call to VG_(discard_translations) results in the discarding of exactly one superblock. In such cases, it's much cheaper to find and invalidate the VG_(tt_fast) cache entry associated with the block, than it is to invalidate the entire cache, because (1) invalidating the fast cache is expensive, and (2) repopulating the fast cache after invalidation is even more expensive. For QEMU, which intensively invalidates individual translations (presumably due to patching them), this reduces the fast-cache miss rate from circa one in 33 lookups to around one in 130 lookups.	2019-01-25 12:06:37 +01:00
Julian Seward	f4072abf6b	Update.	2019-01-25 09:31:19 +01:00
Julian Seward	f96d131ce2	Bug 402781 - Redo the cache used to process indirect branch targets. Implementation for x86-solaris and amd64-solaris. This completes the implementations for all targets. Note these two are untested because I don't have any way to test them.	2019-01-25 09:27:23 +01:00
Julian Seward	50bb127b1d	Bug 402781 - Redo the cache used to process indirect branch targets. [This commit contains an implementation for all targets except amd64-solaris and x86-solaris, which will be completed shortly.] In the baseline simulator, jumps to guest code addresses that are not known at JIT time have to be looked up in a guest->host mapping table. That means: indirect branches, indirect calls and most commonly, returns. Since there are huge numbers of these (often 10+ million/second) the mapping mechanism needs to be extremely cheap. Currently, this is implemented using a direct-mapped cache, VG_(tt_fast), with 2^15 (guest_addr, host_addr) pairs. This is queried in handwritten assembly in VG_(disp_cp_xindir) in dispatch-<arch>-<os>.S. If there is a miss in the cache then we fall back out to C land, and do a slow lookup using VG_(search_transtab). Given that the size of the translation table(s) in recent years has expanded significantly in order to keep pace with increasing application sizes, two bad things have happened: (1) the cost of a miss in the fast cache has risen significantly, and (2) the miss rate on the fast cache has also increased significantly. This means that large (~ one-million-basic-blocks-JITted) applications that run for a long time end up spending a lot of time in VG_(search_transtab). The proposed fix is to increase associativity of the fast cache, from 1 (direct mapped) to 4. Simulations of various cache configurations using indirect-branch traces from a large application show that is the best of various configurations. In an extreme case with 5.7 billion indirect branches: * The increase of associativity from 1 way to 4 way, whilst keeping the overall cache size the same (32k guest/host pairs), reduces the miss rate by around a factor of 3, from 4.02% to 1.30%. * The use of a slightly better hash function than merely slicing off the bottom 15 bits of the address, reduces the miss rate further, from 1.30% to 0.53%. Overall the VG_(tt_fast) miss rate is almost unchanged on small workloads, but reduced by a factor of up to almost 8 on large workloads. By implementing each (4-entry) cache set using a move-to-front scheme in the case of hits in ways 1, 2 or 3, the vast majority of hits can be made to happen in way 0. Hence the cost of having this extra associativity is almost zero in the case of a hit. The improved hash function costs an extra 2 ALU shots (a shift and an xor) but overall this seems performance neutral to a win.	2019-01-25 09:14:56 +01:00
Andreas Arnez	467c7c4c96	Bug 403552 s390x: Fix vector facility bit number The wrong bit number was used when checking for the vector facility. This can result in a fatal emulation error: "Encountered an instruction that requires the vector facility. That facility is not available on this host." In many cases the wrong facility bit was usually set as well, hence nothing bad happened. But when running Valgrind within a Qemu/KVM guest, the wrong bit was not (always?) set and the emulation error occurred. This fix simply corrects the vector facility bit number, changing it from 128 to 129.	2019-01-24 11:11:51 +01:00
Philippe Waroquiers	d7d8231750	Fix false positive 'Conditional jump or move' on amd64 64 bits ptracing 32 bits. PTRACE_GET_THREAD_AREA is not handled by amd64 linux syswrap, which leads to false positive errors in 64 bits program ptrace-ing 32 bits processes. For example, the below error was wrongly reported on GDB: ==25377== Conditional jump or move depends on uninitialised value(s) ==25377== at 0x8A1D7EC: td_thr_get_info (td_thr_get_info.c:35) ==25377== by 0x526819: thread_from_lwp(thread_info, ptid_t) (linux-thread-db.c:417) ==25377== by 0x5281D4: thread_db_notice_clone(ptid_t, ptid_t) (linux-thread-db.c:442) ==25377== by 0x51773B: linux_handle_extended_wait(lwp_info, int) (linux-nat.c:2027) .... ==25377== Uninitialised value was created by a stack allocation ==25377== at 0x69A360: x86_linux_get_thread_area(int, void, unsigned int) (x86-linux-nat.c:278) Fix this by implementing PTRACE_GET\|SET_THREAD_AREA on amd64.	2019-01-12 15:35:59 +01:00
Mark Wielaard	3528f84037	readdwarf3.c (parse_type_DIE): Accept DW_TAG_subrange_type with DW_AT_count GCC9 generates a subrange_type with a lower_bound and count, but no upper_bound attribute. This simply means the upper bound is lower plus count.	2019-01-11 21:52:58 +01:00
Mark Wielaard	c512949082	Bug 402480 Do not use %esp in clobber list. This is the same fix as for amd64-linux, but now for x86-linux.	2019-01-11 20:00:21 +01:00
Mark Wielaard	2c1f016e63	Bug 402519 - POWER 3.0 addex instruction incorrectly implemented addex uses OV as carry in and carry out. For all other instructions OV is the signed overflow flag. And instructions like adde use CA as carry. Replace set_XER_OV_OV32 with set_XER_OV_OV32_ADDEX, which will call calculate_XER_CA_64 and calculate_XER_CA_32, but with OV as input, and sets OV and OV32. Enable test_addex in none/tests/ppc64/test_isa_3_0.c and update the expected output. test_addex would fail to match the expected output before this patch.	2018-12-31 22:26:31 +01:00
Philippe Waroquiers	9966fa6b69	Add memcheck/tests/vbit-test/vbit-test-sec in .gitignore	2018-12-29 10:23:35 +01:00
Philippe Waroquiers	ed1c1ef744	Some more .exp changes following --show-error-list new option A few .exp files (not tested on amd64) have to be changed to have the messages in the new order: Use --track-origins=yes to see where uninitialised values come from For lists of detected and suppressed errors, rerun with: -s	2018-12-29 10:20:33 +01:00
Philippe Waroquiers	4962900a13	Fix the name of the option in the FIXED BUGS section	2018-12-29 00:25:34 +01:00
Philippe Waroquiers	f3a3eadf36	Document new options --show-error-list=no\|yes and -s in NEWS	2018-12-29 00:16:46 +01:00
Philippe Waroquiers	9efc7e80f2	Document the new options --show-error-list and -s	2018-12-28 19:33:06 +01:00
Philippe Waroquiers	cfae4f70a6	Modify .exp files following the new error message. Change: For counts of detected and suppressed errors, rerun with: -v to For lists of detected and suppressed errors, rerun with: -s	2018-12-28 19:33:00 +01:00
Philippe Waroquiers	d680f66465	Implement option --show-error-list=no\|yes -s This option allows to list the detected errors and show the used suppressions without increasing the verbosity. Increasing the verbosity also activates a lot of messages that are often not very useful for the user. So, this option allows to see the list of errors and used suppressions independently of the verbosity. Note if a high verbosity is selected, the behaviour is unchanged. In other words, when specifying -v, the list of detected errors and the used suppressions are still shown, even if --show-error-list=yes and -s are not used.	2018-12-28 19:32:53 +01:00
Philippe Waroquiers	36bf7c0647	Factorize producing the 'For counts of detected and suppressed errors' msg Each tool producing errors had identical code to produce this msg. Factorize the production of the message in m_main.c This prepares the work to have a specific option to show the list of detected errors and the count of suppressed errors. This has a (small) visible effect on the output of memcheck: Instead of producing For counts of detected and suppressed errors, rerun with: -v Use --track-origins=yes to see where uninitialised values come from memcheck now produces: Use --track-origins=yes to see where uninitialised values come from For counts of detected and suppressed errors, rerun with: -v i.e. the track origin and counts of errors msg are inverted.	2018-12-23 23:45:33 +01:00
Mark Wielaard	39f0abfc92	Add vbit-test-sec.vgtest and vbit-test-sec.stderr.exp to EXTRA_DIST.	2018-12-23 23:42:27 +01:00
Mark Wielaard	087979e467	Mention 402481 as fixed in NEWS.	2018-12-23 23:11:42 +01:00
Khem Raj	022f5af61b	tests/amd64: Do not clobber %rsp register This is seen with gcc-9.0 compiler now which is fix that gcc community did recently https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52813 Signed-off-by: Khem Raj <raj.khem@gmail.com>	2018-12-23 23:09:28 +01:00
Mark Wielaard	49ca1853fc	Also test memcheck/tests/vbit-test on any secondary arch. If we are building a secondary arch then also build and run the memcheck vbit-test for that architecture.	2018-12-23 22:20:44 +01:00

1 2 3 4 5 ...

16247 Commits