ftmemsim-valgrind

mirror of https://github.com/Zenithsiz/ftmemsim-valgrind.git synced 2026-02-10 21:47:06 +00:00

Author	SHA1	Message	Date
Carl Love	991db2a39b	PPC64, fix issues with dnormal values in the vector fp instructions. The result of the floating point instructions vmaddfp, vnmsubfp, vaddfp, vsubfp, vmaxfp, vminfp, vrefp, vrsqrtefp, vcmpeqfp, vcmpeqfp, vcmpgefp, vcmpgtfp are controlled by the setting of the NJ bit in the VSCR register. If VSCR[NJ] = 0; then denormalized values are handled as specified by Java and the IEEE standard. If the bit is a 1, then the denormalized element in the vector is replaced with a zero. Valgrind was not properly handling the denormalized case for these instructions. This patch fixes the issue. https://bugs.kde.org/show_bug.cgi?id=406256	2019-05-28 13:49:33 -05:00
Mark Wielaard	461cc5c003	Cleanup GPL header address notices by using http://www.gnu.org/licenses/ Sync VEX/LICENSE.GPL with top-level COPYING file. We used 3 different addresses for writing to the FSF to receive a copy of the GPL. Replace all different variants with an URL <http://www.gnu.org/licenses/>. The following files might still have some slightly different (L)GPL copyright notice because they were derived from other programs: - files under coregrind/m_demangle which come from libiberty: cplus-dem.c, d-demangle.c, demangle.h, rust-demangle.c, safe-ctype.c and safe-ctype.h - coregrind/m_demangle/dyn-string.[hc] derived from GCC. - coregrind/m_demangle/ansidecl.h derived from glibc. - VEX files for FMA detived from glibc: host_generic_maddf.h and host_generic_maddf.c - files under coregrin/m_debuginfo derived from LZO: lzoconf.h, lzodefs.h, minilzo-inl.c and minilzo.h - files under coregrind/m_gdbserver detived from GDB: gdb/signals.h, inferiors.c, regcache.c, regcache.h, regdef.h, remote-utils.c, server.c, server.h, signals.c, target.c, target.h and utils.c Plus the following test files: - none/tests/ppc32/testVMX.c derived from testVMX. - ppc tests derived from QEMU: jm-insns.c, ppc64_helpers.h and test_isa_3_0.c - tests derived from bzip2 (with embedded GPL text in code): hackedbz2.c, origin5-bz2.c, varinfo6.c - tests detived from glibc: str_tester.c, pth_atfork1.c - test detived from GCC libgomp: tc17_sembar.c - performance tests derived from bzip2 or tinycc (with embedded GPL text in code): bz2.c, test_input_for_tinycc.c and tinycc.c	2019-05-26 20:07:51 +02:00
Julian Seward	50bb127b1d	Bug 402781 - Redo the cache used to process indirect branch targets. [This commit contains an implementation for all targets except amd64-solaris and x86-solaris, which will be completed shortly.] In the baseline simulator, jumps to guest code addresses that are not known at JIT time have to be looked up in a guest->host mapping table. That means: indirect branches, indirect calls and most commonly, returns. Since there are huge numbers of these (often 10+ million/second) the mapping mechanism needs to be extremely cheap. Currently, this is implemented using a direct-mapped cache, VG_(tt_fast), with 2^15 (guest_addr, host_addr) pairs. This is queried in handwritten assembly in VG_(disp_cp_xindir) in dispatch-<arch>-<os>.S. If there is a miss in the cache then we fall back out to C land, and do a slow lookup using VG_(search_transtab). Given that the size of the translation table(s) in recent years has expanded significantly in order to keep pace with increasing application sizes, two bad things have happened: (1) the cost of a miss in the fast cache has risen significantly, and (2) the miss rate on the fast cache has also increased significantly. This means that large (~ one-million-basic-blocks-JITted) applications that run for a long time end up spending a lot of time in VG_(search_transtab). The proposed fix is to increase associativity of the fast cache, from 1 (direct mapped) to 4. Simulations of various cache configurations using indirect-branch traces from a large application show that is the best of various configurations. In an extreme case with 5.7 billion indirect branches: * The increase of associativity from 1 way to 4 way, whilst keeping the overall cache size the same (32k guest/host pairs), reduces the miss rate by around a factor of 3, from 4.02% to 1.30%. * The use of a slightly better hash function than merely slicing off the bottom 15 bits of the address, reduces the miss rate further, from 1.30% to 0.53%. Overall the VG_(tt_fast) miss rate is almost unchanged on small workloads, but reduced by a factor of up to almost 8 on large workloads. By implementing each (4-entry) cache set using a move-to-front scheme in the case of hits in ways 1, 2 or 3, the vast majority of hits can be made to happen in way 0. Hence the cost of having this extra associativity is almost zero in the case of a hit. The improved hash function costs an extra 2 ALU shots (a shift and an xor) but overall this seems performance neutral to a win.	2019-01-25 09:14:56 +01:00
Ivo Raisr	38edd50c0e	Update copyright end year to 2017 in preparation for 3.13 release. n-i-bz git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16333	2017-05-04 15:09:39 +00:00
Florian Krohm	193f88fad4	Make sure no executable stack gets created. Explanation by Matthias Schwarzott: The linker will request an executable stack as soon as at least one object file, that is linked in, wants an executable stack. And the absence of the .section .note.GNU-stack."",@progbits is enough to tell the linker that an executable stack is needed. So even an empty asm-file must at least contain this statement to not force executable stacks on the whole executable. * Define a helper macro MARK_STACK_NO_EXEC that disables the executable stack. * Instantiate this macro unconditionally at the end of each asm file. Patch by Matthias Schwarzott <zzam@gentoo.org>. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15692	2015-09-30 20:30:48 +00:00
Julian Seward	adc2dafee9	Update copyright dates, to include 2015. No functional change. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15577	2015-08-21 11:32:26 +00:00
Carl Love	98908947c7	This commit is for Bugzilla 334834. The Bugzilla contains patch 2 of 3 to add PPC64 LE support. The other two patches can be found in Bugzillas 334384 and 334836. POWER PC, add the functional Little Endian support, patch 2 The IBM POWER processor now supports both Big Endian and Little Endian. The ABI for Little Endian also changes. Specifically, the function descriptor is not used, the stack size changed, accessing the TOC changed. Functions now have a local and a global entry point. Register r2 contains the TOC for local calls and register r12 contains the TOC for global calls. This patch makes the functional changes to the Valgrind tool. The patch makes the changes needed for the none/tests/ppc32 and none/tests/ppc64 Makefile.am. A number of the ppc specific tests have Endian dependencies that are not fixed in this patch. They are fixed in the next patch. Per Julian's comments renamed coregrind/m_dispatch/dispatch-ppc64-linux.S to coregrind/m_dispatch/dispatch-ppc64be-linux.S Created new file for LE coregrind/m_dispatch/dispatch-ppc64le-linux.S. The same was done for coregrind/m_syswrap/syscall-ppc-linux.S. Signed-off-by: Carl Love <carll@us.ibm.com> git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14239	2014-08-07 23:35:54 +00:00

7 Commits