ftmemsim-valgrind

mirror of https://github.com/Zenithsiz/ftmemsim-valgrind.git synced 2026-02-04 18:56:10 +00:00

Author	SHA1	Message	Date
Mark Wielaard	461cc5c003	Cleanup GPL header address notices by using http://www.gnu.org/licenses/ Sync VEX/LICENSE.GPL with top-level COPYING file. We used 3 different addresses for writing to the FSF to receive a copy of the GPL. Replace all different variants with an URL <http://www.gnu.org/licenses/>. The following files might still have some slightly different (L)GPL copyright notice because they were derived from other programs: - files under coregrind/m_demangle which come from libiberty: cplus-dem.c, d-demangle.c, demangle.h, rust-demangle.c, safe-ctype.c and safe-ctype.h - coregrind/m_demangle/dyn-string.[hc] derived from GCC. - coregrind/m_demangle/ansidecl.h derived from glibc. - VEX files for FMA detived from glibc: host_generic_maddf.h and host_generic_maddf.c - files under coregrin/m_debuginfo derived from LZO: lzoconf.h, lzodefs.h, minilzo-inl.c and minilzo.h - files under coregrind/m_gdbserver detived from GDB: gdb/signals.h, inferiors.c, regcache.c, regcache.h, regdef.h, remote-utils.c, server.c, server.h, signals.c, target.c, target.h and utils.c Plus the following test files: - none/tests/ppc32/testVMX.c derived from testVMX. - ppc tests derived from QEMU: jm-insns.c, ppc64_helpers.h and test_isa_3_0.c - tests derived from bzip2 (with embedded GPL text in code): hackedbz2.c, origin5-bz2.c, varinfo6.c - tests detived from glibc: str_tester.c, pth_atfork1.c - test detived from GCC libgomp: tc17_sembar.c - performance tests derived from bzip2 or tinycc (with embedded GPL text in code): bz2.c, test_input_for_tinycc.c and tinycc.c	2019-05-26 20:07:51 +02:00
Julian Seward	50bb127b1d	Bug 402781 - Redo the cache used to process indirect branch targets. [This commit contains an implementation for all targets except amd64-solaris and x86-solaris, which will be completed shortly.] In the baseline simulator, jumps to guest code addresses that are not known at JIT time have to be looked up in a guest->host mapping table. That means: indirect branches, indirect calls and most commonly, returns. Since there are huge numbers of these (often 10+ million/second) the mapping mechanism needs to be extremely cheap. Currently, this is implemented using a direct-mapped cache, VG_(tt_fast), with 2^15 (guest_addr, host_addr) pairs. This is queried in handwritten assembly in VG_(disp_cp_xindir) in dispatch-<arch>-<os>.S. If there is a miss in the cache then we fall back out to C land, and do a slow lookup using VG_(search_transtab). Given that the size of the translation table(s) in recent years has expanded significantly in order to keep pace with increasing application sizes, two bad things have happened: (1) the cost of a miss in the fast cache has risen significantly, and (2) the miss rate on the fast cache has also increased significantly. This means that large (~ one-million-basic-blocks-JITted) applications that run for a long time end up spending a lot of time in VG_(search_transtab). The proposed fix is to increase associativity of the fast cache, from 1 (direct mapped) to 4. Simulations of various cache configurations using indirect-branch traces from a large application show that is the best of various configurations. In an extreme case with 5.7 billion indirect branches: * The increase of associativity from 1 way to 4 way, whilst keeping the overall cache size the same (32k guest/host pairs), reduces the miss rate by around a factor of 3, from 4.02% to 1.30%. * The use of a slightly better hash function than merely slicing off the bottom 15 bits of the address, reduces the miss rate further, from 1.30% to 0.53%. Overall the VG_(tt_fast) miss rate is almost unchanged on small workloads, but reduced by a factor of up to almost 8 on large workloads. By implementing each (4-entry) cache set using a move-to-front scheme in the case of hits in ways 1, 2 or 3, the vast majority of hits can be made to happen in way 0. Hence the cost of having this extra associativity is almost zero in the case of a hit. The improved hash function costs an extra 2 ALU shots (a shift and an xor) but overall this seems performance neutral to a win.	2019-01-25 09:14:56 +01:00
Ivo Raisr	38edd50c0e	Update copyright end year to 2017 in preparation for 3.13 release. n-i-bz git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16333	2017-05-04 15:09:39 +00:00
Florian Krohm	193f88fad4	Make sure no executable stack gets created. Explanation by Matthias Schwarzott: The linker will request an executable stack as soon as at least one object file, that is linked in, wants an executable stack. And the absence of the .section .note.GNU-stack."",@progbits is enough to tell the linker that an executable stack is needed. So even an empty asm-file must at least contain this statement to not force executable stacks on the whole executable. * Define a helper macro MARK_STACK_NO_EXEC that disables the executable stack. * Instantiate this macro unconditionally at the end of each asm file. Patch by Matthias Schwarzott <zzam@gentoo.org>. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15692	2015-09-30 20:30:48 +00:00
Julian Seward	adc2dafee9	Update copyright dates, to include 2015. No functional change. git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15577	2015-08-21 11:32:26 +00:00
Julian Seward	3f6d211236	Add support for ARMv8 AArch64 (the 64 bit ARM instruction set). git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13770	2014-01-12 12:54:00 +00:00

6 Commits