26 Commits

Author SHA1 Message Date
Mark Wielaard
461cc5c003 Cleanup GPL header address notices by using http://www.gnu.org/licenses/
Sync VEX/LICENSE.GPL with top-level COPYING file. We used 3 different
addresses for writing to the FSF to receive a copy of the GPL. Replace
all different variants with an URL <http://www.gnu.org/licenses/>.

The following files might still have some slightly different (L)GPL
copyright notice because they were derived from other programs:

- files under coregrind/m_demangle which come from libiberty:
  cplus-dem.c, d-demangle.c, demangle.h, rust-demangle.c,
  safe-ctype.c and safe-ctype.h
- coregrind/m_demangle/dyn-string.[hc] derived from GCC.
- coregrind/m_demangle/ansidecl.h derived from glibc.
- VEX files for FMA detived from glibc:
  host_generic_maddf.h and host_generic_maddf.c
- files under coregrin/m_debuginfo derived from LZO:
  lzoconf.h, lzodefs.h, minilzo-inl.c and minilzo.h
- files under coregrind/m_gdbserver detived from GDB:
  gdb/signals.h, inferiors.c, regcache.c, regcache.h,
  regdef.h, remote-utils.c, server.c, server.h, signals.c,
  target.c, target.h and utils.c

Plus the following test files:

- none/tests/ppc32/testVMX.c derived from testVMX.
- ppc tests derived from QEMU: jm-insns.c, ppc64_helpers.h
  and test_isa_3_0.c
- tests derived from bzip2 (with embedded GPL text in code):
  hackedbz2.c, origin5-bz2.c, varinfo6.c
- tests detived from glibc: str_tester.c, pth_atfork1.c
- test detived from GCC libgomp: tc17_sembar.c
- performance tests derived from bzip2 or tinycc (with embedded GPL
  text in code): bz2.c, test_input_for_tinycc.c and tinycc.c
2019-05-26 20:07:51 +02:00
Julian Seward
50bb127b1d Bug 402781 - Redo the cache used to process indirect branch targets.
[This commit contains an implementation for all targets except amd64-solaris
and x86-solaris, which will be completed shortly.]

In the baseline simulator, jumps to guest code addresses that are not known at
JIT time have to be looked up in a guest->host mapping table.  That means:
indirect branches, indirect calls and most commonly, returns.  Since there are
huge numbers of these (often 10+ million/second) the mapping mechanism needs
to be extremely cheap.

Currently, this is implemented using a direct-mapped cache, VG_(tt_fast), with
2^15 (guest_addr, host_addr) pairs.  This is queried in handwritten assembly
in VG_(disp_cp_xindir) in dispatch-<arch>-<os>.S.  If there is a miss in the
cache then we fall back out to C land, and do a slow lookup using
VG_(search_transtab).

Given that the size of the translation table(s) in recent years has expanded
significantly in order to keep pace with increasing application sizes, two bad
things have happened: (1) the cost of a miss in the fast cache has risen
significantly, and (2) the miss rate on the fast cache has also increased
significantly.  This means that large (~ one-million-basic-blocks-JITted)
applications that run for a long time end up spending a lot of time in
VG_(search_transtab).

The proposed fix is to increase associativity of the fast cache, from 1
(direct mapped) to 4.  Simulations of various cache configurations using
indirect-branch traces from a large application show that is the best of
various configurations.  In an extreme case with 5.7 billion indirect
branches:

* The increase of associativity from 1 way to 4 way, whilst keeping the
  overall cache size the same (32k guest/host pairs), reduces the miss rate by
  around a factor of 3, from 4.02% to 1.30%.

* The use of a slightly better hash function than merely slicing off the
  bottom 15 bits of the address, reduces the miss rate further, from 1.30% to
  0.53%.

Overall the VG_(tt_fast) miss rate is almost unchanged on small workloads, but
reduced by a factor of up to almost 8 on large workloads.

By implementing each (4-entry) cache set using a move-to-front scheme in the
case of hits in ways 1, 2 or 3, the vast majority of hits can be made to
happen in way 0.  Hence the cost of having this extra associativity is almost
zero in the case of a hit.  The improved hash function costs an extra 2 ALU
shots (a shift and an xor) but overall this seems performance neutral to a
win.
2019-01-25 09:14:56 +01:00
Ivo Raisr
38edd50c0e Update copyright end year to 2017 in preparation for 3.13 release.
n-i-bz



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16333
2017-05-04 15:09:39 +00:00
Florian Krohm
193f88fad4 Make sure no executable stack gets created.
Explanation by Matthias Schwarzott:

The linker will request an executable stack as soon as at least one
object file, that is linked in, wants an executable stack.
And the absence of the 
      .section .note.GNU-stack."",@progbits
is enough to tell the linker that an executable stack is needed.
So even an empty asm-file must at least contain this statement to not
force executable stacks on the whole executable.

* Define a helper macro MARK_STACK_NO_EXEC that disables the
  executable stack.
* Instantiate this macro unconditionally at the end of each asm file.

Patch by Matthias Schwarzott <zzam@gentoo.org>.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15692
2015-09-30 20:30:48 +00:00
Julian Seward
adc2dafee9 Update copyright dates, to include 2015. No functional change.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15577
2015-08-21 11:32:26 +00:00
Julian Seward
dbf9b63605 Update copyright dates (20XY-2012 ==> 20XY-2013)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13658
2013-10-18 14:27:36 +00:00
Julian Seward
4a3633e266 Update copyright dates to include 2012.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12843
2012-08-05 15:46:46 +00:00
Julian Seward
09bcae8ec8 Last optimisation for the day: change VG_(stats__n_xindirs) in such a
way that the fast-path through VG_(disp_cp_xindir) only has to
increment a 32 bit counter, saving memory bandwidth on 32 bit
platforms compared to a 64-bit inc.  The overall numbers of XIndirs
can still be 64 bit though.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12527
2012-04-21 23:05:57 +00:00
Julian Seward
712ee2547b Make the return type of VG_(disp_run_translations) be void, rather
than the HWord it was claimed to be.  Inconsistency spotted by
Philippe Waroquiers.



git-svn-id: svn://svn.valgrind.org/valgrind/branches/TCHAIN@12486
2012-04-04 12:23:23 +00:00
Julian Seward
8b6f93641c Add translation chaining support for amd64, x86 and ARM
(Valgrind side).  See #296422.



git-svn-id: svn://svn.valgrind.org/valgrind/branches/TCHAIN@12484
2012-04-02 21:56:03 +00:00
Julian Seward
c96096ab24 Update all copyright dates, from 20xy-2010 to 20xy-2011.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12206
2011-10-23 07:32:08 +00:00
Julian Seward
ffc3968ff2 Give the amd64-linux and x86-linux dispatchers two entry points, not one,
so as to avoid a GSP-changed check in the common case.  See vex r2155.
(amd64-darwin and x86-darwin are now temporarily unbuildable.)



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11786
2011-05-29 09:34:30 +00:00
Julian Seward
3a131dc867 dispatch-x86-linux.S:
use test-based detection of GSP pointer changes.
   Saves one load per SB.

dispatch-amd64-linux.S:
   ditto

dispatch-amd64-linux.S:
   use movabsq to get &VG_(tt_fast) into a register,
   instead of an rsp-relative load from a constant pool.
   Saves a second load per SB.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11785
2011-05-28 17:07:53 +00:00
Julian Seward
9b0574dff8 Update copyright dates to 2010.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11121
2010-05-03 21:37:12 +00:00
Nicholas Nethercote
b05a2a18d7 This commit merges the BUILD_TWEAKS branch onto the trunk. It has the
following improvements:

- Arch/OS/platform-specific files are now included/excluded via the
  preprocessor, rather than via the build system.  This is more consistent
  (we use the pre-processor for small arch/OS/platform-specific chunks
  within files) and makes the build system much simpler, as the sources for
  all programs are the same on all platforms.

- Vast amounts of cut+paste Makefile.am code has been factored out.  If a
  new platform is implemented, you need to add 11 extra Makefile.am lines.
  Previously it was over 100 lines.

- Vex has been autotoolised.  Dependency checking now works in Vex (no more
  incomplete builds).  Parallel builds now also work.  --with-vex no longer
  works;  it's little use and a pain to support.  VEX/Makefile is still in
  the Vex repository and gets overwritten at configure-time;  it should
  probably be renamed Makefile-gcc to avoid possible problems, such as
  accidentally committing a generated Makefile.  There's a bunch of hacky
  copying to deal with the fact that autotools don't handle same-named files
  in different directories.  Julian plans to rename the files to avoid this
  problem.

- Various small Makefile.am things have been made more standard automake
  style, eg. the use of pkginclude/pkglib prefixes instead of rolling our
  own.

- The existing five top-level Makefile.am include files have been
  consolidated into three.

- Most Makefile.am files now are structured more clearly, with comment
  headers separating sections, declarations relating to the same things next
  to each other, better spacing and layout, etc.

- Removed the unused exp-ptrcheck/tests/x86 directory.

- Renamed some XML files.

- Factored out some duplicated dSYM handling code.

- Split auxprogs/ into auxprogs/ and mpi/, which allowed the resulting
  Makefile.am files to be much more standard.

- Cleaned up m_coredump by merging a bunch of files that had been
  overzealously separated.

The net result is 630 fewer lines of Makefile.am code, or 897 if you exclude
the added Makefile.vex.am, or 997 once the hacky file copying for Vex is
removed.  And the build system is much simpler.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@10364
2009-06-24 00:37:09 +00:00
Nicholas Nethercote
2001629c3f Updated copyright years.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@9344
2009-03-10 22:02:09 +00:00
Julian Seward
5679a22410 Update copyright dates ("200X-2007" --> "200X-2008").
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@7398
2008-02-11 11:34:59 +00:00
Julian Seward
79f854bc29 Redo the dispatcher's fast-cache mechanism (VG_(tt_fast) et al) to be
more cache friendly.  This changes the mechanism from being a table of
pointers to (guest address, translated code pairs) to being a table of
pairs (guest address, pointer to translated code).  The effect ranges
from zero up to about 20% performance improvement on memcheck, the
biggest effects being seen for programs which jump around a large
number of blocks of code and whose data set does not fit in L2.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@6582
2007-02-11 05:08:06 +00:00
Julian Seward
172505c978 Update copyright dates.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@6488
2007-01-08 06:01:59 +00:00
Julian Seward
0850d32443 Merge r6217 (also comment cosmetics):
Use 'ctr' rather than 'lr' for indirect jumps, so as not to trash the
branch predictor(s) for returns from generated code.  Makes a big
difference on ppc970 (and POWER4).


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@6291
2006-10-17 02:08:26 +00:00
Julian Seward
ad67fd79fe Update copyright dates.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5954
2006-06-05 23:21:15 +00:00
Tom Hughes
f5f7215ba4 Add .type and .size directives for VG_(run_innerloop) and
VG_(run_a_noredir_translation) on all platforms where they are
missing.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5873
2006-05-01 09:28:39 +00:00
Julian Seward
8576ec23a6 Make function wrapping work on amd64-linux.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5521
2006-01-12 13:34:20 +00:00
Julian Seward
bcc3feca7e Rewrite amd64 dispatch loop to add performance enhancements as per x86
reorganisation of r5345.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5346
2005-12-15 15:46:43 +00:00
Julian Seward
6c9242d63c Hacks to enable self-hosting on amd64, so as to facilitate
cachegrinding it.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5119
2005-11-13 18:51:31 +00:00
Nicholas Nethercote
4ef4aabbd0 Make the dispatch files platform-specific, not just arch-specific,
as requested by Greg Parker.  (The ppc32/Darwin dispatch loop is
different to the ppc32/Linux one, for example.)



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@4843
2005-10-02 14:48:09 +00:00