blocks, the <= relation is the correct one. In effect asserting <
constitutes an off-by-one error.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5366
accessed blocks closer to the front. This speeds up malloc/free
intensive programs because evidently those searches cause a lot of
cache misses (so cachegrind tells us). For perf/heap.c on P4
Northwood, this halves the run-time (!) from 85.8 to 42.9 seconds.
For "real" code (start/exit ktuberling) there is a small but
worthwhile performance gain, of about 2 seconds out of 95.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5365
- show percentage speedup over the first Valgrind when comparing multiple
Valgrind
- don't accept --reps < 0
- avoid div-by-zero if the runtime is measured as zero
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5348
to translations and back to dispatcher, and also different arg
passing conventions to LibVEX_Translate).
- Rewrite x86 dispatcher to not increment the profiling counters
unless requested by the user. This dramatically reduces the
D1 miss rate and gives considerable performance improvement
on x86. Also, restructure and add comments to dispatch-x86-linux.S
to make it much easier to follow (imo).
amd64/ppc32/ppc64 fixes to follow.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5345
- needed some hackery to get around VEX's loss of accuracy.
------------------------------
Added test for fsqrt (fp square root)
Enabled stfs(u)(x) (fp single-precision stores)
- VEX implementation not great: ends up rounding twice, losing
accuracy, but is good enough for this test's small fp argument array.
Changed fp arg setup
- no denormals (for VEX inaccuracy)
All fp tests
- don't print CR, XER flags, as VEX doesn't set them.
3 arg fp arith tests (fp 'multiply and add' etc)
- no 'special' fp vals (for VEX inaccuracy)
- zap lo byte (for VEX inaccuracy)
fctiw, fctiwz (fp convert to int)
- zap high 32bits of result (is undefined)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5344
(use "make perf" to run) that executes test programs and times their
slowdowns under various tools. It works a lot like the vg_regtest script.
It's a bit rough around the edges -- eg. you can't currently directly
compare two different versions of Valgrind, which would be useful -- but it
is a good start.
There are currently two test programs in perf/. More will be added as time
goes on. This stuff will be built on so that performance changes can be
tracked over time.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5323
counts when a function name was used in more than one module. This showed
up for "???" functions when profiling Valgrind itself.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5319
- fixed launcher.c to recognise ppc32/64-linux platforms properly
- lots of assembly fixes to handle func descriptors, toc references, 64bit regs.
- fixed var types in vki-ppc64-linux
Now gets as far as VG_(translate), but dies from a case of invalid orig_addr.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5299
uses memory after freeing. Check the redzones for all non-client
frees, and fill all non-client freed areas with garbage. Unroll
VG_(memset) as a precautionary measure against performance lossage.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5283
has to have the same status as the HTML/PDF/PS docs, that is, not
built by default because it depends on the ultra-fragile XML
toolchain. So make it use the same hacks, that is, build only at
'make dist' time.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5279