15957 Commits

Author SHA1 Message Date
Mark Wielaard
400ad0e36e Fix memcheck/tests/linux/dlclose_leak.c build under -std=gnu90.
Older gcc (4.8) default to GNU C90. Causing:

dlclose_leak.c:14:5: error: ‘for’ loop initial declarations are only
                     allowed in C99 mode

Fix by declaring int i before the loop.
2018-01-16 11:08:59 +01:00
Julian Seward
f8ae2f95d6 Bug 79362 - Debug info is lost for .so files when they are dlclose'd. Followup fix to avoid assertion failure when dlopening an object that has previously been dlclosed.
As reported by Matthias Schwarzott <zzam@gentoo.org>.  Testcase patch from him.  The fix is
for check_CFSI_related_invariants() to avoid checking for overlaps against DebugInfos that are
in 'archived' status, since -- if a previously dlopened-and-then-dlclosed object is later
re-dlopened -- this may cause an overlap between the active and archived DebugInfos, which
is of no consequence.  If the kernel maps the object to the same VMA the second time around
then there will *certainly* be an overlap.
2018-01-15 11:25:12 +01:00
Mark Wielaard
7d04030322 Additional fix for gnu debug alt file resolving.
Also handle the case where the symlink itself contains a relative path.
Then we need to add the symlink dir before it.

https://bugs.kde.org/show_bug.cgi?id=387773
2018-01-13 14:33:50 +01:00
Petar Jovanovic
91cb209442 mips: fix build failure introduced with "Bug 79362 - Debug info ..."
Previous commit (cceed053ce876560b9a7512125dd93c7fa059778) broke the build
for MIPS architecture.
Update the code in VG_(get_StackTrace_wrk) to reflect the changes made in
the previous commit.
2018-01-12 18:20:36 +01:00
Julian Seward
cceed053ce Bug 79362 - Debug info is lost for .so files when they are dlclose'd. Majorly reworked by Philippe Waroquiers. 2018-01-11 19:40:12 +01:00
Julian Seward
f1a49eeb42 Bug 385408 - s390x: z13 vector "support" instructions not implemented. Patch from Vadim Barkov (vbrkov@gmail.com).
(from bug 385408 comment 0):
Valgrind currently lacks support for the z/Architecture vector "support"
instructions introduced with z13.  These are documented in the
z/Architecture Principles of Operation, Eleventh Edition (March, 2015),
chapter 21: "Vector Overview and Support Instructions".
2018-01-11 18:20:27 +01:00
Julian Seward
0f18cfc986 Fix memcheck/tests/vbit-test (the vbit test program) to track changes in bug 387664.
Bug 387664 changes the default settings for accurate definedness checking
for {Add,Sub}{32,64} and {CmpEQ,CmpNE}{8,16,32,64}.  This fix updates the
vbit tester (memcheck/tests/vbit-test) to test the accurate versions of
these, and thereby fixes a regression caused by
e847cb5429927317023d8410c3c56952aa47fb08 as committed for bug 387664.
2018-01-03 11:55:44 +01:00
Julian Seward
f16ba15391 expensiveAddSub(): Fix incorrect comment. No functional change. 2018-01-03 11:38:14 +01:00
Ivo Raisr
3a5c5cecbd Remove compiler warning about possibly uninitialized variable.
This happened only with quite an old gcc version.
Anyway, this commit simplifies the situation a bit.
2017-12-13 17:01:08 +01:00
Mark Wielaard
be82bb5f9d Fix gnu debug alt file resolving.
https://bugs.kde.org/show_bug.cgi?id=387773

The path to the alt file is relative to the actual debug file.
Make sure that we got the real file, not a (build-id) symlink.
Also handle the case where a debug or alt file is an absolute path.
2017-12-13 00:22:53 +01:00
Julian Seward
d6a810760e Fix false positive with s390x cgijnl instruction testing against sign bit.
https://bugs.kde.org/show_bug.cgi?id=387712

When the cgij "compare immediate and branch relative" instruction
compares 0 <=signed dep1, that means dep1 >=signed 0, so it is a test
against the most significant bit of dep1. So only that bit needs
to be defined.
2017-12-12 22:31:54 +01:00
Mark Wielaard
c5218ff4c1 Remove old Haskell and orig diff files.
These files haven't been used for the last 20 years.
2017-12-12 19:14:59 +01:00
Julian Seward
e847cb5429 Bug 387664 - Memcheck: make expensive-definedness-checks be the default
Memcheck tries to accurately track definedness at the bit level, at least
for scalar integer operations.  For many operations it is good enough to use
approximations which may overstate the undefinedness of the result of an
operation, provided that fully defined inputs still produce a fully defined
output.  For example, the standard analysis for an integer add is

   Add#(x#, y#) = Left(UifU(x#, y#))

which (as explained in the USENIX 05 paper
http://valgrind.org/docs/memcheck2005.pdf) means: for an add, worst-case
carry propagation is assumed.  So all bits to the left of, and including,
the rightmost undefined bit in either operand, are assumed to be undefined.

As compilers have become increasingly aggressive, some of these
approximations are no longer good enough.  For example, LLVM for some years
has used Add operations with partially undefined inputs, when it knows that
the carry propagation will not pollute important parts of the result.
Similarly, both GCC and LLVM will generate integer equality comparisons with
partially undefined inputs in situations where it knows the result of the
comparison will be defined.  In both cases, Memcheck's default strategies
give rise to false uninitialised-value errors, and the problem is getting
worse as time goes by.

Memcheck already has expensive (non-default) instrumentation for integer
adds, subtracts, and equality comparisons.  Currently these are only used if
you specify --expensive-definedness-checks=yes, and in some rare cases to do
with inlined string operations, as determined by analysing the block to be
instrumented, and by default on MacOS.  The performance hit from them can be
quite high, up to 30% lossage.

This patch makes the following changes:

* During instrumentation, there is much finer control over which IROps get
  expensive instrumentation.  The following groups can now be selected
  independently for expensive or cheap instrumentation:

     Iop_Add32
     Iop_Add64
     Iop_Sub32
     Iop_Sub64
     Iop_CmpEQ32 and Iop_CmpNE32
     Iop_CmpEQ64 and Iop_CmpNE64

  This makes it possible to only enable, on a given platform, only the minimal
  necessary set of expensive cases.

* The default set of expensive cases can be set on a per-platform basis.
  This is set up in the first part of MC_(instrument).

* There is a new pre-instrumentation analysis pass.  It identifies Iop_Add32
  and Iop_Add64 uses for which the expensive handling will give the same
  results as the cheap handling.  This includes all adds that are used only
  to create memory addresses.  Given that the expensive handling of adds is,
  well, expensive, and that most adds merely create memory addresses, this
  more than halves the extra costs of expensive Add handling.

* The pre-existing "bogus literal" detection (0x80808080, etc) pass
  has been rolled into the new pre-instrumentation analysis.

* The --expensive-definedness-checks= flag has been changed.  Before, it
  had two settings, "no" and "yes", with "no" being the default.  Now, it
  has three settings:

   no -- always use the cheapest handling

   auto -- use the minimum set of expensive handling needed to get
           reasonable results on this platform, and perform
           pre-instrumentation analysis so as to minimise the costs thereof

   yes -- always use the most expensive handling

  The default setting is now "auto".  The user-visible effect of the new
  default is that there should (hopefully) be a drop in false positive rates
  but (unfortunately) also some drop in performance.
2017-12-12 10:22:51 +01:00
Julian Seward
0e7c46401b Fix this test to work properly with accurate CmpEQ/NE definedness tracking
Memcheck reports an error on "if (n == 42)" in this test.  Unless, that is,
accurate CmpEQ/NE definedness tracking is enabled.  If you stare at this
long enough it is possible to see that the test "n == 42" isn't actually
undefined, because |n| is only ever zero or one, and only its least
significant bit is undefined.  So the equality comparison against 42 is
defined because there are corresponding bits in the two operands that are
different and are both defined.

This commit fixes that by comparing with 1, which forces the result to
really depend on the only undefined bit in |n|.

I also added robustification:

* return arbitrary values from gcc_cant_inline_me(), so as to avoid gcc
  simply copying the input to the output or otherwise deleting the
  conditional branch.

* marking gcc_cant_inline_me() as un-inlineable

* Putting compiler barriers in the second conditional in main(), so gcc
  can't simply ignore the result of the call to gcc_cant_inline_me() and
  then delete the call entirely.
2017-12-07 13:31:38 +01:00
Julian Seward
8a2acb304d amd64: add a spec rule for SHRL/SARL then CondS. gcc-8 has been seen to generate such things. 2017-12-07 12:24:57 +01:00
Julian Seward
40f0364e1e amd64: Add a new spec rule for SUBL then Cond{B,NB} in the case where the RHS is a constant power of two.
LLVM 5.0 appears to have started generating such constructions in order to
find out whether the top N bits of a value are all zero.  This currently
generates Iop_CmpLE32U on partially uninitialised data, causing false
positives in Memcheck.  It seems simplest and most efficient to remove such
constructions at this point.
2017-12-05 12:35:09 +01:00
Julian Seward
ad92845f6b Rearrange sections in mc_translate.c. No functional change.
Rearrange big sections in mc_translate.c, so that the "main" instrumentation
function is at the end of the file rather than in the middle.  The previous
layout never made much sense.  The new layout is, roughly:

* stuff for baseline (level 2, non-origin tracking) instrumentation
* stuff for origin tracking (level 3) instrumentation
* the "final tidying" pass
* the main instrumentation function (and soon, a new pre-instrumentation
  analysis pass)
2017-12-05 12:04:17 +01:00
Philippe Waroquiers
0a5ff8c309 When user asks enough verbosity, also give the full version in preamble
so that e.g.
   valgrind -v date
produces
   ==7639== Using Valgrind-3.14.0.GIT-c470e0c23c-20171120X and LibVEX; rerun with -h for copyright info
to give the verbose version.
2017-11-21 22:17:47 +01:00
Julian Seward
c470e0c23c arm(32)-linux: add support for the TPIDRURW system register. Fixes #386425. 2017-11-20 11:43:55 +01:00
Philippe Waroquiers
53faacfda4 Bypass gcc code generation bug triggered by -finline-functions
commit 7dd9a7f8b3118c25014b0a77aff899e517c46bcd has added the flag -finline-functions.

This triggers a code generation bug in gcc 6.3.0
(at least with gcc version 6.3.0 20170516 (Debian 6.3.0-18)).
(this bug can be reproduced e.g. on gcc67, which is a debian 9.2 system)

The bad code causes the debug trace to be indented by more than 500 characters,
giving e.g. for the first debug line produced by stage 2:
--12305:1:launcher launching /home/philippe/valgrind/git/smallthing/./.in_place/memcheck-amd64-linux
--12305:1:debuglog                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  DebugLog system started by Stage 2 (main), level 1 logging requested

This commit bypasses the code generation bug, by moving the indent calculation
just before its usage.

Note: on amd64/x86, the code size of memcheck tool increases by about 12%
with -finline-functions.
In terms of perf impact (using perf/vg_perf) this gives mixed results :
   memcheck is usually slightly faster, but some tests are slower (e.g. heap_pdb4)
   callgrind is usually slower, but some tests are faster
   helgrind : some tests are slowed down, some tests are faster (some significantly faster such as sarp and ffbench).

See below 2 runs of comparing trunk (with -finline-functions) with fixes
(which does not have -finline-functions).

-- Running  tests in perf ----------------------------------------------
-- bigcode1 --
bigcode1 trunk_untouched:0.07s  me: 2.2s (32.0x, -----)  he: 1.7s (23.9x, -----)  ca: 9.0s (129.0x, -----)
bigcode1 fixes     :0.07s  me: 2.3s (32.3x, -0.9%)  he: 1.7s (23.9x,  0.0%)  ca: 8.8s (125.4x,  2.8%)
-- bigcode2 --
bigcode2 trunk_untouched:0.07s  me: 5.0s (72.1x, -----)  he: 3.2s (46.0x, -----)  ca:18.6s (266.4x, -----)
bigcode2 fixes     :0.07s  me: 5.1s (73.0x, -1.2%)  he: 3.2s (46.1x, -0.3%)  ca:18.4s (262.9x,  1.3%)
-- bz2 --
bz2      trunk_untouched:0.43s  me: 4.5s (10.4x, -----)  he: 6.7s (15.5x, -----)  ca:10.4s (24.2x, -----)
bz2      fixes     :0.43s  me: 4.5s (10.5x, -0.4%)  he: 6.7s (15.5x,  0.0%)  ca:10.1s (23.4x,  3.4%)
-- fbench --
fbench   trunk_untouched:0.14s  me: 2.7s (19.6x, -----)  he: 1.9s (13.4x, -----)  ca: 4.0s (28.3x, -----)
fbench   fixes     :0.14s  me: 2.8s (19.9x, -1.8%)  he: 2.0s (14.6x, -8.5%)  ca: 3.9s (28.1x,  0.8%)
-- ffbench --
ffbench  trunk_untouched:0.15s  me: 2.6s (17.1x, -----)  he: 3.4s (22.4x, -----)  ca: 1.5s (10.1x, -----)
ffbench  fixes     :0.15s  me: 2.6s (17.3x, -0.8%)  he: 3.1s (20.9x,  6.8%)  ca: 1.5s (10.0x,  1.3%)
-- heap --
heap     trunk_untouched:0.05s  me: 3.6s (72.8x, -----)  he: 5.0s (100.0x, -----)  ca: 4.9s (98.2x, -----)
heap     fixes     :0.05s  me: 3.7s (73.6x, -1.1%)  he: 5.1s (102.4x, -2.4%)  ca: 4.8s (95.6x,  2.6%)
-- heap_pdb4 --
heap_pdb4 trunk_untouched:0.06s  me: 5.9s (97.7x, -----)  he: 5.6s (93.7x, -----)  ca: 5.2s (86.8x, -----)
heap_pdb4 fixes     :0.06s  me: 5.8s (96.0x,  1.7%)  he: 5.7s (95.3x, -1.8%)  ca: 5.3s (87.7x, -1.0%)
-- many-loss-records --
many-loss-records trunk_untouched:0.01s  me: 1.0s (101.0x, -----)  he: 0.8s (85.0x, -----)  ca: 0.8s (78.0x, -----)
many-loss-records fixes     :0.01s  me: 1.0s (100.0x,  1.0%)  he: 0.9s (86.0x, -1.2%)  ca: 0.8s (78.0x,  0.0%)
-- many-xpts --
many-xpts trunk_untouched:0.03s  me: 1.1s (38.3x, -----)  he: 1.4s (46.0x, -----)  ca: 1.9s (62.7x, -----)
many-xpts fixes     :0.03s  me: 1.1s (37.0x,  3.5%)  he: 1.4s (47.0x, -2.2%)  ca: 1.8s (61.3x,  2.1%)
-- memrw --
memrw    trunk_untouched:0.04s  me: 0.9s (21.5x, -----)  he: 2.3s (58.0x, -----)  ca: 1.9s (46.8x, -----)
memrw    fixes     :0.04s  me: 0.9s (22.0x, -2.3%)  he: 2.3s (58.0x,  0.0%)  ca: 1.9s (47.2x, -1.1%)
-- sarp --
sarp     trunk_untouched:0.02s  me: 1.5s (77.0x, -----)  he: 3.4s (168.5x, -----)  ca: 1.3s (63.0x, -----)
sarp     fixes     :0.02s  me: 1.6s (80.0x, -3.9%)  he: 4.0s (200.5x,-19.0%)  ca: 1.3s (65.5x, -4.0%)
-- tinycc --
tinycc   trunk_untouched:0.10s  me: 6.7s (66.7x, -----)  he: 6.6s (65.9x, -----)  ca: 7.2s (72.4x, -----)
tinycc   fixes     :0.10s  me: 6.6s (66.0x,  1.0%)  he: 6.8s (68.0x, -3.2%)  ca: 7.2s (72.1x,  0.4%)
-- Finished tests in perf ----------------------------------------------

== 12 programs, 72 timings =================

-- Running  tests in perf ----------------------------------------------
-- bigcode1 --
bigcode1 trunk_untouched:0.07s  me: 2.2s (32.0x, -----)  he: 1.7s (23.7x, -----)  ca: 9.0s (129.1x, -----)
bigcode1 fixes     :0.07s  me: 2.3s (32.3x, -0.9%)  he: 1.7s (23.9x, -0.6%)  ca: 8.8s (125.3x,  3.0%)
-- bigcode2 --
bigcode2 trunk_untouched:0.07s  me: 5.0s (72.1x, -----)  he: 3.2s (46.0x, -----)  ca:18.7s (266.6x, -----)
bigcode2 fixes     :0.07s  me: 5.1s (72.9x, -1.0%)  he: 3.2s (46.0x,  0.0%)  ca:18.5s (263.7x,  1.1%)
-- bz2 --
bz2      trunk_untouched:0.43s  me: 4.5s (10.5x, -----)  he: 6.7s (15.5x, -----)  ca:10.4s (24.2x, -----)
bz2      fixes     :0.43s  me: 4.5s (10.5x, -0.2%)  he: 6.7s (15.5x, -0.2%)  ca:10.1s (23.4x,  3.3%)
-- fbench --
fbench   trunk_untouched:0.14s  me: 2.8s (19.6x, -----)  he: 1.9s (13.4x, -----)  ca: 4.0s (28.2x, -----)
fbench   fixes     :0.14s  me: 2.8s (19.9x, -1.1%)  he: 2.0s (14.6x, -8.5%)  ca: 3.9s (28.1x,  0.3%)
-- ffbench --
ffbench  trunk_untouched:0.15s  me: 2.6s (17.1x, -----)  he: 3.4s (22.5x, -----)  ca: 1.5s (10.1x, -----)
ffbench  fixes     :0.15s  me: 2.6s (17.3x, -0.8%)  he: 3.1s (20.8x,  7.4%)  ca: 1.5s ( 9.9x,  2.0%)
-- heap --
heap     trunk_untouched:0.05s  me: 3.6s (72.6x, -----)  he: 5.0s (99.4x, -----)  ca: 4.9s (98.4x, -----)
heap     fixes     :0.05s  me: 3.7s (73.6x, -1.4%)  he: 5.1s (102.4x, -3.0%)  ca: 4.8s (95.2x,  3.3%)
-- heap_pdb4 --
heap_pdb4 trunk_untouched:0.06s  me: 5.9s (98.0x, -----)  he: 5.6s (94.0x, -----)  ca: 5.2s (86.8x, -----)
heap_pdb4 fixes     :0.06s  me: 5.8s (96.0x,  2.0%)  he: 5.7s (94.8x, -0.9%)  ca: 5.2s (87.3x, -0.6%)
-- many-loss-records --
many-loss-records trunk_untouched:0.01s  me: 1.0s (101.0x, -----)  he: 0.8s (85.0x, -----)  ca: 0.8s (76.0x, -----)
many-loss-records fixes     :0.01s  me: 1.0s (100.0x,  1.0%)  he: 0.9s (87.0x, -2.4%)  ca: 0.8s (77.0x, -1.3%)
-- many-xpts --
many-xpts trunk_untouched:0.03s  me: 1.2s (38.7x, -----)  he: 1.4s (45.3x, -----)  ca: 1.9s (62.7x, -----)
many-xpts fixes     :0.03s  me: 1.1s (37.0x,  4.3%)  he: 1.4s (47.0x, -3.7%)  ca: 1.8s (61.3x,  2.1%)
-- memrw --
memrw    trunk_untouched:0.04s  me: 0.9s (22.0x, -----)  he: 2.3s (58.2x, -----)  ca: 1.9s (46.5x, -----)
memrw    fixes     :0.04s  me: 0.9s (21.8x,  1.1%)  he: 2.3s (58.2x,  0.0%)  ca: 1.9s (47.2x, -1.6%)
-- sarp --
sarp     trunk_untouched:0.02s  me: 1.5s (76.5x, -----)  he: 3.4s (167.5x, -----)  ca: 1.3s (63.0x, -----)
sarp     fixes     :0.02s  me: 1.6s (79.5x, -3.9%)  he: 4.0s (200.5x,-19.7%)  ca: 1.3s (65.5x, -4.0%)
-- tinycc --
tinycc   trunk_untouched:0.10s  me: 6.6s (66.3x, -----)  he: 6.6s (66.2x, -----)  ca: 7.2s (72.4x, -----)
tinycc   fixes     :0.10s  me: 6.6s (66.1x,  0.3%)  he: 6.8s (68.1x, -2.9%)  ca: 7.2s (72.2x,  0.3%)
-- Finished tests in perf ----------------------------------------------

== 12 programs, 72 timings =================
2017-11-16 22:53:46 +01:00
Ivo Raisr
6fbb3ddfe5 Add .stderr.exp file for memcheck/tests/linux/capget when running inside Docker container. 2017-11-15 23:37:17 +01:00
Ivo Raisr
ae7d3ea729 Update .gitignore for Solaris. 2017-11-14 13:12:16 +00:00
Ivo Raisr
4a8ea8908f Update NEWS after fixing BZ#208052. 2017-11-14 10:51:49 +01:00
Tom Hughes
a5af4146e3 Avoid underflow in strlcpy and strlcat wrappers when count is zero
We can't decrement n because it's unsigned and might be zero which
means it would wrap and we'd wind up reading far too much.

Fixes BZ#208052
2017-11-14 09:16:26 +00:00
Petar Jovanovic
286d05eea0 synchronize access to vgdb_interrupted_tid
Delay writing to the global vgdb_interrupted_tid until all the threads are
in interruptible state. This ensures that valgrind_wait() will see correct
value.

This solves occasional failures of gdbserver_tests/hgtls test.
2017-11-13 13:13:28 +01:00
Philippe Waroquiers
cc89760481 Improve efficiency of SP tracking in helgrind (and incidentally in exp-sgheck)
Helgrind (and incidentally exp-sgcheck) does not need both of
tracking new mem stack and die mem stack:
Helgrind only tracks new mem stack. exp-sgcheck only tracks die mem stack.

Currently, m_translate.c vg_SP_update_pass inserts helpers calls
for new and die mem stack, even if the tool only needs new mem stack (helgrind)
or die mem stack (exp-sgcheck).

The optimisation consists in not inserting helpers calls when the tool
does not need to see new (or die) mem stack.
Also, for helgrind, implement specialised new_mem_stack for known SP updates
with small values (like memcheck).

This reduces the size of the generated code for helgrind and exp-sgcheck.
(see below the diffs on perf/memrw). This does not impact the code generation
for tools that tracks both new and die mem stack (such as memcheck).

trunk:
exp-sgcheck: --28481--  transtab: new        2,256 (44,529 -> 581,402; ratio 13.1) [0 scs] avg tce size 257
helgrind:    --28496--  transtab: new        2,299 (46,667 -> 416,575; ratio 8.9) [0 scs] avg tce size 181
memcheck:    --28501--  transtab: new        2,220 (50,038 -> 777,139; ratio 15.5) [0 scs] avg tce size 350

with this patch:
exp-sgcheck: --28516--  transtab: new        2,254 (44,479 -> 567,196; ratio 12.8) [0 scs] avg tce size 251
helgrind:    --28512--  transtab: new        2,297 (46,620 -> 399,799; ratio 8.6) [0 scs] avg tce size 174
memcheck:    --28507--  transtab: new        2,219 (49,991 -> 776,028; ratio 15.5) [0 scs] avg tce size 349

More in details, the changes consist in:

pub_core_tooliface.h:
  * add 2 booleans any_new_mem_stack and any_die_mem_stack to the tdict struct
  * renamed VG_(sanity_check_needs) to VG_(finish_needs_init), as it
    does now more than sanity checks : it derives the 2 above booleans.
m_tooliface.c:
  * change VG_(sanity_check_needs) to VG_(finish_needs_init)
m_main.c:
  * update call to VG_(sanity_check_needs)
hg_main.c:
  * add a few inlines for functions just calling another function
  * define the functions evh__new_mem_stack_[4|8|12|16|32|112|128|144|160]
    (using the macro DCL_evh__new_mem_stack).
  * call the VG_(track_new_mem_stack_[4|8|12|16|32|112|128|144|160])
m_translate.c
  * n_SP_updates_* stats are now maintained separately for the new and die
    fast and known cases.
  * need_to_handle_SP_assignment can now check only the 2 booleans
    any_new_mem_stack and any_die_mem_stack
  * DO_NEW macro: does not insert anymore a helper call if the tool does
    not track 'new' mem_stack.
    In case there is no new tracking, it however still does update the
    SP aliases (and the n_SP_updates_new_fast).
  * similar changes for DO_DIE macro.
  * a bunch of white spaces changes
 Note: it is easier to look at the changes in this file using
   git diff -w
 to ignore the white spaces changes (e.g. due to DO_NEW/DO_DIE indentation
 changes).

regtested on debian/amd64 and on centos/ppc64
2017-11-07 21:18:31 +01:00
Philippe Waroquiers
4d621f6510 Move or conditionalise on CHECK_CEM some expensive asserts
* Some RCEC related asserts checking there was no corruption are on hot paths
   => make these checks only when CHECK_CEM is set.
* Move an expensive assert where the event is inserted, as it is useless
  to check this when searching for an already existing event :
  it is enough to ensure that an invalid szB cannot be inserted,
  and so will not be found, and so assert will trigger in the insertion logic.
2017-11-07 21:13:55 +01:00
Julian Seward
d813fb74af s390_irgen_EX_SS: add initialisations so as to remove (false positive) warnings from gcc-7.x.
When compiling guest_s390_toIR.c for a 32-bit target (a configuration in which
it will never be used, but never mind), gcc-7.x notices that sizeof(ss.dec) is
larger than sizeof(ss.bytes), so the initialisation of ss.bytes leaves ss.dec.b2
and ss.dec.d2 uninitialised.  This patch causes both variants to be initialised.
When built for a 64 bit target, the existing initialisation of ss.bytes covers
ss.dec completely, so there is no error.
2017-11-07 15:01:51 +01:00
Julian Seward
7dd9a7f8b3 Add -finline-functions to standard build flags, so gcc will consider all functions as candidates for inlining. 2017-11-07 14:18:16 +01:00
Ivo Raisr
c46053cc38 Optionally exit on the first error with --exit-on-first-error=<yes|no>.
Fixes BZ#385939.
Slightly modified patch by: Fauchet Gauthier <gauthier.fauchet@free.fr>
2017-11-04 14:31:22 +01:00
Philippe Waroquiers
1eb5ea2afe Small optimisation in helgrind address description
Searching if an addr is in a malloc-ed client block is expensive (linear search)
So, before scanning the list of malloc block, check that the address is
in a client heap segment : this is a fast operation (it has a small
cache, and for cache miss, does a dichotomic search) and avoids
scanning a often big list (for big applications).
2017-11-04 08:32:03 +01:00
Petar Jovanovic
95038d380d mips: finetune tests that print FCSR
Bits 18 (NAN2008) and 19 (ABS2008) in FCSR are preset by hardware and can
differ between platforms. Hence, we should clear these bits before printing
FCSR value in order to have the same output on different platforms.

This fixes several failures (tests modified by this change) that occur on
MIPS P5600 board. The P5600 is a core that implements MIPS32 Release 5 arch.
2017-11-03 19:11:36 +01:00
Philippe Waroquiers
b8fa6c086f Improve the NEWS entry for --delta-stacktrace flag. 2017-11-02 21:50:48 +01:00
Philippe Waroquiers
619fb35df7 Fix 376257 - helgrind history full speed up using a cached stack
This patch implements the flag --delta-stacktrace=yes/no.
Yes indicates to calculate the full history stack traces by
changing just the last frame if no call/return instruction was
executed.
This can speed up helgrind by up to 25%.

This flags is currently set to yes only on linux x86 and amd64, as some
platform dependent validation of the used heuristics is needed before
setting the default to yes on a platform. See function check_cached_rcec_ok
in libhb_core.c for more details about how to validate/check the behaviour
on a new platform.
2017-11-02 21:33:35 +01:00
Carl Love
6a55b1e82c Fix access to time base register to return 64-bits. 2017-10-31 13:45:28 -05:00
Petar Jovanovic
0eea388934 android: compute possible size of a symbol of unknown size
Under specific circumstances, setting 2048 as a size of symbol of unknown
size causes that symbol crosses unmapped region. This further causes an
assertion in Valgrind.

Compute possible size by computing maximal size the symbol can have within
its section.

Patch by Tamara Vlahovic.
2017-10-31 18:00:38 +01:00
Philippe Waroquiers
2f9cceafa3 introduce a test for n-i-bz fix bug in strspn replacement
c1eace647ca4f670ef9bec0d0fe72cdd25a96394 fixed a bug in strspn replacement.
Add a test to cover this fix.
2017-10-28 15:02:11 +02:00
Petar Jovanovic
bf87528d10 mips: update NEWS about MIPS MSA support
Spread the word about MIPS MSA support.

Related BZ issue - #382563.
2017-10-28 00:39:16 +02:00
Petar Jovanovic
0e1fa562e9 mips: MSA tests
This set of tests covers the whole MSA instruction set:

  none/tests/mips32/msa_arithmetic
  none/tests/mips32/msa_comparison
  none/tests/mips32/msa_data_transfer
  none/tests/mips32/msa_fpu
  none/tests/mips32/msa_logical_and_shift
  none/tests/mips32/msa_shuffle

  none/tests/mips64/msa_arithmetic         (symlink to mips32)
  none/tests/mips64/msa_comparison         (symlink to mips32)
  none/tests/mips64/msa_data_transfer
  none/tests/mips64/msa_fpu                (symlink to mips32)
  none/tests/mips64/msa_logical_and_shift  (symlink to mips32)
  none/tests/mips64/msa_shuffle            (symlink to mips32)

Contributed by:
  Tamara Vlahovic, Aleksandar Rikalo and Aleksandra Karadzic.

Related BZ issue - #382563.
2017-10-27 16:27:25 +02:00
Petar Jovanovic
4686886774 mips: add support for MSA regs in Memcheck
Add support for MSA registers in Memcheck.

Contributed by:
  Tamara Vlahovic, Aleksandar Rikalo and Aleksandra Karadzic.

Related BZ issue - #382563.
2017-10-27 16:27:24 +02:00
Petar Jovanovic
13577bb699 mips: detect presence of MSA
Detect presence of MSA capabilities.

Contributed by:
  Tamara Vlahovic, Aleksandar Rikalo and Aleksandra Karadzic.

Minor code-style rewrites by myself.

Related BZ issue - #382563.
2017-10-27 16:27:24 +02:00
Petar Jovanovic
4ef3d807e1 mips: MSA support for mips32/mips64.
Full support of MIPS SIMD Architecture Module (MSA) instruction set.

Following IOPs have been implemented using generation of MSA instructions:

  Iop_GetElem8x16, Iop_GetElem16x8, Iop_GetElem32x4, Iop_GetElem64x2,
  Iop_V128to32, Iop_V128HIto64, Iop_V128to64, Iop_F32toF16x4, Iop_Abs64x2,
  Iop_Abs32x4, Iop_Abs16x8, Iop_Abs8x16, Iop_Cnt8x16, Iop_NotV128,
  Iop_Reverse8sIn16_x8, Iop_Reverse8sIn32_x4, Iop_Reverse8sIn64_x2,
  Iop_Cls8x16, Iop_Cls16x8, Iop_Cls32x4, Iop_Clz8x16, Iop_Clz16x8,
  Iop_Clz32x4, Iop_Clz64x2, Iop_Abs32Fx4, Iop_Abs64Fx2, Iop_RecipEst32Fx4,
  Iop_RecipEst64Fx2, Iop_RSqrtEst32Fx4, Iop_RSqrtEst64Fx2, Iop_F16toF32x4,
  Iop_I32UtoFx4, Iop_FtoI32Sx4_RZ, Iop_FtoI32Ux4_RZ, Iop_Add8x16,
  Iop_Add16x8, Iop_Add32x4, Iop_Add64x2, Iop_Sub8x16, Iop_Sub16x8,
  Iop_Sub32x4, Iop_Sub64x2, Iop_QAdd8Sx16, Iop_QAdd16Sx8, Iop_QAdd32Sx4,
  Iop_QAdd64Sx2, Iop_QAdd8Ux16, Iop_QAdd16Ux8, Iop_QAdd32Ux4,
  Iop_QAdd64Ux2, Iop_QSub8Sx16, Iop_QSub16Sx8, Iop_QSub32Sx4,
  Iop_QSub64Sx2, Iop_QSub8Ux16, Iop_QSub16Ux8, Iop_QSub32Ux4,
  Iop_QSub64Ux2, Iop_QDMulHi32Sx4, Iop_QDMulHi16Sx8, Iop_QRDMulHi32Sx4,
  Iop_QRDMulHi16Sx8, Iop_Max8Sx16, Iop_Max16Sx8, Iop_Max32Sx4, Iop_Max64Sx2,
  Iop_Max8Ux16, Iop_Max16Ux8, Iop_Max32Ux4, Iop_Max64Ux2, Iop_Min8Sx16,
  Iop_Min16Sx8, Iop_Min32Sx4, Iop_Min64Sx2, Iop_Min8Ux16, Iop_Min16Ux8,
  Iop_Min32Ux4, Iop_Min64Ux2, Iop_Shl8x16, Iop_Shl16x8, Iop_Shl32x4,
  Iop_Shl64x2, Iop_Shr8x16, Iop_Shr16x8, Iop_Shr32x4, Iop_Shr64x2,
  Iop_Sar8x16, Iop_Sar16x8, Iop_Sar32x4, Iop_Sar64x2, Iop_InterleaveHI8x16,
  Iop_InterleaveHI16x8, Iop_InterleaveHI32x4, Iop_InterleaveHI64x2,
  Iop_InterleaveLO8x16, Iop_InterleaveLO16x8, Iop_InterleaveLO32x4,
  Iop_InterleaveLO64x2, Iop_InterleaveEvenLanes8x16,
  Iop_InterleaveEvenLanes16x8, Iop_InterleaveEvenLanes32x4,
  Iop_InterleaveOddLanes8x16, Iop_InterleaveOddLanes16x8,
  Iop_InterleaveOddLanes32x4, Iop_CmpEQ8x16, Iop_CmpEQ16x8, Iop_CmpEQ32x4,
  Iop_CmpEQ64x2, Iop_CmpGT8Sx16, Iop_CmpGT16Sx8, Iop_CmpGT32Sx4,
  Iop_CmpGT64Sx2, Iop_CmpGT8Ux16, Iop_CmpGT16Ux8, Iop_CmpGT32Ux4,
  Iop_CmpGT64Ux2, Iop_Avg8Sx16, Iop_Avg16Sx8, Iop_Avg32Sx4, Iop_Avg8Ux16,
  Iop_Avg16Ux8, Iop_Avg32Ux4, Iop_Mul8x16, Iop_Mul16x8, Iop_Mul32x4,
  Iop_AndV128, Iop_OrV128, Iop_XorV128, Iop_ShrV128, Iop_ShlV128,
  Iop_ShlN8x16, Iop_ShlN16x8, Iop_ShlN32x4, Iop_ShlN64x2, Iop_SarN8x16,
  Iop_SarN16x8, Iop_SarN32x4, Iop_SarN64x2, Iop_ShrN8x16, Iop_ShrN16x8,
  Iop_ShrN32x4, Iop_ShrN64x2, Iop_QandQSarNnarrow64Sto32Sx2,
  Iop_QandQSarNnarrow32Sto16Sx4, Iop_QandQRSarNnarrow64Sto32Sx2,
  Iop_QandQRSarNnarrow32Sto16Sx4, Iop_CmpEQ32Fx4, Iop_CmpEQ64Fx2,
  Iop_CmpLT32Fx4, Iop_CmpLT64Fx2, Iop_CmpLE32Fx4, Iop_CmpLE64Fx2,
  Iop_CmpUN32Fx4, Iop_CmpUN64Fx2, Iop_64HLtoV128, Iop_Min32Fx4,
  Iop_Min64Fx2, Iop_Max32Fx4, Iop_Max64Fx2, Iop_Sqrt32Fx4,
  Iop_Sqrt64Fx2, Iop_Add32Fx4, Iop_Add64Fx2, Iop_Sub32Fx4,
  Iop_Sub64Fx2, Iop_Mul32Fx4, Iop_Mul64Fx2, Iop_Div32Fx4,
  Iop_Div64Fx2, Iop_F32x4_2toQ16x8, Iop_F64x2_2toQ32x4,
  Iop_ScaleF64, Scale2_64Fx2, Scale2_32Fx4, Iop_Log2_32Fx4, Iop_Log2_64Fx2,
  Iop_PackOddLanes8x16, Iop_PackEvenLanes8x16, Iop_PackOddLanes16x8,
  Iop_PackEvenLanes16x8, Iop_PackOddLanes32x4, Iop_PackEvenLanes32x4.

Folowing IOPs have been implemented without generating MSA instructions:

  Iop_CmpEQ8, Iop_MullU8, Iop_MullS8, Iop_MullU16, Iop_MullS16, Iop_DivS32,
  Iop_DivU32, Iop_DivS64, Iop_DivU64, Iop_F32toI32U, Iop_F64toI64U,
  Iop_I64UtoF64

Imlementation of the following IOPs has been changed in order to use MSA
when it is possible:

  Iop_MAddF64, Iop_MSubF32, Iop_MSubF64.

Contributed by:
  Tamara Vlahovic, Aleksandar Rikalo and Aleksandra Karadzic.

Related BZ issue - #382563.
2017-10-27 16:27:24 +02:00
Petar Jovanovic
91373819a3 mips: new Iops added to support MSA
New Iops are defined:
  Iop_Scale2_32Fx4, Iop_Scale2_64Fx2,
  Iop_Log2_32Fx4, Iop_Log2_64Fx2,
  Iop_F32x4_2toQ16x8, Iop_F64x2_2toQ32x4,
  Iop_PackOddLanes8x16, Iop_PackEvenLanes8x16,
  Iop_PackOddLanes16x8, Iop_PackEvenLanes16x8,
  Iop_PackOddLanes32x4, Iop_PackEvenLanes32x4.

Contributed by:
  Tamara Vlahovic, Aleksandar Rikalo and Aleksandra Karadzic.

Related BZ issue - #382563.
2017-10-27 16:27:24 +02:00
Philippe Waroquiers
c1eace647c Fix n-i-bz fix bug in strspn replacement
Mix-up between UChar and HChar in strspn.
Also grouped together the n-i-bz announded fixes in NEWS
2017-10-26 20:53:15 +02:00
Mark Wielaard
476b52d62d Bug #385912. Remove explicit NULL check from none/tests/rlimit_nofile.
glibc doesn't guarantee anything about setrlimit with a NULL limit argument.
It could just crash (if it needs to adjust the limit) or might silently
succeed (as newer glibc do). Just remove the extra check.

See also the "setrlimit change to prlimit change in behavior" thread:
https://sourceware.org/ml/libc-alpha/2017-10/threads.html#00830
2017-10-20 14:55:06 +02:00
Mark Wielaard
f844689f85 Suppress _dl_runtime_resolve_avx_slow for memcheck conditional.
glibc ld.so has an optimization when resolving a symbol that checks
whether or not the upper 128 bits of the ymm registers are zero. If
so it uses "cheaper" instructions to save/restore them using the xmm
registers. If those upper 128 bits contain undefined values memcheck
will issue an Conditional jump or move depends on uninitialised value(s)
warning whenever trying to resolve a symbol.

This triggers in our sh-mem-vecxxx test cases. Suppress the warning
by default.

https://bugs.kde.org/show_bug.cgi?id=385868
2017-10-20 14:17:04 +02:00
Petar Jovanovic
cd1d7eb00c mips: simplify handling of Iop_Max32U
Use MIPSRH_Reg to get MIPSRH for Iop_Max32U. Without it, under specific
circumstances, the code may explode and exceed Valgrind instruction buffer
due to multiple calls to iselWordExpr_R through iselWordExpr_RH.

Issue discovered while testing Valgrind on Android.

Patch by Tamara Vlahovic.
2017-10-17 15:40:47 +02:00
Petar Jovanovic
2cf115e657 mips: fix handling of Iex_ITE
While handling Iex_ITE, do not use the same virtual register for the
input and output.

Issue discovered while testing Valgrind on Android.

Patch by Tamara Vlahovic.
2017-10-17 15:31:06 +02:00
Ivo Raisr
074de238d4 VEX register allocator: allocate caller-save registers for short lived vregs.
Allocate caller-saved registers for short lived vregs and callee-save registers
for vregs which span accross helper calls.
Fixes BZ#384987.
2017-10-11 20:56:49 +02:00
Ivo Raisr
83cabd3249 Refactor tracking of MOV coalescing.
Reg<->Reg MOV coalescing status is now a part of the HRegUsage.
This allows register allocation to query it two times without incurring
a performance penalty. This in turn allows to better keep track of
vreg<->vreg MOV coalescing so that all vregs in the coalesce chain
get the effective |dead_before| of the last vreg.

A small performance improvement has been observed because this allows
to coalesce even spilled vregs (previously only assigned ones).
2017-10-11 20:56:48 +02:00