ftmemsim-valgrind

mirror of https://github.com/Zenithsiz/ftmemsim-valgrind.git synced 2026-02-03 18:13:01 +00:00

Author	SHA1	Message	Date
Mark Wielaard	400ad0e36e	Fix memcheck/tests/linux/dlclose_leak.c build under -std=gnu90. Older gcc (4.8) default to GNU C90. Causing: dlclose_leak.c:14:5: error: ‘for’ loop initial declarations are only allowed in C99 mode Fix by declaring int i before the loop.	2018-01-16 11:08:59 +01:00
Julian Seward	f8ae2f95d6	Bug 79362 - Debug info is lost for .so files when they are dlclose'd. Followup fix to avoid assertion failure when dlopening an object that has previously been dlclosed. As reported by Matthias Schwarzott <zzam@gentoo.org>. Testcase patch from him. The fix is for check_CFSI_related_invariants() to avoid checking for overlaps against DebugInfos that are in 'archived' status, since -- if a previously dlopened-and-then-dlclosed object is later re-dlopened -- this may cause an overlap between the active and archived DebugInfos, which is of no consequence. If the kernel maps the object to the same VMA the second time around then there will certainly be an overlap.	2018-01-15 11:25:12 +01:00
Mark Wielaard	7d04030322	Additional fix for gnu debug alt file resolving. Also handle the case where the symlink itself contains a relative path. Then we need to add the symlink dir before it. https://bugs.kde.org/show_bug.cgi?id=387773	2018-01-13 14:33:50 +01:00
Petar Jovanovic	91cb209442	mips: fix build failure introduced with "Bug 79362 - Debug info ..." Previous commit (cceed053ce876560b9a7512125dd93c7fa059778) broke the build for MIPS architecture. Update the code in VG_(get_StackTrace_wrk) to reflect the changes made in the previous commit.	2018-01-12 18:20:36 +01:00
Julian Seward	cceed053ce	Bug 79362 - Debug info is lost for .so files when they are dlclose'd. Majorly reworked by Philippe Waroquiers.	2018-01-11 19:40:12 +01:00
Julian Seward	f1a49eeb42	Bug 385408 - s390x: z13 vector "support" instructions not implemented. Patch from Vadim Barkov (vbrkov@gmail.com). (from bug 385408 comment 0): Valgrind currently lacks support for the z/Architecture vector "support" instructions introduced with z13. These are documented in the z/Architecture Principles of Operation, Eleventh Edition (March, 2015), chapter 21: "Vector Overview and Support Instructions".	2018-01-11 18:20:27 +01:00
Julian Seward	0f18cfc986	Fix memcheck/tests/vbit-test (the vbit test program) to track changes in bug 387664. Bug 387664 changes the default settings for accurate definedness checking for {Add,Sub}{32,64} and {CmpEQ,CmpNE}{8,16,32,64}. This fix updates the vbit tester (memcheck/tests/vbit-test) to test the accurate versions of these, and thereby fixes a regression caused by e847cb5429927317023d8410c3c56952aa47fb08 as committed for bug 387664.	2018-01-03 11:55:44 +01:00
Julian Seward	f16ba15391	expensiveAddSub(): Fix incorrect comment. No functional change.	2018-01-03 11:38:14 +01:00
Ivo Raisr	3a5c5cecbd	Remove compiler warning about possibly uninitialized variable. This happened only with quite an old gcc version. Anyway, this commit simplifies the situation a bit.	2017-12-13 17:01:08 +01:00
Mark Wielaard	be82bb5f9d	Fix gnu debug alt file resolving. https://bugs.kde.org/show_bug.cgi?id=387773 The path to the alt file is relative to the actual debug file. Make sure that we got the real file, not a (build-id) symlink. Also handle the case where a debug or alt file is an absolute path.	2017-12-13 00:22:53 +01:00
Julian Seward	d6a810760e	Fix false positive with s390x cgijnl instruction testing against sign bit. https://bugs.kde.org/show_bug.cgi?id=387712 When the cgij "compare immediate and branch relative" instruction compares 0 <=signed dep1, that means dep1 >=signed 0, so it is a test against the most significant bit of dep1. So only that bit needs to be defined.	2017-12-12 22:31:54 +01:00
Mark Wielaard	c5218ff4c1	Remove old Haskell and orig diff files. These files haven't been used for the last 20 years.	2017-12-12 19:14:59 +01:00
Julian Seward	e847cb5429	Bug 387664 - Memcheck: make expensive-definedness-checks be the default Memcheck tries to accurately track definedness at the bit level, at least for scalar integer operations. For many operations it is good enough to use approximations which may overstate the undefinedness of the result of an operation, provided that fully defined inputs still produce a fully defined output. For example, the standard analysis for an integer add is Add#(x#, y#) = Left(UifU(x#, y#)) which (as explained in the USENIX 05 paper http://valgrind.org/docs/memcheck2005.pdf) means: for an add, worst-case carry propagation is assumed. So all bits to the left of, and including, the rightmost undefined bit in either operand, are assumed to be undefined. As compilers have become increasingly aggressive, some of these approximations are no longer good enough. For example, LLVM for some years has used Add operations with partially undefined inputs, when it knows that the carry propagation will not pollute important parts of the result. Similarly, both GCC and LLVM will generate integer equality comparisons with partially undefined inputs in situations where it knows the result of the comparison will be defined. In both cases, Memcheck's default strategies give rise to false uninitialised-value errors, and the problem is getting worse as time goes by. Memcheck already has expensive (non-default) instrumentation for integer adds, subtracts, and equality comparisons. Currently these are only used if you specify --expensive-definedness-checks=yes, and in some rare cases to do with inlined string operations, as determined by analysing the block to be instrumented, and by default on MacOS. The performance hit from them can be quite high, up to 30% lossage. This patch makes the following changes: * During instrumentation, there is much finer control over which IROps get expensive instrumentation. The following groups can now be selected independently for expensive or cheap instrumentation: Iop_Add32 Iop_Add64 Iop_Sub32 Iop_Sub64 Iop_CmpEQ32 and Iop_CmpNE32 Iop_CmpEQ64 and Iop_CmpNE64 This makes it possible to only enable, on a given platform, only the minimal necessary set of expensive cases. * The default set of expensive cases can be set on a per-platform basis. This is set up in the first part of MC_(instrument). * There is a new pre-instrumentation analysis pass. It identifies Iop_Add32 and Iop_Add64 uses for which the expensive handling will give the same results as the cheap handling. This includes all adds that are used only to create memory addresses. Given that the expensive handling of adds is, well, expensive, and that most adds merely create memory addresses, this more than halves the extra costs of expensive Add handling. * The pre-existing "bogus literal" detection (0x80808080, etc) pass has been rolled into the new pre-instrumentation analysis. * The --expensive-definedness-checks= flag has been changed. Before, it had two settings, "no" and "yes", with "no" being the default. Now, it has three settings: no -- always use the cheapest handling auto -- use the minimum set of expensive handling needed to get reasonable results on this platform, and perform pre-instrumentation analysis so as to minimise the costs thereof yes -- always use the most expensive handling The default setting is now "auto". The user-visible effect of the new default is that there should (hopefully) be a drop in false positive rates but (unfortunately) also some drop in performance.	2017-12-12 10:22:51 +01:00
Julian Seward	0e7c46401b	Fix this test to work properly with accurate CmpEQ/NE definedness tracking Memcheck reports an error on "if (n == 42)" in this test. Unless, that is, accurate CmpEQ/NE definedness tracking is enabled. If you stare at this long enough it is possible to see that the test "n == 42" isn't actually undefined, because \|n\| is only ever zero or one, and only its least significant bit is undefined. So the equality comparison against 42 is defined because there are corresponding bits in the two operands that are different and are both defined. This commit fixes that by comparing with 1, which forces the result to really depend on the only undefined bit in \|n\|. I also added robustification: * return arbitrary values from gcc_cant_inline_me(), so as to avoid gcc simply copying the input to the output or otherwise deleting the conditional branch. * marking gcc_cant_inline_me() as un-inlineable * Putting compiler barriers in the second conditional in main(), so gcc can't simply ignore the result of the call to gcc_cant_inline_me() and then delete the call entirely.	2017-12-07 13:31:38 +01:00
Julian Seward	8a2acb304d	amd64: add a spec rule for SHRL/SARL then CondS. gcc-8 has been seen to generate such things.	2017-12-07 12:24:57 +01:00
Julian Seward	40f0364e1e	amd64: Add a new spec rule for SUBL then Cond{B,NB} in the case where the RHS is a constant power of two. LLVM 5.0 appears to have started generating such constructions in order to find out whether the top N bits of a value are all zero. This currently generates Iop_CmpLE32U on partially uninitialised data, causing false positives in Memcheck. It seems simplest and most efficient to remove such constructions at this point.	2017-12-05 12:35:09 +01:00
Julian Seward	ad92845f6b	Rearrange sections in mc_translate.c. No functional change. Rearrange big sections in mc_translate.c, so that the "main" instrumentation function is at the end of the file rather than in the middle. The previous layout never made much sense. The new layout is, roughly: * stuff for baseline (level 2, non-origin tracking) instrumentation * stuff for origin tracking (level 3) instrumentation * the "final tidying" pass * the main instrumentation function (and soon, a new pre-instrumentation analysis pass)	2017-12-05 12:04:17 +01:00
Philippe Waroquiers	0a5ff8c309	When user asks enough verbosity, also give the full version in preamble so that e.g. valgrind -v date produces ==7639== Using Valgrind-3.14.0.GIT-c470e0c23c-20171120X and LibVEX; rerun with -h for copyright info to give the verbose version.	2017-11-21 22:17:47 +01:00
Julian Seward	c470e0c23c	arm(32)-linux: add support for the TPIDRURW system register. Fixes #386425 .	2017-11-20 11:43:55 +01:00
Philippe Waroquiers	53faacfda4	Bypass gcc code generation bug triggered by -finline-functions commit 7dd9a7f8b3118c25014b0a77aff899e517c46bcd has added the flag -finline-functions. This triggers a code generation bug in gcc 6.3.0 (at least with gcc version 6.3.0 20170516 (Debian 6.3.0-18)). (this bug can be reproduced e.g. on gcc67, which is a debian 9.2 system) The bad code causes the debug trace to be indented by more than 500 characters, giving e.g. for the first debug line produced by stage 2: --12305:1:launcher launching /home/philippe/valgrind/git/smallthing/./.in_place/memcheck-amd64-linux --12305:1:debuglog DebugLog system started by Stage 2 (main), level 1 logging requested This commit bypasses the code generation bug, by moving the indent calculation just before its usage. Note: on amd64/x86, the code size of memcheck tool increases by about 12% with -finline-functions. In terms of perf impact (using perf/vg_perf) this gives mixed results : memcheck is usually slightly faster, but some tests are slower (e.g. heap_pdb4) callgrind is usually slower, but some tests are faster helgrind : some tests are slowed down, some tests are faster (some significantly faster such as sarp and ffbench). See below 2 runs of comparing trunk (with -finline-functions) with fixes (which does not have -finline-functions). -- Running tests in perf ---------------------------------------------- -- bigcode1 -- bigcode1 trunk_untouched:0.07s me: 2.2s (32.0x, -----) he: 1.7s (23.9x, -----) ca: 9.0s (129.0x, -----) bigcode1 fixes :0.07s me: 2.3s (32.3x, -0.9%) he: 1.7s (23.9x, 0.0%) ca: 8.8s (125.4x, 2.8%) -- bigcode2 -- bigcode2 trunk_untouched:0.07s me: 5.0s (72.1x, -----) he: 3.2s (46.0x, -----) ca:18.6s (266.4x, -----) bigcode2 fixes :0.07s me: 5.1s (73.0x, -1.2%) he: 3.2s (46.1x, -0.3%) ca:18.4s (262.9x, 1.3%) -- bz2 -- bz2 trunk_untouched:0.43s me: 4.5s (10.4x, -----) he: 6.7s (15.5x, -----) ca:10.4s (24.2x, -----) bz2 fixes :0.43s me: 4.5s (10.5x, -0.4%) he: 6.7s (15.5x, 0.0%) ca:10.1s (23.4x, 3.4%) -- fbench -- fbench trunk_untouched:0.14s me: 2.7s (19.6x, -----) he: 1.9s (13.4x, -----) ca: 4.0s (28.3x, -----) fbench fixes :0.14s me: 2.8s (19.9x, -1.8%) he: 2.0s (14.6x, -8.5%) ca: 3.9s (28.1x, 0.8%) -- ffbench -- ffbench trunk_untouched:0.15s me: 2.6s (17.1x, -----) he: 3.4s (22.4x, -----) ca: 1.5s (10.1x, -----) ffbench fixes :0.15s me: 2.6s (17.3x, -0.8%) he: 3.1s (20.9x, 6.8%) ca: 1.5s (10.0x, 1.3%) -- heap -- heap trunk_untouched:0.05s me: 3.6s (72.8x, -----) he: 5.0s (100.0x, -----) ca: 4.9s (98.2x, -----) heap fixes :0.05s me: 3.7s (73.6x, -1.1%) he: 5.1s (102.4x, -2.4%) ca: 4.8s (95.6x, 2.6%) -- heap_pdb4 -- heap_pdb4 trunk_untouched:0.06s me: 5.9s (97.7x, -----) he: 5.6s (93.7x, -----) ca: 5.2s (86.8x, -----) heap_pdb4 fixes :0.06s me: 5.8s (96.0x, 1.7%) he: 5.7s (95.3x, -1.8%) ca: 5.3s (87.7x, -1.0%) -- many-loss-records -- many-loss-records trunk_untouched:0.01s me: 1.0s (101.0x, -----) he: 0.8s (85.0x, -----) ca: 0.8s (78.0x, -----) many-loss-records fixes :0.01s me: 1.0s (100.0x, 1.0%) he: 0.9s (86.0x, -1.2%) ca: 0.8s (78.0x, 0.0%) -- many-xpts -- many-xpts trunk_untouched:0.03s me: 1.1s (38.3x, -----) he: 1.4s (46.0x, -----) ca: 1.9s (62.7x, -----) many-xpts fixes :0.03s me: 1.1s (37.0x, 3.5%) he: 1.4s (47.0x, -2.2%) ca: 1.8s (61.3x, 2.1%) -- memrw -- memrw trunk_untouched:0.04s me: 0.9s (21.5x, -----) he: 2.3s (58.0x, -----) ca: 1.9s (46.8x, -----) memrw fixes :0.04s me: 0.9s (22.0x, -2.3%) he: 2.3s (58.0x, 0.0%) ca: 1.9s (47.2x, -1.1%) -- sarp -- sarp trunk_untouched:0.02s me: 1.5s (77.0x, -----) he: 3.4s (168.5x, -----) ca: 1.3s (63.0x, -----) sarp fixes :0.02s me: 1.6s (80.0x, -3.9%) he: 4.0s (200.5x,-19.0%) ca: 1.3s (65.5x, -4.0%) -- tinycc -- tinycc trunk_untouched:0.10s me: 6.7s (66.7x, -----) he: 6.6s (65.9x, -----) ca: 7.2s (72.4x, -----) tinycc fixes :0.10s me: 6.6s (66.0x, 1.0%) he: 6.8s (68.0x, -3.2%) ca: 7.2s (72.1x, 0.4%) -- Finished tests in perf ---------------------------------------------- == 12 programs, 72 timings ================= -- Running tests in perf ---------------------------------------------- -- bigcode1 -- bigcode1 trunk_untouched:0.07s me: 2.2s (32.0x, -----) he: 1.7s (23.7x, -----) ca: 9.0s (129.1x, -----) bigcode1 fixes :0.07s me: 2.3s (32.3x, -0.9%) he: 1.7s (23.9x, -0.6%) ca: 8.8s (125.3x, 3.0%) -- bigcode2 -- bigcode2 trunk_untouched:0.07s me: 5.0s (72.1x, -----) he: 3.2s (46.0x, -----) ca:18.7s (266.6x, -----) bigcode2 fixes :0.07s me: 5.1s (72.9x, -1.0%) he: 3.2s (46.0x, 0.0%) ca:18.5s (263.7x, 1.1%) -- bz2 -- bz2 trunk_untouched:0.43s me: 4.5s (10.5x, -----) he: 6.7s (15.5x, -----) ca:10.4s (24.2x, -----) bz2 fixes :0.43s me: 4.5s (10.5x, -0.2%) he: 6.7s (15.5x, -0.2%) ca:10.1s (23.4x, 3.3%) -- fbench -- fbench trunk_untouched:0.14s me: 2.8s (19.6x, -----) he: 1.9s (13.4x, -----) ca: 4.0s (28.2x, -----) fbench fixes :0.14s me: 2.8s (19.9x, -1.1%) he: 2.0s (14.6x, -8.5%) ca: 3.9s (28.1x, 0.3%) -- ffbench -- ffbench trunk_untouched:0.15s me: 2.6s (17.1x, -----) he: 3.4s (22.5x, -----) ca: 1.5s (10.1x, -----) ffbench fixes :0.15s me: 2.6s (17.3x, -0.8%) he: 3.1s (20.8x, 7.4%) ca: 1.5s ( 9.9x, 2.0%) -- heap -- heap trunk_untouched:0.05s me: 3.6s (72.6x, -----) he: 5.0s (99.4x, -----) ca: 4.9s (98.4x, -----) heap fixes :0.05s me: 3.7s (73.6x, -1.4%) he: 5.1s (102.4x, -3.0%) ca: 4.8s (95.2x, 3.3%) -- heap_pdb4 -- heap_pdb4 trunk_untouched:0.06s me: 5.9s (98.0x, -----) he: 5.6s (94.0x, -----) ca: 5.2s (86.8x, -----) heap_pdb4 fixes :0.06s me: 5.8s (96.0x, 2.0%) he: 5.7s (94.8x, -0.9%) ca: 5.2s (87.3x, -0.6%) -- many-loss-records -- many-loss-records trunk_untouched:0.01s me: 1.0s (101.0x, -----) he: 0.8s (85.0x, -----) ca: 0.8s (76.0x, -----) many-loss-records fixes :0.01s me: 1.0s (100.0x, 1.0%) he: 0.9s (87.0x, -2.4%) ca: 0.8s (77.0x, -1.3%) -- many-xpts -- many-xpts trunk_untouched:0.03s me: 1.2s (38.7x, -----) he: 1.4s (45.3x, -----) ca: 1.9s (62.7x, -----) many-xpts fixes :0.03s me: 1.1s (37.0x, 4.3%) he: 1.4s (47.0x, -3.7%) ca: 1.8s (61.3x, 2.1%) -- memrw -- memrw trunk_untouched:0.04s me: 0.9s (22.0x, -----) he: 2.3s (58.2x, -----) ca: 1.9s (46.5x, -----) memrw fixes :0.04s me: 0.9s (21.8x, 1.1%) he: 2.3s (58.2x, 0.0%) ca: 1.9s (47.2x, -1.6%) -- sarp -- sarp trunk_untouched:0.02s me: 1.5s (76.5x, -----) he: 3.4s (167.5x, -----) ca: 1.3s (63.0x, -----) sarp fixes :0.02s me: 1.6s (79.5x, -3.9%) he: 4.0s (200.5x,-19.7%) ca: 1.3s (65.5x, -4.0%) -- tinycc -- tinycc trunk_untouched:0.10s me: 6.6s (66.3x, -----) he: 6.6s (66.2x, -----) ca: 7.2s (72.4x, -----) tinycc fixes :0.10s me: 6.6s (66.1x, 0.3%) he: 6.8s (68.1x, -2.9%) ca: 7.2s (72.2x, 0.3%) -- Finished tests in perf ---------------------------------------------- == 12 programs, 72 timings =================	2017-11-16 22:53:46 +01:00
Ivo Raisr	6fbb3ddfe5	Add .stderr.exp file for memcheck/tests/linux/capget when running inside Docker container.	2017-11-15 23:37:17 +01:00
Ivo Raisr	ae7d3ea729	Update .gitignore for Solaris.	2017-11-14 13:12:16 +00:00
Ivo Raisr	4a8ea8908f	Update NEWS after fixing BZ#208052.	2017-11-14 10:51:49 +01:00
Tom Hughes	a5af4146e3	Avoid underflow in strlcpy and strlcat wrappers when count is zero We can't decrement n because it's unsigned and might be zero which means it would wrap and we'd wind up reading far too much. Fixes BZ#208052	2017-11-14 09:16:26 +00:00
Petar Jovanovic	286d05eea0	synchronize access to vgdb_interrupted_tid Delay writing to the global vgdb_interrupted_tid until all the threads are in interruptible state. This ensures that valgrind_wait() will see correct value. This solves occasional failures of gdbserver_tests/hgtls test.	2017-11-13 13:13:28 +01:00
Philippe Waroquiers	cc89760481	Improve efficiency of SP tracking in helgrind (and incidentally in exp-sgheck) Helgrind (and incidentally exp-sgcheck) does not need both of tracking new mem stack and die mem stack: Helgrind only tracks new mem stack. exp-sgcheck only tracks die mem stack. Currently, m_translate.c vg_SP_update_pass inserts helpers calls for new and die mem stack, even if the tool only needs new mem stack (helgrind) or die mem stack (exp-sgcheck). The optimisation consists in not inserting helpers calls when the tool does not need to see new (or die) mem stack. Also, for helgrind, implement specialised new_mem_stack for known SP updates with small values (like memcheck). This reduces the size of the generated code for helgrind and exp-sgcheck. (see below the diffs on perf/memrw). This does not impact the code generation for tools that tracks both new and die mem stack (such as memcheck). trunk: exp-sgcheck: --28481-- transtab: new 2,256 (44,529 -> 581,402; ratio 13.1) [0 scs] avg tce size 257 helgrind: --28496-- transtab: new 2,299 (46,667 -> 416,575; ratio 8.9) [0 scs] avg tce size 181 memcheck: --28501-- transtab: new 2,220 (50,038 -> 777,139; ratio 15.5) [0 scs] avg tce size 350 with this patch: exp-sgcheck: --28516-- transtab: new 2,254 (44,479 -> 567,196; ratio 12.8) [0 scs] avg tce size 251 helgrind: --28512-- transtab: new 2,297 (46,620 -> 399,799; ratio 8.6) [0 scs] avg tce size 174 memcheck: --28507-- transtab: new 2,219 (49,991 -> 776,028; ratio 15.5) [0 scs] avg tce size 349 More in details, the changes consist in: pub_core_tooliface.h: * add 2 booleans any_new_mem_stack and any_die_mem_stack to the tdict struct * renamed VG_(sanity_check_needs) to VG_(finish_needs_init), as it does now more than sanity checks : it derives the 2 above booleans. m_tooliface.c: * change VG_(sanity_check_needs) to VG_(finish_needs_init) m_main.c: * update call to VG_(sanity_check_needs) hg_main.c: * add a few inlines for functions just calling another function * define the functions evh__new_mem_stack_[4\|8\|12\|16\|32\|112\|128\|144\|160] (using the macro DCL_evh__new_mem_stack). * call the VG_(track_new_mem_stack_[4\|8\|12\|16\|32\|112\|128\|144\|160]) m_translate.c * n_SP_updates_* stats are now maintained separately for the new and die fast and known cases. * need_to_handle_SP_assignment can now check only the 2 booleans any_new_mem_stack and any_die_mem_stack * DO_NEW macro: does not insert anymore a helper call if the tool does not track 'new' mem_stack. In case there is no new tracking, it however still does update the SP aliases (and the n_SP_updates_new_fast). * similar changes for DO_DIE macro. * a bunch of white spaces changes Note: it is easier to look at the changes in this file using git diff -w to ignore the white spaces changes (e.g. due to DO_NEW/DO_DIE indentation changes). regtested on debian/amd64 and on centos/ppc64	2017-11-07 21:18:31 +01:00
Philippe Waroquiers	4d621f6510	Move or conditionalise on CHECK_CEM some expensive asserts * Some RCEC related asserts checking there was no corruption are on hot paths => make these checks only when CHECK_CEM is set. * Move an expensive assert where the event is inserted, as it is useless to check this when searching for an already existing event : it is enough to ensure that an invalid szB cannot be inserted, and so will not be found, and so assert will trigger in the insertion logic.	2017-11-07 21:13:55 +01:00
Julian Seward	d813fb74af	s390_irgen_EX_SS: add initialisations so as to remove (false positive) warnings from gcc-7.x. When compiling guest_s390_toIR.c for a 32-bit target (a configuration in which it will never be used, but never mind), gcc-7.x notices that sizeof(ss.dec) is larger than sizeof(ss.bytes), so the initialisation of ss.bytes leaves ss.dec.b2 and ss.dec.d2 uninitialised. This patch causes both variants to be initialised. When built for a 64 bit target, the existing initialisation of ss.bytes covers ss.dec completely, so there is no error.	2017-11-07 15:01:51 +01:00
Julian Seward	7dd9a7f8b3	Add -finline-functions to standard build flags, so gcc will consider all functions as candidates for inlining.	2017-11-07 14:18:16 +01:00
Ivo Raisr	c46053cc38	Optionally exit on the first error with --exit-on-first-error=<yes\|no>. Fixes BZ#385939. Slightly modified patch by: Fauchet Gauthier <gauthier.fauchet@free.fr>	2017-11-04 14:31:22 +01:00
Philippe Waroquiers	1eb5ea2afe	Small optimisation in helgrind address description Searching if an addr is in a malloc-ed client block is expensive (linear search) So, before scanning the list of malloc block, check that the address is in a client heap segment : this is a fast operation (it has a small cache, and for cache miss, does a dichotomic search) and avoids scanning a often big list (for big applications).	2017-11-04 08:32:03 +01:00
Petar Jovanovic	95038d380d	mips: finetune tests that print FCSR Bits 18 (NAN2008) and 19 (ABS2008) in FCSR are preset by hardware and can differ between platforms. Hence, we should clear these bits before printing FCSR value in order to have the same output on different platforms. This fixes several failures (tests modified by this change) that occur on MIPS P5600 board. The P5600 is a core that implements MIPS32 Release 5 arch.	2017-11-03 19:11:36 +01:00
Philippe Waroquiers	b8fa6c086f	Improve the NEWS entry for --delta-stacktrace flag.	2017-11-02 21:50:48 +01:00
Philippe Waroquiers	619fb35df7	Fix 376257 - helgrind history full speed up using a cached stack This patch implements the flag --delta-stacktrace=yes/no. Yes indicates to calculate the full history stack traces by changing just the last frame if no call/return instruction was executed. This can speed up helgrind by up to 25%. This flags is currently set to yes only on linux x86 and amd64, as some platform dependent validation of the used heuristics is needed before setting the default to yes on a platform. See function check_cached_rcec_ok in libhb_core.c for more details about how to validate/check the behaviour on a new platform.	2017-11-02 21:33:35 +01:00
Carl Love	6a55b1e82c	Fix access to time base register to return 64-bits.	2017-10-31 13:45:28 -05:00
Petar Jovanovic	0eea388934	android: compute possible size of a symbol of unknown size Under specific circumstances, setting 2048 as a size of symbol of unknown size causes that symbol crosses unmapped region. This further causes an assertion in Valgrind. Compute possible size by computing maximal size the symbol can have within its section. Patch by Tamara Vlahovic.	2017-10-31 18:00:38 +01:00
Philippe Waroquiers	2f9cceafa3	introduce a test for n-i-bz fix bug in strspn replacement c1eace647ca4f670ef9bec0d0fe72cdd25a96394 fixed a bug in strspn replacement. Add a test to cover this fix.	2017-10-28 15:02:11 +02:00
Petar Jovanovic	bf87528d10	mips: update NEWS about MIPS MSA support Spread the word about MIPS MSA support. Related BZ issue - #382563.	2017-10-28 00:39:16 +02:00
Petar Jovanovic	0e1fa562e9	mips: MSA tests This set of tests covers the whole MSA instruction set: none/tests/mips32/msa_arithmetic none/tests/mips32/msa_comparison none/tests/mips32/msa_data_transfer none/tests/mips32/msa_fpu none/tests/mips32/msa_logical_and_shift none/tests/mips32/msa_shuffle none/tests/mips64/msa_arithmetic (symlink to mips32) none/tests/mips64/msa_comparison (symlink to mips32) none/tests/mips64/msa_data_transfer none/tests/mips64/msa_fpu (symlink to mips32) none/tests/mips64/msa_logical_and_shift (symlink to mips32) none/tests/mips64/msa_shuffle (symlink to mips32) Contributed by: Tamara Vlahovic, Aleksandar Rikalo and Aleksandra Karadzic. Related BZ issue - #382563.	2017-10-27 16:27:25 +02:00
Petar Jovanovic	4686886774	mips: add support for MSA regs in Memcheck Add support for MSA registers in Memcheck. Contributed by: Tamara Vlahovic, Aleksandar Rikalo and Aleksandra Karadzic. Related BZ issue - #382563.	2017-10-27 16:27:24 +02:00
Petar Jovanovic	13577bb699	mips: detect presence of MSA Detect presence of MSA capabilities. Contributed by: Tamara Vlahovic, Aleksandar Rikalo and Aleksandra Karadzic. Minor code-style rewrites by myself. Related BZ issue - #382563.	2017-10-27 16:27:24 +02:00
Petar Jovanovic	4ef3d807e1	mips: MSA support for mips32/mips64. Full support of MIPS SIMD Architecture Module (MSA) instruction set. Following IOPs have been implemented using generation of MSA instructions: Iop_GetElem8x16, Iop_GetElem16x8, Iop_GetElem32x4, Iop_GetElem64x2, Iop_V128to32, Iop_V128HIto64, Iop_V128to64, Iop_F32toF16x4, Iop_Abs64x2, Iop_Abs32x4, Iop_Abs16x8, Iop_Abs8x16, Iop_Cnt8x16, Iop_NotV128, Iop_Reverse8sIn16_x8, Iop_Reverse8sIn32_x4, Iop_Reverse8sIn64_x2, Iop_Cls8x16, Iop_Cls16x8, Iop_Cls32x4, Iop_Clz8x16, Iop_Clz16x8, Iop_Clz32x4, Iop_Clz64x2, Iop_Abs32Fx4, Iop_Abs64Fx2, Iop_RecipEst32Fx4, Iop_RecipEst64Fx2, Iop_RSqrtEst32Fx4, Iop_RSqrtEst64Fx2, Iop_F16toF32x4, Iop_I32UtoFx4, Iop_FtoI32Sx4_RZ, Iop_FtoI32Ux4_RZ, Iop_Add8x16, Iop_Add16x8, Iop_Add32x4, Iop_Add64x2, Iop_Sub8x16, Iop_Sub16x8, Iop_Sub32x4, Iop_Sub64x2, Iop_QAdd8Sx16, Iop_QAdd16Sx8, Iop_QAdd32Sx4, Iop_QAdd64Sx2, Iop_QAdd8Ux16, Iop_QAdd16Ux8, Iop_QAdd32Ux4, Iop_QAdd64Ux2, Iop_QSub8Sx16, Iop_QSub16Sx8, Iop_QSub32Sx4, Iop_QSub64Sx2, Iop_QSub8Ux16, Iop_QSub16Ux8, Iop_QSub32Ux4, Iop_QSub64Ux2, Iop_QDMulHi32Sx4, Iop_QDMulHi16Sx8, Iop_QRDMulHi32Sx4, Iop_QRDMulHi16Sx8, Iop_Max8Sx16, Iop_Max16Sx8, Iop_Max32Sx4, Iop_Max64Sx2, Iop_Max8Ux16, Iop_Max16Ux8, Iop_Max32Ux4, Iop_Max64Ux2, Iop_Min8Sx16, Iop_Min16Sx8, Iop_Min32Sx4, Iop_Min64Sx2, Iop_Min8Ux16, Iop_Min16Ux8, Iop_Min32Ux4, Iop_Min64Ux2, Iop_Shl8x16, Iop_Shl16x8, Iop_Shl32x4, Iop_Shl64x2, Iop_Shr8x16, Iop_Shr16x8, Iop_Shr32x4, Iop_Shr64x2, Iop_Sar8x16, Iop_Sar16x8, Iop_Sar32x4, Iop_Sar64x2, Iop_InterleaveHI8x16, Iop_InterleaveHI16x8, Iop_InterleaveHI32x4, Iop_InterleaveHI64x2, Iop_InterleaveLO8x16, Iop_InterleaveLO16x8, Iop_InterleaveLO32x4, Iop_InterleaveLO64x2, Iop_InterleaveEvenLanes8x16, Iop_InterleaveEvenLanes16x8, Iop_InterleaveEvenLanes32x4, Iop_InterleaveOddLanes8x16, Iop_InterleaveOddLanes16x8, Iop_InterleaveOddLanes32x4, Iop_CmpEQ8x16, Iop_CmpEQ16x8, Iop_CmpEQ32x4, Iop_CmpEQ64x2, Iop_CmpGT8Sx16, Iop_CmpGT16Sx8, Iop_CmpGT32Sx4, Iop_CmpGT64Sx2, Iop_CmpGT8Ux16, Iop_CmpGT16Ux8, Iop_CmpGT32Ux4, Iop_CmpGT64Ux2, Iop_Avg8Sx16, Iop_Avg16Sx8, Iop_Avg32Sx4, Iop_Avg8Ux16, Iop_Avg16Ux8, Iop_Avg32Ux4, Iop_Mul8x16, Iop_Mul16x8, Iop_Mul32x4, Iop_AndV128, Iop_OrV128, Iop_XorV128, Iop_ShrV128, Iop_ShlV128, Iop_ShlN8x16, Iop_ShlN16x8, Iop_ShlN32x4, Iop_ShlN64x2, Iop_SarN8x16, Iop_SarN16x8, Iop_SarN32x4, Iop_SarN64x2, Iop_ShrN8x16, Iop_ShrN16x8, Iop_ShrN32x4, Iop_ShrN64x2, Iop_QandQSarNnarrow64Sto32Sx2, Iop_QandQSarNnarrow32Sto16Sx4, Iop_QandQRSarNnarrow64Sto32Sx2, Iop_QandQRSarNnarrow32Sto16Sx4, Iop_CmpEQ32Fx4, Iop_CmpEQ64Fx2, Iop_CmpLT32Fx4, Iop_CmpLT64Fx2, Iop_CmpLE32Fx4, Iop_CmpLE64Fx2, Iop_CmpUN32Fx4, Iop_CmpUN64Fx2, Iop_64HLtoV128, Iop_Min32Fx4, Iop_Min64Fx2, Iop_Max32Fx4, Iop_Max64Fx2, Iop_Sqrt32Fx4, Iop_Sqrt64Fx2, Iop_Add32Fx4, Iop_Add64Fx2, Iop_Sub32Fx4, Iop_Sub64Fx2, Iop_Mul32Fx4, Iop_Mul64Fx2, Iop_Div32Fx4, Iop_Div64Fx2, Iop_F32x4_2toQ16x8, Iop_F64x2_2toQ32x4, Iop_ScaleF64, Scale2_64Fx2, Scale2_32Fx4, Iop_Log2_32Fx4, Iop_Log2_64Fx2, Iop_PackOddLanes8x16, Iop_PackEvenLanes8x16, Iop_PackOddLanes16x8, Iop_PackEvenLanes16x8, Iop_PackOddLanes32x4, Iop_PackEvenLanes32x4. Folowing IOPs have been implemented without generating MSA instructions: Iop_CmpEQ8, Iop_MullU8, Iop_MullS8, Iop_MullU16, Iop_MullS16, Iop_DivS32, Iop_DivU32, Iop_DivS64, Iop_DivU64, Iop_F32toI32U, Iop_F64toI64U, Iop_I64UtoF64 Imlementation of the following IOPs has been changed in order to use MSA when it is possible: Iop_MAddF64, Iop_MSubF32, Iop_MSubF64. Contributed by: Tamara Vlahovic, Aleksandar Rikalo and Aleksandra Karadzic. Related BZ issue - #382563.	2017-10-27 16:27:24 +02:00
Petar Jovanovic	91373819a3	mips: new Iops added to support MSA New Iops are defined: Iop_Scale2_32Fx4, Iop_Scale2_64Fx2, Iop_Log2_32Fx4, Iop_Log2_64Fx2, Iop_F32x4_2toQ16x8, Iop_F64x2_2toQ32x4, Iop_PackOddLanes8x16, Iop_PackEvenLanes8x16, Iop_PackOddLanes16x8, Iop_PackEvenLanes16x8, Iop_PackOddLanes32x4, Iop_PackEvenLanes32x4. Contributed by: Tamara Vlahovic, Aleksandar Rikalo and Aleksandra Karadzic. Related BZ issue - #382563.	2017-10-27 16:27:24 +02:00
Philippe Waroquiers	c1eace647c	Fix n-i-bz fix bug in strspn replacement Mix-up between UChar and HChar in strspn. Also grouped together the n-i-bz announded fixes in NEWS	2017-10-26 20:53:15 +02:00
Mark Wielaard	476b52d62d	Bug #385912 . Remove explicit NULL check from none/tests/rlimit_nofile. glibc doesn't guarantee anything about setrlimit with a NULL limit argument. It could just crash (if it needs to adjust the limit) or might silently succeed (as newer glibc do). Just remove the extra check. See also the "setrlimit change to prlimit change in behavior" thread: https://sourceware.org/ml/libc-alpha/2017-10/threads.html#00830	2017-10-20 14:55:06 +02:00
Mark Wielaard	f844689f85	Suppress _dl_runtime_resolve_avx_slow for memcheck conditional. glibc ld.so has an optimization when resolving a symbol that checks whether or not the upper 128 bits of the ymm registers are zero. If so it uses "cheaper" instructions to save/restore them using the xmm registers. If those upper 128 bits contain undefined values memcheck will issue an Conditional jump or move depends on uninitialised value(s) warning whenever trying to resolve a symbol. This triggers in our sh-mem-vecxxx test cases. Suppress the warning by default. https://bugs.kde.org/show_bug.cgi?id=385868	2017-10-20 14:17:04 +02:00
Petar Jovanovic	cd1d7eb00c	mips: simplify handling of Iop_Max32U Use MIPSRH_Reg to get MIPSRH for Iop_Max32U. Without it, under specific circumstances, the code may explode and exceed Valgrind instruction buffer due to multiple calls to iselWordExpr_R through iselWordExpr_RH. Issue discovered while testing Valgrind on Android. Patch by Tamara Vlahovic.	2017-10-17 15:40:47 +02:00
Petar Jovanovic	2cf115e657	mips: fix handling of Iex_ITE While handling Iex_ITE, do not use the same virtual register for the input and output. Issue discovered while testing Valgrind on Android. Patch by Tamara Vlahovic.	2017-10-17 15:31:06 +02:00
Ivo Raisr	074de238d4	VEX register allocator: allocate caller-save registers for short lived vregs. Allocate caller-saved registers for short lived vregs and callee-save registers for vregs which span accross helper calls. Fixes BZ#384987.	2017-10-11 20:56:49 +02:00
Ivo Raisr	83cabd3249	Refactor tracking of MOV coalescing. Reg<->Reg MOV coalescing status is now a part of the HRegUsage. This allows register allocation to query it two times without incurring a performance penalty. This in turn allows to better keep track of vreg<->vreg MOV coalescing so that all vregs in the coalesce chain get the effective \|dead_before\| of the last vreg. A small performance improvement has been observed because this allows to coalesce even spilled vregs (previously only assigned ones).	2017-10-11 20:56:48 +02:00

1 2 3 4 5 ...

15957 Commits