mirror of
https://github.com/Zenithsiz/ftmemsim-valgrind.git
synced 2026-02-14 14:58:07 +00:00
This branch contains code which avoids Memcheck false positives resulting from
gcc and clang creating branches on uninitialised data. For example:
bool isClosed;
if (src.isRect(..., &isClosed, ...) && isClosed) {
clang9 -O2 compiles this as:
callq 7e7cdc0 <_ZNK6SkPath6isRectEP6SkRectPbPNS_9DirectionE>
cmpb $0x0,-0x60(%rbp) // "if (isClosed) { .."
je 7ed9e08 // "je after"
test %al,%al // "if (return value of call is nonzero) { .."
je 7ed9e08 // "je after"
..
after:
That is, the && has been evaluated right-to-left. This is a correct
transformation if the compiler can prove that the call to |isRect| returns
|false| along any path on which it does not write its out-parameter
|&isClosed|.
In general, for the lazy-semantics (L->R) C-source-level && operator, we have
|A && B| == |B && A| if you can prove that |B| is |false| whenever A is
undefined. I assume that clang has some kind of interprocedural analysis that
tells it that. The compiler is further obliged to show that |B| won't trap,
since it is now being evaluated speculatively, but that's no big deal to
prove.
A similar result holds, per de Morgan, for transformations involving the C
language ||.
Memcheck correctly handles bitwise &&/|| in the presence of undefined inputs.
It has done so since the beginning. However, it assumes that every
conditional branch in the program is important -- any branch on uninitialised
data is an error. However, this idiom demonstrates otherwise. It defeats
Memcheck's existing &&/|| handling because the &&/|| is spread across two
basic blocks, rather than being bitwise.
This initial commit contains a complete initial implementation to fix that.
The basic idea is to detect the && condition spread across two blocks, and
transform it into a single block using bitwise &&. Then Memcheck's existing
accurate instrumentation of bitwise && will correctly handle it. The
transformation is
<contents of basic block A>
C1 = ...
if (!C1) goto after
.. falls through to ..
<contents of basic block B>
C2 = ...
if (!C2) goto after
.. falls through to ..
after:
===>
<contents of basic block A>
C1 = ...
<contents of basic block B, conditional on C1>
C2 = ...
if (!C1 && !C2) goto after
.. falls through to ..
after:
This assumes that <contents of basic block B> can be conditionalised, at the
IR level, so that the guest state is not modified if C1 is |false|. That's
not possible for all IRStmt kinds, but it is possible for a large enough
subset to make this transformation feasible.
There is no corresponding transformation that recovers an || condition,
because, per de Morgan, that merely corresponds to swapping the side exits vs
fallthoughs, and inverting the sense of the tests, and the pattern-recogniser
as implemented checks all possible combinations already.
The analysis and block-building is performed on the IR returned by the
architecture specific front ends. So they are almost not modified at all: in
fact they are simplified because all logic related to chasing through
unconditional and conditional branches has been removed from them, redone at
the IR level, and centralised.
The only file with big changes is the IRSB constructor logic,
guest_generic_bb_to_IR.c (a.k.a the "trace builder"). This is a complete
rewrite.
There is some additional work for the IR optimiser (ir_opt.c), since that
needs to do a quick initial simplification pass of the basic blocks, in order
to reduce the number of different IR variants that the trace-builder has to
pattern match on. An important followup task is to further reduce this cost.
There are two new IROps to support this: And1 and Or1, which both operate on
Ity_I1. They are regarded as evaluating both arguments, consistent with AndXX
and OrXX for all other sizes. It is possible to synthesise at the IR level by
widening the value to Ity_I8 or above, doing bitwise And/Or, and re-narrowing
it, but this gives inefficient code, so I chose to represent them directly.
The transformation appears to work for amd64-linux. In principle -- because
it operates entirely at the IR level -- it should work for all targets,
providing the initial pre-simplification pass can normalise the block ends
into the required form. That will no doubt require some tuning. And1 and Or1
will have to be implemented in all instruction selectors, but that's easy
enough.
Remaining FIXMEs in the code:
* Rename `expr_is_speculatable` et al to `expr_is_conditionalisable`. These
functions merely conditionalise code; the speculation has already been done
by gcc/clang.
* `expr_is_speculatable`: properly check that Iex_Unop/Binop don't contain
operatins that might trap (Div, Rem, etc).
* `analyse_block_end`: recognise all block ends, and abort on ones that can't
be recognised. Needed to ensure we don't miss any cases.
* maybe: guest_amd64_toIR.c: generate better code for And1/Or1
* ir_opt.c, do_iropt_BB: remove the initial flattening pass since presimp
will already have done it
* ir_opt.c, do_minimal_initial_iropt_BB (a.k.a. presimp). Make this as
cheap as possible. In particular, calling `cprop_BB_wrk` is total overkill
since we only need copy propagation.
* ir_opt.c: once the above is done, remove boolean parameter for `cprop_BB_wrk`.
* ir_opt.c: concatenate_irsbs: maybe de-dup w.r.t. maybe_unroll_loop_BB.
* remove option `guest_chase_cond` from VexControl (?). It was never used.
* convert option `guest_chase_thresh` from VexControl (?) into a Bool, since
the revised code here only cares about the 0-vs-nonzero distinction now.
102 lines
3.5 KiB
C
102 lines
3.5 KiB
C
/* -*- mode: C; c-basic-offset: 3; -*- */
|
|
|
|
/*
|
|
This file is part of MemCheck, a heavyweight Valgrind tool for
|
|
detecting memory errors.
|
|
|
|
Copyright (C) 2012-2017 Florian Krohm
|
|
|
|
This program is free software; you can redistribute it and/or
|
|
modify it under the terms of the GNU General Public License as
|
|
published by the Free Software Foundation; either version 2 of the
|
|
License, or (at your option) any later version.
|
|
|
|
This program is distributed in the hope that it will be useful, but
|
|
WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with this program; if not, see <http://www.gnu.org/licenses/>.
|
|
|
|
The GNU General Public License is contained in the file COPYING.
|
|
*/
|
|
|
|
#ifndef VBITS_H
|
|
#define VBITS_H
|
|
|
|
#include <stdint.h>
|
|
#include <stdio.h>
|
|
#include <stdbool.h>
|
|
|
|
typedef uint64_t uint128_t[2];
|
|
typedef uint64_t uint256_t[4];
|
|
|
|
/* A type to represent V-bits */
|
|
typedef struct {
|
|
unsigned num_bits;
|
|
union {
|
|
uint8_t u1;
|
|
uint8_t u8;
|
|
uint16_t u16;
|
|
uint32_t u32;
|
|
uint64_t u64;
|
|
uint128_t u128;
|
|
uint256_t u256;
|
|
} bits;
|
|
} vbits_t;
|
|
|
|
|
|
/* A type large enough to hold any IRType'd value. At this point
|
|
we do not expect to test with specific floating point values.
|
|
So we don't need to represent them. */
|
|
typedef union {
|
|
uint8_t u1;
|
|
uint8_t u8;
|
|
uint16_t u16;
|
|
uint32_t u32;
|
|
uint64_t u64;
|
|
uint128_t u128;
|
|
uint256_t u256;
|
|
} value_t;
|
|
|
|
|
|
void print_vbits(FILE *, vbits_t);
|
|
vbits_t undefined_vbits(unsigned num_bits);
|
|
vbits_t undefined_vbits_BxE(unsigned int bits, unsigned int elements,
|
|
vbits_t v);
|
|
vbits_t undefined_vbits_BxE_rotate(unsigned int bits, unsigned int elements,
|
|
vbits_t vbits,
|
|
value_t value);
|
|
vbits_t undefined_vbits_128_even_element(unsigned int bits,
|
|
unsigned int elements, vbits_t v);
|
|
vbits_t undefined_vbits_64x2_transpose(vbits_t v);
|
|
vbits_t undefined_vbits_Narrow256_AtoB(unsigned int src_num_bits,
|
|
unsigned int result_num_bits,
|
|
vbits_t src1_v, value_t src1_value,
|
|
vbits_t src2_v, value_t src2_value,
|
|
bool sataurate);
|
|
vbits_t defined_vbits(unsigned num_bits);
|
|
int equal_vbits(vbits_t, vbits_t);
|
|
vbits_t truncate_vbits(vbits_t, unsigned num_bits);
|
|
vbits_t left_vbits(vbits_t, unsigned num_bits);
|
|
vbits_t or_vbits(vbits_t, vbits_t);
|
|
vbits_t and_vbits(vbits_t, vbits_t);
|
|
vbits_t concat_vbits(vbits_t, vbits_t);
|
|
vbits_t upper_vbits(vbits_t);
|
|
vbits_t sextend_vbits(vbits_t, unsigned num_bits);
|
|
vbits_t zextend_vbits(vbits_t, unsigned num_bits);
|
|
vbits_t onehot_vbits(unsigned bitno, unsigned num_bits);
|
|
vbits_t shl_vbits(vbits_t, unsigned amount);
|
|
vbits_t shr_vbits(vbits_t, unsigned amount);
|
|
vbits_t sar_vbits(vbits_t, unsigned amount);
|
|
int completely_defined_vbits(vbits_t);
|
|
vbits_t cmpord_vbits(unsigned v1_num_bits, unsigned v2_num_bits);
|
|
vbits_t cmp_eq_ne_vbits(vbits_t vbits1, vbits_t vbits2,
|
|
value_t val1, value_t val2);
|
|
vbits_t int_add_or_sub_vbits(int isAdd,
|
|
vbits_t vbits1, vbits_t vbits2,
|
|
value_t val1, value_t val2);
|
|
|
|
#endif // VBITS_H
|