The shmctl syscall on amd64, arm64 and riscv (but we don't have a port
for that last one) always use IPC_64. Explicitly pass it to the generic
PRE/POST handlers so they select the correct (64bit) data structures on
those architectures.
https://bugzilla.redhat.com/show_bug.cgi?id=1909548
On Linux, there are two variants of the direct shmctl syscall:
- sys_shmctl: always uses shmid64_ds, does not accept IPC_64
- sys_old_shmctl: uses shmid_ds or shmid64_ds depending on IPC_64
The following Linux ABIs have the sys_old_shmctl variant:
alpha, arm, microblaze, mips n32/n64, xtensa
Other ABIs (and future ABIs) have the sys_shmctl variant, including ABIs
that only got sys_shmctl in Linux 5.1 (such as x86, mips o32, ppc,
s390x).
We incorrectly assume the sys_old_shmctl variant on nanomips and x86,
causing shmat() calls under valgrind to fail with EINVAL.
On x86, the issue was previously masked by the non-existence of
__NR_shmctl until a9fc7bceeb0b0 ("Update Linux x86 system call number
definitions") in 2019.
On mips o32, ppc, and s390x this issue is not visible as our headers do
not have __NR_shmctl for those ABIs (396 since Linux 5.1).
Fix the issue by correcting the preprocessor check in get_shm_size() to
only assume the old Linux sys_old_shmctl behavior on the specific
affected platforms.
Also, exclude the use of direct shmctl entirely on Linux x86, ppc,
mips o32, s390x in order to keep compatibility with pre-5.1 kernel
versions that did not yet have direct shmctl for those ABIs.
This currently only has actual effect on x86 as only it has __NR_shmctl
in our headers.
Fixes tests mremap4, mremap5, mremap6.
https://bugs.kde.org/show_bug.cgi?id=410743
Add Iop_NegF16, Iop_AbsF16 and Iop_SqrtF16 to VEX/priv/ir_defs.c
primopMightTrap. Also rewrite case statement slightly so GCC will warn
if an enumeration value is missed.
Implement DWARF5 in readdwarf.c and readdwarf3.c
Since gcc11 will default to DWARF5 by default it is time for
valgrind to support it. The patch handles everything gcc11 produces
(except for the new DWARF expressions).
There is some duplication in the patch since we actually have two DWARF
readers which use slightly different abstractions (Slices vs Cursors).
It would be nice if we could merge these somehow. The reader in
readdwarf3.c is only used when --read-var-info=yes is used (which
drd uses to provide the allocation context).
The handling of DW_FORM_implicit_const is tricky with the current design.
An abbrev which contains an attribute encoded with DW_FORM_implicit_const
has its value also in the abbrev. The code in readdwarf3.c assumed it
always could simply get the data from the .debug_info/current Cursor.
For now I added a value field to the name_form field that holds the
associated value. This is slightly wasteful since the extra field is
not necessary for other forms.
Tested against GCC10 (defaulting to DWARF4) and GCC11 (defaulting to
DWARF5) on x86_64. No regressions in the regtests.
https://bugs.kde.org/show_bug.cgi?id=432102
GCC notices that AT is passed around as char, specifically as %u argument
to DIP. But ifieldAT returns an UChar and vsx_matrix_ger takes AT as UChar.
This causes lots of format string warnings when building with GCC11.
Simply declare AT as UChar instead of char.
GCC warns:
readpdb.c:1631:16: warning: this 'if' clause does not guard...
[-Wmisleading-indentation]
1631 | if (debug)
| ^~
In file included from ./pub_core_basics.h:38,
from m_debuginfo/readpdb.c:38:
../include/pub_tool_basics.h:69:30: note: ...this statement, but the latter
is misleadingly indented as if it were guarded by the 'if'
69 | #define ML_(str) VGAPPEND(vgModuleLocal_, str)
| ^~~~~~~~~~~~~~
../include/pub_tool_basics.h:66:29: note: in definition of macro 'VGAPPEND'
66 | #define VGAPPEND(str1,str2) str1##str2
| ^~~~
m_debuginfo/readpdb.c:1636:19: note: in expansion of macro 'ML_'
1636 | ML_(addLineInfo)(
| ^~~
The warning message is slightly hard to read because of the macro expansion.
But GCC is right that the indentation is misleading. Fixed by reindenting.
This patch fixes numerous errors in the ISA support.
The word and prefix versions of the instructions do not use the same mask
to extract the immediate values. The prefix instructions should all use
the DFOM_IMMASK.
The parsing of prefix instructions has been fixed to ensure the ISA 3.1
instructions all have the ISA_3_1_PREFIX_CHECK check.
Fixed the commenting to improve the comments for the instruction parsing.
Fixed the parsing of the plxv instruction.
General code cleanup.
The effective address (EA) calculation for the prefixed instructions
concatenate an 18-bit immediate value from the prefix word and a 16-bit
immediate value fro the instruction word. This results in a 34-bit value.
The concatenated value must be stored into a long long int not a 32-bit
integer.
The ISA 3.1 support has both word instructions of length 4-bytes and prefixed
instruction of length 8-bytes. The following fix is needed when Valgrind
is compiled using an ISA 3.1 compiler.
This is the exact analog of cadd90993504678607a4f95dfe5d1df5207c1eb0, to the
point of almost being a copy-n-paste. That commit split (amd64) iselCondCode
into two functions, iselCondCode_C (existing) and iselCondCode_R (new). The
latter computes an I1-typed expression into a register rather than a condition
code. The two functions cooperate so as to minimise between conversions between
a condition-code value and a value in a register.
Thus far the arm64 isel can't generate instructions of the form
{and,or,xor,add,sub} reg,reg,reg-shifted-by-imm
and hence sometimes winds up generating pairs like
lsh x2, x1, #13 ; orr x4, x3, x2
when instead it could just have generated
orr x4, x3, x1, lsh #13
This commit fixes that, although only for the 64-bit case, not the 32-bit
case. Specifically, it can transform the IR forms
{Add,Sub,And,Or,Xor}(E1, {Shl,Shr,Sar}(E2, immediate)) and
{Add,And,Or,Xor}({Shl,Shr,Sar}(E1, immediate), E2)
into a single arm64 instruction. Note that `Sub` is not included in the
second line, because shifting the first operand requires inverting the arg
order in the arm64 instruction, which isn't allowable with `Sub`, since it's
not commutative and arm64 doesn't offer us a reverse-subtract instruction to
use instead.
This gives a 1.1% reduction generated code size when running
/usr/bin/date on Memcheck.
When running Memcheck, most blocks will do one and often two of `PUT(..) =
0x0:I64`, as a result of the way the front end models arm64 condition codes.
The arm64 isel would generate `mov xN, #0 ; str xN, [xBaseblock, #imm]`,
which is pretty stupid. This patch changes it to a single insn:
`str xzr, [xBaseblock, #imm]`.
This is a special-case for `PUT(..) = 0x0:I64`. General-case integer stores
of 0x0:I64 are unchanged.
This gives a 1.9% reduction in generated code size when running
/usr/bin/date on Memcheck.
On Fedora 33 with gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9)
it looks like fun:__static_initialization_and_destruction_0 is
now inlined which causes the existing suppression for the
same reachable to no longer match.
This is a follow-on to 41504d33dec8773c591d45192d1dda6e9c670031.
For the cases of sfbm that are actually just sign-extensions to a wider width,
emit that directly and do disassembly-printing accordingly. No functional
change.
The URL to the original C++ front-end for GCC internals document
disappeared. Replace it with an URL that still has a description of
the original magic cookie added by operator new [] by that frontend.
The ufbm and sfbm instructions implement some kind of semi-magical rotate,
mask and sign/zero-extend functionality. Boring old left and right shifts are
special cases of it. The existing translation into IR is correct, but has the
disadvantage that the IR optimiser isn't clever enough to simplify the
resulting IR back into a single shift in the case where the instruction is
used simply to encode a shift. This induces inefficiency and it also makes
the resulting disassembly pretty difficult to read, if you're into that kind
of thing.
This commit does the obvious thing: detects cases where the required behaviour
is just a single shift, and emits IR and disassembly-printing accordingly.
All other cases fall through to the existing general-case handling and so are
unchanged.
The arm64 frontend used to implement the scalar fmadd, fmsub, fnmadd
and fnmsub iinstructions into separate addition/substraction and
multiplication instructions, which caused rounding issues.
This patch turns them into Iop_M{Add,Sub}F{32,64} instructions
(with some arguments negated). And the backend now emits fmadd or fmsub
instructions.
Alexandra Hajkova <ahajkova@redhat.com> added tests and fixed up the
implementation to make sure rounding (and sign) are correct now.
https://bugs.kde.org/show_bug.cgi?id=426014
stxsibx (Store VSX Scalar as Integer Byte Indexed X-form) is implemented
by first reading a whole word, merging in the new byte, and then writing
out the whole word. Causing memcheck to warn when the destination might
have room for less than 8 bytes.
The stxsihx (Store VSX Scalar as Integer Halfword Indexed X-form)
instruction does something similar reading and then writing a full
word instead of a half word.
The code can be simplified (and made more correct) by storing the byte
(or half-word) directly, IRStmt_Store seems fine to store byte or half
word sized data, and so seems the ppc backend.
https://bugs.kde.org/show_bug.cgi?id=430354
Implement the new instructions/features that were added to z/Architecture
with the vector-enhancements facility 1. Also cover the instructions from
the vector-packed-decimal facility that are defined outside the chapter
"Vector Decimal Instructions", but not the ones from that chapter itself.
For a detailed list of newly supported instructions see the updates to
`docs/internals/s390-opcodes.csv'.
Since the miscellaneous instruction extensions facility 2 was already
addressed by Bug 404406, this completes the support necessary to run
general programs built with `--march=z14' under Valgrind. The
vector-packed-decimal facility is currently not exploited by the standard
toolchain and libraries.
Compare-and-swap instructions can cause memcheck false positives when
operating on partially uninitialized data. An example is where a 1-byte
lock is allocated on the stack and then manipulated using CS on the
surrounding word. This is correct, and the uninitialized data has no
influence on the result, but memcheck still complains.
This is caused by logic in the s390 backend, where the expected and actual
memory values are compared using Iop_Sub32. Fix this by using
Iop_CasCmpNE32 instead.
Newer binutils produce an error when the assembly contains lmw, stmw,
lswi, lswx, stswi, or stswx instructions in little-endian mode.
Only build and run the lsw and ldst_multiple testcases on ppc64[be].
https://bugs.kde.org/show_bug.cgi?id=427870
Currently, if there are multiple equal global peaks, `intro_Block` and
`resize_Block` record the first one while `check_for_peak` records the
last one. This could lead to inconsistent output, though it's unlikely
in practice.
This commit fixes things so that all functions record the last peak.
The previous commit:
commit eb82a294573d15c1be663673d55b559a82ca29d3
Author: Julian Seward <jseward@acm.org>
Date: Tue Nov 10 21:10:48 2020 +0100
Add a missing ifdef, whose absence caused build breakage on non-POWER targets.
fixed the compile issue in conv_f16_to_double() where non-Power platforms
do not support the power xscvhpdp assembly instructions. The instruction
is supported by ISA 3.0 platforms. Older Power platforms still fail to
compile with the assembly instruction. This patch fixes the if def for
power systems that do not support ISA 3.0.