pshufb mm/xmm/ymm rearranges byte lanes in vector registers. It's fairly
widely used, but we generated terrible code for it. With this patch, we just
generate, at the back end, pshufb plus a bit of masking, which is a great
improvement.
memcheck/mc_translate.c:
Add mkRight{32,64} as right-travelling analogues to mkLeft{32,64}.
doCmpORD: for the cases of a signed comparison against zero, compute
definedness of the 3 result bits (lt,gt,eq) separately, and, for the lt and eq
bits, do it exactly accurately.
expensiveCountTrailingZeroes: no functional change. Re-analyse/verify and add
comments.
expensiveCountLeadingZeroes: add. Very similar to
expensiveCountTrailingZeroes.
Add some comments to mark unary ops which are self-shadowing.
Route Iop_Ctz{,Nat}{32,64} through expensiveCountTrailingZeroes.
Route Iop_Clz{,Nat}{32,64} through expensiveCountLeadingZeroes.
Add instrumentation for Iop_PopCount{32,64} and Iop_Reverse8sIn32_x1.
memcheck/tests/vbit-test/irops.c
Add dummy new entries for all new IROps, just enough to make it compile and
run.
This adds z/Architecture vector integer and string instruction support.
The main author of this patch is Vadim Barkov <vbrkov@gmail.com>. Some
fixes were provided by Andreas Arnez <arnez@linux.ibm.com>.
(from bug 385408 comment 0):
Valgrind currently lacks support for the z/Architecture vector "support"
instructions introduced with z13. These are documented in the
z/Architecture Principles of Operation, Eleventh Edition (March, 2015),
chapter 21: "Vector Overview and Support Instructions".
Bug 387664 changes the default settings for accurate definedness checking
for {Add,Sub}{32,64} and {CmpEQ,CmpNE}{8,16,32,64}. This fix updates the
vbit tester (memcheck/tests/vbit-test) to test the accurate versions of
these, and thereby fixes a regression caused by
e847cb5429 as committed for bug 387664.
New Iops are defined:
Iop_Scale2_32Fx4, Iop_Scale2_64Fx2,
Iop_Log2_32Fx4, Iop_Log2_64Fx2,
Iop_F32x4_2toQ16x8, Iop_F64x2_2toQ32x4,
Iop_PackOddLanes8x16, Iop_PackEvenLanes8x16,
Iop_PackOddLanes16x8, Iop_PackEvenLanes16x8,
Iop_PackOddLanes32x4, Iop_PackEvenLanes32x4.
Contributed by:
Tamara Vlahovic, Aleksandar Rikalo and Aleksandra Karadzic.
Related BZ issue - #382563.
Previous implementation misused some opcodes, and a side effect was
dead code emission.
To reimplement handling of these instructions, three new IoPs have been
introduced:
Iop_DivModU64to64, // :: I64,I64 -> I128
// of which lo half is div and hi half is mod
Iop_DivModS32to32, // :: I32,I32 -> I64
// of which lo half is div and hi half is mod
Iop_DivModU32to32, // :: I32,I32 -> I64
// of which lo half is div and hi half is mod
Patch by Aleksandra Karadzic and Tamara Vlahovic.
A couple things got missed in the previous HW cap stuff needs updating patch
that cause the vbit tester to fail. The fixes are based on the patch
submitted by Mark Weilaard.
The changes were missed in Valgrind commit 16034
bugzilla 370265
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@16037
The test suite support for the Power PC ISA 3.0 instructions added in
VEX commit 3244 is added in this commit.
bugzilla 364948
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15938
The original code was using the bcdadd / bcdsub instruction on the operand
shadow bits to calculate the shadow bits for the result. This introduced
non-zero bits shadow bits in the result. The shadow bits for these
instructions should be set to all valid or all invalid. If one of the
argument shadow bits was one, then all of the shadow bits of the result should
be one. Otherwise the result shadow bits should be zero.
This patch fixes the above bug in memcheck/mc_translate.c
Fixing the above bug broke the v-bit test. The issue is the v-bit tester
assumes the shadow bits for the operands of a given Iop can be set to one
for testing purposes. The implementation of the bcdadd and bcdsub was passing
a constant value for the variable ps. The ps value is an argument to the
instruction that specifies how to set the sign code of the result. The
implementation of the instructions was changed to issue the instruction with
ps=0. Then the result of the instruction is updated in the VEX code if ps=1.
This changed also results in cleaning up the vbit test code.
This patch also fixes the issues with the v-bit test program.
Valgrind commit 3218
Bugzilla 360035
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15871
Add "memory" to the clobber arguments of VALGRIND_DO_CLIENT_REQUEST_EXPR.
This fixes memcheck/tests/vbit-test/vbit-test.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15740
Power 8 instructions.
The patch for bug 354797 moved the declaration for rc outside of the
architecture #ifdef. This results in an message about rc being unused
on architectures other then s390 and powerpc. This commit eliminates
the issue by:
powerpc: move rc declaration into #ifdef for powerpc.
Remove tab, put in missing break.
s390: remove rc declaration from inside case statement. Put rc declaration
before the switch statement but within the #ifdef for s390 so it will
be declared for use in both case clauses.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15738
The ISA 2.07 support adds new Iops as well as support for some existing
Iops. None of these Iops have been enabled in the vbit tester. This commit
adds the needed support to the files in memcheck/tests/vbit-test.
These changes add support for additional immediate operands and additional
undefined bit checking functions.
There are additional changes to files VEX/priv/ir_inject.c and VEX/pub/libvex.h
that are in VEX commit 3202
Bugzilla 354797 was created for this issue.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15720
The support for the Valgrind Iops is dependent on the Power processor
support for various instructions. The instructions supported by a
given Power processor is based on the version of the ISA. The patch
add a check to the vbit-test to ensure it does not try to test an Iop
that generates an instruction on the host that is not supported.
This patch fixes bugzilla 352765.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15653
s390: Add testcase for fixbr.
Patch by Andreas Arnez <arnez@linux.vnet.ibm.com>.
Part of fixing BZ #350290.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15629
Valgrind aspects, to match vex r3124.
See bug 339778 - Linux/TileGx platform support to Valgrind
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15080
- add configure option --enable-ubsan
- add __ubsan helpers (by Julian)
This requires gcc 4.9.2 or later. Not all platforms are supported, though.
With this change and VEX r3099 regression tests pass on amd64
with a valgrind compiled with -fsanitize=undefined.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14995
This was not as straight forward as expected. Specifically, adding the
new flag to CFLAGS in configure.ac did not work and was causing
compiler warnings. For instance, compiling memcheck/tests/execve2.c will
generate a -Wnonnull warning even though the testcase is explicitly
compiled with -Wno-nonnull. The reason is that (a) -Wformat is implied by
-Wnonnull and (b) the list of compiler flags gets assembled in the wrong
order. The culprit appears to be that we modify CFLAGS in configure.ac and
that really is not the right place. Conceptually, configure should determine
tool-chain capabilities and not assemble compiler flags. That should be done
in Makefiles. This patch entangles all this.
So, whatever was added to CFLAGS in configure.ac has now been moved to
Makefile.all.am and Makefile.tool-tests.am. Those are:
-Wno-long-long
-Wwrite-strings
-Wcast-qual
-fno-stack-protector
Note, that this change allows us to simplify Makefile.tool-tests.am which
in the past was disabling some of those flags (e.g. by adding -Wno-cast-qual
again).
In case of the clang compiler, extra command line options are needed. I've
moved those into a separate 'if COMPILER_IS_CLANG' section and not merge
them into baseline flags.
Related to BZ 334727.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14798