Commit Graph

3127 Commits

Author SHA1 Message Date
Petar Jovanovic
9dd746e898 mips: fix file permission of guest_mips_toIR.c
Revert accidentally modified file permissions back to 0644.
2019-04-22 22:49:48 +00:00
Petar Jovanovic
50dd9600ab mips: fix mips32r6 and mips64r6 compilation issue
Add missing variable declarations.
Modify local_sys_write_stderr to use movn if available, and use
seleqz/selnez instructions otherwise.
2019-04-19 21:25:38 +00:00
Julian Seward
270037da8b Bug 406465 - arm64 instruction selector fails on "t0 = <expr>" where <expr> has type Ity_F16. 2019-04-13 12:34:06 +02:00
Carl Love
82e94fff80 PPC64, patch to test case issues reported in bugzilla 401827 and 401828.
This corrects a valgrind instruction emulation issue revealed by
a GCC change.
The xscvdpsp,xscvdpspn,xscvdpuxws instructions each convert
double precision values to single precision values, and write
the results into bits 0-32 of the 128 bit target register.
To get the value into the normal position for a scalar register
the result needed to be right-shifted 32 bits, so gcc always
did that.
It was determined that hardware also always did that, so the (redundant)
gcc shift was removed.
This exposed an issue because valgrind was only writing the result to
bits 0-31 of the target register.

This patch updates the emulation to write the result to both of the involved
32-bit fields.

VEX/priv/guest_ppc_toIR.c:
  - rearrange ops in dis_vx_conv to update more portions of the target
    register with copies of the result.   xscvdpsp,xscvdpspn,xscvdpuxws

none/tests/ppc64/test_isa_2_06_part1.c
  - update res32 checking to explicitly include fcfids and fcfidus in the
    32-bit result grouping.

none/tests/ppc64/test_isa_2_07_part2.c
  - correct NULL initializer for logic_tests definition

[*1] - GCC change referenced:
    2017-09-26  Michael Meissner  <meissner@linux.vnet.ibm.com>
            * config/rs6000/rs6000.md (movsi_from_sf): Adjust code to
              eliminate doing a 32-bit shift right or vector extract after
              doing  XSCVDPSPN.

patch submitted by:   Will Schmidt <will_schmidt@vnet.ibm.com>
reviewed, committed by:  Carl Love <cel@us.ibm.com>
2019-04-04 12:31:05 -05:00
Petar Jovanovic
dc950d964b mips: get rid of format and implicit-fallthrough warnings
Indicate when the fall through from the previous case label is intentional.
Fix format warnings related to arguments in printf calls.
2019-03-28 18:35:17 +01:00
Petar Jovanovic
92ecddd13e mips: code refactoring (NFC)
Code in VEX/priv/guest_mips_toIR.c is notably refactored.
DSP ASE dissasembly has been put in a separate file: guest_mipsdsp_toIR.c.

Patch by Aleksandar Rikalo.
2019-03-27 18:42:05 +00:00
Mark Wielaard
8ed9b61432 Use ULong instead of unsigned long in s390_irgen_EX_SS.
ovl was defined as an unsigned long. This would cause warnings from gcc:

  guest_s390_toIR.c:195:30: warning: right shift count >= width of type
  [-Wshift-count-overflow]

when building on 32bit arches, or building a 32bit secondary arch.

Fix this by defining ovl as ULong which is always guaranteed 64bit.
2019-03-27 15:51:34 +01:00
Mark Wielaard
f04ae9f359 Use gcc -Wimplicit-fallthrough=2 by default if available
GCC 7 instroduced -Wimplicit-fallthrough
https://developers.redhat.com/blog/2017/03/10/wimplicit-fallthrough-in-gcc-7/

It caught a couple of bugs, but it does need a bit of extra comments to
explain when a switch case statement fall-through is deliberate. Luckily
with -Wimplicit-fallthrough=2 various existing comments already do that.
I have fixed the bugs, but adding explicit break statements where
necessary and added comments where the fall-through was correct.

https://bugs.kde.org/show_bug.cgi?id=405430
2019-03-27 15:34:45 +01:00
Petar Jovanovic
b93d378296 mips: add a comment about decoding LX on Cavium
Interpret LX as a Cavium instruction, otherwise try decoding it as a DSP
instruction.
The fallthrough is deliberate.

Related to KDE #405430.
2019-03-25 16:47:09 +00:00
Carl Love
ed80ebfa17 PPC64, fix for vrlwnm, vrlwmi, vrldrm, vrldmi instructions.
Fixes the case where the specified end bit is less then the start bit.

Valgrind bug 405734
2019-03-22 12:50:52 -05:00
Carl Love
30a24515f0 PPC64, fix output for xvcvdpsp instruction.
The instruction should write the output to the upper and lower 32-bit
halfs of the results.

Valgrind bugzilla 405733.
2019-03-22 12:42:27 -05:00
Carl Love
e998650095 PPC64, The function _get_maxmin_fp_NaN does not handle the case of QNaN, SNaN correctly.
This patch fixes Valgrind to handle the case of QNaN, SNaN input the same
as the HW handles it.

Valgrind bug 405365.
2019-03-22 12:32:29 -05:00
Carl Love
d4686f635e PPC64, instructions xvcvdpsxws, xvcvdpuxws do not handle over/underflow, NaN correctly
The instructions are not checking for overflow, underflow, NaN and setting
the output correctly.

Valgrind bugzilla 405363
2019-03-22 12:26:00 -05:00
Carl Love
2da60f569f PPC64, fix for vmsummbm instruction.
The instruction needs to have the 32-bit "lane" values chopped to 32-bits.
The current lane implementation is not doing the chopping.  Need to
explicitly do the chop and add.

Valgrind bug 405362
2019-03-22 12:06:31 -05:00
Carl Love
886b0a1cf4 PPC64, fix implementation of xvcvsxdsp and xvcvuxddp instructions.
Instructions need to write result to upper and lower 32-bit half of the
64-bit result.

This is a fix for Valgrind bug 405356.
2019-03-22 11:56:38 -05:00
Petar Jovanovic
029f1196fc mips: correct order of function arguments for mkFormVEC
Vectors wt and ws were incorrectly received in mkFormVEC().
Issue spotted by Mark Wielaard and reported as KDE #405458.
2019-03-18 16:48:45 +01:00
Julian Seward
472b067e39 amd64: Implement RDRAND, VCVTPH2PS and VCVTPS2PH.
Bug 398870 - Please add support for instruction vcvtps2ph
Bug 353370 - RDRAND amd64->IR: unhandled instruction bytes: 0x48 0xF 0xC7 0xF0

This commit implements:

* amd64 RDRAND instruction, on hosts that have it.

* amd64 VCVTPH2PS and VCVTPS2PH, on hosts that have it.

  The presence/absence of these on the host is now reflected in the CPUID
  results returned to the guest.  So code that tests for these features in
  CPUID and acts accordingly should "just work".

* New test cases, none/tests/amd64/rdrand and none/tests/amd64/f16c.  These
  are built if the host's assembler can handle them, in the usual way.
2019-03-17 21:43:26 +01:00
Ilya Leoshkevich
7e9113cb7a Bug 405403 - s390x: Allow using disInstr_S390 on little-endian hosts
Certain projects, e.g. https://angr.io, use VEX as an intermediate
representation for the binary code analysis. In order to make it
possible to use them to analyze S/390 code on Intel, this patch
resolves the following issues in the disassembler:

- Bit fields, which are used to describe instruction formats, map to
  different bits on different hosts. This patch replaces them with
  macros, e.g. SS.l bit field becomes SS_l macro. Most bit field usages
  are replaced using the following perl script:

    perl -p -i \
         -e 's/\(&ovl\.value\)/&ovl/g;' \
         -e 's/ovl\.value/ovl/g;' \
         -e 's/ovl\.fmt\.([a-zA-Z\d_]+)\.([a-z\d]+)/$1_$2(ovl)/g' \
         priv/guest_s390_toIR.c

  Since after that there are no more structs, #pragma pack is also
  removed.

- Instructions are loaded from memory as words, which behaves
  differently depending on host endianness. Such loads are replaced by
  assembly of words from separately loaded bytes. This affects regular
  disassembly functions, and also s390_irgen_EXRL(), which loads
  last_execute_target this way.

- disInstr_S390() explicitly prohibits little-endian hosts with an
  assert, which is removed in this patch.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2019-03-15 15:00:30 +01:00
Tom Hughes
09566120e7 Suppress FSGSBASE flag from cpuid results
We don't support {rd,wr}{fs,gs}base so we shouldn't say we do.
2019-03-14 15:17:10 +00:00
Julian Seward
ecc4e97093 Bug 399287 - amd64 front end: Illegal Instruction vcmptrueps. Fix, but no test cases. 2019-03-13 14:25:41 +01:00
Julian Seward
4816357b5c VEX/auxprogs/genoffsets.c: Add cast to my_offsetof. n-i-bz.
Clang/LLVM trips over my_offsetof in VEX/auxprogs/genoffsets.c.  See LLVM
PR 40890 for details (https://bugs.llvm.org/show_bug.cgi?id=40890).

Now, it's a Clang bug that Clang exits on an assertion failure rather than
emits a diagnostic, but the previous my_offsetof expression is a pointer,
not an integer.  Add a cast as done in other definitions of offsetof in
the tree.  Patch from Ed Maste <emaste@freebsd.org>.
2019-03-12 18:37:15 +01:00
Julian Seward
4ee1dd2778 bb_to_IR(): increase assertion limits on the maximum size of self-checking translations. n-i-bz. 2019-03-09 17:58:11 +01:00
Mark Wielaard
256cf43c5e memcheck powerpc subfe x, x, x initializes x to 0 or -1 based on CA
GCC might use subfe x, x, x to initialize x to 0 or -1, based on
whether the carry flag is set. This happens in some cases when g++
compiles resetting a unique_ptr. The "trick" used by the compiler is
that it can AND a pointer with the register x (now 0x0 or 0xffffffff)
to set something to NULL or to the given pointer.

subfe is implemented as rD = (log not)rA + rB + XER[CA]
if we instead implement it as rD = rB - rA - (XER[CA] ^ 1)
then memcheck can see that rB and Ra cancel each other out if they
are the same.

https://bugs.kde.org/show_bug.cgi?id=404054
2019-02-21 17:21:53 +01:00
Julian Seward
dee1c5ac84 Fix format string warnings from gcc9. No functional change (I think!) 2019-02-02 14:06:51 +01:00
Julian Seward
b19f6882cf s390 back end: s390_isel_vec_expr_wrk: fix some enum type confusion. n-i-bz.
In s390_isel_vec_expr_wrk() there has been some assignments of enum-typed
values to variables of different enum types.  This fixes it.  It also adds a
few initialisations to variables of type HReg for safety against the
possibility of them being used uninitialised.  No functional change.  Tested
by Andreas Arnez.
2019-01-31 07:56:26 +01:00
Julian Seward
130ac30533 s390 front end: remove unused function 'put_gpr_int'. n-i-bz. 2019-01-26 18:18:28 +01:00
Julian Seward
2656009e6f amd64 pipeline: generate a much better translation for PMADDUBSW.
This seems pretty common in some codecs, and the existing translation
was somewhat longwinded.
2019-01-26 18:00:41 +01:00
Julian Seward
6b16f0e2a0 Rename some int<->fp conversion IROps for consistency. No functional change. n-i-bz.
2018-Dec-27: some of int<->fp conversion operations have been renamed so as to
have a trailing _DEP, meaning "deprecated".  This is because they don't
specify a rounding mode to be used for the conversion and so are
underspecified.  Their use should be replaced with equivalents that do specify
a rounding mode, either as a first argument or using a suffix on the name,
that indicates the rounding mode to use.
2019-01-26 17:38:01 +01:00
Andreas Arnez
467c7c4c96 Bug 403552 s390x: Fix vector facility bit number
The wrong bit number was used when checking for the vector facility.  This
can result in a fatal emulation error: "Encountered an instruction that
requires the vector facility.  That facility is not available on this
host."

In many cases the wrong facility bit was usually set as well, hence
nothing bad happened.  But when running Valgrind within a Qemu/KVM guest,
the wrong bit was not (always?) set and the emulation error occurred.

This fix simply corrects the vector facility bit number, changing it from
128 to 129.
2019-01-24 11:11:51 +01:00
Mark Wielaard
2c1f016e63 Bug 402519 - POWER 3.0 addex instruction incorrectly implemented
addex uses OV as carry in and carry out. For all other instructions
OV is the signed overflow flag. And instructions like adde use CA
as carry.

Replace set_XER_OV_OV32 with set_XER_OV_OV32_ADDEX, which will
call calculate_XER_CA_64 and calculate_XER_CA_32, but with OV
as input, and sets OV and OV32.

Enable test_addex in none/tests/ppc64/test_isa_3_0.c and update
the expected output. test_addex would fail to match the expected
output before this patch.
2018-12-31 22:26:31 +01:00
Julian Seward
d43c20b391 Bug 402481 - vbit-test fails on x86 for Iop_CmpEQ64 iselInt64Expr Sar64(Sub64(t14,Shr64(t14,0x1:I8)),0x3F:I8).
Fixes the failure by implementing Iop_Sar64 in the x86 back end.
2018-12-23 22:02:03 +01:00
Julian Seward
3b2f8bf69e amd64 back end: generate improved SIMD64 code.
For most SIMD operations that happen on 64-bit values (as would arise from MMX
instructions, for example, such as Add16x4, CmpEQ32x2, etc), generate code
that performs the operation using SSE/SSE2 instructions on values in the low
halves of XMM registers.  This is much more efficient than the previous scheme
of calling out to helper functions written in C.  There are still a few SIMD64
operations done via helpers, though.
2018-12-22 19:01:50 +01:00
Julian Seward
b17d5ffdb8 amd64 back end: generate better code for 2x64<-->V128 and 4x64<-->V256 transfers ..
.. by adding support for MOVQ xmm/ireg and using that to implement 64HLtoV128,
4x64toV256 and their inverses.  This reduces the number of instructions,
removes the use of memory as an intermediary, and avoids store-forwarding
stalls.
2018-12-22 18:04:42 +01:00
Julian Seward
dda0d80f3d amd64 pipeline: improve performance of cvtdq2ps and cvtps2dq (128 and 256 bit versions) ..
.. by giving them their own vector IROps rather than doing each lane individually.
2018-12-22 16:11:39 +01:00
Julian Seward
901f3d3813 amd64 back end: generate better code for 128/256 bit vector shifts by immediate. n-i-bz. 2018-12-22 13:34:11 +01:00
Julian Seward
b078fabb56 amd64 pipeline: generate much better code for pshufb mm/xmm/ymm. n-i-bz.
pshufb mm/xmm/ymm rearranges byte lanes in vector registers.  It's fairly
widely used, but we generated terrible code for it.  With this patch, we just
generate, at the back end, pshufb plus a bit of masking, which is a great
improvement.
2018-12-22 07:23:00 +01:00
Julian Seward
6cb6bdbd0a amd64 hosts: detect SSSE3 (not SSE3) capabilities on the host. As-yet unused. n-i-bz. 2018-12-22 06:06:19 +01:00
Julian Seward
01f1936b12 Adjust ppc set_AV_CR6 computation to help Memcheck instrumentation.
* changes set_AV_CR6 so that it does scalar comparisons against zero,
  rather than sometimes against an all-ones word.  This is something
  that Memcheck can instrument exactly.

* in Memcheck, requests expensive instrumentation of Iop_Cmp{EQ,NE}64
  by default on ppc64le.

https://bugs.kde.org/show_bug.cgi?id=386945#c62
2018-12-20 22:46:59 +01:00
Mark Wielaard
3ef4b2c780 Implement ppc64 lxvb16x as 128-bit vector load with reversed double words.
This makes it possible for memcheck to know which part of the 128bit
vector is defined, even if the load is partly beyond an addressable block.

Partially resolves bug 386945.
2018-12-20 22:46:59 +01:00
Mark Wielaard
98a73de1c0 Implement ppc64 lxvd2x as 128-bit load with double word swap for ppc64le.
This makes it possible for memcheck to know which part of the 128bit
vector is defined, even if the load is partly beyond an addressable block.

Partially resolves bug 386945.
2018-12-20 22:46:59 +01:00
Mark Wielaard
0ed17bc9f6 Implement ppc64 ldbrx as 64-bit load and Iop_Reverse8sIn64_x1.
This makes it possible for memcheck to analyse the new gcc strcmp
inlined code correctly even if the ldbrx load is partly beyond an
addressable block.

Partially resolves bug 386945.
2018-12-20 22:46:59 +01:00
Vadim Barkov
600a0099a1 Bug 385411 s390x: Add z13 vector floating point support
This adds support for the z/Architecture vector FP instructions that were
introduced with z13.

The patch was contributed by Vadim Barkov, with some clean-up and minor
adjustments by Andreas Arnez.
2018-11-30 14:29:39 +01:00
Julian Seward
f2c03ce3ba Bug 401112 - LLVM 5.0 generates comparison against partially initialized data.
This generalises the existing spec rules for W of 32 bits:

             W  <u   0---(N-1)---0 1 0---0  or

(that is, B/NB after SUBL, where dep2 has the above form), to also cover

             W  <=u  0---(N-1)---0 0 1---1

(that is, BE/NBE after SUBL, where dept2 has the specified form).

Patch from Nicolas B. Pierron (nicolas.b.pierron@nbp.name).
2018-11-28 14:15:06 +01:00
Andreas Arnez
ddfc274b24 s390x: More fixes for z13 support
This patch addresses the following:

* Fix the implementation of LOCGHI.  Previously Valgrind performed 32-bit
  sign extension instead of 64-bit sign extension on the immediate value.

* Advertise VXRS in HWCAP.  If no VXRS are advertised, but the program
  uses vector registers, this could cause problems with a glibc built with
  "-march=z13".
2018-11-22 13:45:56 +01:00
Julian Seward
27fe22378d Add support for Iop_{Sar,Shr}8 on ppc. --expensive-definedness-checks=yes needs them. 2018-11-20 12:09:03 +01:00
Julian Seward
cb5d7e0475 VEX/priv/ir_opt.c
fold_Expr: transform PopCount64(And64(Add64(x,-1),Not64(x))) into CtzNat64(x).

This is part of the fix for bug 386945.
2018-11-20 11:46:55 +01:00
Julian Seward
81d9832226 ppc front end: use new IROps added in 42719898.
This pertains to bug 386945.

VEX/priv/guest_ppc_toIR.c:

gen_POPCOUNT: use Iop_PopCount{32,64} where possible.

gen_vpopcntd_mode32: use Iop_PopCount32.

for cntlz{w,d}, use Iop_CtzNat{32,64}.

gen_byterev32: use Iop_Reverse8sIn32_x1 instead of lengthy sequence.

verbose_Clz32: remove (was unused anyway).
2018-11-20 11:36:53 +01:00
Julian Seward
97d336b79e Add ppc host-side isel and instruction support for IROps added in previous commit.
VEX/priv/host_ppc_defs.c, VEX/priv/host_ppc_defs.h:

Dont emit cnttz{w,d}.  We may need them on a target which doesn't support
them.  Instead we can generate a fairly reasonable alternative sequence with
cntlz{w,d} instead.

Add support for emitting popcnt{w,d}.

VEX/priv/host_ppc_isel.c

Add support for: Iop_ClzNat32 Iop_ClzNat64

Redo support for: Iop_Ctz{32,64} and their Nat equivalents, so as to not use
cnttz{w,d}, as mentioned above.

Add support for: Iop_PopCount64 Iop_PopCount32 Iop_Reverse8sIn32_x1
2018-11-20 11:09:30 +01:00
Julian Seward
4271989815 Add some new IROps to support improved Memcheck analysis of strlen etc.
This is part of the fix for bug 386945.  It adds the following IROps, plus
their supporting type- and printing- fragments:

Iop_Reverse8sIn32_x1: 32-bit byteswap.  A fancy name, but it is consistent
with naming for the other swapping IROps that already exist.

Iop_PopCount64, Iop_PopCount32: population count

Iop_ClzNat64, Iop_ClzNat32, Iop_CtzNat64, Iop_CtzNat32: counting leading and
trailing zeroes, with "natural" (Nat) semantics for a zero input, meaning, in
the case of zero input, return the number of bits in the word.  These
functionally overlap with the existing Iop_Clz64, Iop_Clz32, Iop_Ctz64,
Iop_Ctz32.  The existing operations are undefined in case of a zero input.
Adding these new variants avoids the complexity of having to change the
declared semantics of the existing operations.  Instead they are deprecated
but still available for use.
2018-11-20 10:52:33 +01:00
Andreas Arnez
9545e9f96b Bug 400491 s390x: Sign-extend immediate operand of LOCHI and friends
The VEX implementation of each of the z/Architecture instructions LOCHI,
LOCHHI, and LOCGHI treats the immediate 16-bit operand as an unsigned
integer instead of a signed integer.  This is fixed.
2018-11-14 16:22:24 +01:00