put it all into static state within a single function. Also, now the callers
of get_cpu_features() don't have to worry about whether it's been called
before.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2572
exists on Athlon's that have MMXEXT support and those don't have SSE state
so won't decode it where it was.
CCMAIL: 85947-done@bugs.kde.org
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2524
(almost useless) instruction "xadd %reg,%reg" gave the wrong answer
due to a subtlety of the order in which the destination registers are
PUTted to.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2513
into 0 or -1 in reg. This has no actual dependency on reg, but
memcheck can't see that, and so will yelp if reg contains garbage. A
simple fix is to put zero into reg before we start, zapping any
undefinedness it might otherwise contain.
Hopefully fixes#84978 (unconfirmed)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2512
to query the CPU characteristics as the use of four implicit registers
causes havoc when GCC tries to inline and optimise the assembler.
Fix to bug #79696.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2421
Patch to ignore REP prefixes on single byte RET instructions.
(REP RET is apparently faster than RET on AMD K7/K8)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2257
This patch adds translation tests for most of the basic x86 instructions and
fixes a few missing/broken instructions to work properly.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2242
the assember was getting fiddly. It now masks out only the undefined
or unimplemented parts of the feature set bits, so it now passes through
all the non-ISA-related feature bits to clients.
It also leaves the vendor ID string unmolested, so that clients can
extract vendor-specific information like extended brand strings and
cache/TLB configuration info.
It does, however, implement some Valgrind-specific requests at 0xd8000000,
though at present the only functionality is the ValgrindVCPU signature.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2236
- introduced DIS() and DIP() macros to shorten debug printing
- introduce jmp_lit(), jcc_lit(), jmp_treg() for common UCode sequences
- replace many unnecessary dis?dis_buf:NULL tests with dis_buf, by
changing the tests in disAMode()
Overall, reduced code size by about 230 lines.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2234
we now completely virtualize it. The feature flags returned are the
intersection of the set the CPU supports, and the set of flags Valgrind
supports. This turns out to be a small number of features, like FPU,
TSC, MMX, SSE, SSE2, FXSR. All mention of things which are only useful
to kernel-mode code are also suppressed. This CPUID doesn't support
any extended feature flags, or extended CPUID operations. It returns a
vendor string of "ValgrindVCPU".
If the host CPU doesn't support CPUID, then we make sure we treat it as
an illegal instruction (I'm not sure if we handle the eflags bit toggle
test right). This is because the CPUID helper doesn't actually use the
cpuid instruction in all cases, so it may succeed where the host CPU
wouldn't (other instructions which depend on feature flags will end up
generating those instructions, so they'll endup generating a SIGILL if
client code uses them on a CPU which doesn't support them).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2225
capabilities, and uses it to see if it has SSE/SSE2/fxsave support before
trying to use fxsave at startup.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2221
Patch to improve SSE/SS2 support
This patch should implement most of the missing SSE/SSE2 opcodes. About
the only ones it doesn't do are the MASKMOVxxx ones as they are quite
horrible and involved an implicit reference to EDI so I need to think
about them a bit more.
The patch also includes a set of tests for the MMX/SSE/SSE2 opcodes to
validate that they have the same effect under valgrind as they do when
run normally. In one or two cases this wasn't actually the case even
for some of the implemented opcodes, so I fixed those as well ;-)
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2202
I added a test case and cleaned up vg_dispatch.S while I was about it.
CCMAIL: 69529-done@bugs.kde.org
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2129
Valgrind's dependency on the dynamic linker for getting started, and
instead takes things into its own hands.
This checkin doesn't add much in the way of new functionality, but it
is the basis for all future work on Valgrind. It allows us much more
flexibility in implementation, and well as increasing the reliability
of Valgrind by protecting it more from its clients.
This patch requires some changes to tools to update them to the changes
in the tool API, but they are straightforward. See the posting "Heads
up: Full Virtualization" on valgrind-developers for a more complete
description of this change and its effects on you.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2118
This patch extends the SFENCE support that is already present to include
support for LFENCE and MFENCE as well. It also stops CLFLUSH being mistaken
for SFENCE by checking the top two bits of the MODRM byte.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2087
so much so that the file is now 280 lines shorter. This despite me also adding
support for LOOP{E,NE} (thanks to Abhijit Menon-Sen). Also added support for
CMPS[lw], which was missing. Adding more REP-prefix instructions in the future
will now be much easier.
As part of this, I moved the D-flag fetch outside of the REP loops. This might
make programs that use REP prefixes a lot go faster.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@2068
Scientific Library (gsl-1.4) compiled with Intel Icc 7.1 20030307Z '-g
-O -xW'. I think this gives pretty good coverage of SSE/SSE2 floating
point instructions, or at least the subset emitted by Icc. So far
tested on memcheck and nulgrind; addrcheck and cachesim still testing.
MERGE TO STABLE
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@1955
can treat it like add and generate partially-defined results of multiply
with partially defined arguments. It may also speed things up a bit,
if they use lots of multiplies.
This change only deals with signed "new style" multiplies. That the x86
has two quite different kinds of multiply instructions: the "old-style"
signed and unsigned multiply which uses fixed registers (eax:edx) and
generates a result twice the size of the arguments, and the newer signed
multiple which takes general addressing modes. It seems that gcc always
(almost always?) generates the new signed multiply instructions, except
for byte-sized multiplies.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@1925