- add wrappers with default calling convention for powerpc_cpu member functions used through nv_mem_fun ptr()
** explicit wrappers for member functions that were used explicitly
** dynamic wrapper generator in nv_mem_fun1_t for member functions used dynamically via the instruction table
- add missing direct addressing (non-zero constant offset to Mac memory) support in lvx and stvx implementations
- fix mismatched parameter lists between powerpc_jit member functions and the calls they get through the jit_info table to fix problems at -O2
Makes SheepShaver compatible with Ubuntu Intrepid and
other distros that bundle the gcc-4.3 compiler.
The patch changes two things:
1. Renames the block_cache where its name collides with its class
definition.
2. Fixes the "explicit template specialization cannot have a storage
class" error in the ppc-dyngen-ops.cpp file.
set to 0 until generated code is optimized enough (current slow down factor
is 3x vs. previous core, expectations are about 50% slower FP code).
The main benefit is exception bits are accurate. All glibc test-fenv,
test-arith{,f}, test-double, test-float pass on ppc, and mostly on x86_64
with gcc 4.0.1. Yes, this is also compiler dependent.
FIXME: find a real Mac application that depends on precise FPSCR bits... I
think I don't want to care optimizing yet until someone shows me a real world
application.
GPR macro definition.
Make the JIT engine somewhat reentrant. This brings a massive performance
boost for applications that cause many Execute68k(). e.g. audio in PlayerPRO.
The merge probably got wrong as there are some problems probably due to the
experiment begining with CR deferred evaluation. With nbench/ppc, performance
improvement was around 2x. With nbench on x86, performance improvement was
around 4x on average.
Incompatible change: instr_info_t has a new field in the middle. But since
insertion of PPC_I(XXX) identifiers is auto-generated, there is no problem.
oldish EXEC_RETURN handling with a throw/catch mechanism since we
do have a dependency on extra conditions (invalidated cache) that
prevents fast execution loops.
predecode cache. This implies to run in interpreted mode only while
processing EmulOps or other native (nested) runs.
Note that the FLIGHT_RECORDER with a predecode cache gets slower than
without caching at all.