Note: all timings are relative to virtual time which is 16 ns per instruction. 8 ns per instruction is too fast for the PDM 'mach' gestalt calculation in firmware.
- For MPC601, bit 7 of RTC and DEC changes at 7.8336 MHz so bit 0 would change at 1.0027008 GHz. Therefore multiply time base frequency by 128. For MPC601, DEC now changes at 1.0027008 GHz instead of 7.8336 MHz.
- Add tbr_freq_shift for cases where time base frequency exceeds 1GHz.
- Change calc_rtcl_value to use time base frequency. For MPC601, RTC now changes at 1.0027008 GHz instead 1GHz.
- For MPC601, the 7 least significant bits of DEC are not implemented so make them not getable or setable.
In #135 we switched from a static OpcodeGrabber table to a
curOpcodeGrabber pointer in ppc_main_opcode. This results in an extra
indirection (as far as generated assembly having an additional load),
which reduces execution speed.
Switch to making the opcode grabber into a parameter to
ppc_main_opcode, and make ppc_exec_inner keep it up to date (via an
EXEF_OPCODE exception flag).
Also fixes FPU instructions in ppctests - we now need to set the FP
MSR bit when initializing the CPU.
Rather than running them normally, they should trigger a "no FPU"
exception. This appears to be required to allow correct graphical
rendering under Mac OS X - the FP bit cleared via mtmsr and rfi
instructions and something else appears to be relying on the exception
to be thrown.
Implemented by maintaining a parallel version of the OpcodeGrabber
table (OpcodeGrabberNoFPU) which contains alternate implementations
for all the floating point instructions. We switch the table whenever
the MSR value changes. This should minimize the overhead of doing
these checks.
Some of these accept a 4th register "C" so they exist in the opcode table 32 times.
Some of these don't accept a 4th register "C" so they exist in the opcode table once.
Instead of a primary opcode lookup table with 64 entries and a few
smaller tables with 4-2048 entries, use a single 64 * 2048 (128K)
entry table to dispatch opcodes.
Helps with performance, since we avoid the function call overhead for
some frequently-used instructions (e.g. branch, integer, floating point).
Saves ~2 seconds from the time to Welcome to Macintosh (same measurement
methodology as #125)
Secondarily also makes opcode registration/decoding a bit more uniform,
and scannable, since it's now all in initialize_ppc_opcode_table.
Replace it wth an explicit opcode parameter that is passed around. That
is both slightly easier to reason about (to trace where it comes from)
and slightly faster, since it can be read from a register.
On my machine takes booting to "Welcome to Macintosh" being output in
a verbose boot of Mac OS X 10.2.8 from 31.8s to 30.6s (average of 5
runs, measured using deterministic mode and looking at when execution
reaches PC 0x90004a88).
ppc_opcode16 and other functions are only needed in the implementation in
ppcexec.cpp, they don't need to be in the header.
fp_return_double and fp_return_uint64 have no uses (as of 2141a72b87)
can can thus be removed altogether.
Similarly ppc_fpu_off has no uses (as of bb3f4e596e)
and can be removed.