Instead of a primary opcode lookup table with 64 entries and a few
smaller tables with 4-2048 entries, use a single 64 * 2048 (128K)
entry table to dispatch opcodes.
Helps with performance, since we avoid the function call overhead for
some frequently-used instructions (e.g. branch, integer, floating point).
Saves ~2 seconds from the time to Welcome to Macintosh (same measurement
methodology as #125)
Secondarily also makes opcode registration/decoding a bit more uniform,
and scannable, since it's now all in initialize_ppc_opcode_table.
Before f65f9b9845 this would happen
automatically (since the stream member was always valid). It's now a
unique_ptr that's only initialized when the file is opened, so we need
to add guards for all the operations.
Adds support for a --deterministic command-line option that makes
repeated runs the same:
- Keyboard and mouse input is ignored
- The sound server does a periodic pull from the DMA channel (so that
it gets drained), but only does so via a periodic timer (instead of
being driven by a cubeb callback, which could arrive at different
times)
- Disk image writes are disabled (reads of a modified area still
work via an in-memory copy)
- NVRAM writes are disabled
- The current time that ViaCuda initializes the guest OS is always the
same.
This makes execution exactly the same each time, which should
make debugging of more subtle issues easier.
To validate that the deterministic mode is working, I've added a
periodic log of the current "time" (measured in cycle count), PC
and opcode. When comparing two runs with --log-no-uptime, the generated
log files are identical.
size_t and off_t are 32-bit values in Emscripten, which causes issues
with disk images larger than 4GB. Use the explicit type (which is more
consistent with the rest of the codebase anyway).
Keeps track of instructions (including operands) that are executed,
to see if there are any hotspots that could be optimized or fastpaths
that should be added.
Also adds a mode where CPU profiler data is periodically output, to
make it easier to get at these instruction counts during startup.
Allows different implementations for different platforms (the JS
build relies on browser APIs to stream disk images over the network).
Setting aside the JS build, this also reduces some code duplication.
Result of running IWYU (https://include-what-you-use.org/) and
applying most of the suggestions about unncessary includes and
forward declarations.
Was motivated by observing that <thread> was being included in
ppcopcodes.cpp even though it was unused (found while researching
the use of threads), but seems generally good to help with build
times and correctness.
While Emscripten has an SDL compabtility layer, it assumes that the
code is executing in the main browser process (and thus has access to
them DOM). The Infinite Mac project runs emulators in a worker thread
(for better performance) and has a custom API for the display, sound,
input, etc. Similarly, it does not need the cross-platform sound support
from cubeb, there there is a sound API as well.
This commit makes SDL (*_sdl.cpp) and cubeb-based (*_cubeb.cpp) code be
skipped when targeting Emscripten, and instead *_js.cpp files are used
instead (this is the cross-platform convention used by Chromium[^1], and
could be extended for other targets).
For hostevents.cpp and soundserver.cpp the entire file was replaced,
whereas for videoctrl.cpp there was enough shared logic that it was
kept, and the platform-specific bits were moved behind a Display class
that can have per-platform implementations. For cases where we need
additional private fields in the platform-specific classes, we use
a PIMPL pattern.
The *_js.cpp files with implementations are not included in this
commit, since they are closely tied to the Infinite Mac project, and
will live in its fork of DingusPPC.
[^1]: https://www.chromium.org/developers/design-documents/conventions-and-patterns-for-multi-platform-development/