From f5b89d87b71816ec368aa35e24498330b26babca Mon Sep 17 00:00:00 2001 From: Thomas Harte Date: Tue, 22 Dec 2020 22:34:42 -0400 Subject: [PATCH] First set of thoughts. --- Roadmap-for-Flexibly-Timed-Platforms.md | 52 +++++++++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 Roadmap-for-Flexibly-Timed-Platforms.md diff --git a/Roadmap-for-Flexibly-Timed-Platforms.md b/Roadmap-for-Flexibly-Timed-Platforms.md new file mode 100644 index 0000000..ace84a1 --- /dev/null +++ b/Roadmap-for-Flexibly-Timed-Platforms.md @@ -0,0 +1,52 @@ +# Definition + +A flexibly-timed platform is any where exact bus timings, CLK's current stock in trade, are not well-defined or not of the essence. + +Unambiguous examples include a 486 PC or a PowerPC Mac; on neither of these platforms does software depend on exact bus timings. Pragmatically speaking it would therefore be detrimental to the end user for CLK to implement them given the greater development and substantially greater runtime costs. + +## Straddling Platforms + +Some platforms begin with single, well-defined bus timings and then grow beyond them. On a macro level this is true of machines like the IBM PC and the Acorn Archimedes; in both cases some software depends on the exact timings of the original versions of the machine, despite the platform subsequently growing in scope such that a strong connection with guaranteed timing is lost. + +On a micro level, it is true of some processor families — CLK will generally need to provide bus-accurate implementations of early versions of processors, such as the 68000, while being able to provide only flexibly-timed versions of later family members, such as the 68040. + +# Intended Implementation + +1. an instruction decoder; +2. feeds a JIT threaded-code generator; +3. which is constituted of function calls to suitably-templated functions such as pragmatically to minimise repeat decoding. + +For processor families like the 68k, the same instruction decoder can be applied both by the threaded JIT generator and by bus-centric implementations of as many of the early members of the family as is justified. + +## Instruction Decoders + +These will necessarily be unique per CPU family. Expectations off the top of my head: + +* the 68000 decoder can be stateless and predicated on a simple 16-bit word to decoded instruction plus length; +* the PowerPC and ARM decoders (ignoring Thumb) can similarly be stateless, mapping 32-bit words to decoded instructions; +* x86 decoders will need to be stateful, with an interface offering it a buffer of bytes with a length, returning either: (i) a fully decoded instruction and its length; or (ii) an indication that more bytes are required, and an indication of how many if that is currently knowable. + +Fully-decoded instructions will be fixed-size descriptive structs or bit fields, presumably usually larger than the input data. + +## JIT Threaded-Code Generator + +This will likely be a simple list of function pointers; each function will be provided with processor state and the fully-decoded instruction. Executing linearly will likely involve just calling each in turn. Something like a std::map from program counter to entry point can be kept per translated page in order to retain those entry points that are actually in use and some architectures may require multiple JIT streams per page, subject to instruction alignment rules and the possibility of positioning the program counter partway through what is otherwise a single instruction. + +It's likely that a suitably generic implementation of the page cache could be used for all relevant architectures. + +## Templated Instructions + +For the purpose of call cleanliness, it's likely that some degree of instruction composition via templating will be appropriate. + +E.g. for the 8086 that might mean providing the decoded version of the ModRM byte as a template parameter in addition to being an argument, permitting addressing mode to be selected by constexpr. + +What to include in the template signature will be a question of measure and degree, e.g. it would make little sense to include all three registers nominated by some PowerPC instructions in the corresponding function template as overloading the emulator with generated code would not be a path to good performance. + +# Likely Development Path + +1. factor out the proto-decoder that currently resides within the bus-centric 68000; +2. extend it also to decode 68020, 68030 and 68040 instruction sets; +3. build the JIT threaded-code generator and deploy it coupled to the 68k decoder to provide flexibly-timed 68020, 68030 and 68040s; +4. use those to build out the higher-function pre-PowerPC Macintoshes. + +With a workable pattern suitably discovered, apply the same to other architectures. \ No newline at end of file