moa/docs/posts/2021-11-making_an_emulator.md
2021-11-28 09:48:31 -08:00

901 lines
40 KiB
Markdown

Making a 68000 Emulator in Rust
===============================
###### *Written November 2021 by transistor_fet*
A few months ago, I was looking for a project to do while recovering from a bout of illness. I
needed something that was straight-forward enough to work on without getting stuck on tough
decisions or the need to learn a lot before diving in, which was the case with all the other
projects on my plate at the time. My girlfriend suggested writing an
[emulator](https://jabberwocky.ca/projects/moa/), and my first thought was to try emulating the
computer I made last year, [computie](https://jabberwocky.ca/projects/Computie/), since I already
had a fair amount of code for it that I knew well, and the 68000 architecture and assembly language
was still fresh in my mind. Naturally I chose to write it in Rust.
I've worked on different projects that have some vague similarities to emulators, such as
programming language interpreters and artificial life simulations but I haven't actually tried to
make an emulator before. I'm not aiming to make a fast emulator, and since this is meant to be fun,
I'm not *as* concerned about accuracy (specifically instruction execution time accuracy). I would,
however, like something that's flexible enough to experiment with different hardware designs.
Perhaps I could use this with some of my future hardware designs to test out hardware configurations
and to develop and test the software that runs on them. It would also be nice to emulate some
vintage computers that use the 68000, which each have their own I/O devices that would need
simulating. With that in mind, we should work towards making independent components for each
device, which interact in regular ways and can be combined in different configurations without the
need to modify the components themselves (ie. no tight coupling of components)
I chose Rust because it's currently my favorite systems language. I've used C and C++ quite a bit
as well, but Rust's advanced type system is much more pleasant to work with, and the compile-time
checks means I can focus less on simple bugs and more on the problem at hand, getting more done in
the same amount of time. I'm assuming here some familiarity with Rust, as well as the basic
principles of how a computer works, but not necessarily that much about the 68000 or emulators. We
will start with some code to simulate the computer memory, creating an abstract way of accessing
data. We'll then implement the NOP (no operation) instruction, the simplest possible instruction,
for the 68000, and expand the implementation from there. Once we've created a way of handling the
passage of time for the CPUs and I/O devices, we'll implement a simple serial port controller which
will act as our primary I/O device. From there we'll make all of our simulated devices, represented
as `struct` objects in Rust, look the same so that we can treat them the same, regardless of what
they represent internally. We'll then be able to package them up into single working system and set
them in motion to run the Computie software from binaries images.
* [The Computer](#the-computer)
* [The 68000](#the-68000)
* [The 68681 Serial Port Controller](#the-68681-serial-port-controller)
* [The Memory](#the-memory)
* [Simulating The CPU](#simulating-the-cpu)
* [Adding Instructions](#adding-instructions)
* [Abstracting Time](#abstracting-time)
* [Some I/O](#some-i-o)
* [Box It Up](#box-it-up)
* [An Addressable of Addressables (The Data Bus)](#an-addressable-of-addressables--the-data-bus-)
* [A Happy Little System](#a-happy-little-system)
* [Tying It All Together](#tying-it-all-together)
* [Now What](#now-what)
The Computer
------------
Computie is a single board computer with a Motorola 68010 CPU connected to 1MB of RAM, some flash,
and an MC68681 dual serial port controller, which handles most of the I/O. (The
[68010](https://en.wikipedia.org/wiki/Motorola_68010) is almost identical to the 68000 but with some
minor fixes which won't affect us here. I'll mostly be referring to the common aspects of both
processors). One of the serial connections is used as a TTY to interact with either the unix-like
Computie OS, or the monitor software that's normally stored in the flash chip. It also supports a
CompactFlash card and SLIP connection over the other serial port for internet access, but we wont
cover those here. In order to get a working emulator, we'll focus on just the CPU, memory, and
MC68681 controller.
The 68000
---------
The 68000 is 32-bit processor with a 16-bit data bus and 16-bit arithmetic logic unit. It was used
on many early computers including the early Macintosh series, the early Sun Microsystems
workstations, the Amiga, the Atari ST, and the Sega Genesis/Mega Drive, just to name a few. It was
almost chosen for the IBM PC as well, but IBM wanted to use an inexpensive 8-bit data bus in the PC,
and the 68008 (with an 8-bit data bus) wasn't available at the time. The 8088, with it's 8-bit bus,
was available however, and we have been stuck with that decision ever since.
The 68000 has 8 32-bit general purpose data registers, and 7 32-bit general purpose address
registers plus two stack registers which can be accessed as the 8th address register depending on
whether the CPU is in Supervisor mode or User mode. Internally the address registers use a separate
bus and adder unit from the main arithmetic logic unit, which only operates on the data registers.
This affects which instructions can be used with which registers, and operations on address
registers don't always affect the condition flags (which has caused me many troubles both when
writing the Computie software, and with the implementation of instructions here).
A 16-bit status register is used for condition codes and for privileged flags like the Supervisor
mode enable flag and the interrupt priority level. Only the lower 8 bits can be modified from User
mode. The conditions flags are set by many of the instructions based on their results, and can be
checked by the conditional jump instructions to branch based on the results of comparisons.
The program counter register keeps track of the next instruction to be executed. Instructions are
always a multiple of 2-byte words, and the CPU uses [big
endian](https://en.wikipedia.org/wiki/Endianness) byte order, so the most significant byte will
always be in the lower byte address in memory. As an example, the NOP instruction uses the opcode
0x4E71, where 0x4E would be the first byte in memory followed by 0x71. A longer instruction like
ADDI can have instruction words after the opcode (in this case the immediate data to add). For
example, `addil #0x12345678, %d0` which adds the hex number 0x12345678 to data register 0 would
be encoded as the sequence of words [0x0680, 0x1234, 0x5678]. The opcode word, 0x0680, has encoded
in it that it's an ADDI instruction, that the size of the operation is a long word (32-bit), and
that it should use data register 0 as both the number to add to, and the destination where the
result will be stored. (Note: 68000 assembly language instructions move data from the left hand
operand to the right hand operand, unlike Intel x86 assembly language which uses the reverse).
A vector table is expected at address 0 which contains an array of up to 256 32-bit addresses. The
first address contains the value to be loaded into the Supervisor stack register upon reset, and the
remaining addresses are the value loaded into the program counter when a given exception occurs.
This table cannot be relocated on the 68000, but it can be changed after reset on the 68010.
Computie uses the 68010 for this exact feature, so that the OS can put the vector table in RAM, but
the monitor software doesn't use interrupts. We wont cover interrupts here so this feature isn't
needed for now and we can focus only on emulating the 68000. As for the vector table, we just need
to simulate how the processor starts up on power on or after a reset, in which case the vector table
is always at address 0, and the first two long words are the stack pointer and initial program
counter respectively.
The 68681 Serial Port Controller
--------------------------------
The MC68681 is a peripheral controller designed specifically for use with the 68000 series. It has
two serial channels, A and B, as well as an onboard 16-bit timer/counter, 8 general purpose output
pins, and 6 general purpose input pins. Internally it has 16 registers which can be read or written
to from the CPU. The behaviour of the 16 registers is sometimes different between reading and
writing, such as the status registers (SRA & SRB) which indicate the current status of the serial
ports when read, and the clock select registers (CSRA & CSRB) which configure the serial port clocks
when writing to the same address as the status registers. It can also generate an interrupt for
one of 8 different internal conditions, such as data ready to read on a given channel, or the timer
reaching zero. It uses it's own clock signal to generate the serial clocks and to count down with,
so it will need to run some code along side the CPU to simulate its internal state.
The Memory
----------
Now that we have a bit of background on the devices we'll be emulating, lets start making the
emulator. The first thing we'll need in order to emulate our computer is a way of accessing and
addressing memory where instruction data can be read from. We need to eventually have a common way
of reading and writing to either simulated ram, or simulated I/O devices. Since we want to keep
things generic and interchangeable, using Rust enums for the different devices in the system would
be too tightly-coupled. Traits it is, so we'll start with an `Addressable` trait.
```rust
type Address = u64; // I (wishfully) chose u64 here in case
// I ever emulate a 64-bit system
pub trait Addressable {
fn len(&self) -> usize;
fn read(&mut self, addr: Address, data: &mut [u8]) -> Result<(), Error>;
fn write(&mut self, addr: Address, data: &[u8]) -> Result<(), Error>;
...
}
```
I went through a few different iterations of this, especially for the `.read()` method. At first I
returned an iterator over the data starting at the address given, which works well for simulated
memory, but reading from an I/O device could return data that is unique to when it was read. For
example, when reading the next byte of data from a serial port, the data will be removed from the
device's internal FIFO and returned, and since at that point it wont be stored anywhere, we can't
have a reference to it. We need the data (of variable length) to be owned by the caller when the
method returns, and passing in a reference to a mutable array to hold that data is a simple way to
do that.
We can also add some default methods to the trait which will make it easier to access multi-byte
values. The example here only shows methods to read and write 16 bit numbers in big endian byte
order but there are also similar methods for 32-bit numbers and for little endian numbers.
```rust
pub trait Addressable {
...
fn read_beu16(&mut self, addr: Address) -> Result<u16, Error> {
let mut data = [0; 2];
self.read(addr, &mut data)?;
Ok(read_beu16(&data))
}
fn write_beu16(&mut self, addr: Address, value: u16) -> Result<(), Error> {
let mut data = [0; 2];
write_beu16(&mut data, value);
self.write(addr, &data)
}
...
}
#[inline(always)]
pub fn read_beu16(data: &[u8]) -> u16 {
(data[0] as u16) << 8 |
(data[1] as u16)
}
#[inline(always)]
pub fn write_beu16(data: &mut [u8], value: u16) -> &mut [u8] {
data[0] = (value >> 8) as u8;
data[1] = value as u8;
data
}
```
Now for some simulated ram that implements our trait. (I'm leaving out the `.new()` methods for
most of the code snippets because they are pretty straight-forward, but you can assume they exist,
and take the their field values as arguments, or set their fields to 0).
```rust
pub struct MemoryBlock {
pub content: Vec<u8>,
}
impl Addressable for MemoryBlock {
fn len(&self) -> usize {
self.contents.len()
}
fn read(&mut self, addr: Address, data: &mut [u8]) -> Result<(), Error> {
for i in 0..data.len() {
data[i] = self.contents[(addr as usize) + i];
}
Ok(())
}
fn write(&mut self, addr: Address, data: &[u8]) -> Result<(), Error> {
for i in 0..data.len() {
self.contents[(addr as usize) + i] = data[i];
}
Ok(())
}
}
```
Simulating The CPU
------------------
With just the Addressable trait, we can start simulating the CPU. Each cycle of the CPU involves
reading in instruction data, decoding it, executing the instruction, modifying the stored state of
the CPU, and finally checking for interrupts or breakpoints before looping again. We don't need all
of this to start though, so first lets make some CPU state so that we at least have a PC (program
counter) to keep track of the instruction data we've read.
```rust
#[derive(Copy, Clone, Debug, PartialEq)]
pub enum M68kStatus {
Init,
Running,
Stopped,
}
#[derive(Clone, Debug, PartialEq)]
pub struct M68kState {
pub status: M68kStatus,
pub pc: u32, // Program Counter
pub sr: u16, // Status Register
pub d_reg: [u32; 8], // Data Registers
pub a_reg: [u32; 7], // Address Registers
pub ssp: u32, // Supervisor Stack Pointer
pub usp: u32, // User Stack Pointer
}
pub struct M68k {
pub state: M68kState,
...
}
```
(I've separated the state into its own struct, separate from the `M68k` struct, in part to make it
cleaner, but mostly to make it easier to test. We can get a complete known state using just
M68kState::new(), and we can clone, modify, and compare states because we've derived `PartialEq`, so
after running a test on a given instruction, we only need one `assert_eq!(cpu.state,
expected_state)` to check if the resulting state is what we expected).
Next we need to initialize the CPU. When the 68000 is reset, it first reads in the stack pointer
and initial value of the PC register from the beginning of memory where the vector table is located.
```rust
impl M68k {
...
pub fn init(&mut self, memory: &mut dyn Addressable) -> Result<(), Error> {
self.state.ssp = memory.read_beu32(0)?;
self.state.pc = memory.read_beu32(4)?;
self.state.status = M68kStatus::Running;
Ok(())
}
...
}
```
Since the decoding of instructions can be a bit convoluted, and since speed is not our utmost goal, it's
cleaner to decode the instructions fully into some kind of internal representation, and then execute
them based on that representation. For now, lets just add a NOP instruction.
```rust
#[derive(Clone, Debug)]
pub enum Instruction {
NOP,
}
impl M68k {
pub fn read_instruction_word(&mut self, memory: &mut dyn Addressable) -> Result<u16, Error> {
let ins = memory.read_beu16(self.state.pc as Address)?;
self.state.pc += 2;
Ok(ins)
}
pub fn decode(&mut self, memory: &mut dyn Addressable) -> Result<Instruction, Error> {
let ins = self.read_instruction_word(memory)?;
match ins {
0x4E71 => Ok(Instruction::NOP),
_ => panic!("instruction not yet supported: {:#04X}", ins),
}
}
}
```
We can then make a function to execute the decoded instruction
```rust
impl M68k {
pub fn execute(&mut self, memory: &mut dyn Addressable, instruction: Instruction) -> Result<(), Error> {
match instruction {
Instruction::NOP => {
// Do Nothing
Ok(())
},
_ => panic!("Instruction not implemented: {:?}", instruction),
}
}
}
```
And finally a function to wrap these stages together into a single step. For debugging purposes
we'll print out what instruction we're about to execute after decoding but before executing.
```rust
impl M68k {
pub fn step(&mut self, memory: &mut dyn Addressable) -> Result<(), Error> {
match self.state.status {
M68kStatus::Init => self.init(memory),
M68kStatus::Stopped => Err(Error::new("cpu has stopped")),
M68kStatus::Running => {
let addr = self.state.pc;
let instruction = self.decode(memory)?;
println!("{:08x}: {:?}", addr, instruction);
self.execute(memory, instruction)?;
Ok(())
},
}
}
}
```
At this point we have enough pieces to loop over a series of NOP instructions. Our main function
looks like the following
```rust
const ROM: &[u16] = &[
0x0010, 0x0000, // Initial stack address is at 0x00100000
0x0000, 0x0008, // Initial PC address is at 0x8, which is the word
// that follows this
0x4e71, // 4 NOP instructions
0x4e71,
0x4e71,
0x4e71,
0x4e72, 0x2700 // The STOP #0x2700 instruction, which would normally
// stop the CPU but it's unsupported, so it will
// cause a panic!()
];
let mut cpu = M68k::new();
let mut memory = MemoryBlock::from_u16(ROM);
loop {
cpu.step(&mut memory).unwrap();
}
```
Our output should look something like this:
```
00000008: NOP
0000000a: NOP
0000000c: NOP
0000000e: NOP
thread 'main' panicked at 'instruction not yet supported: 0x4E72', src/main.rs:184:18
```
Adding Instructions
-------------------
Since the 68000 has a reasonably orthogonal instruction set, we can break down the opcode word into
sub-components, and build up instructions by separately interpreting those sub-components, rather
than having a match arm for each of the 65536 combinations. There is a really helpful [chart by
GoldenCrystal](http://goldencrystal.free.fr/M68kOpcodes-v2.3.pdf) which shows the full breakdown of
opcodes for the 68000. We can look at the first 4 bits of the instruction word to separate it into
16 broad categories of instruction, and then further break it down from there. The full code can be
seen [here](https://github.com/transistorfet/moa/blob/main/src/cpus/m68k/decode.rs)
We can extend our `Instruction` type to contain more instructions, including the addressing modes
that the 68000 supports. A `MOVE` instruction for example can move data to or from a data or
address register, or to memory indirectly using an address in an address register (optionally
pre-decrementing or post-incrementing the address), or indirectly using an offset added to an
address register, as well as a few others. Since these different addressing modes are available for
most instructions, we can wrap them up into a `Target` type, and use that as the arguments of
instructions in the `Instruction` type. For example, the instruction `addil #0x12345678, %d0` would
be represented as `Instruction::ADD(Target::Immediate(0x12345678), Target::DirectDReg(0),
Size::Long)`. We will maintain the same operand order as the 68k assembly language so data will
move from the left hand operand (`0x12345678`) to the right hand operand (`%d0`).
```rust
pub type Register = u8;
#[derive(Copy, Clone, Debug, PartialEq)]
pub enum Size {
Byte,
Word,
Long,
}
#[derive(Copy, Clone, Debug, PartialEq)]
pub enum Condition {
NotEqual,
Equal,
...
}
#[derive(Copy, Clone, Debug, PartialEq)]
pub enum Target {
Immediate(u32), // A u32 literal
DirectDReg(Register), // Contents of a data register
DirectAReg(Register), // Contents of an address register
IndirectAReg(Register), // Contents of a memory location given by an address register
IndirectARegInc(Register), // Same as IndirectAReg but increment the address register
// by the size of the operation *after* reading memory
IndirectARegDec(Register), // Same as IndirectAReg but decrement the address register
// by the size of the operation *before* reading memory
IndirectARegOffset(Register, i32), // Contents of memory given by an address register plus the
// signed offset (address register will *not* be modified)
IndirectPCOffset(i32), // Same as IndirectARegOffset but using the PC register
IndirectMemory(u32), // Contents of memory location given by literal u32 value
}
#[derive(Clone, Debug, PartialEq)]
pub enum Instruction {
ADD(Target, Target, Size), // Addition
ADDA(Target, Register, Size), // Adding to an address register
// doesn't affect flags
Bcc(Condition, i32), // Branch Conditionally
BRA(i32), // Branch to PC + offset
BSR(i32), // Branch to Subroutine
// (Push PC; PC = PC + offset)
JMP(Target), // Set the PC to the given value
JSR(Target), // Push PC to the stack and then
// set PC to the given value
LEA(Target, Register), // Load effective address into
// address register
MOVE(Target, Target, Size),
MOVEA(Target, Register, Size),
NOP,
RTS, // Return from subroutine (Pop PC)
STOP(u16), // Load word into SR register and
// stop until an interrupt
SUB(Target, Target, Size), // Subtraction
SUBA(Target, Register, Size),
...
}
```
This is just an example of the instruction type definitions. The full code can be found
[here](https://github.com/transistorfet/moa/blob/main/src/cpus/m68k/instructions.rs). Note: it's
possible to express more combinations with these instruction types than there are legal instruction
combinations. Some combinations are illegal on the 68000 but allowed on the 68020 and up, which the
emulator supports using an enum to represent the different 68000 model numbers. Most of the error
checking and version difference checking is done during the decode phase (and an illegal instruction
exception raised if necessary). Some instructions have special behaviours, so they've been given
their own variant in the enum, like ADDA to add a value to an address register which doesn't affect
the condition codes the way the ADD instruction does, or CMPA to compare an address register which
sign extends byte or word operands to long words before comparing them.
Abstracting Time
----------------
In the case of a real CPU, the clock signal is what drives the CPU forward, but it's not the only
device in the system that is driven by a clock. I/O devices, such as the MC68681 serial port
controller also take an input clock which affects their state over time. In the case of the serial
port controller, it has a timer which needs to count down periodically. We're going to need some
way of running code for different devices based on the clock, so lets use another trait for this.
```rust
pub type Clock = u64;
pub type ClockElapsed = u64;
trait Steppable {
fn step(&mut self, memory: &mut dyn Addressable, clock: Clock) -> Result<ClockElapsed, Error>;
}
```
Now we can call the `step()` method each cycle and pass it the current clock value. It will
return the number of clock ticks that should elapse before the next call to that device's `step()`
method.
In the case of Computie, the CPU runs at 10 MHz, but the serial port controller runs on a separate
clock at 3.6864 MHz. In order to handle these different clock speeds, we can arbitrarily decide
that our `Clock` value will be the number of nanoseconds from the start of the simulation, and
`ClockElapsed` will be a difference in nanoseconds from the start of the start of the step. We use
a `u64` here so that we can keep track of simulation time in nanoseconds for approximately 584 which
*should* be enough. Keeping track of the time will allow us to later limit how much time passes
(either speeding up or slowing down the execution relative to real-time). With this, we can get a
somewhat accurate count when simulating the timer in the serial controller chip. That said, we wont
worry about simulating CPU execution times, which varies quite a bit based on each instruction and
it's operands, so can add a lot of complexity.
Some I/O
--------
It's time to implement the MC68681 serial port controller. Since it's both an Addressable device
and a Steppable device, we'll need to implement both traits. The registers we'll need to support
first are the serial port channel A registers. `REG_TBA_WR` is the transmit write buffer, which
will output the character written to it over serial channel A. We can just print the character
written to this register to the screen for now. We also need to implement the `REG_SRA_RD`
register, which is a status register. Bits 3 and 2 of the status register indicate if the transmit
buffer for channel A is ready and empty. The software for Computie checks the status register
before writing data because the real MC68681 can't transmit fast enough to avoid a buffer overrun,
so as long as we return a value with those bits set, the software will write characters to the
`REG_TBA_WR` register. This example also shows the `REG_RBA_RD` register for reading serial data
in, but setting the `port_a_input` field is not shown for simplicity. The full code has be viewed
[here](https://github.com/transistorfet/moa/blob/main/src/peripherals/mc68681.rs)
```rust
// Register Addresses (relative to mapped address)
const REG_SRA_RD: Address = 0x03; // Ch A Status Register
const REG_TBA_WR: Address = 0x07; // Ch A Byte to Transmit
const REG_RBA_RD: Address = 0x07; // Ch A Received Byte
const REG_ISR_RD: Address = 0x0B; // Interrupt Status Register
// Status Register Bits (SRA/SRB)
const SR_TX_EMPTY: u8 = 0x08;
const SR_TX_READY: u8 = 0x04;
const SR_RX_FULL: u8 = 0x02;
const SR_RX_READY: u8 = 0x01;
// Interrupt Status/Mask Bits (ISR/IVR)
const ISR_TIMER_CHANGE: u8 = 0x08;
struct MC68681 {
pub port_a_status: u8,
pub port_a_input: u8,
pub is_timing: bool,
pub timer: u16,
pub int_status: u8,
}
impl Addressable for MC68681 {
fn len(&self) -> usize {
0x10
}
fn read(&mut self, addr: Address, data: &mut [u8]) -> Result<(), Error> {
match addr {
REG_SRA_RD => {
data[0] = SR_TX_READY | SR_TX_EMPTY;
},
REG_RBA_RD => {
data[0] = self.port_a_input;
self.port_a_status &= !SR_RX_READY;
},
REG_ISR_RD => {
data[0] = self.int_status;
},
_ => { },
}
debug!("{}: read from {:0x} of {:0x}", DEV_NAME, addr, data[0]);
Ok(())
}
fn write(&mut self, addr: Address, data: &[u8]) -> Result<(), Error> {
debug!("{}: writing {:0x} to {:0x}", DEV_NAME, data[0], addr);
match addr {
REG_TBA_WR => {
// Print the character
println!("{}", data[0] as char);
},
_ => { },
}
Ok(())
}
}
impl Steppable for MC68681 {
fn step(&mut self, memory: &mut dyn Addressable, clock: Clock) -> Result<ClockElapsed, Error> {
if self.is_timing {
// Count down, wrapping around from 0 to 0xFFFF
self.timer = self.timer.wrapping_sub(1);
if self.timer == 0 {
// Set the interrupt flag
self.int_status |= ISR_TIMER_CHANGE;
}
}
// Delay for the number of nanoseconds of our 3.6864 MHz clock
Ok(1_000_000_000 / 3_686_400)
}
}
```
This implementation is enough to print the `Welcome to the 68k Monitor!` message that the monitor
software prints at boot, but we can't accept input yet. Since Computie is meant to be connected via
TTY, we could open a new pseudoterminal on the host computer (Linux in this case), and then connect to
that pseudoterminal using `miniterm` like we would with the real Computie. This will also free the
console window for debugging and log messages.
I originally implemented this using non-blocking input to check for a new character received on the
machine-end of the pseudoterm, and then storing it in a single byte in the MC68681 object, but I
later changed this to use a separate thread for polling, and `mpsc` channels to communicate with the
simulation thread. It's a bit too much code to include here but you can see the full tty
implementation [here](https://github.com/transistorfet/moa/blob/main/src/host/tty.rs)
Box It Up
---------
We have 3 different devices objects at this point: the CPU (`M68k`), the memory (`MemoryBlock`), and
the serial port controller (`MC68681`). The CPU implements the `Steppable` trait, the memory
implements the `Addressable` trait, and the controller implements both. That last one is a problem
because we can't have two mutable references to both the `Addressable` and `Steppable` trait objects
of the controller at the same time, and rust doesn't yet support trait objects implementing multiple
traits under these conditions. Generics wont work either because we need to store the
`Addressable`s in some kind of list, and items in a list must have the same type, so we need some
other way of getting a reference to one of the trait objects only when we need to use it. If we
weren't so adamant about making this flexible and configurable, we could possibly keep this simpler,
but alas, we are that adamant. So lets introduce another trait, and some reference counted boxes!
```rust
pub trait Transmutable {
fn as_steppable(&mut self) -> Option<&mut dyn Steppable> {
None
}
fn as_addressable(&mut self) -> Option<&mut dyn Addressable> {
None
}
}
```
The `Transmutable` trait can be implemented for each device object, and it has a method for each of
our traits, which returns the trait object reference we need. The default implementations return
`None` to indicate that the trait is not implemented for that device object. Device objects that
*do* implement a given trait can redefine the `Transmutable` function to return `Some(self)`
instead. That means that any `Transmutable` trait object can be turned into either an `Addressable`
or a `Steppable`, if supported by the underlying device object, and we have a way of checking that
support at runtime.
We top it all off with `TransmutableBox` which will allow us to have multiple references to any
`Transmutable` objects, so we can pass them around to build our machine configuration.
`wrap_transmutable` is a helper function to box up the device object during startup. The device
objects will be set up once at start up and then not changed until the program terminates, so we're
not too concerned about reference cycles.
```rust
pub type TransmutableBox = Rc<RefCell<Box<dyn Transmutable>>>;
pub fn wrap_transmutable<T: Transmutable + 'static>(value: T) -> TransmutableBox {
Rc::new(RefCell::new(Box::new(value)))
}
```
Using this method does introduce some performance penalties, but they are marginal relative to the
decode and execution times of each instruction and we can still achieve pretty decent instruction
cycle speeds despite the overhead. If speed was a concern, we might try to avoid the `Transmutable`
trait by also eliminating our `Steppable` trait and using a static machine configuration where the
device object types are known at compile time and the top level function for each machine
configuration would drive the simulation forward. This latter style architecture is used by many
other emulators, such as the [ClockSignal (CLK)](https://github.com/TomHarte/CLK) emulator (which
has a very well organized and easy to read codebase if you're looking for another emulator project
to learn from). The downside is that each machine requires more machine-specific top level code to
tie the pieces together.
If anyone knows of a better way to organize this so that we can get the best of both worlds, speed
and the ability to abstract away all the devices without tight coupling, I'd love to heard about it.
I'm always looking for new ways to design things in Rust.
An Addressable of Addressables (The Data Bus)
---------------------------------------------
With this new trait, all of our devices look the same regardless of the combinations of traits they
implement. We can store them all in the same list, as well as pass them to other objects to be
turned into their respective trait references when needed. We can now implement a `Bus` to hold our
`Addressable`s so that we can access them all through the same address space
```rust
pub struct Block {
pub base: Address,
pub length: usize,
pub dev: TransmutableBox,
}
pub struct Bus {
pub blocks: Vec<Block>,
}
impl Bus {
/// Insert the devices in the correct address order
pub fn insert(&mut self, base: Address, length: usize, dev: TransmutableBox) {
let block = Block { base, length, dev };
let i = self.blocks.iter().position(|cur| cur.base > block.base).unwrap_or(self.blocks.len());
self.blocks.insert(i, block);
}
/// Find the device that's mapped to the given address range, and return
/// the device, as well as the given address minus the device's base address
pub fn get_device_at(&self, addr: Address, count: usize) -> Result<(TransmutableBox, Address), Error> {
for block in &self.blocks {
if addr >= block.base && addr < (block.base + block.length as Address) {
let relative_addr = addr - block.base;
if relative_addr as usize + count <= block.length {
return Ok((block.dev.clone(), relative_addr));
} else {
return Err(Error::new(&format!("Error reading address {:#010x}", addr)));
}
}
}
return Err(Error::new(&format!("No segment found at {:#010x}", addr)));
}
}
impl Addressable for Bus {
fn len(&self) -> usize {
let block = &self.blocks[self.blocks.len() - 1];
(block.base as usize) + block.length
}
fn read(&mut self, addr: Address, data: &mut [u8]) -> Result<(), Error> {
let (dev, relative_addr) = self.get_device_at(addr, data.len())?;
let result = dev.borrow_mut().as_addressable().unwrap().read(relative_addr, data);
result
}
fn write(&mut self, addr: Address, data: &[u8]) -> Result<(), Error> {
let (dev, relative_addr) = self.get_device_at(addr, data.len())?;
let result = dev.borrow_mut().as_addressable().unwrap().write(relative_addr, data);
result
}
}
```
Our `Bus` can act like an `Addressable`, so anything that's mapped to an address on the bus can be
accessed through the same `&mut Addressable` allowing us to pass it to the CPU's `.step()` method.
A Happy Little System
---------------------
The last piece we need is a way to call all of our `.step()` methods, but only the top level loop
needs to be able to call them, so we don't need to package them up the same as `Addressable`. We
do, however, need to keep track of the passage of time. We can make a `System` to hold our pieces,
which can also hold our event tracking code, and our `Bus`.
```rust
pub struct NextStep {
pub next_clock: Clock,
pub device: TransmutableBox,
}
struct System {
pub clock: Clock,
pub bus: Bus,
pub event_queue: Vec<NextStep>,
}
impl System {
/// Insert into the queue such that the last device is the next step to execute
fn queue_device(&mut self, device_step: NextStep) {
for i in (0..self.event_queue.len()).rev() {
if self.event_queue[i].next_clock > device_step.next_clock {
self.event_queue.insert(i + 1, device_step);
return;
}
}
self.event_queue.insert(0, device_step);
}
/// Add a device to the system, and if it's `Steppable` then queue it
pub fn add_device(&mut self, device: TransmutableBox) -> Result<(), Error> {
if device.borrow_mut().as_steppable().is_some() {
self.queue_device(NextStep::new(device));
}
Ok(())
}
/// Add a device to the system while also mapping it to the given address on the bus
pub fn add_addressable_device(&mut self, addr: Address, device: TransmutableBox) -> Result<(), Error> {
let length = device.borrow_mut().as_addressable().unwrap().len();
self.bus.insert(addr, length, device.clone());
self.add_device(device)
}
/// Execute one step function from the queue
pub fn step(&mut self) -> Result<(), Error> {
// Remove the last item in the queue which is the next step to execute
let mut next_device = self.event_queue.pop().unwrap();
self.clock = next_device.next_clock;
let memory = self.bus.as_addressable().unwrap();
// Execute the step function
let diff = next_device.device.borrow_mut().as_steppable().unwrap().step(memory, self.clock)?;
// Adjust the device's next scheduled step
next_device.next_clock = self.clock + diff;
// Re-insert into the queue in order of next event
self.queue_device(next_device);
Ok(())
}
/// Run step functions until the system clock has changed by the given time in nanoseconds
pub fn run_for(&mut self, elapsed: ClockElapsed) -> Result<(), Error> {
let target = self.clock + elapsed;
while self.clock < target {
self.step()?;
}
Ok(())
}
}
```
In later iterations, I pass the whole immutable reference to `System` into the step functions, and
put `Bus` inside a `RefCell` which can then be accessed if needed by the `.step()` function for the
device object. I've also added an interrupt controller to `System` which can be used for hardware
interrupts from devices (`MC68681` in the case of Computie).
Tying It All Together
---------------------
Finally we can tie all our pieces together and set them in motion.
```rust
fn main() -> Result<(), Error> {
let mut system = System::new();
// Create our Flash memory, pre-loaded with our Computie monitor firmware
let rom = MemoryBlock::load("binaries/computie/monitor.bin")?;
system.add_addressable_device(0x00000000, wrap_transmutable(rom))?;
// Create a RAM segment at the 1MB address
let mut ram = MemoryBlock::new(vec![0; 0x00100000]);
system.add_addressable_device(0x00100000, wrap_transmutable(ram))?;
let mut serial = MC68681::new();
// Open a new PTY and launch a terminal emulator that in turn
// launches miniterm and connects to the other end of the PTY
launch_terminal_emulator(serial.port_a.connect(Box::new(SimplePty::open()?))?);
system.add_addressable_device(0x00700000, wrap_transmutable(serial))?;
let mut cpu = M68k::new(M68kType::MC68010);
system.add_device(wrap_transmutable(cpu))?;
// Run forever
system.run_for(u64::MAX)?;
Ok(())
}
pub fn launch_terminal_emulator(name: String) {
use std::thread;
use std::time::Duration;
use std::process::Command;
Command::new("x-terminal-emulator").arg("-e").arg(&format!("pyserial-miniterm {}", name)).spawn().unwrap();
thread::sleep(Duration::from_secs(1));
}
```
We can now boot our monitor firmware and once interrupts are added, we can even boot the OS!
![alt text](images/2021-11-moa-computie-os.png "The Computie OS booting in Moa")
Now What
--------
After only a couple weeks I was able to get all of the Computie software running, including the OS,
albeit with a buggy and incomplete 68000 implementation. That didn't seem too hard. What else can
I get running inside this emulator. How about I try emulating the Sega Genesis! How hard can it
be? Well... after a few months it's damn near broken me, but I'm determined to complete it
eventually. I've also taken detours into emulating the original Macintosh (simpler than the Genesis
so easier to debug the 68k implementation with) and the TRS-80 (in order to test the Z80
implementation I made to eventually use in the Genesis). I'm hoping to write more about those other
efforts, once I get them somewhat working. Until then, happy emulating!