1
0
mirror of https://github.com/TomHarte/CLK.git synced 2024-11-21 21:33:54 +00:00

Wrote a restatement of what I think is true about Amstrad CPC timing.

Thomas Harte 2017-08-02 10:02:03 -04:00
parent bf07e6293a
commit e6113b0cc8

59
Amstrad-CPC-Timing.md Normal file

@ -0,0 +1,59 @@
The Amstrad CPC is a machine containing, amongst other things:
* a 4Mhz Z80;
* RAM with a 200ns access time (i.e. permitting access at up to 5Mhz);
* a 1Mhz CRTC for video address generation and sync timing; and
* a custom gate array for video serialisation.
Video and the CPU share the same RAM. The gate array fetches two bytes of video for every address generated by the CRTC so it needs a 2Mhz channel to RAM. The system automatically arbitrates video and CPU accesses so that each is invisible to the other.
## Z80 Wait Timing
The Z80 provides a 'wait' line. During any machine cycle that accesses memory or IO, the Z80 will sample the wait line. While it is active, the processor will pause.
It always samples on the falling edge of its clock — halfway through a cycle — but beyond that timing varies.
Opcode fetches cost four bus cycles, but contain two phases of operation: two cycles to perform the memory access for an opcode, followed by two refresh cycles. Refresh cycles don't adhere to the wait line. The opcode fetch first samples the wait line after 1.5 cycles. If wait is set, it will wait an additional cycle and check again. Once wait is complete it will conclude with a final half cycle.
Therefore opcode fetches will access memory _during the first cycle in which wait is not active_.
Other memory fetches idiomatically cost three bus cycles. Wait is again sampled after 1.5 cycles. As soon as wait is not active, the final 1.5 cycles take place. The actual access occurs during the final cycle.
Therefore ordinary reads and writes will access memory _during the next cycle after the first in which wait is not active_.
## Four-Phase Clock
The CPC arbitrates between video and the CPU with a four-phase clock. It signals the wait line for three of those phases:
1. wait is inactive
2. wait is active
3. wait is active
4. wait is active
As the Z80 can access RAM either in the first cycle for which wait is inactive, or in the cycle afterwards, the result of that wait timing is that the CPU might access RAM during cycle 1 and it might access RAM during cycle 2:
1. wait is inactive; CPU accesses RAM if it's performing an opcode fetch
2. wait is active; CPU accesses RAM if it's performing a non-opcode read or write
3. wait is active
4. wait is active
That provides a natural window for video fetching:
1. wait is inactive; CPU accesses RAM if it's performing an opcode fetch
2. wait is active; CPU accesses RAM if it's performing a non-opcode read or write
3. wait is active; video fetch occurs
4. wait is active; video fetch occurs
## Effect on Z80 Instruction Timing
The common statement is that Z80 machine cycles are rounded up to the nearest multiple of four. This is a simplification but accurate because:
* an opcode fetch will access during cycle 1 and, after refresh, the next cycle will begin in clock cycle 4;
* an ordinary read or write will access during cycle 2 and the next machine cycle will begin in clock cycle 3;
* any machine cycle that begins in clock cycle 3 or 4 will have reached the point where it samples wait by the next cycle 1.
So once the Z80 is in phase:
* if you follow a three-cycle machine cycle then you will incur one wait state;
* if you follow a four-cycle machine cycle then you will not incur any wait states.
Therefore if your machine cycle was not intentionally four cycles then it will still accrue a total cost of four cycles by adding one to the machine cycle afterwards.
### Rule of thumb
The CPC is often said to have a processing speed that is approximately equivalent to an unconstrained Z80 running at 3.3Mhz. Ignoring machine cycles other than opcode fetches and memory accesses, that would be true if 30% of the byte stream were opcodes — then 30% of the time the Z80 would run at a full 4Mhz and 70% of the time it would run at only 3/4 of its normal speed (each three-cycle machine cycle having been expanded to four cycles), i.e. at the equivalent of 3Mhz. 0.3*4 + 0.7*3 = 1.2 + 2.1 = 3.3. So the 3.3Mhz figure is approximately as accurate as the claim that "most machine cycles are memory accesses, and 30% of memory accesses are to fetch opcodes".