This PR fixes all discrepancies of sim65 instruction timings, for both the 6502 and the 65C02 processors.
The timings as implemented in this PR have been verified against actual hardware (Atari 800 XL for 6502; and WDC 65C02 for 65C02).
These timings can also be verified against the 65x02 test suite. However, in this case, a single discrepancy arises; the 65x02 testsuite suggests that the 65C02 opcode 0x5c should take 4 clocks. However, tests on a hardware 65C02 have conclusively shown that this instruction takes 8 clock cycles. The 8 clock cycles duration for the 65C02 0xfc opcode is also confirmed by other sources, e.g. Section 9 of http://www.6502.org/tutorials/65c02opcodes.html.
This test makes sim65 correct both in terms of functionality (all opcodes now do what they do on hardware) and in terms of timing (all instructions take as long as they would on real hardware).
The one discrepancy that remains, is that on a real 6502/65C02, some instructions issue R or W cycles on the bus while the instruction processing is being done. Those spurious bus cycles are not replicated in sim65. Sim65 is thus an instruction-level simulator, rather than a bus-cycle level simulator. In other words, while the clock cycle counts for each instruction are now correct, not all clock cycles are individually simulated.
This PR fixes the implementation of 5 illegal opcodes
in the 6502, which the 6502X supports:
* $93 SHA (zp),y
* $9B TAS abs,y
* $9C SHY abs,x
* $9E SHX abs,x
* $9F SHA abs,y
The common denominator of the previous implementation was that it didn't correctly handle the case when the Y or X indexing induced a page crossing. In those cases, the effective address calculation of the instructions becomes truly messed up (with the high byte of the address equal to the value being written).
The correctness of the implementations in this PR was verified using the 65x02 test suite, and corresponds to a (detailed) reading of the "No More Secrets" document.
Stylistically, there is room for improvement in these implementations, specifically in factoring out common behavior into macros. However, for now the "explicit" coding style will suffice. It is clear enough, and we want to reach a situation soon where the sim65
code is able to pass the full '65x02' testsuite. Once we get to that point, we can refactor this code with a lot more confidence, since we will have the benefit of a working exhaustive test to make sure we don't break stuff.
This PR implements support for 32 65C02-specific instructions
to sim65: BBRx, BBSx, RMBx, SMBx, with x = 0..7.
These instructions are implemented using two macros:
* The "ZP_BITOP" macro implements the RMBx and SMBx isntructions.
* The "ZP_BIT_BRANCH" macro implements the BBRx abd BBSx instructions.
The implementation of these instructions has been verified usingthe 65x02 test suite.
After a lot of preparatory work, we are now in position to finally tighten
the types of the 6502 registers defined in the CPURegs struct of sim65.
All registers were previously defined as bare 'unsigned', leading to subtle
bugs where the bits beyond the 8 or 16 "true" bits in the register could
become non-zero. Tightening the types of the registers to uint8_t and
uint16_t as appropriate gets rid of these subtle bugs once and for all,
assisted by the semantics of C when assigning an unsigned value to an
unsigned type with less bits: the high-order bits are simply discarded,
which is precisely what we'd want to happen.
This change cleans up a lot of spurious failures of sim65 against the
65x02 test-set. For the 6502 and 65C02, we're now *functionally*
compliant. For timing (i.e., clock cycle counts for each instruction),
some work remains.
ANE (0x8b) is an unstable illegal opcode that depends on a "constant" value that isn't
really constant. It varies between machines, with temperature, and so on. Original sim65
behavior was to use the constant value 0xEF. To get the behavior in line with the 65x02
testsuite, we now use the value 0xEE instead, which is also a reasonable choice that can
be observed in practice.
The obvious way to implement JSR for the 6502 is to (a) read the target address,
and then (b) push the return address minus one. Or do (b) first, then (a).
However, there is a non-obvious case where this conflicts with the actual order
of operations that the 6502 does, which is:
(a) Load the LSB of the target address.
(b) Push the MSB of the return address, minus one.
(c) Push the LSB of the return address, minus one.
(d) Load the MSB of the target address.
This can make a difference in a pretty esoteric case, if the JSR target is located,
wholly or in part, inside the stack page (!). This won't happen in normal code
but it can happen in specifically constructed examples.
To deal with this, we load the LSB and MSB of the target address separately, with
the pushing of the return address sandwiched in between, to mimic the order of the
bus operations on a real 6502.