This PR fixes all discrepancies of sim65 instruction timings, for both the 6502 and the 65C02 processors.
The timings as implemented in this PR have been verified against actual hardware (Atari 800 XL for 6502; and WDC 65C02 for 65C02).
These timings can also be verified against the 65x02 test suite. However, in this case, a single discrepancy arises; the 65x02 testsuite suggests that the 65C02 opcode 0x5c should take 4 clocks. However, tests on a hardware 65C02 have conclusively shown that this instruction takes 8 clock cycles. The 8 clock cycles duration for the 65C02 0xfc opcode is also confirmed by other sources, e.g. Section 9 of http://www.6502.org/tutorials/65c02opcodes.html.
This test makes sim65 correct both in terms of functionality (all opcodes now do what they do on hardware) and in terms of timing (all instructions take as long as they would on real hardware).
The one discrepancy that remains, is that on a real 6502/65C02, some instructions issue R or W cycles on the bus while the instruction processing is being done. Those spurious bus cycles are not replicated in sim65. Sim65 is thus an instruction-level simulator, rather than a bus-cycle level simulator. In other words, while the clock cycle counts for each instruction are now correct, not all clock cycles are individually simulated.
This PR fixes the implementation of 5 illegal opcodes
in the 6502, which the 6502X supports:
* $93 SHA (zp),y
* $9B TAS abs,y
* $9C SHY abs,x
* $9E SHX abs,x
* $9F SHA abs,y
The common denominator of the previous implementation was that it didn't correctly handle the case when the Y or X indexing induced a page crossing. In those cases, the effective address calculation of the instructions becomes truly messed up (with the high byte of the address equal to the value being written).
The correctness of the implementations in this PR was verified using the 65x02 test suite, and corresponds to a (detailed) reading of the "No More Secrets" document.
Stylistically, there is room for improvement in these implementations, specifically in factoring out common behavior into macros. However, for now the "explicit" coding style will suffice. It is clear enough, and we want to reach a situation soon where the sim65
code is able to pass the full '65x02' testsuite. Once we get to that point, we can refactor this code with a lot more confidence, since we will have the benefit of a working exhaustive test to make sure we don't break stuff.
This PR implements support for 32 65C02-specific instructions
to sim65: BBRx, BBSx, RMBx, SMBx, with x = 0..7.
These instructions are implemented using two macros:
* The "ZP_BITOP" macro implements the RMBx and SMBx isntructions.
* The "ZP_BIT_BRANCH" macro implements the BBRx abd BBSx instructions.
The implementation of these instructions has been verified usingthe 65x02 test suite.
After a lot of preparatory work, we are now in position to finally tighten
the types of the 6502 registers defined in the CPURegs struct of sim65.
All registers were previously defined as bare 'unsigned', leading to subtle
bugs where the bits beyond the 8 or 16 "true" bits in the register could
become non-zero. Tightening the types of the registers to uint8_t and
uint16_t as appropriate gets rid of these subtle bugs once and for all,
assisted by the semantics of C when assigning an unsigned value to an
unsigned type with less bits: the high-order bits are simply discarded,
which is precisely what we'd want to happen.
This change cleans up a lot of spurious failures of sim65 against the
65x02 test-set. For the 6502 and 65C02, we're now *functionally*
compliant. For timing (i.e., clock cycle counts for each instruction),
some work remains.
ANE (0x8b) is an unstable illegal opcode that depends on a "constant" value that isn't
really constant. It varies between machines, with temperature, and so on. Original sim65
behavior was to use the constant value 0xEF. To get the behavior in line with the 65x02
testsuite, we now use the value 0xEE instead, which is also a reasonable choice that can
be observed in practice.
The obvious way to implement JSR for the 6502 is to (a) read the target address,
and then (b) push the return address minus one. Or do (b) first, then (a).
However, there is a non-obvious case where this conflicts with the actual order
of operations that the 6502 does, which is:
(a) Load the LSB of the target address.
(b) Push the MSB of the return address, minus one.
(c) Push the LSB of the return address, minus one.
(d) Load the MSB of the target address.
This can make a difference in a pretty esoteric case, if the JSR target is located,
wholly or in part, inside the stack page (!). This won't happen in normal code
but it can happen in specifically constructed examples.
To deal with this, we load the LSB and MSB of the target address separately, with
the pushing of the return address sandwiched in between, to mimic the order of the
bus operations on a real 6502.
This patch provides a temporary fix for the issue where the fgets()
function did not use the target-specific newline character to
decide if it has reached the end of the line. It defaulted to the
value $0a, which is the newline character on only some targets.
The Atari, for example, has newline character $9b instead.
This patch is ugly, because the ca65 assembler that is used for
fgets doesn't currently accept C-type character escape sequences
as values. Ideally we'd be able to write:
cmp #'\n'
And this would end up being translated to a compare-immediate
to the target-specific newline character.
Since that is impossible, this patch substitutes the equivalent,
but ugly, code:
.byte $c9, "\n"
This works because $c9 is the opcode for cmp #imm, and the "\n"
string /is/ translated to the platform-specific newline character,
at least when the 'string_escapes' feature is enabled.
It provides access to a handful of 64-bit counters that count different things:
- clock cycles
- instructions
- number of IRQ processed
- number of NMIs processed
- nanoseconds since 1-1-1970.
This in not ready yet to be pushed as a merge request into the upstream CC65
repository. What's lacking:
- documentation
- tests
And to be discussed:
- do we agree on this implementation direction and interface in principe?
- can I include inttypes.h for printing a 64-bit unsigned value?
- will clock_gettime() work on a Windows build?
The linkage of the 'Regs' variable in 6502.c was changed from static
to extern. This makes the Regs type visible (and even alterable) from
the outside.
This change helps tools to inspect the CPU state. In particular, it
was implemented to facilitate a tool that verifies opcode
functionality using the '65x02' testsuite. But the change is also
potentially useful for e.g. an online debugger that wants to inspect
the CPU state while the 6502 is neing simulated.
The current (before-this-patch) version of sim65.c does not correctly implement
the ADC and SBC instructions in 65C02 mode. This PR fixes that.
The 6502 and 65C02 behave identically in binary mode; in decimal behavior
however they diverge, both in the handling of inputs that are not BCD values,
and in the handling of processor flags.
This fix restructures the original "ADC" and "SBC" macros in versions that
are specific for the 6502 and the 65C02, and updates the opcode tables to
ensure that they point to the correct implementations.
Considering the ADC instruction for a moment, the original "ADC" macro was
changed to two macros ADC_6502 and ADC_65C02. These check the D (decimal
mode) bit, and defer their implementation to any of three macros ADC_BINARY_MODE,
ADC_DECIMAL_MODE_6502, and ADC_DECIMAL_MODE_65C02. This is a bit verbose but it
makes it very clear what's going on.
(For the SBC changes, the analogous changes were made.)
The correctness of the changes made is ensured as follows:
First, an in-depth study was made how ADC and SBC work, both in the original
6502 and the later 65C02 processor. The actual behavior of both processors
was captured on hardware (an Atari 800 XL with a 6502 and a Neo6502 equipped
with a WDC 65C02 processor), and was analyzed. The results were cross-referenced
with internet sources, leading to a C implementation that reproduces the exact
result of the hardware processors. See:
https://github.com/sidneycadot/6502-test/blob/main/functional_test/adc_sbc/c_and_python_implementations/6502_adc_sbc.c
Next, these C implementations of ADC and SBC were fitted into sim65's macro-
based implementation scheme, replacing the existing 6502-only implementation.
Finally, the new sim65 implementation was tested against the 65x02 testsuite,
showing that (1) the 6502 implementation was still correct; and (2) that
the 65C02 implementation is now also correct.
As an added bonus, this new implementation of ADC/SBC no longer relies on a
dirty implementation detail in sim65: the previous implementation relied on
the fact that currently, the A register in the simulator is implemented as
an "unsigned", with more bits than the actual A register (8 bits). In the
future we want to change the register width to 8 bits, and this updated
ADC/SBC is a necessary precursor to that change.