mirror of
https://github.com/MoleskiCoder/EightBit.git
synced 2024-12-23 00:29:47 +00:00
f5582df402
Signed-off-by: Adrian Conlon <Adrian.conlon@gmail.com>
1535 lines
65 KiB
Plaintext
1535 lines
65 KiB
Plaintext
#
|
|
# $Id: 64doc,v 1.8 1994/06/03 19:50:04 jopi Exp $
|
|
#
|
|
# This file is part of Commodore 64 emulator
|
|
# and Program Development System.
|
|
#
|
|
# See README for copyright notice
|
|
#
|
|
# This file contains documentation for 6502/6510/8500/8502 instruction set.
|
|
#
|
|
#
|
|
# Written by
|
|
# John West (john@ucc.gu.uwa.edu.au)
|
|
# Marko MŠkelŠ (msmakela@kruuna.helsinki.fi)
|
|
#
|
|
#
|
|
# $Log: 64doc,v $
|
|
# Revision 1.8 1994/06/03 19:50:04 jopi
|
|
# Patchlevel 2
|
|
#
|
|
# Revision 1.7 1994/04/15 13:07:04 jopi
|
|
# 65xx Register descriptions added
|
|
#
|
|
# Revision 1.6 1994/02/18 16:09:36 jopi
|
|
#
|
|
# Revision 1.5 1994/01/26 16:08:37 jopi
|
|
# X64 version 0.2 PL 1
|
|
#
|
|
# Revision 1.4 1993/11/10 01:55:34 jopi
|
|
#
|
|
# Revision 1.3 93/06/21 13:37:18 jopi
|
|
# X64 version 0.2 PL 0
|
|
#
|
|
# Revision 1.2 93/06/21 13:07:15 jopi
|
|
# *** empty log message ***
|
|
#
|
|
#
|
|
|
|
Note: To extract the uuencoded ML programs in this article most
|
|
easily you may use e.g. "uud" by Edwin Kremer ,
|
|
which extracts them all at once.
|
|
|
|
|
|
Documentation for the NMOS 65xx/85xx Instruction Set
|
|
|
|
6510 Instructions by Addressing Modes
|
|
6502 Registers
|
|
6510/8502 Undocumented Commands
|
|
Register selection for load and store
|
|
Decimal mode in NMOS 6500 series
|
|
6510 features
|
|
Different CPU types
|
|
6510 Instruction Timing
|
|
How Real Programmers Acknowledge Interrupts
|
|
Memory Management
|
|
Autostart Code
|
|
Notes
|
|
References
|
|
|
|
|
|
6510 Instructions by Addressing Modes
|
|
|
|
off- ++++++++++ Positive ++++++++++ ---------- Negative ----------
|
|
set 00 20 40 60 80 a0 c0 e0 mode
|
|
|
|
+00 BRK JSR RTI RTS NOP* LDY CPY CPX Impl/immed
|
|
+01 ORA AND EOR ADC STA LDA CMP SBC (indir,x)
|
|
+02 t t t t NOP*t LDX NOP*t NOP*t ? /immed
|
|
+03 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* (indir,x)
|
|
+04 NOP* BIT NOP* NOP* STY LDY CPY CPX Zeropage
|
|
+05 ORA AND EOR ADC STA LDA CMP SBC Zeropage
|
|
+06 ASL ROL LSR ROR STX LDX DEC INC Zeropage
|
|
+07 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* Zeropage
|
|
|
|
+08 PHP PLP PHA PLA DEY TAY INY INX Implied
|
|
+09 ORA AND EOR ADC NOP* LDA CMP SBC Immediate
|
|
+0a ASL ROL LSR ROR TXA TAX DEX NOP Accu/impl
|
|
+0b ANC** ANC** ASR** ARR** ANE** LXA** SBX** SBC* Immediate
|
|
+0c NOP* BIT JMP JMP () STY LDY CPY CPX Absolute
|
|
+0d ORA AND EOR ADC STA LDA CMP SBC Absolute
|
|
+0e ASL ROL LSR ROR STX LDX DEC INC Absolute
|
|
+0f SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* Absolute
|
|
|
|
+10 BPL BMI BVC BVS BCC BCS BNE BEQ Relative
|
|
+11 ORA AND EOR ADC STA LDA CMP SBC (indir),y
|
|
+12 t t t t t t t t ?
|
|
+13 SLO* RLA* SRE* RRA* SHA** LAX* DCP* ISB* (indir),y
|
|
+14 NOP* NOP* NOP* NOP* STY LDY NOP* NOP* Zeropage,x
|
|
+15 ORA AND EOR ADC STA LDA CMP SBC Zeropage,x
|
|
+16 ASL ROL LSR ROR STX y) LDX y) DEC INC Zeropage,x
|
|
+17 SLO* RLA* SRE* RRA* SAX* y) LAX* y) DCP* ISB* Zeropage,x
|
|
|
|
+18 CLC SEC CLI SEI TYA CLV CLD SED Implied
|
|
+19 ORA AND EOR ADC STA LDA CMP SBC Absolute,y
|
|
+1a NOP* NOP* NOP* NOP* TXS TSX NOP* NOP* Implied
|
|
+1b SLO* RLA* SRE* RRA* SHS** LAS** DCP* ISB* Absolute,y
|
|
+1c NOP* NOP* NOP* NOP* SHY** LDY NOP* NOP* Absolute,x
|
|
+1d ORA AND EOR ADC STA LDA CMP SBC Absolute,x
|
|
+1e ASL ROL LSR ROR SHX**y) LDX y) DEC INC Absolute,x
|
|
+1f SLO* RLA* SRE* RRA* SHA**y) LAX* y) DCP* ISB* Absolute,x
|
|
|
|
ROR intruction is available on MC650x microprocessors after
|
|
June, 1976.
|
|
|
|
Legend:
|
|
|
|
t Jams the machine
|
|
*t Jams very rarely
|
|
* Undocumented command
|
|
** Unusual operation
|
|
y) indexed using Y instead of X
|
|
() indirect instead of absolute
|
|
|
|
Note that the NOP instructions do have other addressing modes than the
|
|
implied addressing. The NOP instruction is just like any other load
|
|
instruction, except it does not store the result anywhere nor affects the
|
|
flags.
|
|
|
|
6502 Registers
|
|
|
|
The NMOS 65xx processors are not ruined with too many registers. In addition
|
|
to that, the registers are mostly 8-bit. Here is a brief description of each
|
|
register:
|
|
|
|
PC Program Counter
|
|
This register points the address from which the next instruction
|
|
byte (opcode or parameter) will be fetched. Unlike other
|
|
registers, this one is 16 bits in length. The low and high 8-bit
|
|
halves of the register are called PCL and PCH, respectively. The
|
|
Program Counter may be read by pushing its value on the stack.
|
|
This can be done either by jumping to a subroutine or by causing
|
|
an interrupt.
|
|
S Stack pointer
|
|
The NMOS 65xx processors have 256 bytes of stack memory, ranging
|
|
from $0100 to $01FF. The S register is a 8-bit offset to the stack
|
|
page. In other words, whenever anything is being pushed on the
|
|
stack, it will be stored to the address $0100+S.
|
|
|
|
The Stack pointer can be read and written by transfering its value
|
|
to or from the index register X (see below) with the TSX and TXS
|
|
instructions.
|
|
P Processor status
|
|
This 8-bit register stores the state of the processor. The bits in
|
|
this register are called flags. Most of the flags have something
|
|
to do with arithmetic operations.
|
|
|
|
The P register can be read by pushing it on the stack (with PHP or
|
|
by causing an interrupt). If you only need to read one flag, you
|
|
can use the branch instructions. Setting the flags is possible by
|
|
pulling the P register from stack or by using the flag set or
|
|
clear instructions.
|
|
|
|
Following is a list of the flags, starting from the 8th bit of the
|
|
P register (bit 7, value $80):
|
|
N Negative flag
|
|
This flag will be set after any arithmetic operations
|
|
(when any of the registers A, X or Y is being loaded
|
|
with a value). Generally, the N flag will be copied from
|
|
the topmost bit of the register being loaded.
|
|
|
|
Note that TXS (Transfer X to S) is not an arithmetic
|
|
operation. Also note that the BIT instruction affects
|
|
the Negative flag just like arithmetic operations.
|
|
Finally, the Negative flag behaves differently in
|
|
Decimal operations (see description below).
|
|
V oVerflow flag
|
|
Like the Negative flag, this flag is intended to be used
|
|
with 8-bit signed integer numbers. The flag will be
|
|
affected by addition and subtraction, the instructions
|
|
PLP, CLV and BIT, and the hardware signal -SO. Note that
|
|
there is no SEV instruction, even though the MOS
|
|
engineers loved to use East European abbreviations, like
|
|
DDR (Deutsche Demokratische Republik vs. Data Direction
|
|
Register). (The Russian abbreviation for their former
|
|
trade association COMECON is SEV.) The -SO (Set
|
|
Overflow) signal is available on some processors, at
|
|
least the 6502, to set the V flag. This enables response
|
|
to an I/O activity in equal or less than three clock
|
|
cycles when using a BVC instruction branching to itself
|
|
($50 $FE).
|
|
|
|
The CLV instruction clears the V flag, and the PLP and
|
|
BIT instructions copy the flag value from the bit 6 of
|
|
the topmost stack entry or from memory.
|
|
|
|
After a binary addition or subtraction, the V flag will
|
|
be set on a sign overflow, cleared otherwise. What is a
|
|
sign overflow? For instance, if you are trying to add
|
|
123 and 45 together, the result (168) does not fit in a
|
|
8-bit signed integer (upper limit 127 and lower limit
|
|
-128). Similarly, adding -123 to -45 causes the
|
|
overflow, just like subtracting -45 from 123 or 123 from
|
|
-45 would do.
|
|
|
|
Like the N flag, the V flag will not be set as expected
|
|
in the Decimal mode. Later in this document is a precise
|
|
operation description.
|
|
|
|
A common misbelief is that the V flag could only be set
|
|
by arithmetic operations, not cleared.
|
|
1 unused flag
|
|
To the current knowledge, this flag is always 1.
|
|
B Break flag
|
|
This flag is used to distinguish software (BRK)
|
|
interrupts from hardware interrupts (IRQ or NMI). The B
|
|
flag is always set except when the P register is being
|
|
pushed on stack when jumping to an interrupt routine to
|
|
process only a hardware interrupt.
|
|
|
|
The official NMOS 65xx documentation claims that the BRK
|
|
instruction could only cause a jump to the IRQ vector
|
|
($FFFE). However, if an NMI interrupt occurs while
|
|
executing a BRK instruction, the processor will jump to
|
|
the NMI vector ($FFFA), and the P register will be
|
|
pushed on the stack with the B flag set.
|
|
D Decimal mode flag
|
|
This flag is used to select the (Binary Coded) Decimal
|
|
mode for addition and subtraction. In most applications,
|
|
the flag is zero.
|
|
|
|
The Decimal mode has many oddities, and it operates
|
|
differently on CMOS processors. See the description of
|
|
the ADC, SBC and ARR instructions below.
|
|
I Interrupt disable flag
|
|
This flag can be used to prevent the processor from
|
|
jumping to the IRQ handler vector ($FFFE) whenever the
|
|
hardware line -IRQ is active. The flag will be
|
|
automatically set after taking an interrupt, so that the
|
|
processor would not keep jumping to the interrupt
|
|
routine if the -IRQ signal remains low for several clock
|
|
cycles.
|
|
Z Zero flag
|
|
The Zero flag will be affected in the same cases than
|
|
the Negative flag. Generally, it will be set if an
|
|
arithmetic register is being loaded with the value zero,
|
|
and cleared otherwise. The flag will behave differently
|
|
in Decimal operations.
|
|
C Carry flag
|
|
This flag is used in additions, subtractions,
|
|
comparisons and bit rotations. In additions and
|
|
subtractions, it acts as a 9th bit and lets you to chain
|
|
operations to calculate with bigger than 8-bit numbers.
|
|
When subtracting, the Carry flag is the negative of
|
|
Borrow: if an overflow occurs, the flag will be clear,
|
|
otherwise set. Comparisons are a special case of
|
|
subtraction: they assume Carry flag set and Decimal flag
|
|
clear, and do not store the result of the subtraction
|
|
anywhere.
|
|
|
|
There are four kinds of bit rotations. All of them store
|
|
the bit that is being rotated off to the Carry flag. The
|
|
left shifting instructions are ROL and ASL. ROL copies
|
|
the initial Carry flag to the lowmost bit of the byte;
|
|
ASL always clears it. Similarly, the ROR and LSR
|
|
instructions shift to the right.
|
|
A Accumulator
|
|
The accumulator is the main register for arithmetic and logic
|
|
operations. Unlike the index registers X and Y, it has a direct
|
|
connection to the Arithmetic and Logic Unit (ALU). This is why
|
|
many operations are only available for the accumulator, not the
|
|
index registers.
|
|
X Index register X
|
|
This is the main register for addressing data with indices. It has
|
|
a special addressing mode, indexed indirect, which lets you to
|
|
have a vector table on the zero page.
|
|
Y Index register Y
|
|
The Y register has the least operations available. On the other
|
|
hand, only it has the indirect indexed addressing mode that
|
|
enables access to any memory place without having to use
|
|
self-modifying code.
|
|
|
|
6510/8502 Undocumented Commands
|
|
|
|
-- A brief explanation about what may happen while using don't care states.
|
|
|
|
ANE $8B A = (A | #$EE) & X & #byte
|
|
same as
|
|
A = ((A & #$11 & X) | ( #$EE & X)) & #byte
|
|
|
|
In real 6510/8502 the internal parameter #$11
|
|
may occasionally be #$10, #$01 or even #$00.
|
|
This occurs when the video chip starts DMA
|
|
between the opcode fetch and the parameter fetch
|
|
of the instruction. The value probably depends
|
|
on the data that was left on the bus by the VIC-II.
|
|
|
|
LXA $AB C=Lehti: A = X = ANE
|
|
Alternate: A = X = (A & #byte)
|
|
|
|
TXA and TAX have to be responsible for these.
|
|
|
|
SHA $93,$9F Store (A & X & (ADDR_HI + 1))
|
|
SHX $9E Store (X & (ADDR_HI + 1))
|
|
SHY $9C Store (Y & (ADDR_HI + 1))
|
|
SHS $9B SHA and TXS, where X is replaced by (A & X).
|
|
|
|
Note: The value to be stored is copied also
|
|
to ADDR_HI if page boundary is crossed.
|
|
|
|
SBX $CB Carry and Decimal flags are ignored but the
|
|
Carry flag will be set in substraction. This
|
|
is due to the CMP command, which is executed
|
|
instead of the real SBC.
|
|
|
|
ARR $6B This instruction first performs an AND
|
|
between the accumulator and the immediate
|
|
parameter, then it shifts the accumulator to
|
|
the right. However, this is not the whole
|
|
truth. See the description below.
|
|
|
|
Many undocumented commands do not use AND between registers, the CPU
|
|
just throws the bytes to a bus simultaneously and lets the
|
|
open-collector drivers perform the AND. I.e. the command called 'SAX',
|
|
which is in the STORE section (opcodes $A0...$BF), stores the result
|
|
of (A & X) by this way.
|
|
|
|
More fortunate is its opposite, 'LAX' which just loads a byte
|
|
simultaneously into both A and X.
|
|
|
|
$6B ARR
|
|
|
|
This instruction seems to be a harmless combination of AND and ROR at
|
|
first sight, but it turns out that it affects the V flag and also has
|
|
a special kind of decimal mode. This is because the instruction has
|
|
inherited some properties of the ADC instruction ($69) in addition to
|
|
the ROR ($6A).
|
|
|
|
In Binary mode (D flag clear), the instruction effectively does an AND
|
|
between the accumulator and the immediate parameter, and then shifts
|
|
the accumulator to the right, copying the C flag to the 8th bit. It
|
|
sets the Negative and Zero flags just like the ROR would. The ADC code
|
|
shows up in the Carry and oVerflow flags. The C flag will be copied
|
|
from the bit 6 of the result (which doesn't seem too logical), and the
|
|
V flag is the result of an Exclusive OR operation between the bit 6
|
|
and the bit 5 of the result. This makes sense, since the V flag will
|
|
be normally set by an Exclusive OR, too.
|
|
|
|
In Decimal mode (D flag set), the ARR instruction first performs the
|
|
AND and ROR, just like in Binary mode. The N flag will be copied from
|
|
the initial C flag, and the Z flag will be set according to the ROR
|
|
result, as expected. The V flag will be set if the bit 6 of the
|
|
accumulator changed its state between the AND and the ROR, cleared
|
|
otherwise.
|
|
|
|
Now comes the funny part. If the low nybble of the AND result,
|
|
incremented by its lowmost bit, is greater than 5, the low nybble in
|
|
the ROR result will be incremented by 6. The low nybble may overflow
|
|
as a consequence of this BCD fixup, but the high nybble won't be
|
|
adjusted. The high nybble will be BCD fixed in a similar way. If the
|
|
high nybble of the AND result, incremented by its lowmost bit, is
|
|
greater than 5, the high nybble in the ROR result will be incremented
|
|
by 6, and the Carry flag will be set. Otherwise the C flag will be
|
|
cleared.
|
|
|
|
To help you understand this description, here is a C routine that
|
|
illustrates the ARR operation in Decimal mode:
|
|
|
|
unsigned
|
|
A, /* Accumulator */
|
|
AL, /* low nybble of accumulator */
|
|
AH, /* high nybble of accumulator */
|
|
|
|
C, /* Carry flag */
|
|
Z, /* Zero flag */
|
|
V, /* oVerflow flag */
|
|
N, /* Negative flag */
|
|
|
|
t, /* temporary value */
|
|
s; /* value to be ARRed with Accumulator */
|
|
|
|
t = A & s; /* Perform the AND. */
|
|
|
|
AH = t >> 4; /* Separate the high */
|
|
AL = t & 15; /* and low nybbles. */
|
|
|
|
N = C; /* Set the N and */
|
|
Z = !(A = (t >> 1) | (C << 7)); /* Z flags traditionally */
|
|
V = (t ^ A) & 64; /* and V flag in a weird way. */
|
|
|
|
if (AL + (AL & 1) > 5) /* BCD "fixup" for low nybble. */
|
|
A = (A & 0xF0) | ((A + 6) & 0xF);
|
|
|
|
if (C = AH + (AH & 1) > 5) /* Set the Carry flag. */
|
|
A = (A + 0x60) & 0xFF; /* BCD "fixup" for high nybble. */
|
|
|
|
$CB SBX X <- (A & X) - Immediate
|
|
|
|
The 'SBX' ($CB) may seem to be very complex operation, even though it
|
|
is a combination of the subtraction of accumulator and parameter, as
|
|
in the 'CMP' instruction, and the command 'DEX'. As a result, both A
|
|
and X are connected to ALU but only the subtraction takes place. Since
|
|
the comparison logic was used, the result of subtraction should be
|
|
normally ignored, but the 'DEX' now happily stores to X the value of
|
|
(A & X) - Immediate. That is why this instruction does not have any
|
|
decimal mode, and it does not affect the V flag. Also Carry flag will
|
|
be ignored in the subtraction but set according to the result.
|
|
|
|
Proof:
|
|
|
|
begin 644 vsbx
|
|
M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```*D`H#V1*Z`_D2N@09$KJ0>%
|
|
M^QBE^VEZJ+$KH#F1*ZD`2"BI`*(`RP`(:-B@.5$K*4#P`E@`H#VQ*SAI`)$K
|
|
JD-Z@/[$K:0"1*Y#4J2X@TO\XH$&Q*VD`D2N0Q,;[$+188/_^]_:_OK>V
|
|
`
|
|
end
|
|
|
|
and
|
|
|
|
begin 644 sbx
|
|
M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI`*!-D2N@3Y$KH%&1*ZD#
|
|
MA?L8I?M*2)`#J1@LJ3B@29$K:$J0`ZGX+*G8R)$K&/BXJ?2B8\L)AOP(:(7]
|
|
MV#B@3;$KH$\Q*Z!1\2L(1?SP`0!H1?TIM]#XH$VQ*SAI`)$KD,N@3[$K:0"1
|
|
9*Y#!J2X@TO\XH%&Q*VD`D2N0L<;[$))88-#X
|
|
`
|
|
end
|
|
|
|
These test programs show if your machine is compatible with ours
|
|
regarding the opcode $CB. The first test, vsbx, proves that SBX does
|
|
not affect the V flag. The latter one, sbx, proves the rest of our
|
|
theory. The vsbx test tests 33554432 SBX combinations (16777216
|
|
different A, X and Immediate combinations, and two different V flag
|
|
states), and the sbx test doubles that amount (16777216*4 D and C flag
|
|
combinations). Both tests have run successfully on a C64 and a Vic20.
|
|
They ought to run on C16, +4 and the PET series as well. The tests
|
|
stop with BRK, if the opcode $CB does not work as expected. Successful
|
|
operation ends in RTS. As the tests are very slow, they print dots on
|
|
the screen while running so that you know that the machine has not
|
|
jammed. On computers running at 1 MHz, the first test prints
|
|
approximately one dot every four seconds and a total of 2048 dots,
|
|
whereas the second one prints half that amount, one dot every seven
|
|
seconds.
|
|
|
|
If the tests fail on your machine, please let us know your processor's
|
|
part number and revision. If possible, save the executable (after it
|
|
has stopped with BRK) under another name and send it to us so that we
|
|
know at which stage the program stopped.
|
|
|
|
The following program is a Commodore 64 executable that Marko M"akel"a
|
|
developed when trying to find out how the V flag is affected by SBX.
|
|
(It was believed that the SBX affects the flag in a weird way, and
|
|
this program shows how SBX sets the flag differently from SBC.) You
|
|
may find the subroutine at $C150 useful when researching other
|
|
undocumented instructions' flags. Run the program in a machine
|
|
language monitor, as it makes use of the BRK instruction. The result
|
|
tables will be written on pages $C2 and $C3.
|
|
|
|
begin 644 sbx-c100
|
|
M`,%XH`",#L&,$,&,$L&XJ8*B@LL7AOL(:(7\N#BM#L$M$,'M$L$(Q?OP`B@`
|
|
M:$7\\`,@4,'N#L'0U.X0P=#/SB#0[A+!T,<``````````````)BJ\!>M#L$M
|
|
L$,'=_\'0":T2P=W_PM`!8,K0Z:T.P2T0P9D`PID`!*T2P9D`PYD`!
|
|
|
|
Other undocumented instructions usually cause two preceding opcodes
|
|
being executed. However 'NOP' seems to completely disappear from 'SBC'
|
|
code $EB.
|
|
|
|
The most difficult to comprehend are the rest of the instructions
|
|
located on the '$0B' line.
|
|
|
|
All the instructions located at the positive (left) side of this line
|
|
should rotate either memory or the accumulator, but the addressing
|
|
mode turns out to be immediate! No problem. Just read the operand, let
|
|
it be ANDed with the accumulator and finally use accumulator
|
|
addressing mode for the instructions above them.
|
|
|
|
RELIGION_MODE_ON
|
|
/* This part of the document is not accurate. You can
|
|
read it as a fairy tale, but do not count on it when
|
|
performing your own measurements. */
|
|
|
|
The rest two instructions on the same line, called 'ANE' and 'LXA'
|
|
($8B and $AB respectively) often give quite unpredictable results.
|
|
However, the most usual operation is to store ((A | #$ee) & X & #$nn)
|
|
to accumulator. Note that this does not work reliably in a real 64!
|
|
In the Commodore 128 the opcode $8B uses values 8C, CC, EE, and
|
|
occasionally 0C and 8E for the OR instead of EE,EF,FE and FF used in
|
|
the C64. With a C128 running at 2 MHz #$EE is always used. Opcode $AB
|
|
does not cause this OR taking place on 8502 while 6510 always performs
|
|
it. Note that this behaviour depends on processor and/or video chip
|
|
revision.
|
|
|
|
Let's take a closer look at $8B (6510).
|
|
|
|
A <- X & D & (A | VAL)
|
|
|
|
where VAL comes from this table:
|
|
|
|
X high D high D low VAL
|
|
even even --- $EE (1)
|
|
even odd --- $EE
|
|
odd even --- $EE
|
|
odd odd 0 $EE
|
|
odd odd not 0 $FE (2)
|
|
|
|
(1) If the bottom 2 bits of A are both 1, then the LSB of the result may
|
|
be 0. The values of X and D are different every time I run the test.
|
|
This appears to be very rare.
|
|
(2) VAL is $FE most of the time. Sometimes it is $EE - it seems to be random,
|
|
not related to any of the data. This is much more common than (1).
|
|
|
|
In decimal mode, VAL is usually $FE.
|
|
|
|
Two different functions have been discovered for LAX, opcode $AB. One
|
|
is A = X = ANE (see above) and the other, encountered with 6510 and
|
|
8502, is less complicated A = X = (A & #byte). However, according to
|
|
what is reported, the version altering only the lowest bits of each
|
|
nybble seems to be more common.
|
|
|
|
What happens, is that $AB loads a value into both A and X, ANDing the
|
|
low bit of each nybble with the corresponding bit of the old
|
|
A. However, there are exceptions. Sometimes the low bit is cleared
|
|
even when A contains a '1', and sometimes other bits are cleared. The
|
|
exceptions seem random (they change every time I run the test). Oops -
|
|
that was in decimal mode. Much the same with D=0.
|
|
|
|
What causes the randomness? Probably it is that it is marginal logic
|
|
levels - when too much wired-anding goes on, some of the signals get
|
|
very close to the threshold. Perhaps we're seeing some of them step
|
|
over it. The low bit of each nybble is special, since it has to cope
|
|
with carry differently (remember decimal mode). We never see a '0'
|
|
turn into a '1'.
|
|
|
|
Since these instructions are unpredictable, they should not be used.
|
|
|
|
There is still very strange instruction left, the one named SHA/X/Y,
|
|
which is the only one with only indexed addressing modes. Actually,
|
|
the commands 'SHA', 'SHX' and 'SHY' are generated by the indexing
|
|
algorithm.
|
|
|
|
While using indexed addressing, effective address for page boundary
|
|
crossing is calculated as soon as possible so it does not slow down
|
|
operation. As a result, in the case of SHA/X/Y, the address and data
|
|
are processed at the same time making AND between them to take place.
|
|
Thus, the value to be stored by SAX, for example, is in fact (A & X &
|
|
(ADDR_HI + 1)). On page boundary crossing the same value is copied
|
|
also to high byte of the effective address.
|
|
|
|
RELIGION_MODE_OFF
|
|
|
|
|
|
Register selection for load and store
|
|
|
|
bit1 bit0 A X Y
|
|
0 0 x
|
|
0 1 x
|
|
1 0 x
|
|
1 1 x x
|
|
|
|
So, A and X are selected by bits 1 and 0 respectively, while
|
|
~(bit1|bit0) enables Y.
|
|
|
|
Indexing is determined by bit4, even in relative addressing mode,
|
|
which is one kind of indexing.
|
|
|
|
Lines containing opcodes xxx000x1 (01 and 03) are treated as absolute
|
|
after the effective address has been loaded into CPU.
|
|
|
|
Zeropage,y and Absolute,y (codes 10x1 x11x) are distinquished by bit5.
|
|
|
|
|
|
Decimal mode in NMOS 6500 series
|
|
|
|
Most sources claim that the NMOS 6500 series sets the N, V and Z
|
|
flags unpredictably when performing addition or subtraction in decimal
|
|
mode. Of course, this is not true. While testing how the flags are
|
|
set, I also wanted to see what happens if you use illegal BCD values.
|
|
|
|
ADC works in Decimal mode in a quite complicated way. It is amazing
|
|
how it can do that all in a single cycle. Here's a C code version of
|
|
the instruction:
|
|
|
|
unsigned
|
|
A, /* Accumulator */
|
|
AL, /* low nybble of accumulator */
|
|
AH, /* high nybble of accumulator */
|
|
|
|
C, /* Carry flag */
|
|
Z, /* Zero flag */
|
|
V, /* oVerflow flag */
|
|
N, /* Negative flag */
|
|
|
|
s; /* value to be added to Accumulator */
|
|
|
|
AL = (A & 15) + (s & 15) + C; /* Calculate the lower nybble. */
|
|
|
|
AH = (A >> 4) + (s >> 4) + (AL > 15); /* Calculate the upper nybble. */
|
|
|
|
if (AL > 9) AL += 6; /* BCD fixup for lower nybble. */
|
|
|
|
Z = ((A + s + C) & 255 != 0); /* Zero flag is set just
|
|
like in Binary mode. */
|
|
|
|
/* Negative and Overflow flags are set with the same logic than in
|
|
Binary mode, but after fixing the lower nybble. */
|
|
|
|
N = (AH & 8 != 0);
|
|
V = ((AH << 4) ^ A) & 128 && !((A ^ s) & 128);
|
|
|
|
if (AH > 9) AH += 6; /* BCD fixup for upper nybble. */
|
|
|
|
/* Carry is the only flag set after fixing the result. */
|
|
|
|
C = (AH > 15);
|
|
A = ((AH << 4) | (AL & 15)) & 255;
|
|
|
|
The C flag is set as the quiche eaters expect, but the N and V flags
|
|
are set after fixing the lower nybble but before fixing the upper one.
|
|
They use the same logic than binary mode ADC. The Z flag is set before
|
|
any BCD fixup, so the D flag does not have any influence on it.
|
|
|
|
Proof: The following test program tests all 131072 ADC combinations in
|
|
Decimal mode, and aborts with BRK if anything breaks this theory.
|
|
If everything goes well, it ends in RTS.
|
|
|
|
begin 600 dadc
|
|
M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@ 'BI&* A/N$_$B@+)$KH(V1
|
|
M*Q@(I?PI#X7]I?LI#V7]R0J0 FD%J"D/A?VE^RGP9?PI\ C $) ":0^JL @H
|
|
ML ?)H) &""@X:5\X!?V%_0AH*3W@ ! ""8"HBD7[$ JE^T7\, 28"4"H**7[
|
|
M9?S0!)@) J@8N/BE^V7\V A%_= G:(3]1?W0(.;[T(?F_-"#:$D8\ )88*D=
|
|
0&&4KA?NI &4LA?RI.&S[ A%
|
|
|
|
end
|
|
|
|
All programs in this chapter have been successfully tested on a Vic20
|
|
and a Commodore 64 and a Commodore 128D in C64 mode. They should run on
|
|
C16, +4 and on the PET series as well. If not, please report the problem
|
|
to Marko M"akel"a. Each test in this chapter should run in less than a
|
|
minute at 1 MHz.
|
|
|
|
SBC is much easier. Just like CMP, its flags are not affected by
|
|
the D flag.
|
|
|
|
Proof:
|
|
|
|
begin 600 dsbc-cmp-flags
|
|
M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@ 'B@ (3[A/RB XH8:66HL2N@
|
|
M09$KH$R1*XII::BQ*Z!%D2N@4)$K^#BXI?OE_-@(:(7].+BE^^7\"&A%_? !
|
|
5 .;[T./F_-#?RA"_8!@X&#CEY<7%
|
|
|
|
end
|
|
|
|
The only difference in SBC's operation in decimal mode from binary mode
|
|
is the result-fixup:
|
|
|
|
unsigned
|
|
A, /* Accumulator */
|
|
AL, /* low nybble of accumulator */
|
|
AH, /* high nybble of accumulator */
|
|
|
|
C, /* Carry flag */
|
|
Z, /* Zero flag */
|
|
V, /* oVerflow flag */
|
|
N, /* Negative flag */
|
|
|
|
s; /* value to be added to Accumulator */
|
|
|
|
AL = (A & 15) - (s & 15) - !C; /* Calculate the lower nybble. */
|
|
|
|
if (AL & 16) AL -= 6; /* BCD fixup for lower nybble. */
|
|
|
|
AH = (A >> 4) - (s >> 4) - (AL & 16); /* Calculate the upper nybble. */
|
|
|
|
if (AH & 16) AH -= 6; /* BCD fixup for upper nybble. */
|
|
|
|
/* The flags are set just like in Binary mode. */
|
|
|
|
C = (A - s - !C) & 256 != 0;
|
|
Z = (A - s - !C) & 255 != 0;
|
|
V = ((A - s - !C) ^ s) & 128 && (A ^ s) & 128;
|
|
N = (A - s - !C) & 128 != 0;
|
|
|
|
A = ((AH << 4) | (AL & 15)) & 255;
|
|
|
|
Again Z flag is set before any BCD fixup. The N and V flags are set
|
|
at any time before fixing the high nybble. The C flag may be set in any
|
|
phase.
|
|
|
|
Decimal subtraction is easier than decimal addition, as you have to
|
|
make the BCD fixup only when a nybble overflows. In decimal addition,
|
|
you had to verify if the nybble was greater than 9. The processor has
|
|
an internal "half carry" flag for the lower nybble, used to trigger
|
|
the BCD fixup. When calculating with legal BCD values, the lower nybble
|
|
cannot overflow again when fixing it.
|
|
So, the processor does not handle overflows while performing the fixup.
|
|
Similarly, the BCD fixup occurs in the high nybble only if the value
|
|
overflows, i.e. when the C flag will be cleared.
|
|
|
|
Because SBC's flags are not affected by the Decimal mode flag, you
|
|
could guess that CMP uses the SBC logic, only setting the C flag
|
|
first. But the SBX instruction shows that CMP also temporarily clears
|
|
the D flag, although it is totally unnecessary.
|
|
|
|
The following program, which tests SBC's result and flags,
|
|
contains the 6502 version of the pseudo code example above.
|
|
|
|
begin 600 dsbc
|
|
M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@ 'BI&* A/N$_$B@+)$KH':1
|
|
M*S@(I?PI#X7]I?LI#^7]L /I!1@I#ZBE_"GPA?VE^RGP"#CE_2GPL KI7RBP
|
|
M#ND/.+ )*+ &Z0^P NE?A/T%_87]*+BE^^7\"&BH.+CXI?OE_-@(1?W0FVB$
|
|
8_47]T)3F^]">YOS0FFA)&- $J3C0B%A@
|
|
|
|
end
|
|
|
|
Obviously the undocumented instructions RRA (ROR+ADC) and ISB
|
|
(INC+SBC) have inherited also the decimal operation from the official
|
|
instructions ADC and SBC. The program droradc proves this statement
|
|
for ROR, and the dincsbc test proves this for ISB. Finally,
|
|
dincsbc-deccmp proves that ISB's and DCP's (DEC+CMP) flags are not
|
|
affected by the D flag.
|
|
|
|
begin 644 droradc
|
|
M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI&*``A/N$_$B@+)$KH(V1
|
|
M*S@(I?PI#X7]I?LI#V7]R0J0`FD%J"D/A?VE^RGP9?PI\`C`$)`":0^JL`@H
|
|
ML`?)H)`&""@X:5\X!?V%_0AH*3W@`!`""8"HBD7[$`JE^T7\,`28"4"H**7[
|
|
M9?S0!)@)`J@XN/BE^R;\9_S8"$7]T"=HA/U%_=`@YOO0A>;\T(%H21CP`EA@
|
|
2J1T892N%^ZD`92R%_*DX;/L`
|
|
`
|
|
end
|
|
|
|
begin 644 dincsbc
|
|
M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI&*``A/N$_$B@+)$KH':1
|
|
M*S@(I?PI#X7]I?LI#^7]L`/I!1@I#ZBE_"GPA?VE^RGP"#CE_2GPL`KI7RBP
|
|
M#ND/.+`)*+`&Z0^P`NE?A/T%_87]*+BE^^7\"&BH.+CXI?O&_.?\V`A%_="9
|
|
::(3]1?W0DN;[T)SF_-"8:$D8T`2I.-"&6&#\
|
|
`
|
|
end
|
|
|
|
begin 644 dincsbc-deccmp
|
|
M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'B@`(3[A/RB`XH8:7>HL2N@
|
|
M3Y$KH%R1*XII>ZBQ*Z!3D2N@8)$KBFE_J+$KH%61*Z!BD2OX.+BE^^;\Q_S8
|
|
L"&B%_3BXI?OF_,?\"&A%_?`!`.;[T-_F_-#;RA"M8!@X&#CFYL;&Q\?GYP#8
|
|
`
|
|
end
|
|
|
|
|
|
6510 features
|
|
|
|
o PHP always pushes the Break (B) flag as a `1' to the stack.
|
|
Jukka Tapanim"aki claimed in C=lehti issue 3/89, on page 27 that the
|
|
processor makes a logical OR between the status register's bit 4
|
|
and the bit 8 of the stack pointer register (which is always 1).
|
|
He did not give any reasons for this argument, and has refused to clarify
|
|
it afterwards. Well, this was not the only error in his article...
|
|
|
|
o Indirect addressing modes do not handle page boundary crossing at all.
|
|
When the parameter's low byte is $FF, the effective address wraps
|
|
around and the CPU fetches high byte from $xx00 instead of $xx00+$0100.
|
|
E.g. JMP ($01FF) fetches PCL from $01FF and PCH from $0100,
|
|
and LDA ($FF),Y fetches the base address from $FF and $00.
|
|
|
|
o Indexed zero page addressing modes never fix the page address on
|
|
crossing the zero page boundary.
|
|
E.g. LDX #$01 : LDA ($FF,X) loads the effective address from $00 and $01.
|
|
|
|
o The processor always fetches the byte following a relative branch
|
|
instruction. If the branch is taken, the processor reads then the
|
|
opcode from the destination address. If page boundary is crossed, it
|
|
first reads a byte from the old page from a location that is bigger
|
|
or smaller than the correct address by one page.
|
|
|
|
o If you cross a page boundary in any other indexed mode,
|
|
the processor reads an incorrect location first, a location that is
|
|
smaller by one page.
|
|
|
|
o Read-Modify-Write instructions write unmodified data, then modified
|
|
(so INC effectively does LDX loc;STX loc;INX;STX loc)
|
|
|
|
o -RDY is ignored during writes
|
|
(This is why you must wait 3 cycles before doing any DMA --
|
|
the maximum number of consecutive writes is 3, which occurs
|
|
during interrupts except -RESET.)
|
|
|
|
o Some undefined opcodes may give really unpredictable results.
|
|
|
|
o All registers except the Program Counter remain unmodified after -RESET.
|
|
(This is why you must preset D and I flags in the RESET handler.)
|
|
|
|
|
|
Different CPU types
|
|
|
|
The Rockwell data booklet 29651N52 (technical information about R65C00
|
|
microprocessors, dated October 1984), lists the following differences between
|
|
NMOS R6502 microprocessor and CMOS R65C00 family:
|
|
|
|
|
|
1. Indexed addressing across page boundary.
|
|
NMOS: Extra read of invalid address.
|
|
CMOS: Extra read of last instruction byte.
|
|
|
|
|
|
2. Execution of invalid op codes.
|
|
NMOS: Some terminate only by reset. Results are undefined.
|
|
CMOS: All are NOPs (reserved for future use).
|
|
|
|
|
|
3. Jump indirect, operand = XXFF.
|
|
NMOS: Page address does not increment.
|
|
CMOS: Page address increments and adds one additional cycle.
|
|
|
|
|
|
4. Read/modify/write instructions at effective address.
|
|
NMOS: One read and two write cycles.
|
|
CMOS: Two read and one write cycle.
|
|
|
|
|
|
5. Decimal flag.
|
|
NMOS: Indeterminate after reset.
|
|
CMOS: Initialized to binary mode (D=0) after reset and interrupts.
|
|
|
|
|
|
6. Flags after decimal operation.
|
|
NMOS: Invalid N, V and Z flags.
|
|
CMOS: Valid flag adds one additional cycle.
|
|
|
|
|
|
7. Interrupt after fetch of BRK instruction.
|
|
NMOS: Interrupt vector is loaded, BRK vector is ignored.
|
|
CMOS: BRK is executed, then interrupt is executed.
|
|
|
|
|
|
6510 Instruction Timing
|
|
|
|
The NMOS 6500 series processors always perform at least two reads
|
|
for each instruction. In addition to the operation code (opcode), they
|
|
fetch the next byte. This is quite efficient, as most instructions are
|
|
two or three bytes long.
|
|
|
|
The processors also use a sort of pipelining. If an instruction does
|
|
not store data in memory on its last cycle, the processor can fetch
|
|
the opcode of the next instruction while executing the last cycle. For
|
|
instance, the instruction EOR #$FF truly takes three cycles. On the
|
|
first cycle, the opcode $49 will be fetched. During the second cycle
|
|
the processor decodes the opcode and fetches the parameter #$FF. On
|
|
the third cycle, the processor will perform the operation and store
|
|
the result to accumulator, but simultaneously it fetches the opcode
|
|
for the next instruction. This is why the instruction effectively
|
|
takes only two cycles.
|
|
|
|
The following tables show what happens on the bus while executing
|
|
different kinds of instructions.
|
|
|
|
Interrupts
|
|
|
|
NMI and IRQ both take 7 cycles. Their timing diagram is much like
|
|
BRK's (see below). IRQ will be executed only when the I flag is
|
|
clear. IRQ and BRK both set the I flag, whereas the NMI does not
|
|
affect its state.
|
|
|
|
The processor will usually wait for the current instruction to
|
|
complete before executing the interrupt sequence. To process the
|
|
interrupt before the next instruction, the interrupt must occur
|
|
before the last cycle of the current instruction.
|
|
|
|
There is one exception to this rule: the BRK instruction. If a
|
|
hardware interrupt (NMI or IRQ) occurs before the fourth (flags
|
|
saving) cycle of BRK, the BRK instruction will be skipped, and
|
|
the processor will jump to the hardware interrupt vector. This
|
|
sequence will always take 7 cycles.
|
|
|
|
You do not completely lose the BRK interrupt, the B flag will be
|
|
set in the pushed status register if a BRK instruction gets
|
|
interrupted. When BRK and IRQ occur at the same time, this does
|
|
not cause any problems, as your program will consider it as a
|
|
BRK, and the IRQ would occur again after the processor returned
|
|
from your BRK routine, unless you cleared the interrupt source in
|
|
your BRK handler. But the simultaneous occurrence of NMI and BRK
|
|
is far more fatal. If you do not check the B flag in the NMI
|
|
routine and subtract two from the return address when needed, the
|
|
BRK instruction will be skipped.
|
|
|
|
If the NMI and IRQ interrupts overlap each other (one interrupt
|
|
occurs before fetching the interrupt vector for the other
|
|
interrupt), the processor will most probably jump to the NMI
|
|
vector in every case, and then jump to the IRQ vector after
|
|
processing the first instruction of the NMI handler. This has not
|
|
been measured yet, but the IRQ is very similar to BRK, and many
|
|
sources state that the NMI has higher priority than IRQ. However,
|
|
it might be that the processor takes the interrupt that comes
|
|
later, i.e. you could lose an NMI interrupt if an IRQ occurred in
|
|
four cycles after it.
|
|
|
|
After finishing the interrupt sequence, the processor will start
|
|
to execute the first instruction of the interrupt routine. This
|
|
proves that the processor uses a sort of pipelining: it finishes
|
|
the current instruction (or interrupt sequence) while reading the
|
|
opcode of the next instruction.
|
|
|
|
RESET does not push program counter on stack, and it lasts
|
|
probably 6 cycles after deactivating the signal. Like NMI, RESET
|
|
preserves all registers except PC.
|
|
|
|
Instructions accessing the stack
|
|
|
|
BRK
|
|
|
|
# address R/W description
|
|
--- ------- --- -----------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R read next instruction byte (and throw it away),
|
|
increment PC
|
|
3 $0100,S W push PCH on stack (with B flag set), decrement S
|
|
4 $0100,S W push PCL on stack, decrement S
|
|
5 $0100,S W push P on stack, decrement S
|
|
6 $FFFE R fetch PCL
|
|
7 $FFFF R fetch PCH
|
|
|
|
RTI
|
|
|
|
# address R/W description
|
|
--- ------- --- -----------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R read next instruction byte (and throw it away)
|
|
3 $0100,S R increment S
|
|
4 $0100,S R pull P from stack, increment S
|
|
5 $0100,S R pull PCL from stack, increment S
|
|
6 $0100,S R pull PCH from stack
|
|
|
|
RTS
|
|
|
|
# address R/W description
|
|
--- ------- --- -----------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R read next instruction byte (and throw it away)
|
|
3 $0100,S R increment S
|
|
4 $0100,S R pull PCL from stack, increment S
|
|
5 $0100,S R pull PCH from stack
|
|
6 PC R increment PC
|
|
|
|
PHA, PHP
|
|
|
|
# address R/W description
|
|
--- ------- --- -----------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R read next instruction byte (and throw it away)
|
|
3 $0100,S W push register on stack, decrement S
|
|
|
|
PLA, PLP
|
|
|
|
# address R/W description
|
|
--- ------- --- -----------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R read next instruction byte (and throw it away)
|
|
3 $0100,S R increment S
|
|
4 $0100,S R pull register from stack
|
|
|
|
JSR
|
|
|
|
# address R/W description
|
|
--- ------- --- -------------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch low address byte, increment PC
|
|
3 $0100,S R internal operation (predecrement S?)
|
|
4 $0100,S W push PCH on stack, decrement S
|
|
5 $0100,S W push PCL on stack, decrement S
|
|
6 PC R copy low address byte to PCL, fetch high address
|
|
byte to PCH
|
|
|
|
Accumulator or implied addressing
|
|
|
|
# address R/W description
|
|
--- ------- --- -----------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R read next instruction byte (and throw it away)
|
|
|
|
Immediate addressing
|
|
|
|
# address R/W description
|
|
--- ------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch value, increment PC
|
|
|
|
Absolute addressing
|
|
|
|
JMP
|
|
|
|
# address R/W description
|
|
--- ------- --- -------------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch low address byte, increment PC
|
|
3 PC R copy low address byte to PCL, fetch high address
|
|
byte to PCH
|
|
|
|
Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
|
|
LAX, NOP)
|
|
|
|
# address R/W description
|
|
--- ------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch low byte of address, increment PC
|
|
3 PC R fetch high byte of address, increment PC
|
|
4 address R read from effective address
|
|
|
|
Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
|
|
SLO, SRE, RLA, RRA, ISB, DCP)
|
|
|
|
# address R/W description
|
|
--- ------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch low byte of address, increment PC
|
|
3 PC R fetch high byte of address, increment PC
|
|
4 address R read from effective address
|
|
5 address W write the value back to effective address,
|
|
and do the operation on it
|
|
6 address W write the new value to effective address
|
|
|
|
Write instructions (STA, STX, STY, SAX)
|
|
|
|
# address R/W description
|
|
--- ------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch low byte of address, increment PC
|
|
3 PC R fetch high byte of address, increment PC
|
|
4 address W write register to effective address
|
|
|
|
Zero page addressing
|
|
|
|
Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
|
|
LAX, NOP)
|
|
|
|
# address R/W description
|
|
--- ------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch address, increment PC
|
|
3 address R read from effective address
|
|
|
|
Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
|
|
SLO, SRE, RLA, RRA, ISB, DCP)
|
|
|
|
# address R/W description
|
|
--- ------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch address, increment PC
|
|
3 address R read from effective address
|
|
4 address W write the value back to effective address,
|
|
and do the operation on it
|
|
5 address W write the new value to effective address
|
|
|
|
Write instructions (STA, STX, STY, SAX)
|
|
|
|
# address R/W description
|
|
--- ------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch address, increment PC
|
|
3 address W write register to effective address
|
|
|
|
Zero page indexed addressing
|
|
|
|
Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
|
|
LAX, NOP)
|
|
|
|
# address R/W description
|
|
--- --------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch address, increment PC
|
|
3 address R read from address, add index register to it
|
|
4 address+I* R read from effective address
|
|
|
|
Notes: I denotes either index register (X or Y).
|
|
|
|
* The high byte of the effective address is always zero,
|
|
i.e. page boundary crossings are not handled.
|
|
|
|
Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
|
|
SLO, SRE, RLA, RRA, ISB, DCP)
|
|
|
|
# address R/W description
|
|
--- --------- --- ---------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch address, increment PC
|
|
3 address R read from address, add index register X to it
|
|
4 address+X* R read from effective address
|
|
5 address+X* W write the value back to effective address,
|
|
and do the operation on it
|
|
6 address+X* W write the new value to effective address
|
|
|
|
Note: * The high byte of the effective address is always zero,
|
|
i.e. page boundary crossings are not handled.
|
|
|
|
Write instructions (STA, STX, STY, SAX)
|
|
|
|
# address R/W description
|
|
--- --------- --- -------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch address, increment PC
|
|
3 address R read from address, add index register to it
|
|
4 address+I* W write to effective address
|
|
|
|
Notes: I denotes either index register (X or Y).
|
|
|
|
* The high byte of the effective address is always zero,
|
|
i.e. page boundary crossings are not handled.
|
|
|
|
Absolute indexed addressing
|
|
|
|
Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
|
|
LAX, LAE, SHS, NOP)
|
|
|
|
# address R/W description
|
|
--- --------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch low byte of address, increment PC
|
|
3 PC R fetch high byte of address,
|
|
add index register to low address byte,
|
|
increment PC
|
|
4 address+I* R read from effective address,
|
|
fix the high byte of effective address
|
|
5+ address+I R re-read from effective address
|
|
|
|
Notes: I denotes either index register (X or Y).
|
|
|
|
* The high byte of the effective address may be invalid
|
|
at this time, i.e. it may be smaller by $100.
|
|
|
|
+ This cycle will be executed only if the effective address
|
|
was invalid during cycle #4, i.e. page boundary was crossed.
|
|
|
|
Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
|
|
SLO, SRE, RLA, RRA, ISB, DCP)
|
|
|
|
# address R/W description
|
|
--- --------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch low byte of address, increment PC
|
|
3 PC R fetch high byte of address,
|
|
add index register X to low address byte,
|
|
increment PC
|
|
4 address+X* R read from effective address,
|
|
fix the high byte of effective address
|
|
5 address+X R re-read from effective address
|
|
6 address+X W write the value back to effective address,
|
|
and do the operation on it
|
|
7 address+X W write the new value to effective address
|
|
|
|
Notes: * The high byte of the effective address may be invalid
|
|
at this time, i.e. it may be smaller by $100.
|
|
|
|
Write instructions (STA, STX, STY, SHA, SHX, SHY)
|
|
|
|
# address R/W description
|
|
--- --------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch low byte of address, increment PC
|
|
3 PC R fetch high byte of address,
|
|
add index register to low address byte,
|
|
increment PC
|
|
4 address+I* R read from effective address,
|
|
fix the high byte of effective address
|
|
5 address+I W write to effective address
|
|
|
|
Notes: I denotes either index register (X or Y).
|
|
|
|
* The high byte of the effective address may be invalid
|
|
at this time, i.e. it may be smaller by $100. Because
|
|
the processor cannot undo a write to an invalid
|
|
address, it always reads from the address first.
|
|
|
|
Relative addressing (BCC, BCS, BNE, BEQ, BPL, BMI, BVC, BVS)
|
|
|
|
# address R/W description
|
|
--- --------- --- ---------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch operand, increment PC
|
|
3 PC R Fetch opcode of next instruction,
|
|
If branch is taken, add operand to PCL.
|
|
Otherwise increment PC.
|
|
4+ PC* R Fetch opcode of next instruction.
|
|
Fix PCH. If it did not change, increment PC.
|
|
5! PC R Fetch opcode of next instruction,
|
|
increment PC.
|
|
|
|
Notes: The opcode fetch of the next instruction is included to
|
|
this diagram for illustration purposes. When determining
|
|
real execution times, remember to subtract the last
|
|
cycle.
|
|
|
|
* The high byte of Program Counter (PCH) may be invalid
|
|
at this time, i.e. it may be smaller or bigger by $100.
|
|
|
|
+ If branch is taken, this cycle will be executed.
|
|
|
|
! If branch occurs to different page, this cycle will be
|
|
executed.
|
|
|
|
Indexed indirect addressing
|
|
|
|
Read instructions (LDA, ORA, EOR, AND, ADC, CMP, SBC, LAX)
|
|
|
|
# address R/W description
|
|
--- ----------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch pointer address, increment PC
|
|
3 pointer R read from the address, add X to it
|
|
4 pointer+X R fetch effective address low
|
|
5 pointer+X+1 R fetch effective address high
|
|
6 address R read from effective address
|
|
|
|
Note: The effective address is always fetched from zero page,
|
|
i.e. the zero page boundary crossing is not handled.
|
|
|
|
Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP)
|
|
|
|
# address R/W description
|
|
--- ----------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch pointer address, increment PC
|
|
3 pointer R read from the address, add X to it
|
|
4 pointer+X R fetch effective address low
|
|
5 pointer+X+1 R fetch effective address high
|
|
6 address R read from effective address
|
|
7 address W write the value back to effective address,
|
|
and do the operation on it
|
|
8 address W write the new value to effective address
|
|
|
|
Note: The effective address is always fetched from zero page,
|
|
i.e. the zero page boundary crossing is not handled.
|
|
|
|
Write instructions (STA, SAX)
|
|
|
|
# address R/W description
|
|
--- ----------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch pointer address, increment PC
|
|
3 pointer R read from the address, add X to it
|
|
4 pointer+X R fetch effective address low
|
|
5 pointer+X+1 R fetch effective address high
|
|
6 address W write to effective address
|
|
|
|
Note: The effective address is always fetched from zero page,
|
|
i.e. the zero page boundary crossing is not handled.
|
|
|
|
Indirect indexed addressing
|
|
|
|
Read instructions (LDA, EOR, AND, ORA, ADC, SBC, CMP)
|
|
|
|
# address R/W description
|
|
--- ----------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch pointer address, increment PC
|
|
3 pointer R fetch effective address low
|
|
4 pointer+1 R fetch effective address high,
|
|
add Y to low byte of effective address
|
|
5 address+Y* R read from effective address,
|
|
fix high byte of effective address
|
|
6+ address+Y R read from effective address
|
|
|
|
Notes: The effective address is always fetched from zero page,
|
|
i.e. the zero page boundary crossing is not handled.
|
|
|
|
* The high byte of the effective address may be invalid
|
|
at this time, i.e. it may be smaller by $100.
|
|
|
|
+ This cycle will be executed only if the effective address
|
|
was invalid during cycle #5, i.e. page boundary was crossed.
|
|
|
|
Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP)
|
|
|
|
# address R/W description
|
|
--- ----------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch pointer address, increment PC
|
|
3 pointer R fetch effective address low
|
|
4 pointer+1 R fetch effective address high,
|
|
add Y to low byte of effective address
|
|
5 address+Y* R read from effective address,
|
|
fix high byte of effective address
|
|
6 address+Y R read from effective address
|
|
7 address+Y W write the value back to effective address,
|
|
and do the operation on it
|
|
8 address+Y W write the new value to effective address
|
|
|
|
Notes: The effective address is always fetched from zero page,
|
|
i.e. the zero page boundary crossing is not handled.
|
|
|
|
* The high byte of the effective address may be invalid
|
|
at this time, i.e. it may be smaller by $100.
|
|
|
|
Write instructions (STA, SHA)
|
|
|
|
# address R/W description
|
|
--- ----------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch pointer address, increment PC
|
|
3 pointer R fetch effective address low
|
|
4 pointer+1 R fetch effective address high,
|
|
add Y to low byte of effective address
|
|
5 address+Y* R read from effective address,
|
|
fix high byte of effective address
|
|
6 address+Y W write to effective address
|
|
|
|
Notes: The effective address is always fetched from zero page,
|
|
i.e. the zero page boundary crossing is not handled.
|
|
|
|
* The high byte of the effective address may be invalid
|
|
at this time, i.e. it may be smaller by $100.
|
|
|
|
Absolute indirect addressing (JMP)
|
|
|
|
# address R/W description
|
|
--- --------- --- ------------------------------------------
|
|
1 PC R fetch opcode, increment PC
|
|
2 PC R fetch pointer address low, increment PC
|
|
3 PC R fetch pointer address high, increment PC
|
|
4 pointer R fetch low address to latch
|
|
5 pointer+1* R fetch PCH, copy latch to PCL
|
|
|
|
Note: * The PCH will always be fetched from the same page
|
|
than PCL, i.e. page boundary crossing is not handled.
|
|
|
|
How Real Programmers Acknowledge Interrupts
|
|
|
|
With RMW instructions:
|
|
|
|
; beginning of combined raster/timer interrupt routine
|
|
LSR $D019 ; clear VIC interrupts, read raster interrupt flag to C
|
|
BCS raster ; jump if VIC caused an interrupt
|
|
... ; timer interrupt routine
|
|
|
|
Operational diagram of LSR $D019:
|
|
|
|
# data address R/W
|
|
--- ---- ------- --- ---------------------------------
|
|
1 4E PC R fetch opcode
|
|
2 19 PC+1 R fetch address low
|
|
3 D0 PC+2 R fetch address high
|
|
4 xx $D019 R read memory
|
|
5 xx $D019 W write the value back, rotate right
|
|
6 xx/2 $D019 W write the new value back
|
|
|
|
The 5th cycle acknowledges the interrupt by writing the same
|
|
value back. If only raster interrupts are used, the 6th cycle
|
|
has no effect on the VIC. (It might acknowledge also some
|
|
other interrupts.)
|
|
|
|
With indexed addressing:
|
|
|
|
; acknowledge interrupts to both CIAs
|
|
LDX #$10
|
|
LDA $DCFD,X
|
|
|
|
Operational diagram of LDA $DCFD,X:
|
|
|
|
# data address R/W description
|
|
--- ---- ------- --- ---------------------------------
|
|
1 BD PC R fetch opcode
|
|
2 FD PC+1 R fetch address low
|
|
3 DC PC+2 R fetch address high, add X to address low
|
|
4 xx $DC0D R read from address, fix high byte of address
|
|
5 yy $DD0D R read from right address
|
|
|
|
; acknowledge interrupts to CIA 2
|
|
LDX #$10
|
|
STA $DDFD,X
|
|
|
|
Operational diagram of STA $DDFD,X:
|
|
|
|
# data address R/W description
|
|
--- ---- ------- --- ---------------------------------
|
|
1 9D PC R fetch opcode
|
|
2 FD PC+1 R fetch address low
|
|
3 DC PC+2 R fetch address high, add X to address low
|
|
4 xx $DD0D R read from address, fix high byte of address
|
|
5 ac $DE0D W write to right address
|
|
|
|
With branch instructions:
|
|
|
|
; acknowledge interrupts to CIA 2
|
|
LDA #$00 ; clear N flag
|
|
JMP $DD0A
|
|
DD0A BPL $DC9D ; branch
|
|
DC9D BRK ; return
|
|
|
|
You need the following preparations to initialize the CIA registers:
|
|
|
|
LDA #$91 ; argument of BPL
|
|
STA $DD0B
|
|
LDA #$10 ; BPL
|
|
STA $DD0A
|
|
STA $DD08 ; load the ToD values from the latches
|
|
LDA $DD0B ; freeze the ToD display
|
|
LDA #$7F
|
|
STA $DC0D ; assure that $DC0D is $00
|
|
|
|
Operational diagram of BPL $DC9D:
|
|
|
|
# data address R/W description
|
|
--- ---- ------- --- ---------------------------------
|
|
1 10 $DD0A R fetch opcode
|
|
2 91 $DD0B R fetch argument
|
|
3 xx $DD0C R fetch opcode, add argument to PCL
|
|
4 yy $DD9D R fetch opcode, fix PCH
|
|
( 5 00 $DC9D R fetch opcode )
|
|
|
|
; acknowledge interrupts to CIA 1
|
|
LSR ; clear N flag
|
|
JMP $DCFA
|
|
DCFA BPL $DD0D
|
|
DD0D BRK
|
|
|
|
; Again you need to set the ToD registers of CIA 1 and the
|
|
; Interrupt Control Register of CIA 2 first.
|
|
|
|
Operational diagram of BPL $DD0D:
|
|
|
|
# data address R/W description
|
|
--- ---- ------- --- ---------------------------------
|
|
1 10 $DCFA R fetch opcode
|
|
2 11 $DCFB R fetch argument
|
|
3 xx $DCFC R fetch opcode, add argument to PCL
|
|
4 yy $DC0D R fetch opcode, fix PCH
|
|
( 5 00 $DD0D R fetch opcode )
|
|
|
|
; acknowledge interrupts to CIA 2 automagically
|
|
; preparations
|
|
LDA #$7F
|
|
STA $DD0D ; disable all interrupt sources of CIA2
|
|
LDA $DD0E
|
|
AND #$BE ; ensure that $DD0C remains constant
|
|
STA $DD0E ; and stop the timer
|
|
LDA #$FD
|
|
STA $DD0C ; parameter of BPL
|
|
LDA #$10
|
|
STA $DD0B ; BPL
|
|
LDA #$40
|
|
STA $DD0A ; RTI/parameter of LSR
|
|
LDA #$46
|
|
STA $DD09 ; LSR
|
|
STA $DD08 ; load the ToD values from the latches
|
|
LDA $DD0B ; freeze the ToD display
|
|
LDA #$09
|
|
STA $0318
|
|
LDA #$DD
|
|
STA $0319 ; change NMI vector to $DD09
|
|
LDA #$FF ; Try changing this instruction's operand
|
|
STA $DD05 ; (see comment below).
|
|
LDA #$FF
|
|
STA $DD04 ; set interrupt frequency to 1/65536 cycles
|
|
LDA $DD0E
|
|
AND #$80
|
|
ORA #$11
|
|
LDX #$81
|
|
STX $DD0D ; enable timer interrupt
|
|
STA $DD0E ; start timer
|
|
|
|
LDA #$00 ; To see that the interrupts really occur,
|
|
STA $D011 ; use something like this and see how
|
|
LOOP DEC $D020 ; changing the byte loaded to $DD05 from
|
|
BNE LOOP ; #$FF to #$0F changes the image.
|
|
|
|
When an NMI occurs, the processor jumps to Kernal code, which jumps to
|
|
($0318), which points to the following routine:
|
|
|
|
DD09 LSR $40 ; clear N flag
|
|
BPL $DD0A ; Note: $DD0A contains RTI.
|
|
|
|
Operational diagram of BPL $DD0A:
|
|
|
|
# data address R/W description
|
|
--- ---- ------- --- ---------------------------------
|
|
1 10 $DD0B R fetch opcode
|
|
2 11 $DD0C R fetch argument
|
|
3 xx $DD0D R fetch opcode, add argument to PCL
|
|
4 40 $DD0A R fetch opcode, (fix PCH)
|
|
|
|
With RTI:
|
|
|
|
; the fastest possible interrupt handler in the 6500 family
|
|
; preparations
|
|
SEI
|
|
LDA $01 ; disable ROM and enable I/O
|
|
AND #$FD
|
|
ORA #$05
|
|
STA $01
|
|
LDA #$7F
|
|
STA $DD0D ; disable CIA 2's all interrupt sources
|
|
LDA $DD0E
|
|
AND #$BE ; ensure that $DD0C remains constant
|
|
STA $DD0E ; and stop the timer
|
|
LDA #$40
|
|
STA $DD0C ; store RTI to $DD0C
|
|
LDA #$0C
|
|
STA $FFFA
|
|
LDA #$DD
|
|
STA $FFFB ; change NMI vector to $DD0C
|
|
LDA #$FF ; Try changing this instruction's operand
|
|
STA $DD05 ; (see comment below).
|
|
LDA #$FF
|
|
STA $DD04 ; set interrupt frequency to 1/65536 cycles
|
|
LDA $DD0E
|
|
AND #$80
|
|
ORA #$11
|
|
LDX #$81
|
|
STX $DD0D ; enable timer interrupt
|
|
STA $DD0E ; start timer
|
|
|
|
LDA #$00 ; To see that the interrupts really occur,
|
|
STA $D011 ; use something like this and see how
|
|
LOOP DEC $D020 ; changing the byte loaded to $DD05 from
|
|
BNE LOOP ; #$FF to #$0F changes the image.
|
|
|
|
When an NMI occurs, the processor jumps to Kernal code, which
|
|
jumps to ($0318), which points to the following routine:
|
|
|
|
DD0C RTI
|
|
|
|
How on earth can this clear the interrupts? Remember, the
|
|
processor always fetches two successive bytes for each
|
|
instruction.
|
|
|
|
A little more practical version of this is redirecting the NMI
|
|
(or IRQ) to your own routine, whose last instruction is JMP
|
|
$DD0C or JMP $DC0C. If you want to confuse more, change the 0
|
|
in the address to a hexadecimal digit different from the one
|
|
you used when writing the RTI.
|
|
|
|
Or you can combine the latter two methods:
|
|
|
|
DD09 LSR $xx ; xx is any appropriate BCD value 00-59.
|
|
BPL $DCFC
|
|
DCFC RTI
|
|
|
|
This example acknowledges interrupts to both CIAs.
|
|
|
|
If you want to confuse the examiners of your code, you can use any
|
|
of these techniques. Although these examples use no undefined opcodes,
|
|
they do not necessarily run correctly on CMOS processors. However, the
|
|
RTI example should run on 65C02 and 65C816, and the latter branch
|
|
instruction example might work as well.
|
|
|
|
The RMW instruction method has been used in some demos, others were
|
|
developed by Marko M"akel"a. His favourite is the automagical RTI
|
|
method, although it does not have any practical applications, except
|
|
for some time dependent data decryption routines for very complicated
|
|
copy protections.
|
|
|
|
|