diff --git a/M6502/documentation/6502_cpu.txt b/M6502/documentation/6502_cpu.txt new file mode 100644 index 0000000..126bb66 --- /dev/null +++ b/M6502/documentation/6502_cpu.txt @@ -0,0 +1,1534 @@ +# +# $Id: 64doc,v 1.8 1994/06/03 19:50:04 jopi Exp $ +# +# This file is part of Commodore 64 emulator +# and Program Development System. +# +# See README for copyright notice +# +# This file contains documentation for 6502/6510/8500/8502 instruction set. +# +# +# Written by +# John West (john@ucc.gu.uwa.edu.au) +# Marko MŠkelŠ (msmakela@kruuna.helsinki.fi) +# +# +# $Log: 64doc,v $ +# Revision 1.8 1994/06/03 19:50:04 jopi +# Patchlevel 2 +# +# Revision 1.7 1994/04/15 13:07:04 jopi +# 65xx Register descriptions added +# +# Revision 1.6 1994/02/18 16:09:36 jopi +# +# Revision 1.5 1994/01/26 16:08:37 jopi +# X64 version 0.2 PL 1 +# +# Revision 1.4 1993/11/10 01:55:34 jopi +# +# Revision 1.3 93/06/21 13:37:18 jopi +# X64 version 0.2 PL 0 +# +# Revision 1.2 93/06/21 13:07:15 jopi +# *** empty log message *** +# +# + + Note: To extract the uuencoded ML programs in this article most + easily you may use e.g. "uud" by Edwin Kremer , + which extracts them all at once. + + +Documentation for the NMOS 65xx/85xx Instruction Set + + 6510 Instructions by Addressing Modes + 6502 Registers + 6510/8502 Undocumented Commands + Register selection for load and store + Decimal mode in NMOS 6500 series + 6510 features + Different CPU types + 6510 Instruction Timing + How Real Programmers Acknowledge Interrupts + Memory Management + Autostart Code + Notes + References + + +6510 Instructions by Addressing Modes + +off- ++++++++++ Positive ++++++++++ ---------- Negative ---------- +set 00 20 40 60 80 a0 c0 e0 mode + ++00 BRK JSR RTI RTS NOP* LDY CPY CPX Impl/immed ++01 ORA AND EOR ADC STA LDA CMP SBC (indir,x) ++02 t t t t NOP*t LDX NOP*t NOP*t ? /immed ++03 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* (indir,x) ++04 NOP* BIT NOP* NOP* STY LDY CPY CPX Zeropage ++05 ORA AND EOR ADC STA LDA CMP SBC Zeropage ++06 ASL ROL LSR ROR STX LDX DEC INC Zeropage ++07 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* Zeropage + ++08 PHP PLP PHA PLA DEY TAY INY INX Implied ++09 ORA AND EOR ADC NOP* LDA CMP SBC Immediate ++0a ASL ROL LSR ROR TXA TAX DEX NOP Accu/impl ++0b ANC** ANC** ASR** ARR** ANE** LXA** SBX** SBC* Immediate ++0c NOP* BIT JMP JMP () STY LDY CPY CPX Absolute ++0d ORA AND EOR ADC STA LDA CMP SBC Absolute ++0e ASL ROL LSR ROR STX LDX DEC INC Absolute ++0f SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* Absolute + ++10 BPL BMI BVC BVS BCC BCS BNE BEQ Relative ++11 ORA AND EOR ADC STA LDA CMP SBC (indir),y ++12 t t t t t t t t ? ++13 SLO* RLA* SRE* RRA* SHA** LAX* DCP* ISB* (indir),y ++14 NOP* NOP* NOP* NOP* STY LDY NOP* NOP* Zeropage,x ++15 ORA AND EOR ADC STA LDA CMP SBC Zeropage,x ++16 ASL ROL LSR ROR STX y) LDX y) DEC INC Zeropage,x ++17 SLO* RLA* SRE* RRA* SAX* y) LAX* y) DCP* ISB* Zeropage,x + ++18 CLC SEC CLI SEI TYA CLV CLD SED Implied ++19 ORA AND EOR ADC STA LDA CMP SBC Absolute,y ++1a NOP* NOP* NOP* NOP* TXS TSX NOP* NOP* Implied ++1b SLO* RLA* SRE* RRA* SHS** LAS** DCP* ISB* Absolute,y ++1c NOP* NOP* NOP* NOP* SHY** LDY NOP* NOP* Absolute,x ++1d ORA AND EOR ADC STA LDA CMP SBC Absolute,x ++1e ASL ROL LSR ROR SHX**y) LDX y) DEC INC Absolute,x ++1f SLO* RLA* SRE* RRA* SHA**y) LAX* y) DCP* ISB* Absolute,x + + ROR intruction is available on MC650x microprocessors after + June, 1976. + + Legend: + + t Jams the machine + *t Jams very rarely + * Undocumented command + ** Unusual operation + y) indexed using Y instead of X + () indirect instead of absolute + +Note that the NOP instructions do have other addressing modes than the +implied addressing. The NOP instruction is just like any other load +instruction, except it does not store the result anywhere nor affects the +flags. + +6502 Registers + +The NMOS 65xx processors are not ruined with too many registers. In addition +to that, the registers are mostly 8-bit. Here is a brief description of each +register: + + PC Program Counter + This register points the address from which the next instruction + byte (opcode or parameter) will be fetched. Unlike other + registers, this one is 16 bits in length. The low and high 8-bit + halves of the register are called PCL and PCH, respectively. The + Program Counter may be read by pushing its value on the stack. + This can be done either by jumping to a subroutine or by causing + an interrupt. + S Stack pointer + The NMOS 65xx processors have 256 bytes of stack memory, ranging + from $0100 to $01FF. The S register is a 8-bit offset to the stack + page. In other words, whenever anything is being pushed on the + stack, it will be stored to the address $0100+S. + + The Stack pointer can be read and written by transfering its value + to or from the index register X (see below) with the TSX and TXS + instructions. + P Processor status + This 8-bit register stores the state of the processor. The bits in + this register are called flags. Most of the flags have something + to do with arithmetic operations. + + The P register can be read by pushing it on the stack (with PHP or + by causing an interrupt). If you only need to read one flag, you + can use the branch instructions. Setting the flags is possible by + pulling the P register from stack or by using the flag set or + clear instructions. + + Following is a list of the flags, starting from the 8th bit of the + P register (bit 7, value $80): + N Negative flag + This flag will be set after any arithmetic operations + (when any of the registers A, X or Y is being loaded + with a value). Generally, the N flag will be copied from + the topmost bit of the register being loaded. + + Note that TXS (Transfer X to S) is not an arithmetic + operation. Also note that the BIT instruction affects + the Negative flag just like arithmetic operations. + Finally, the Negative flag behaves differently in + Decimal operations (see description below). + V oVerflow flag + Like the Negative flag, this flag is intended to be used + with 8-bit signed integer numbers. The flag will be + affected by addition and subtraction, the instructions + PLP, CLV and BIT, and the hardware signal -SO. Note that + there is no SEV instruction, even though the MOS + engineers loved to use East European abbreviations, like + DDR (Deutsche Demokratische Republik vs. Data Direction + Register). (The Russian abbreviation for their former + trade association COMECON is SEV.) The -SO (Set + Overflow) signal is available on some processors, at + least the 6502, to set the V flag. This enables response + to an I/O activity in equal or less than three clock + cycles when using a BVC instruction branching to itself + ($50 $FE). + + The CLV instruction clears the V flag, and the PLP and + BIT instructions copy the flag value from the bit 6 of + the topmost stack entry or from memory. + + After a binary addition or subtraction, the V flag will + be set on a sign overflow, cleared otherwise. What is a + sign overflow? For instance, if you are trying to add + 123 and 45 together, the result (168) does not fit in a + 8-bit signed integer (upper limit 127 and lower limit + -128). Similarly, adding -123 to -45 causes the + overflow, just like subtracting -45 from 123 or 123 from + -45 would do. + + Like the N flag, the V flag will not be set as expected + in the Decimal mode. Later in this document is a precise + operation description. + + A common misbelief is that the V flag could only be set + by arithmetic operations, not cleared. + 1 unused flag + To the current knowledge, this flag is always 1. + B Break flag + This flag is used to distinguish software (BRK) + interrupts from hardware interrupts (IRQ or NMI). The B + flag is always set except when the P register is being + pushed on stack when jumping to an interrupt routine to + process only a hardware interrupt. + + The official NMOS 65xx documentation claims that the BRK + instruction could only cause a jump to the IRQ vector + ($FFFE). However, if an NMI interrupt occurs while + executing a BRK instruction, the processor will jump to + the NMI vector ($FFFA), and the P register will be + pushed on the stack with the B flag set. + D Decimal mode flag + This flag is used to select the (Binary Coded) Decimal + mode for addition and subtraction. In most applications, + the flag is zero. + + The Decimal mode has many oddities, and it operates + differently on CMOS processors. See the description of + the ADC, SBC and ARR instructions below. + I Interrupt disable flag + This flag can be used to prevent the processor from + jumping to the IRQ handler vector ($FFFE) whenever the + hardware line -IRQ is active. The flag will be + automatically set after taking an interrupt, so that the + processor would not keep jumping to the interrupt + routine if the -IRQ signal remains low for several clock + cycles. + Z Zero flag + The Zero flag will be affected in the same cases than + the Negative flag. Generally, it will be set if an + arithmetic register is being loaded with the value zero, + and cleared otherwise. The flag will behave differently + in Decimal operations. + C Carry flag + This flag is used in additions, subtractions, + comparisons and bit rotations. In additions and + subtractions, it acts as a 9th bit and lets you to chain + operations to calculate with bigger than 8-bit numbers. + When subtracting, the Carry flag is the negative of + Borrow: if an overflow occurs, the flag will be clear, + otherwise set. Comparisons are a special case of + subtraction: they assume Carry flag set and Decimal flag + clear, and do not store the result of the subtraction + anywhere. + + There are four kinds of bit rotations. All of them store + the bit that is being rotated off to the Carry flag. The + left shifting instructions are ROL and ASL. ROL copies + the initial Carry flag to the lowmost bit of the byte; + ASL always clears it. Similarly, the ROR and LSR + instructions shift to the right. + A Accumulator + The accumulator is the main register for arithmetic and logic + operations. Unlike the index registers X and Y, it has a direct + connection to the Arithmetic and Logic Unit (ALU). This is why + many operations are only available for the accumulator, not the + index registers. + X Index register X + This is the main register for addressing data with indices. It has + a special addressing mode, indexed indirect, which lets you to + have a vector table on the zero page. + Y Index register Y + The Y register has the least operations available. On the other + hand, only it has the indirect indexed addressing mode that + enables access to any memory place without having to use + self-modifying code. + +6510/8502 Undocumented Commands + +-- A brief explanation about what may happen while using don't care states. + + ANE $8B A = (A | #$EE) & X & #byte + same as + A = ((A & #$11 & X) | ( #$EE & X)) & #byte + + In real 6510/8502 the internal parameter #$11 + may occasionally be #$10, #$01 or even #$00. + This occurs when the video chip starts DMA + between the opcode fetch and the parameter fetch + of the instruction. The value probably depends + on the data that was left on the bus by the VIC-II. + + LXA $AB C=Lehti: A = X = ANE + Alternate: A = X = (A & #byte) + + TXA and TAX have to be responsible for these. + + SHA $93,$9F Store (A & X & (ADDR_HI + 1)) + SHX $9E Store (X & (ADDR_HI + 1)) + SHY $9C Store (Y & (ADDR_HI + 1)) + SHS $9B SHA and TXS, where X is replaced by (A & X). + + Note: The value to be stored is copied also + to ADDR_HI if page boundary is crossed. + + SBX $CB Carry and Decimal flags are ignored but the + Carry flag will be set in substraction. This + is due to the CMP command, which is executed + instead of the real SBC. + + ARR $6B This instruction first performs an AND + between the accumulator and the immediate + parameter, then it shifts the accumulator to + the right. However, this is not the whole + truth. See the description below. + +Many undocumented commands do not use AND between registers, the CPU +just throws the bytes to a bus simultaneously and lets the +open-collector drivers perform the AND. I.e. the command called 'SAX', +which is in the STORE section (opcodes $A0...$BF), stores the result +of (A & X) by this way. + +More fortunate is its opposite, 'LAX' which just loads a byte +simultaneously into both A and X. + + $6B ARR + +This instruction seems to be a harmless combination of AND and ROR at +first sight, but it turns out that it affects the V flag and also has +a special kind of decimal mode. This is because the instruction has +inherited some properties of the ADC instruction ($69) in addition to +the ROR ($6A). + +In Binary mode (D flag clear), the instruction effectively does an AND +between the accumulator and the immediate parameter, and then shifts +the accumulator to the right, copying the C flag to the 8th bit. It +sets the Negative and Zero flags just like the ROR would. The ADC code +shows up in the Carry and oVerflow flags. The C flag will be copied +from the bit 6 of the result (which doesn't seem too logical), and the +V flag is the result of an Exclusive OR operation between the bit 6 +and the bit 5 of the result. This makes sense, since the V flag will +be normally set by an Exclusive OR, too. + +In Decimal mode (D flag set), the ARR instruction first performs the +AND and ROR, just like in Binary mode. The N flag will be copied from +the initial C flag, and the Z flag will be set according to the ROR +result, as expected. The V flag will be set if the bit 6 of the +accumulator changed its state between the AND and the ROR, cleared +otherwise. + +Now comes the funny part. If the low nybble of the AND result, +incremented by its lowmost bit, is greater than 5, the low nybble in +the ROR result will be incremented by 6. The low nybble may overflow +as a consequence of this BCD fixup, but the high nybble won't be +adjusted. The high nybble will be BCD fixed in a similar way. If the +high nybble of the AND result, incremented by its lowmost bit, is +greater than 5, the high nybble in the ROR result will be incremented +by 6, and the Carry flag will be set. Otherwise the C flag will be +cleared. + +To help you understand this description, here is a C routine that +illustrates the ARR operation in Decimal mode: + + unsigned + A, /* Accumulator */ + AL, /* low nybble of accumulator */ + AH, /* high nybble of accumulator */ + + C, /* Carry flag */ + Z, /* Zero flag */ + V, /* oVerflow flag */ + N, /* Negative flag */ + + t, /* temporary value */ + s; /* value to be ARRed with Accumulator */ + + t = A & s; /* Perform the AND. */ + + AH = t >> 4; /* Separate the high */ + AL = t & 15; /* and low nybbles. */ + + N = C; /* Set the N and */ + Z = !(A = (t >> 1) | (C << 7)); /* Z flags traditionally */ + V = (t ^ A) & 64; /* and V flag in a weird way. */ + + if (AL + (AL & 1) > 5) /* BCD "fixup" for low nybble. */ + A = (A & 0xF0) | ((A + 6) & 0xF); + + if (C = AH + (AH & 1) > 5) /* Set the Carry flag. */ + A = (A + 0x60) & 0xFF; /* BCD "fixup" for high nybble. */ + + $CB SBX X <- (A & X) - Immediate + +The 'SBX' ($CB) may seem to be very complex operation, even though it +is a combination of the subtraction of accumulator and parameter, as +in the 'CMP' instruction, and the command 'DEX'. As a result, both A +and X are connected to ALU but only the subtraction takes place. Since +the comparison logic was used, the result of subtraction should be +normally ignored, but the 'DEX' now happily stores to X the value of +(A & X) - Immediate. That is why this instruction does not have any +decimal mode, and it does not affect the V flag. Also Carry flag will +be ignored in the subtraction but set according to the result. + + Proof: + +begin 644 vsbx +M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```*D`H#V1*Z`_D2N@09$KJ0>% +M^QBE^VEZJ+$KH#F1*ZD`2"BI`*(`RP`(:-B@.5$K*4#P`E@`H#VQ*SAI`)$K +JD-Z@/[$K:0"1*Y#4J2X@TO\XH$&Q*VD`D2N0Q,;[$+188/_^]_:_OK>V +` +end + + and + +begin 644 sbx +M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI`*!-D2N@3Y$KH%&1*ZD# +MA?L8I?M*2)`#J1@LJ3B@29$K:$J0`ZGX+*G8R)$K&/BXJ?2B8\L)AOP(:(7] +MV#B@3;$KH$\Q*Z!1\2L(1?SP`0!H1?TIM]#XH$VQ*SAI`)$KD,N@3[$K:0"1 +9*Y#!J2X@TO\XH%&Q*VD`D2N0L<;[$))88-#X +` +end + +These test programs show if your machine is compatible with ours +regarding the opcode $CB. The first test, vsbx, proves that SBX does +not affect the V flag. The latter one, sbx, proves the rest of our +theory. The vsbx test tests 33554432 SBX combinations (16777216 +different A, X and Immediate combinations, and two different V flag +states), and the sbx test doubles that amount (16777216*4 D and C flag +combinations). Both tests have run successfully on a C64 and a Vic20. +They ought to run on C16, +4 and the PET series as well. The tests +stop with BRK, if the opcode $CB does not work as expected. Successful +operation ends in RTS. As the tests are very slow, they print dots on +the screen while running so that you know that the machine has not +jammed. On computers running at 1 MHz, the first test prints +approximately one dot every four seconds and a total of 2048 dots, +whereas the second one prints half that amount, one dot every seven +seconds. + +If the tests fail on your machine, please let us know your processor's +part number and revision. If possible, save the executable (after it +has stopped with BRK) under another name and send it to us so that we +know at which stage the program stopped. + +The following program is a Commodore 64 executable that Marko M"akel"a +developed when trying to find out how the V flag is affected by SBX. +(It was believed that the SBX affects the flag in a weird way, and +this program shows how SBX sets the flag differently from SBC.) You +may find the subroutine at $C150 useful when researching other +undocumented instructions' flags. Run the program in a machine +language monitor, as it makes use of the BRK instruction. The result +tables will be written on pages $C2 and $C3. + +begin 644 sbx-c100 +M`,%XH`",#L&,$,&,$L&XJ8*B@LL7AOL(:(7\N#BM#L$M$,'M$L$(Q?OP`B@` +M:$7\\`,@4,'N#L'0U.X0P=#/SB#0[A+!T,<``````````````)BJ\!>M#L$M +L$,'=_\'0":T2P=W_PM`!8,K0Z:T.P2T0P9D`PID`!*T2P9D`PYD`! + +Other undocumented instructions usually cause two preceding opcodes +being executed. However 'NOP' seems to completely disappear from 'SBC' +code $EB. + +The most difficult to comprehend are the rest of the instructions +located on the '$0B' line. + +All the instructions located at the positive (left) side of this line +should rotate either memory or the accumulator, but the addressing +mode turns out to be immediate! No problem. Just read the operand, let +it be ANDed with the accumulator and finally use accumulator +addressing mode for the instructions above them. + +RELIGION_MODE_ON +/* This part of the document is not accurate. You can + read it as a fairy tale, but do not count on it when + performing your own measurements. */ + +The rest two instructions on the same line, called 'ANE' and 'LXA' +($8B and $AB respectively) often give quite unpredictable results. +However, the most usual operation is to store ((A | #$ee) & X & #$nn) +to accumulator. Note that this does not work reliably in a real 64! +In the Commodore 128 the opcode $8B uses values 8C, CC, EE, and +occasionally 0C and 8E for the OR instead of EE,EF,FE and FF used in +the C64. With a C128 running at 2 MHz #$EE is always used. Opcode $AB +does not cause this OR taking place on 8502 while 6510 always performs +it. Note that this behaviour depends on processor and/or video chip +revision. + +Let's take a closer look at $8B (6510). + + A <- X & D & (A | VAL) + + where VAL comes from this table: + + X high D high D low VAL + even even --- $EE (1) + even odd --- $EE + odd even --- $EE + odd odd 0 $EE + odd odd not 0 $FE (2) + +(1) If the bottom 2 bits of A are both 1, then the LSB of the result may + be 0. The values of X and D are different every time I run the test. + This appears to be very rare. +(2) VAL is $FE most of the time. Sometimes it is $EE - it seems to be random, + not related to any of the data. This is much more common than (1). + + In decimal mode, VAL is usually $FE. + +Two different functions have been discovered for LAX, opcode $AB. One +is A = X = ANE (see above) and the other, encountered with 6510 and +8502, is less complicated A = X = (A & #byte). However, according to +what is reported, the version altering only the lowest bits of each +nybble seems to be more common. + +What happens, is that $AB loads a value into both A and X, ANDing the +low bit of each nybble with the corresponding bit of the old +A. However, there are exceptions. Sometimes the low bit is cleared +even when A contains a '1', and sometimes other bits are cleared. The +exceptions seem random (they change every time I run the test). Oops - +that was in decimal mode. Much the same with D=0. + +What causes the randomness? Probably it is that it is marginal logic +levels - when too much wired-anding goes on, some of the signals get +very close to the threshold. Perhaps we're seeing some of them step +over it. The low bit of each nybble is special, since it has to cope +with carry differently (remember decimal mode). We never see a '0' +turn into a '1'. + +Since these instructions are unpredictable, they should not be used. + +There is still very strange instruction left, the one named SHA/X/Y, +which is the only one with only indexed addressing modes. Actually, +the commands 'SHA', 'SHX' and 'SHY' are generated by the indexing +algorithm. + +While using indexed addressing, effective address for page boundary +crossing is calculated as soon as possible so it does not slow down +operation. As a result, in the case of SHA/X/Y, the address and data +are processed at the same time making AND between them to take place. +Thus, the value to be stored by SAX, for example, is in fact (A & X & +(ADDR_HI + 1)). On page boundary crossing the same value is copied +also to high byte of the effective address. + +RELIGION_MODE_OFF + + +Register selection for load and store + + bit1 bit0 A X Y + 0 0 x + 0 1 x + 1 0 x + 1 1 x x + +So, A and X are selected by bits 1 and 0 respectively, while + ~(bit1|bit0) enables Y. + +Indexing is determined by bit4, even in relative addressing mode, +which is one kind of indexing. + +Lines containing opcodes xxx000x1 (01 and 03) are treated as absolute +after the effective address has been loaded into CPU. + +Zeropage,y and Absolute,y (codes 10x1 x11x) are distinquished by bit5. + + +Decimal mode in NMOS 6500 series + + Most sources claim that the NMOS 6500 series sets the N, V and Z +flags unpredictably when performing addition or subtraction in decimal +mode. Of course, this is not true. While testing how the flags are +set, I also wanted to see what happens if you use illegal BCD values. + + ADC works in Decimal mode in a quite complicated way. It is amazing +how it can do that all in a single cycle. Here's a C code version of +the instruction: + + unsigned + A, /* Accumulator */ + AL, /* low nybble of accumulator */ + AH, /* high nybble of accumulator */ + + C, /* Carry flag */ + Z, /* Zero flag */ + V, /* oVerflow flag */ + N, /* Negative flag */ + + s; /* value to be added to Accumulator */ + + AL = (A & 15) + (s & 15) + C; /* Calculate the lower nybble. */ + + AH = (A >> 4) + (s >> 4) + (AL > 15); /* Calculate the upper nybble. */ + + if (AL > 9) AL += 6; /* BCD fixup for lower nybble. */ + + Z = ((A + s + C) & 255 != 0); /* Zero flag is set just + like in Binary mode. */ + + /* Negative and Overflow flags are set with the same logic than in + Binary mode, but after fixing the lower nybble. */ + + N = (AH & 8 != 0); + V = ((AH << 4) ^ A) & 128 && !((A ^ s) & 128); + + if (AH > 9) AH += 6; /* BCD fixup for upper nybble. */ + + /* Carry is the only flag set after fixing the result. */ + + C = (AH > 15); + A = ((AH << 4) | (AL & 15)) & 255; + + The C flag is set as the quiche eaters expect, but the N and V flags +are set after fixing the lower nybble but before fixing the upper one. +They use the same logic than binary mode ADC. The Z flag is set before +any BCD fixup, so the D flag does not have any influence on it. + +Proof: The following test program tests all 131072 ADC combinations in + Decimal mode, and aborts with BRK if anything breaks this theory. + If everything goes well, it ends in RTS. + +begin 600 dadc +M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@ 'BI&* A/N$_$B@+)$KH(V1 +M*Q@(I?PI#X7]I?LI#V7]R0J0 FD%J"D/A?VE^RGP9?PI\ C $) ":0^JL @H +ML ?)H) &""@X:5\X!?V%_0AH*3W@ ! ""8"HBD7[$ JE^T7\, 28"4"H**7[ +M9?S0!)@) J@8N/BE^V7\V A%_= G:(3]1?W0(.;[T(?F_-"#:$D8\ )88*D= +0&&4KA?NI &4LA?RI.&S[ A% + +end + + All programs in this chapter have been successfully tested on a Vic20 +and a Commodore 64 and a Commodore 128D in C64 mode. They should run on +C16, +4 and on the PET series as well. If not, please report the problem +to Marko M"akel"a. Each test in this chapter should run in less than a +minute at 1 MHz. + +SBC is much easier. Just like CMP, its flags are not affected by +the D flag. + +Proof: + +begin 600 dsbc-cmp-flags +M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@ 'B@ (3[A/RB XH8:66HL2N@ +M09$KH$R1*XII::BQ*Z!%D2N@4)$K^#BXI?OE_-@(:(7].+BE^^7\"&A%_? ! +5 .;[T./F_-#?RA"_8!@X&#CEY<7% + +end + + The only difference in SBC's operation in decimal mode from binary mode +is the result-fixup: + + unsigned + A, /* Accumulator */ + AL, /* low nybble of accumulator */ + AH, /* high nybble of accumulator */ + + C, /* Carry flag */ + Z, /* Zero flag */ + V, /* oVerflow flag */ + N, /* Negative flag */ + + s; /* value to be added to Accumulator */ + + AL = (A & 15) - (s & 15) - !C; /* Calculate the lower nybble. */ + + if (AL & 16) AL -= 6; /* BCD fixup for lower nybble. */ + + AH = (A >> 4) - (s >> 4) - (AL & 16); /* Calculate the upper nybble. */ + + if (AH & 16) AH -= 6; /* BCD fixup for upper nybble. */ + + /* The flags are set just like in Binary mode. */ + + C = (A - s - !C) & 256 != 0; + Z = (A - s - !C) & 255 != 0; + V = ((A - s - !C) ^ s) & 128 && (A ^ s) & 128; + N = (A - s - !C) & 128 != 0; + + A = ((AH << 4) | (AL & 15)) & 255; + + Again Z flag is set before any BCD fixup. The N and V flags are set +at any time before fixing the high nybble. The C flag may be set in any +phase. + + Decimal subtraction is easier than decimal addition, as you have to +make the BCD fixup only when a nybble overflows. In decimal addition, +you had to verify if the nybble was greater than 9. The processor has +an internal "half carry" flag for the lower nybble, used to trigger +the BCD fixup. When calculating with legal BCD values, the lower nybble +cannot overflow again when fixing it. +So, the processor does not handle overflows while performing the fixup. +Similarly, the BCD fixup occurs in the high nybble only if the value +overflows, i.e. when the C flag will be cleared. + + Because SBC's flags are not affected by the Decimal mode flag, you +could guess that CMP uses the SBC logic, only setting the C flag +first. But the SBX instruction shows that CMP also temporarily clears +the D flag, although it is totally unnecessary. + + The following program, which tests SBC's result and flags, +contains the 6502 version of the pseudo code example above. + +begin 600 dsbc +M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@ 'BI&* A/N$_$B@+)$KH':1 +M*S@(I?PI#X7]I?LI#^7]L /I!1@I#ZBE_"GPA?VE^RGP"#CE_2GPL KI7RBP +M#ND/.+ )*+ &Z0^P NE?A/T%_87]*+BE^^7\"&BH.+CXI?OE_-@(1?W0FVB$ +8_47]T)3F^]">YOS0FFA)&- $J3C0B%A@ + +end + + Obviously the undocumented instructions RRA (ROR+ADC) and ISB +(INC+SBC) have inherited also the decimal operation from the official +instructions ADC and SBC. The program droradc proves this statement +for ROR, and the dincsbc test proves this for ISB. Finally, +dincsbc-deccmp proves that ISB's and DCP's (DEC+CMP) flags are not +affected by the D flag. + +begin 644 droradc +M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI&*``A/N$_$B@+)$KH(V1 +M*S@(I?PI#X7]I?LI#V7]R0J0`FD%J"D/A?VE^RGP9?PI\`C`$)`":0^JL`@H +ML`?)H)`&""@X:5\X!?V%_0AH*3W@`!`""8"HBD7[$`JE^T7\,`28"4"H**7[ +M9?S0!)@)`J@XN/BE^R;\9_S8"$7]T"=HA/U%_=`@YOO0A>;\T(%H21CP`EA@ +2J1T892N%^ZD`92R%_*DX;/L` +` +end + +begin 644 dincsbc +M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI&*``A/N$_$B@+)$KH':1 +M*S@(I?PI#X7]I?LI#^7]L`/I!1@I#ZBE_"GPA?VE^RGP"#CE_2GPL`KI7RBP +M#ND/.+`)*+`&Z0^P`NE?A/T%_87]*+BE^^7\"&BH.+CXI?O&_.?\V`A%_="9 +::(3]1?W0DN;[T)SF_-"8:$D8T`2I.-"&6&#\ +` +end + +begin 644 dincsbc-deccmp +M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'B@`(3[A/RB`XH8:7>HL2N@ +M3Y$KH%R1*XII>ZBQ*Z!3D2N@8)$KBFE_J+$KH%61*Z!BD2OX.+BE^^;\Q_S8 +L"&B%_3BXI?OF_,?\"&A%_?`!`.;[T-_F_-#;RA"M8!@X&#CFYL;&Q\?GYP#8 +` +end + + +6510 features + + o PHP always pushes the Break (B) flag as a `1' to the stack. + Jukka Tapanim"aki claimed in C=lehti issue 3/89, on page 27 that the + processor makes a logical OR between the status register's bit 4 + and the bit 8 of the stack pointer register (which is always 1). + He did not give any reasons for this argument, and has refused to clarify + it afterwards. Well, this was not the only error in his article... + + o Indirect addressing modes do not handle page boundary crossing at all. + When the parameter's low byte is $FF, the effective address wraps + around and the CPU fetches high byte from $xx00 instead of $xx00+$0100. + E.g. JMP ($01FF) fetches PCL from $01FF and PCH from $0100, + and LDA ($FF),Y fetches the base address from $FF and $00. + + o Indexed zero page addressing modes never fix the page address on + crossing the zero page boundary. + E.g. LDX #$01 : LDA ($FF,X) loads the effective address from $00 and $01. + + o The processor always fetches the byte following a relative branch + instruction. If the branch is taken, the processor reads then the + opcode from the destination address. If page boundary is crossed, it + first reads a byte from the old page from a location that is bigger + or smaller than the correct address by one page. + + o If you cross a page boundary in any other indexed mode, + the processor reads an incorrect location first, a location that is + smaller by one page. + + o Read-Modify-Write instructions write unmodified data, then modified + (so INC effectively does LDX loc;STX loc;INX;STX loc) + + o -RDY is ignored during writes + (This is why you must wait 3 cycles before doing any DMA -- + the maximum number of consecutive writes is 3, which occurs + during interrupts except -RESET.) + + o Some undefined opcodes may give really unpredictable results. + + o All registers except the Program Counter remain unmodified after -RESET. + (This is why you must preset D and I flags in the RESET handler.) + + +Different CPU types + +The Rockwell data booklet 29651N52 (technical information about R65C00 +microprocessors, dated October 1984), lists the following differences between +NMOS R6502 microprocessor and CMOS R65C00 family: + + + 1. Indexed addressing across page boundary. + NMOS: Extra read of invalid address. + CMOS: Extra read of last instruction byte. + + + 2. Execution of invalid op codes. + NMOS: Some terminate only by reset. Results are undefined. + CMOS: All are NOPs (reserved for future use). + + + 3. Jump indirect, operand = XXFF. + NMOS: Page address does not increment. + CMOS: Page address increments and adds one additional cycle. + + + 4. Read/modify/write instructions at effective address. + NMOS: One read and two write cycles. + CMOS: Two read and one write cycle. + + + 5. Decimal flag. + NMOS: Indeterminate after reset. + CMOS: Initialized to binary mode (D=0) after reset and interrupts. + + + 6. Flags after decimal operation. + NMOS: Invalid N, V and Z flags. + CMOS: Valid flag adds one additional cycle. + + + 7. Interrupt after fetch of BRK instruction. + NMOS: Interrupt vector is loaded, BRK vector is ignored. + CMOS: BRK is executed, then interrupt is executed. + + +6510 Instruction Timing + + The NMOS 6500 series processors always perform at least two reads +for each instruction. In addition to the operation code (opcode), they +fetch the next byte. This is quite efficient, as most instructions are +two or three bytes long. + + The processors also use a sort of pipelining. If an instruction does +not store data in memory on its last cycle, the processor can fetch +the opcode of the next instruction while executing the last cycle. For +instance, the instruction EOR #$FF truly takes three cycles. On the +first cycle, the opcode $49 will be fetched. During the second cycle +the processor decodes the opcode and fetches the parameter #$FF. On +the third cycle, the processor will perform the operation and store +the result to accumulator, but simultaneously it fetches the opcode +for the next instruction. This is why the instruction effectively +takes only two cycles. + + The following tables show what happens on the bus while executing +different kinds of instructions. + + Interrupts + + NMI and IRQ both take 7 cycles. Their timing diagram is much like + BRK's (see below). IRQ will be executed only when the I flag is + clear. IRQ and BRK both set the I flag, whereas the NMI does not + affect its state. + + The processor will usually wait for the current instruction to + complete before executing the interrupt sequence. To process the + interrupt before the next instruction, the interrupt must occur + before the last cycle of the current instruction. + + There is one exception to this rule: the BRK instruction. If a + hardware interrupt (NMI or IRQ) occurs before the fourth (flags + saving) cycle of BRK, the BRK instruction will be skipped, and + the processor will jump to the hardware interrupt vector. This + sequence will always take 7 cycles. + + You do not completely lose the BRK interrupt, the B flag will be + set in the pushed status register if a BRK instruction gets + interrupted. When BRK and IRQ occur at the same time, this does + not cause any problems, as your program will consider it as a + BRK, and the IRQ would occur again after the processor returned + from your BRK routine, unless you cleared the interrupt source in + your BRK handler. But the simultaneous occurrence of NMI and BRK + is far more fatal. If you do not check the B flag in the NMI + routine and subtract two from the return address when needed, the + BRK instruction will be skipped. + + If the NMI and IRQ interrupts overlap each other (one interrupt + occurs before fetching the interrupt vector for the other + interrupt), the processor will most probably jump to the NMI + vector in every case, and then jump to the IRQ vector after + processing the first instruction of the NMI handler. This has not + been measured yet, but the IRQ is very similar to BRK, and many + sources state that the NMI has higher priority than IRQ. However, + it might be that the processor takes the interrupt that comes + later, i.e. you could lose an NMI interrupt if an IRQ occurred in + four cycles after it. + + After finishing the interrupt sequence, the processor will start + to execute the first instruction of the interrupt routine. This + proves that the processor uses a sort of pipelining: it finishes + the current instruction (or interrupt sequence) while reading the + opcode of the next instruction. + + RESET does not push program counter on stack, and it lasts + probably 6 cycles after deactivating the signal. Like NMI, RESET + preserves all registers except PC. + + Instructions accessing the stack + + BRK + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away), + increment PC + 3 $0100,S W push PCH on stack (with B flag set), decrement S + 4 $0100,S W push PCL on stack, decrement S + 5 $0100,S W push P on stack, decrement S + 6 $FFFE R fetch PCL + 7 $FFFF R fetch PCH + + RTI + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away) + 3 $0100,S R increment S + 4 $0100,S R pull P from stack, increment S + 5 $0100,S R pull PCL from stack, increment S + 6 $0100,S R pull PCH from stack + + RTS + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away) + 3 $0100,S R increment S + 4 $0100,S R pull PCL from stack, increment S + 5 $0100,S R pull PCH from stack + 6 PC R increment PC + + PHA, PHP + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away) + 3 $0100,S W push register on stack, decrement S + + PLA, PLP + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away) + 3 $0100,S R increment S + 4 $0100,S R pull register from stack + + JSR + + # address R/W description + --- ------- --- ------------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R fetch low address byte, increment PC + 3 $0100,S R internal operation (predecrement S?) + 4 $0100,S W push PCH on stack, decrement S + 5 $0100,S W push PCL on stack, decrement S + 6 PC R copy low address byte to PCL, fetch high address + byte to PCH + + Accumulator or implied addressing + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away) + + Immediate addressing + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch value, increment PC + + Absolute addressing + + JMP + + # address R/W description + --- ------- --- ------------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R fetch low address byte, increment PC + 3 PC R copy low address byte to PCL, fetch high address + byte to PCH + + Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT, + LAX, NOP) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, increment PC + 4 address R read from effective address + + Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC, + SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, increment PC + 4 address R read from effective address + 5 address W write the value back to effective address, + and do the operation on it + 6 address W write the new value to effective address + + Write instructions (STA, STX, STY, SAX) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, increment PC + 4 address W write register to effective address + + Zero page addressing + + Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT, + LAX, NOP) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address R read from effective address + + Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC, + SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address R read from effective address + 4 address W write the value back to effective address, + and do the operation on it + 5 address W write the new value to effective address + + Write instructions (STA, STX, STY, SAX) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address W write register to effective address + + Zero page indexed addressing + + Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT, + LAX, NOP) + + # address R/W description + --- --------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address R read from address, add index register to it + 4 address+I* R read from effective address + + Notes: I denotes either index register (X or Y). + + * The high byte of the effective address is always zero, + i.e. page boundary crossings are not handled. + + Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC, + SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- --------- --- --------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address R read from address, add index register X to it + 4 address+X* R read from effective address + 5 address+X* W write the value back to effective address, + and do the operation on it + 6 address+X* W write the new value to effective address + + Note: * The high byte of the effective address is always zero, + i.e. page boundary crossings are not handled. + + Write instructions (STA, STX, STY, SAX) + + # address R/W description + --- --------- --- ------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address R read from address, add index register to it + 4 address+I* W write to effective address + + Notes: I denotes either index register (X or Y). + + * The high byte of the effective address is always zero, + i.e. page boundary crossings are not handled. + + Absolute indexed addressing + + Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT, + LAX, LAE, SHS, NOP) + + # address R/W description + --- --------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, + add index register to low address byte, + increment PC + 4 address+I* R read from effective address, + fix the high byte of effective address + 5+ address+I R re-read from effective address + + Notes: I denotes either index register (X or Y). + + * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. + + + This cycle will be executed only if the effective address + was invalid during cycle #4, i.e. page boundary was crossed. + + Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC, + SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- --------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, + add index register X to low address byte, + increment PC + 4 address+X* R read from effective address, + fix the high byte of effective address + 5 address+X R re-read from effective address + 6 address+X W write the value back to effective address, + and do the operation on it + 7 address+X W write the new value to effective address + + Notes: * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. + + Write instructions (STA, STX, STY, SHA, SHX, SHY) + + # address R/W description + --- --------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, + add index register to low address byte, + increment PC + 4 address+I* R read from effective address, + fix the high byte of effective address + 5 address+I W write to effective address + + Notes: I denotes either index register (X or Y). + + * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. Because + the processor cannot undo a write to an invalid + address, it always reads from the address first. + + Relative addressing (BCC, BCS, BNE, BEQ, BPL, BMI, BVC, BVS) + + # address R/W description + --- --------- --- --------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R fetch operand, increment PC + 3 PC R Fetch opcode of next instruction, + If branch is taken, add operand to PCL. + Otherwise increment PC. + 4+ PC* R Fetch opcode of next instruction. + Fix PCH. If it did not change, increment PC. + 5! PC R Fetch opcode of next instruction, + increment PC. + + Notes: The opcode fetch of the next instruction is included to + this diagram for illustration purposes. When determining + real execution times, remember to subtract the last + cycle. + + * The high byte of Program Counter (PCH) may be invalid + at this time, i.e. it may be smaller or bigger by $100. + + + If branch is taken, this cycle will be executed. + + ! If branch occurs to different page, this cycle will be + executed. + + Indexed indirect addressing + + Read instructions (LDA, ORA, EOR, AND, ADC, CMP, SBC, LAX) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R read from the address, add X to it + 4 pointer+X R fetch effective address low + 5 pointer+X+1 R fetch effective address high + 6 address R read from effective address + + Note: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R read from the address, add X to it + 4 pointer+X R fetch effective address low + 5 pointer+X+1 R fetch effective address high + 6 address R read from effective address + 7 address W write the value back to effective address, + and do the operation on it + 8 address W write the new value to effective address + + Note: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + Write instructions (STA, SAX) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R read from the address, add X to it + 4 pointer+X R fetch effective address low + 5 pointer+X+1 R fetch effective address high + 6 address W write to effective address + + Note: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + Indirect indexed addressing + + Read instructions (LDA, EOR, AND, ORA, ADC, SBC, CMP) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R fetch effective address low + 4 pointer+1 R fetch effective address high, + add Y to low byte of effective address + 5 address+Y* R read from effective address, + fix high byte of effective address + 6+ address+Y R read from effective address + + Notes: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. + + + This cycle will be executed only if the effective address + was invalid during cycle #5, i.e. page boundary was crossed. + + Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R fetch effective address low + 4 pointer+1 R fetch effective address high, + add Y to low byte of effective address + 5 address+Y* R read from effective address, + fix high byte of effective address + 6 address+Y R read from effective address + 7 address+Y W write the value back to effective address, + and do the operation on it + 8 address+Y W write the new value to effective address + + Notes: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. + + Write instructions (STA, SHA) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R fetch effective address low + 4 pointer+1 R fetch effective address high, + add Y to low byte of effective address + 5 address+Y* R read from effective address, + fix high byte of effective address + 6 address+Y W write to effective address + + Notes: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. + + Absolute indirect addressing (JMP) + + # address R/W description + --- --------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address low, increment PC + 3 PC R fetch pointer address high, increment PC + 4 pointer R fetch low address to latch + 5 pointer+1* R fetch PCH, copy latch to PCL + + Note: * The PCH will always be fetched from the same page + than PCL, i.e. page boundary crossing is not handled. + + How Real Programmers Acknowledge Interrupts + + With RMW instructions: + + ; beginning of combined raster/timer interrupt routine + LSR $D019 ; clear VIC interrupts, read raster interrupt flag to C + BCS raster ; jump if VIC caused an interrupt + ... ; timer interrupt routine + + Operational diagram of LSR $D019: + + # data address R/W + --- ---- ------- --- --------------------------------- + 1 4E PC R fetch opcode + 2 19 PC+1 R fetch address low + 3 D0 PC+2 R fetch address high + 4 xx $D019 R read memory + 5 xx $D019 W write the value back, rotate right + 6 xx/2 $D019 W write the new value back + + The 5th cycle acknowledges the interrupt by writing the same + value back. If only raster interrupts are used, the 6th cycle + has no effect on the VIC. (It might acknowledge also some + other interrupts.) + + With indexed addressing: + + ; acknowledge interrupts to both CIAs + LDX #$10 + LDA $DCFD,X + + Operational diagram of LDA $DCFD,X: + + # data address R/W description + --- ---- ------- --- --------------------------------- + 1 BD PC R fetch opcode + 2 FD PC+1 R fetch address low + 3 DC PC+2 R fetch address high, add X to address low + 4 xx $DC0D R read from address, fix high byte of address + 5 yy $DD0D R read from right address + + ; acknowledge interrupts to CIA 2 + LDX #$10 + STA $DDFD,X + + Operational diagram of STA $DDFD,X: + + # data address R/W description + --- ---- ------- --- --------------------------------- + 1 9D PC R fetch opcode + 2 FD PC+1 R fetch address low + 3 DC PC+2 R fetch address high, add X to address low + 4 xx $DD0D R read from address, fix high byte of address + 5 ac $DE0D W write to right address + + With branch instructions: + + ; acknowledge interrupts to CIA 2 + LDA #$00 ; clear N flag + JMP $DD0A + DD0A BPL $DC9D ; branch + DC9D BRK ; return + + You need the following preparations to initialize the CIA registers: + + LDA #$91 ; argument of BPL + STA $DD0B + LDA #$10 ; BPL + STA $DD0A + STA $DD08 ; load the ToD values from the latches + LDA $DD0B ; freeze the ToD display + LDA #$7F + STA $DC0D ; assure that $DC0D is $00 + + Operational diagram of BPL $DC9D: + + # data address R/W description + --- ---- ------- --- --------------------------------- + 1 10 $DD0A R fetch opcode + 2 91 $DD0B R fetch argument + 3 xx $DD0C R fetch opcode, add argument to PCL + 4 yy $DD9D R fetch opcode, fix PCH + ( 5 00 $DC9D R fetch opcode ) + + ; acknowledge interrupts to CIA 1 + LSR ; clear N flag + JMP $DCFA + DCFA BPL $DD0D + DD0D BRK + + ; Again you need to set the ToD registers of CIA 1 and the + ; Interrupt Control Register of CIA 2 first. + + Operational diagram of BPL $DD0D: + + # data address R/W description + --- ---- ------- --- --------------------------------- + 1 10 $DCFA R fetch opcode + 2 11 $DCFB R fetch argument + 3 xx $DCFC R fetch opcode, add argument to PCL + 4 yy $DC0D R fetch opcode, fix PCH + ( 5 00 $DD0D R fetch opcode ) + + ; acknowledge interrupts to CIA 2 automagically + ; preparations + LDA #$7F + STA $DD0D ; disable all interrupt sources of CIA2 + LDA $DD0E + AND #$BE ; ensure that $DD0C remains constant + STA $DD0E ; and stop the timer + LDA #$FD + STA $DD0C ; parameter of BPL + LDA #$10 + STA $DD0B ; BPL + LDA #$40 + STA $DD0A ; RTI/parameter of LSR + LDA #$46 + STA $DD09 ; LSR + STA $DD08 ; load the ToD values from the latches + LDA $DD0B ; freeze the ToD display + LDA #$09 + STA $0318 + LDA #$DD + STA $0319 ; change NMI vector to $DD09 + LDA #$FF ; Try changing this instruction's operand + STA $DD05 ; (see comment below). + LDA #$FF + STA $DD04 ; set interrupt frequency to 1/65536 cycles + LDA $DD0E + AND #$80 + ORA #$11 + LDX #$81 + STX $DD0D ; enable timer interrupt + STA $DD0E ; start timer + + LDA #$00 ; To see that the interrupts really occur, + STA $D011 ; use something like this and see how + LOOP DEC $D020 ; changing the byte loaded to $DD05 from + BNE LOOP ; #$FF to #$0F changes the image. + + When an NMI occurs, the processor jumps to Kernal code, which jumps to + ($0318), which points to the following routine: + + DD09 LSR $40 ; clear N flag + BPL $DD0A ; Note: $DD0A contains RTI. + + Operational diagram of BPL $DD0A: + + # data address R/W description + --- ---- ------- --- --------------------------------- + 1 10 $DD0B R fetch opcode + 2 11 $DD0C R fetch argument + 3 xx $DD0D R fetch opcode, add argument to PCL + 4 40 $DD0A R fetch opcode, (fix PCH) + + With RTI: + + ; the fastest possible interrupt handler in the 6500 family + ; preparations + SEI + LDA $01 ; disable ROM and enable I/O + AND #$FD + ORA #$05 + STA $01 + LDA #$7F + STA $DD0D ; disable CIA 2's all interrupt sources + LDA $DD0E + AND #$BE ; ensure that $DD0C remains constant + STA $DD0E ; and stop the timer + LDA #$40 + STA $DD0C ; store RTI to $DD0C + LDA #$0C + STA $FFFA + LDA #$DD + STA $FFFB ; change NMI vector to $DD0C + LDA #$FF ; Try changing this instruction's operand + STA $DD05 ; (see comment below). + LDA #$FF + STA $DD04 ; set interrupt frequency to 1/65536 cycles + LDA $DD0E + AND #$80 + ORA #$11 + LDX #$81 + STX $DD0D ; enable timer interrupt + STA $DD0E ; start timer + + LDA #$00 ; To see that the interrupts really occur, + STA $D011 ; use something like this and see how + LOOP DEC $D020 ; changing the byte loaded to $DD05 from + BNE LOOP ; #$FF to #$0F changes the image. + + When an NMI occurs, the processor jumps to Kernal code, which + jumps to ($0318), which points to the following routine: + + DD0C RTI + + How on earth can this clear the interrupts? Remember, the + processor always fetches two successive bytes for each + instruction. + + A little more practical version of this is redirecting the NMI + (or IRQ) to your own routine, whose last instruction is JMP + $DD0C or JMP $DC0C. If you want to confuse more, change the 0 + in the address to a hexadecimal digit different from the one + you used when writing the RTI. + + Or you can combine the latter two methods: + + DD09 LSR $xx ; xx is any appropriate BCD value 00-59. + BPL $DCFC + DCFC RTI + + This example acknowledges interrupts to both CIAs. + + If you want to confuse the examiners of your code, you can use any +of these techniques. Although these examples use no undefined opcodes, +they do not necessarily run correctly on CMOS processors. However, the +RTI example should run on 65C02 and 65C816, and the latter branch +instruction example might work as well. + + The RMW instruction method has been used in some demos, others were +developed by Marko M"akel"a. His favourite is the automagical RTI +method, although it does not have any practical applications, except +for some time dependent data decryption routines for very complicated +copy protections. + + diff --git a/M6502/documentation/mos_6500_mpu_nov_1985.pdf b/M6502/documentation/mos_6500_mpu_nov_1985.pdf new file mode 100644 index 0000000..37ab6cb Binary files /dev/null and b/M6502/documentation/mos_6500_mpu_nov_1985.pdf differ