From f5125b2a358ada070037fff781988b2929b477a8 Mon Sep 17 00:00:00 2001 From: Adrian Conlon Date: Sat, 5 Jan 2019 23:21:12 +0000 Subject: [PATCH] Add some documentation regarding instruction cycle timings. Signed-off-by: Adrian Conlon --- M6502/documentation/64doc.txt | 1604 +++++++++++++++++++++++++++++++++ 1 file changed, 1604 insertions(+) create mode 100644 M6502/documentation/64doc.txt diff --git a/M6502/documentation/64doc.txt b/M6502/documentation/64doc.txt new file mode 100644 index 0000000..5fb0226 --- /dev/null +++ b/M6502/documentation/64doc.txt @@ -0,0 +1,1604 @@ +[C= commodore 64] + +--------------------------------------------------------------------------- + +64doc + +# $Id: 64doc,v 1.8 1994/06/03 19:50:04 jopi Exp $ +# +# This file is part of Commodore 64 emulator +# and Program Development System. +# +# See README for copyright notice +# +# This file contains documentation for 6502/6510/8500/8502 instruction set. +# +# +# Written by +# John West (john@ucc.gu.uwa.edu.au) +# Marko Mäkelä (Marko.Makela@HUT.FI) +# +# +# $Log: 64doc,v $ +# Revision 1.8 1994/06/03 19:50:04 jopi +# Patchlevel 2 +# +# Revision 1.7 1994/04/15 13:07:04 jopi +# 65xx Register descriptions added +# +# Revision 1.6 1994/02/18 16:09:36 jopi +# +# Revision 1.5 1994/01/26 16:08:37 jopi +# X64 version 0.2 PL 1 +# +# Revision 1.4 1993/11/10 01:55:34 jopi +# +# Revision 1.3 93/06/21 13:37:18 jopi +# X64 version 0.2 PL 0 +# +# Revision 1.2 93/06/21 13:07:15 jopi +# *** empty log message *** +# +# + + Note: To extract the uuencoded ML programs in this article most + easily you may use e.g. "uud" by Edwin Kremer , + which extracts them all at once. + + Documentation for the NMOS 65xx/85xx Instruction Set + + 6510 Instructions by Addressing Modes + 6502 Registers + 6510/8502 Undocumented Commands + Register selection for load and store + Decimal mode in NMOS 6500 series + 6510 features + Different CPU types + 6510 Instruction Timing + How Real Programmers Acknowledge Interrupts + Memory Management + Autostart Code + Notes + References + + 6510 Instructions by Addressing Modes + +off- ++++++++++ Positive ++++++++++ ---------- Negative ---------- +set 00 20 40 60 80 a0 c0 e0 mode + ++00 BRK JSR RTI RTS NOP* LDY CPY CPX Impl/immed ++01 ORA AND EOR ADC STA LDA CMP SBC (indir,x) ++02 t t t t NOP*t LDX NOP*t NOP*t ? /immed ++03 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* (indir,x) ++04 NOP* BIT NOP* NOP* STY LDY CPY CPX Zeropage ++05 ORA AND EOR ADC STA LDA CMP SBC Zeropage ++06 ASL ROL LSR ROR STX LDX DEC INC Zeropage ++07 SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* Zeropage + ++08 PHP PLP PHA PLA DEY TAY INY INX Implied ++09 ORA AND EOR ADC NOP* LDA CMP SBC Immediate ++0a ASL ROL LSR ROR TXA TAX DEX NOP Accu/impl ++0b ANC** ANC** ASR** ARR** ANE** LXA** SBX** SBC* Immediate ++0c NOP* BIT JMP JMP () STY LDY CPY CPX Absolute ++0d ORA AND EOR ADC STA LDA CMP SBC Absolute ++0e ASL ROL LSR ROR STX LDX DEC INC Absolute ++0f SLO* RLA* SRE* RRA* SAX* LAX* DCP* ISB* Absolute + ++10 BPL BMI BVC BVS BCC BCS BNE BEQ Relative ++11 ORA AND EOR ADC STA LDA CMP SBC (indir),y ++12 t t t t t t t t ? ++13 SLO* RLA* SRE* RRA* SHA** LAX* DCP* ISB* (indir),y ++14 NOP* NOP* NOP* NOP* STY LDY NOP* NOP* Zeropage,x ++15 ORA AND EOR ADC STA LDA CMP SBC Zeropage,x ++16 ASL ROL LSR ROR STX y) LDX y) DEC INC Zeropage,x ++17 SLO* RLA* SRE* RRA* SAX* y) LAX* y) DCP* ISB* Zeropage,x + ++18 CLC SEC CLI SEI TYA CLV CLD SED Implied ++19 ORA AND EOR ADC STA LDA CMP SBC Absolute,y ++1a NOP* NOP* NOP* NOP* TXS TSX NOP* NOP* Implied ++1b SLO* RLA* SRE* RRA* SHS** LAS** DCP* ISB* Absolute,y ++1c NOP* NOP* NOP* NOP* SHY** LDY NOP* NOP* Absolute,x ++1d ORA AND EOR ADC STA LDA CMP SBC Absolute,x ++1e ASL ROL LSR ROR SHX**y) LDX y) DEC INC Absolute,x ++1f SLO* RLA* SRE* RRA* SHA**y) LAX* y) DCP* ISB* Absolute,x + + ROR intruction is available on MC650x microprocessors after + June, 1976. + + Legend: + + t Jams the machine + *t Jams very rarely + * Undocumented command + ** Unusual operation + y) indexed using Y instead of X + () indirect instead of absolute + + Note that the NOP instructions do have other addressing modes + than the implied addressing. The NOP instruction is just like + any other load instruction, except it does not store the + result anywhere nor affects the flags. + + 6502 Registers + + The NMOS 65xx processors are not ruined with too many registers. In +addition to that, the registers are mostly 8-bit. Here is a brief +description of each register: + + PC Program Counter + + This register points the address from which the next + instruction byte (opcode or parameter) will be fetched. + Unlike other registers, this one is 16 bits in length. The + low and high 8-bit halves of the register are called PCL + and PCH, respectively. + + The Program Counter may be read by pushing its value on + the stack. This can be done either by jumping to a + subroutine or by causing an interrupt. + + S Stack pointer + + The NMOS 65xx processors have 256 bytes of stack memory, + ranging from $0100 to $01FF. The S register is a 8-bit + offset to the stack page. In other words, whenever + anything is being pushed on the stack, it will be stored + to the address $0100+S. + + The Stack pointer can be read and written by transfering + its value to or from the index register X (see below) with + the TSX and TXS instructions. + + P Processor status + + This 8-bit register stores the state of the processor. The + bits in this register are called flags. Most of the flags + have something to do with arithmetic operations. + + The P register can be read by pushing it on the stack + (with PHP or by causing an interrupt). If you only need to + read one flag, you can use the branch instructions. + Setting the flags is possible by pulling the P register + from stack or by using the flag set or clear instructions. + + Following is a list of the flags, starting from the 8th + bit of the P register (bit 7, value $80): + + N Negative flag + + This flag will be set after any arithmetic operations + (when any of the registers A, X or Y is being loaded + with a value). Generally, the N flag will be copied + from the topmost bit of the register being loaded. + + Note that TXS (Transfer X to S) is not an arithmetic + operation. Also note that the BIT instruction affects + the Negative flag just like arithmetic operations. + Finally, the Negative flag behaves differently in + Decimal operations (see description below). + + V oVerflow flag + + Like the Negative flag, this flag is intended to be + used with 8-bit signed integer numbers. The flag will + be affected by addition and subtraction, the + instructions PLP, CLV and BIT, and the hardware signal + -SO. Note that there is no SEV instruction, even though + the MOS engineers loved to use East European abbreviations, + like DDR (Deutsche Demokratische Republik vs. Data + Direction Register). (The Russian abbreviation for their + former trade association COMECON is SEV.) The -SO + (Set Overflow) signal is available on some processors, + at least the 6502, to set the V flag. This enables + response to an I/O activity in equal or less than + three clock cycles when using a BVC instruction branching + to itself ($50 $FE). + + The CLV instruction clears the V flag, and the PLP and + BIT instructions copy the flag value from the bit 6 of + the topmost stack entry or from memory. + + After a binary addition or subtraction, the V flag + will be set on a sign overflow, cleared otherwise. + What is a sign overflow? For instance, if you are + trying to add 123 and 45 together, the result (168) + does not fit in a 8-bit signed integer (upper limit + 127 and lower limit -128). Similarly, adding -123 to + -45 causes the overflow, just like subtracting -45 + from 123 or 123 from -45 would do. + + Like the N flag, the V flag will not be set as + expected in the Decimal mode. Later in this document + is a precise operation description. + + A common misbelief is that the V flag could only be + set by arithmetic operations, not cleared. + + 1 unused flag + + To the current knowledge, this flag is always 1. + + B Break flag + + This flag is used to distinguish software (BRK) + interrupts from hardware interrupts (IRQ or NMI). The + B flag is always set except when the P register is + being pushed on stack when jumping to an interrupt + routine to process only a hardware interrupt. + + The official NMOS 65xx documentation claims that the + BRK instruction could only cause a jump to the IRQ + vector ($FFFE). However, if an NMI interrupt occurs + while executing a BRK instruction, the processor will + jump to the NMI vector ($FFFA), and the P register + will be pushed on the stack with the B flag set. + + D Decimal mode flag + + This flag is used to select the (Binary Coded) Decimal + mode for addition and subtraction. In most + applications, the flag is zero. + + The Decimal mode has many oddities, and it operates + differently on CMOS processors. See the description + of the ADC, SBC and ARR instructions below. + + I Interrupt disable flag + + This flag can be used to prevent the processor from + jumping to the IRQ handler vector ($FFFE) whenever the + hardware line -IRQ is active. The flag will be + automatically set after taking an interrupt, so that + the processor would not keep jumping to the interrupt + routine if the -IRQ signal remains low for several + clock cycles. + + Z Zero flag + + The Zero flag will be affected in the same cases than + the Negative flag. Generally, it will be set if an + arithmetic register is being loaded with the value + zero, and cleared otherwise. The flag will behave + differently in Decimal operations. + + C Carry flag + + This flag is used in additions, subtractions, + comparisons and bit rotations. In additions and + subtractions, it acts as a 9th bit and lets you to + chain operations to calculate with bigger than 8-bit + numbers. When subtracting, the Carry flag is the + negative of Borrow: if an overflow occurs, the flag + will be clear, otherwise set. Comparisons are a + special case of subtraction: they assume Carry flag + set and Decimal flag clear, and do not store the + result of the subtraction anywhere. + + There are four kinds of bit rotations. All of them + store the bit that is being rotated off to the Carry + flag. The left shifting instructions are ROL and ASL. + ROL copies the initial Carry flag to the lowmost bit + of the byte; ASL always clears it. Similarly, the ROR + and LSR instructions shift to the right. + + A Accumulator + + The accumulator is the main register for arithmetic and + logic operations. Unlike the index registers X and Y, it + has a direct connection to the Arithmetic and Logic Unit + (ALU). This is why many operations are only available for + the accumulator, not the index registers. + + X Index register X + + This is the main register for addressing data with + indices. It has a special addressing mode, indexed + indirect, which lets you to have a vector table on the + zero page. + + Y Index register Y + + The Y register has the least operations available. On the + other hand, only it has the indirect indexed addressing + mode that enables access to any memory place without + having to use self-modifying code. + + 6510/8502 Undocumented Commands + + -- A brief explanation about what may happen while + using don't care states. + + ANE $8B A = (A | #$EE) & X & #byte + same as + A = ((A & #$11 & X) | ( #$EE & X)) & #byte + + In real 6510/8502 the internal parameter #$11 + may occasionally be #$10, #$01 or even #$00. + This occurs when the video chip starts DMA + between the opcode fetch and the parameter fetch + of the instruction. The value probably depends + on the data that was left on the bus by the VIC-II. + + LXA $AB C=Lehti: A = X = ANE + Alternate: A = X = (A & #byte) + + TXA and TAX have to be responsible for these. + + SHA $93,$9F Store (A & X & (ADDR_HI + 1)) + SHX $9E Store (X & (ADDR_HI + 1)) + SHY $9C Store (Y & (ADDR_HI + 1)) + SHS $9B SHA and TXS, where X is replaced by (A & X). + + Note: The value to be stored is copied also + to ADDR_HI if page boundary is crossed. + + SBX $CB Carry and Decimal flags are ignored but the + Carry flag will be set in substraction. This + is due to the CMP command, which is executed + instead of the real SBC. + + ARR $6B This instruction first performs an AND + between the accumulator and the immediate + parameter, then it shifts the accumulator to + the right. However, this is not the whole + truth. See the description below. + +Many undocumented commands do not use AND between registers, the CPU +just throws the bytes to a bus simultaneously and lets the +open-collector drivers perform the AND. I.e. the command called 'SAX', +which is in the STORE section (opcodes $A0...$BF), stores the result +of (A & X) by this way. + +More fortunate is its opposite, 'LAX' which just loads a byte +simultaneously into both A and X. + + $6B ARR + +This instruction seems to be a harmless combination of AND and ROR at +first sight, but it turns out that it affects the V flag and also has +a special kind of decimal mode. This is because the instruction has +inherited some properties of the ADC instruction ($69) in addition to +the ROR ($6A). + +In Binary mode (D flag clear), the instruction effectively does an AND +between the accumulator and the immediate parameter, and then shifts +the accumulator to the right, copying the C flag to the 8th bit. It +sets the Negative and Zero flags just like the ROR would. The ADC code +shows up in the Carry and oVerflow flags. The C flag will be copied +from the bit 6 of the result (which doesn't seem too logical), and the +V flag is the result of an Exclusive OR operation between the bit 6 +and the bit 5 of the result. This makes sense, since the V flag will +be normally set by an Exclusive OR, too. + +In Decimal mode (D flag set), the ARR instruction first performs the +AND and ROR, just like in Binary mode. The N flag will be copied from +the initial C flag, and the Z flag will be set according to the ROR +result, as expected. The V flag will be set if the bit 6 of the +accumulator changed its state between the AND and the ROR, cleared +otherwise. + +Now comes the funny part. If the low nybble of the AND result, +incremented by its lowmost bit, is greater than 5, the low nybble in +the ROR result will be incremented by 6. The low nybble may overflow +as a consequence of this BCD fixup, but the high nybble won't be +adjusted. The high nybble will be BCD fixed in a similar way. If the +high nybble of the AND result, incremented by its lowmost bit, is +greater than 5, the high nybble in the ROR result will be incremented +by 6, and the Carry flag will be set. Otherwise the C flag will be +cleared. + +To help you understand this description, here is a C routine that +illustrates the ARR operation in Decimal mode: + + unsigned + A, /* Accumulator */ + AL, /* low nybble of accumulator */ + AH, /* high nybble of accumulator */ + + C, /* Carry flag */ + Z, /* Zero flag */ + V, /* oVerflow flag */ + N, /* Negative flag */ + + t, /* temporary value */ + s; /* value to be ARRed with Accumulator */ + + t = A & s; /* Perform the AND. */ + + AH = t >> 4; /* Separate the high */ + AL = t & 15; /* and low nybbles. */ + + N = C; /* Set the N and */ + Z = !(A = (t >> 1) | (C << 7)); /* Z flags traditionally */ + V = (t ^ A) & 64; /* and V flag in a weird way. */ + + if (AL + (AL & 1) > 5) /* BCD "fixup" for low nybble. */ + A = (A & 0xF0) | ((A + 6) & 0xF); + + if (C = AH + (AH & 1) > 5) /* Set the Carry flag. */ + A = (A + 0x60) & 0xFF; /* BCD "fixup" for high nybble. */ + + $CB SBX X <- (A & X) - Immediate + +The 'SBX' ($CB) may seem to be very complex operation, even though it +is a combination of the subtraction of accumulator and parameter, as +in the 'CMP' instruction, and the command 'DEX'. As a result, both A +and X are connected to ALU but only the subtraction takes place. Since +the comparison logic was used, the result of subtraction should be +normally ignored, but the 'DEX' now happily stores to X the value of +(A & X) - Immediate. That is why this instruction does not have any +decimal mode, and it does not affect the V flag. Also Carry flag will +be ignored in the subtraction but set according to the result. + + Proof: + +begin 644 vsbx +M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```*D`H#V1*Z`_D2N@09$KJ0>% +M^QBE^VEZJ+$KH#F1*ZD`2"BI`*(`RP`(:-B@.5$K*4#P`E@`H#VQ*SAI`)$K +JD-Z@/[$K:0"1*Y#4J2X@TO\XH$&Q*VD`D2N0Q,;[$+188/_^]_:_OK>V +` +end + + and + +begin 644 sbx +M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI`*!-D2N@3Y$KH%&1*ZD# +MA?L8I?M*2)`#J1@LJ3B@29$K:$J0`ZGX+*G8R)$K&/BXJ?2B8\L)AOP(:(7] +MV#B@3;$KH$\Q*Z!1\2L(1?SP`0!H1?TIM]#XH$VQ*SAI`)$KD,N@3[$K:0"1 +9*Y#!J2X@TO\XH%&Q*VD`D2N0L<;[$))88-#X +` +end + +These test programs show if your machine is compatible with ours +regarding the opcode $CB. The first test, vsbx, proves that SBX does +not affect the V flag. The latter one, sbx, proves the rest of our +theory. The vsbx test tests 33554432 SBX combinations (16777216 +different A, X and Immediate combinations, and two different V flag +states), and the sbx test doubles that amount (16777216*4 D and C flag +combinations). Both tests have run successfully on a C64 and a Vic20. +They ought to run on C16, +4 and the PET series as well. The tests +stop with BRK, if the opcode $CB does not work as expected. Successful +operation ends in RTS. As the tests are very slow, they print dots on +the screen while running so that you know that the machine has not +jammed. On computers running at 1 MHz, the first test prints +approximately one dot every four seconds and a total of 2048 dots, +whereas the second one prints half that amount, one dot every seven +seconds. + +If the tests fail on your machine, please let us know your processor's +part number and revision. If possible, save the executable (after it +has stopped with BRK) under another name and send it to us so that we +know at which stage the program stopped. + +The following program is a Commodore 64 executable that Marko Mäkelä +developed when trying to find out how the V flag is affected by SBX. +(It was believed that the SBX affects the flag in a weird way, and +this program shows how SBX sets the flag differently from SBC.) You +may find the subroutine at $C150 useful when researching other +undocumented instructions' flags. Run the program in a machine +language monitor, as it makes use of the BRK instruction. The result +tables will be written on pages $C2 and $C3. + +begin 644 sbx-c100 +M`,%XH`",#L&,$,&,$L&XJ8*B@LL7AOL(:(7\N#BM#L$M$,'M$L$(Q?OP`B@` +M:$7\\`,@4,'N#L'0U.X0P=#/SB#0[A+!T,<``````````````)BJ\!>M#L$M +L$,'=_\'0":T2P=W_PM`!8,K0Z:T.P2T0P9D`PID`!*T2P9D`PYD`!HL2N@ +M3Y$KH%R1*XII>ZBQ*Z!3D2N@8)$KBFE_J+$KH%61*Z!BD2OX.+BE^^;\Q_S8 +L"&B%_3BXI?OF_,?\"&A%_?`!`.;[T-_F_-#;RA"M8!@X&#CFYL;&Q\?GYP#8 +` +end + + 6510 features + + o PHP always pushes the Break (B) flag as a `1' to the stack. + Jukka Tapanimäki claimed in C=lehti issue 3/89, on page 27 that the + processor makes a logical OR between the status register's bit 4 + and the bit 8 of the stack pointer register (which is always 1). + He did not give any reasons for this argument, and has refused to clarify + it afterwards. Well, this was not the only error in his article... + + o Indirect addressing modes do not handle page boundary crossing at all. + When the parameter's low byte is $FF, the effective address wraps + around and the CPU fetches high byte from $xx00 instead of $xx00+$0100. + E.g. JMP ($01FF) fetches PCL from $01FF and PCH from $0100, + and LDA ($FF),Y fetches the base address from $FF and $00. + + o Indexed zero page addressing modes never fix the page address on + crossing the zero page boundary. + E.g. LDX #$01 : LDA ($FF,X) loads the effective address from $00 and $01. + + o The processor always fetches the byte following a relative branch + instruction. If the branch is taken, the processor reads then the + opcode from the destination address. If page boundary is crossed, it + first reads a byte from the old page from a location that is bigger + or smaller than the correct address by one page. + + o If you cross a page boundary in any other indexed mode, + the processor reads an incorrect location first, a location that is + smaller by one page. + + o Read-Modify-Write instructions write unmodified data, then modified + (so INC effectively does LDX loc;STX loc;INX;STX loc) + + o -RDY is ignored during writes + (This is why you must wait 3 cycles before doing any DMA -- + the maximum number of consecutive writes is 3, which occurs + during interrupts except -RESET.) + + o Some undefined opcodes may give really unpredictable results. + + o All registers except the Program Counter remain unmodified after -RESET. + (This is why you must preset D and I flags in the RESET handler.) + + Different CPU types + +The Rockwell data booklet 29651N52 (technical information about R65C00 +microprocessors, dated October 1984), lists the following differences between +NMOS R6502 microprocessor and CMOS R65C00 family: + + 1. Indexed addressing across page boundary. + NMOS: Extra read of invalid address. + CMOS: Extra read of last instruction byte. + + 2. Execution of invalid op codes. + NMOS: Some terminate only by reset. Results are undefined. + CMOS: All are NOPs (reserved for future use). + + 3. Jump indirect, operand = XXFF. + NMOS: Page address does not increment. + CMOS: Page address increments and adds one additional cycle. + + 4. Read/modify/write instructions at effective address. + NMOS: One read and two write cycles. + CMOS: Two read and one write cycle. + + 5. Decimal flag. + NMOS: Indeterminate after reset. + CMOS: Initialized to binary mode (D=0) after reset and interrupts. + + 6. Flags after decimal operation. + NMOS: Invalid N, V and Z flags. + CMOS: Valid flag adds one additional cycle. + + 7. Interrupt after fetch of BRK instruction. + NMOS: Interrupt vector is loaded, BRK vector is ignored. + CMOS: BRK is executed, then interrupt is executed. + + 6510 Instruction Timing + + The NMOS 6500 series processors always perform at least two reads +for each instruction. In addition to the operation code (opcode), they +fetch the next byte. This is quite efficient, as most instructions are +two or three bytes long. + + The processors also use a sort of pipelining. If an instruction does +not store data in memory on its last cycle, the processor can fetch +the opcode of the next instruction while executing the last cycle. For +instance, the instruction EOR #$FF truly takes three cycles. On the +first cycle, the opcode $49 will be fetched. During the second cycle +the processor decodes the opcode and fetches the parameter #$FF. On +the third cycle, the processor will perform the operation and store +the result to accumulator, but simultaneously it fetches the opcode +for the next instruction. This is why the instruction effectively +takes only two cycles. + + The following tables show what happens on the bus while executing +different kinds of instructions. + + Interrupts + + NMI and IRQ both take 7 cycles. Their timing diagram is much like + BRK's (see below). IRQ will be executed only when the I flag is + clear. IRQ and BRK both set the I flag, whereas the NMI does not + affect its state. + + The processor will usually wait for the current instruction to + complete before executing the interrupt sequence. To process the + interrupt before the next instruction, the interrupt must occur + before the last cycle of the current instruction. + + There is one exception to this rule: the BRK instruction. If a + hardware interrupt (NMI or IRQ) occurs before the fourth (flags + saving) cycle of BRK, the BRK instruction will be skipped, and + the processor will jump to the hardware interrupt vector. This + sequence will always take 7 cycles. + + You do not completely lose the BRK interrupt, the B flag will be + set in the pushed status register if a BRK instruction gets + interrupted. When BRK and IRQ occur at the same time, this does + not cause any problems, as your program will consider it as a + BRK, and the IRQ would occur again after the processor returned + from your BRK routine, unless you cleared the interrupt source in + your BRK handler. But the simultaneous occurrence of NMI and BRK + is far more fatal. If you do not check the B flag in the NMI + routine and subtract two from the return address when needed, the + BRK instruction will be skipped. + + If the NMI and IRQ interrupts overlap each other (one interrupt + occurs before fetching the interrupt vector for the other + interrupt), the processor will most probably jump to the NMI + vector in every case, and then jump to the IRQ vector after + processing the first instruction of the NMI handler. This has not + been measured yet, but the IRQ is very similar to BRK, and many + sources state that the NMI has higher priority than IRQ. However, + it might be that the processor takes the interrupt that comes + later, i.e. you could lose an NMI interrupt if an IRQ occurred in + four cycles after it. + + After finishing the interrupt sequence, the processor will start + to execute the first instruction of the interrupt routine. This + proves that the processor uses a sort of pipelining: it finishes + the current instruction (or interrupt sequence) while reading the + opcode of the next instruction. + + RESET does not push program counter on stack, and it lasts + probably 6 cycles after deactivating the signal. Like NMI, RESET + preserves all registers except PC. + + Instructions accessing the stack + + BRK + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away), + increment PC + 3 $0100,S W push PCH on stack, decrement S + 4 $0100,S W push PCL on stack, decrement S + 5 $0100,S W push P on stack (with B flag set), decrement S + 6 $FFFE R fetch PCL + 7 $FFFF R fetch PCH + + RTI + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away) + 3 $0100,S R increment S + 4 $0100,S R pull P from stack, increment S + 5 $0100,S R pull PCL from stack, increment S + 6 $0100,S R pull PCH from stack + + RTS + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away) + 3 $0100,S R increment S + 4 $0100,S R pull PCL from stack, increment S + 5 $0100,S R pull PCH from stack + 6 PC R increment PC + + PHA, PHP + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away) + 3 $0100,S W push register on stack, decrement S + + PLA, PLP + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away) + 3 $0100,S R increment S + 4 $0100,S R pull register from stack + + JSR + + # address R/W description + --- ------- --- ------------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R fetch low address byte, increment PC + 3 $0100,S R internal operation (predecrement S?) + 4 $0100,S W push PCH on stack, decrement S + 5 $0100,S W push PCL on stack, decrement S + 6 PC R copy low address byte to PCL, fetch high address + byte to PCH + + Accumulator or implied addressing + + # address R/W description + --- ------- --- ----------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R read next instruction byte (and throw it away) + + Immediate addressing + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch value, increment PC + + Absolute addressing + + JMP + + # address R/W description + --- ------- --- ------------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R fetch low address byte, increment PC + 3 PC R copy low address byte to PCL, fetch high address + byte to PCH + + Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT, + LAX, NOP) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, increment PC + 4 address R read from effective address + + Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC, + SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, increment PC + 4 address R read from effective address + 5 address W write the value back to effective address, + and do the operation on it + 6 address W write the new value to effective address + + Write instructions (STA, STX, STY, SAX) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, increment PC + 4 address W write register to effective address + + Zero page addressing + + Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT, + LAX, NOP) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address R read from effective address + + Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC, + SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address R read from effective address + 4 address W write the value back to effective address, + and do the operation on it + 5 address W write the new value to effective address + + Write instructions (STA, STX, STY, SAX) + + # address R/W description + --- ------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address W write register to effective address + + Zero page indexed addressing + + Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT, + LAX, NOP) + + # address R/W description + --- --------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address R read from address, add index register to it + 4 address+I* R read from effective address + + Notes: I denotes either index register (X or Y). + + * The high byte of the effective address is always zero, + i.e. page boundary crossings are not handled. + + Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC, + SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- --------- --- --------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address R read from address, add index register X to it + 4 address+X* R read from effective address + 5 address+X* W write the value back to effective address, + and do the operation on it + 6 address+X* W write the new value to effective address + + Note: * The high byte of the effective address is always zero, + i.e. page boundary crossings are not handled. + + Write instructions (STA, STX, STY, SAX) + + # address R/W description + --- --------- --- ------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R fetch address, increment PC + 3 address R read from address, add index register to it + 4 address+I* W write to effective address + + Notes: I denotes either index register (X or Y). + + * The high byte of the effective address is always zero, + i.e. page boundary crossings are not handled. + + Absolute indexed addressing + + Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT, + LAX, LAE, SHS, NOP) + + # address R/W description + --- --------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, + add index register to low address byte, + increment PC + 4 address+I* R read from effective address, + fix the high byte of effective address + 5+ address+I R re-read from effective address + + Notes: I denotes either index register (X or Y). + + * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. + + + This cycle will be executed only if the effective address + was invalid during cycle #4, i.e. page boundary was crossed. + + Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC, + SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- --------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, + add index register X to low address byte, + increment PC + 4 address+X* R read from effective address, + fix the high byte of effective address + 5 address+X R re-read from effective address + 6 address+X W write the value back to effective address, + and do the operation on it + 7 address+X W write the new value to effective address + + Notes: * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. + + Write instructions (STA, STX, STY, SHA, SHX, SHY) + + # address R/W description + --- --------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch low byte of address, increment PC + 3 PC R fetch high byte of address, + add index register to low address byte, + increment PC + 4 address+I* R read from effective address, + fix the high byte of effective address + 5 address+I W write to effective address + + Notes: I denotes either index register (X or Y). + + * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. Because + the processor cannot undo a write to an invalid + address, it always reads from the address first. + + Relative addressing (BCC, BCS, BNE, BEQ, BPL, BMI, BVC, BVS) + + # address R/W description + --- --------- --- --------------------------------------------- + 1 PC R fetch opcode, increment PC + 2 PC R fetch operand, increment PC + 3 PC R Fetch opcode of next instruction, + If branch is taken, add operand to PCL. + Otherwise increment PC. + 4+ PC* R Fetch opcode of next instruction. + Fix PCH. If it did not change, increment PC. + 5! PC R Fetch opcode of next instruction, + increment PC. + + Notes: The opcode fetch of the next instruction is included to + this diagram for illustration purposes. When determining + real execution times, remember to subtract the last + cycle. + + * The high byte of Program Counter (PCH) may be invalid + at this time, i.e. it may be smaller or bigger by $100. + + + If branch is taken, this cycle will be executed. + + ! If branch occurs to different page, this cycle will be + executed. + + Indexed indirect addressing + + Read instructions (LDA, ORA, EOR, AND, ADC, CMP, SBC, LAX) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R read from the address, add X to it + 4 pointer+X R fetch effective address low + 5 pointer+X+1 R fetch effective address high + 6 address R read from effective address + + Note: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R read from the address, add X to it + 4 pointer+X R fetch effective address low + 5 pointer+X+1 R fetch effective address high + 6 address R read from effective address + 7 address W write the value back to effective address, + and do the operation on it + 8 address W write the new value to effective address + + Note: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + Write instructions (STA, SAX) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R read from the address, add X to it + 4 pointer+X R fetch effective address low + 5 pointer+X+1 R fetch effective address high + 6 address W write to effective address + + Note: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + Indirect indexed addressing + + Read instructions (LDA, EOR, AND, ORA, ADC, SBC, CMP) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R fetch effective address low + 4 pointer+1 R fetch effective address high, + add Y to low byte of effective address + 5 address+Y* R read from effective address, + fix high byte of effective address + 6+ address+Y R read from effective address + + Notes: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. + + + This cycle will be executed only if the effective address + was invalid during cycle #5, i.e. page boundary was crossed. + + Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R fetch effective address low + 4 pointer+1 R fetch effective address high, + add Y to low byte of effective address + 5 address+Y* R read from effective address, + fix high byte of effective address + 6 address+Y R read from effective address + 7 address+Y W write the value back to effective address, + and do the operation on it + 8 address+Y W write the new value to effective address + + Notes: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. + + Write instructions (STA, SHA) + + # address R/W description + --- ----------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address, increment PC + 3 pointer R fetch effective address low + 4 pointer+1 R fetch effective address high, + add Y to low byte of effective address + 5 address+Y* R read from effective address, + fix high byte of effective address + 6 address+Y W write to effective address + + Notes: The effective address is always fetched from zero page, + i.e. the zero page boundary crossing is not handled. + + * The high byte of the effective address may be invalid + at this time, i.e. it may be smaller by $100. + + Absolute indirect addressing (JMP) + + # address R/W description + --- --------- --- ------------------------------------------ + 1 PC R fetch opcode, increment PC + 2 PC R fetch pointer address low, increment PC + 3 PC R fetch pointer address high, increment PC + 4 pointer R fetch low address to latch + 5 pointer+1* R fetch PCH, copy latch to PCL + + Note: * The PCH will always be fetched from the same page + than PCL, i.e. page boundary crossing is not handled. + + How Real Programmers Acknowledge Interrupts + + With RMW instructions: + + ; beginning of combined raster/timer interrupt routine + LSR $D019 ; clear VIC interrupts, read raster interrupt flag to C + BCS raster ; jump if VIC caused an interrupt + ... ; timer interrupt routine + + Operational diagram of LSR $D019: + + # data address R/W + --- ---- ------- --- --------------------------------- + 1 4E PC R fetch opcode + 2 19 PC+1 R fetch address low + 3 D0 PC+2 R fetch address high + 4 xx $D019 R read memory + 5 xx $D019 W write the value back, rotate right + 6 xx/2 $D019 W write the new value back + + The 5th cycle acknowledges the interrupt by writing the same + value back. If only raster interrupts are used, the 6th cycle + has no effect on the VIC. (It might acknowledge also some + other interrupts.) + + With indexed addressing: + + ; acknowledge interrupts to both CIAs + LDX #$10 + LDA $DCFD,X + + Operational diagram of LDA $DCFD,X: + + # data address R/W description + --- ---- ------- --- --------------------------------- + 1 BD PC R fetch opcode + 2 FD PC+1 R fetch address low + 3 DC PC+2 R fetch address high, add X to address low + 4 xx $DC0D R read from address, fix high byte of address + 5 yy $DD0D R read from right address + + ; acknowledge interrupts to CIA 2 + LDX #$10 + STA $DDFD,X + + Operational diagram of STA $DDFD,X: + + # data address R/W description + --- ---- ------- --- --------------------------------- + 1 9D PC R fetch opcode + 2 FD PC+1 R fetch address low + 3 DC PC+2 R fetch address high, add X to address low + 4 xx $DD0D R read from address, fix high byte of address + 5 ac $DE0D W write to right address + + With branch instructions: + + ; acknowledge interrupts to CIA 2 + LDA #$00 ; clear N flag + JMP $DD0A + DD0A BPL $DC9D ; branch + DC9D BRK ; return + + You need the following preparations to initialize the CIA registers: + + LDA #$91 ; argument of BPL + STA $DD0B + LDA #$10 ; BPL + STA $DD0A + STA $DD08 ; load the ToD values from the latches + LDA $DD0B ; freeze the ToD display + LDA #$7F + STA $DC0D ; assure that $DC0D is $00 + + Operational diagram of BPL $DC9D: + + # data address R/W description + --- ---- ------- --- --------------------------------- + 1 10 $DD0A R fetch opcode + 2 91 $DD0B R fetch argument + 3 xx $DD0C R fetch opcode, add argument to PCL + 4 yy $DD9D R fetch opcode, fix PCH + ( 5 00 $DC9D R fetch opcode ) + + ; acknowledge interrupts to CIA 1 + LSR ; clear N flag + JMP $DCFA + DCFA BPL $DD0D + DD0D BRK + + ; Again you need to set the ToD registers of CIA 1 and the + ; Interrupt Control Register of CIA 2 first. + + Operational diagram of BPL $DD0D: + + # data address R/W description + --- ---- ------- --- --------------------------------- + 1 10 $DCFA R fetch opcode + 2 11 $DCFB R fetch argument + 3 xx $DCFC R fetch opcode, add argument to PCL + 4 yy $DC0D R fetch opcode, fix PCH + ( 5 00 $DD0D R fetch opcode ) + + ; acknowledge interrupts to CIA 2 automagically + ; preparations + LDA #$7F + STA $DD0D ; disable all interrupt sources of CIA2 + LDA $DD0E + AND #$BE ; ensure that $DD0C remains constant + STA $DD0E ; and stop the timer + LDA #$FD + STA $DD0C ; parameter of BPL + LDA #$10 + STA $DD0B ; BPL + LDA #$40 + STA $DD0A ; RTI/parameter of LSR + LDA #$46 + STA $DD09 ; LSR + STA $DD08 ; load the ToD values from the latches + LDA $DD0B ; freeze the ToD display + LDA #$09 + STA $0318 + LDA #$DD + STA $0319 ; change NMI vector to $DD09 + LDA #$FF ; Try changing this instruction's operand + STA $DD05 ; (see comment below). + LDA #$FF + STA $DD04 ; set interrupt frequency to 1/65536 cycles + LDA $DD0E + AND #$80 + ORA #$11 + LDX #$81 + STX $DD0D ; enable timer interrupt + STA $DD0E ; start timer + + LDA #$00 ; To see that the interrupts really occur, + STA $D011 ; use something like this and see how + LOOP DEC $D020 ; changing the byte loaded to $DD05 from + BNE LOOP ; #$FF to #$0F changes the image. + + When an NMI occurs, the processor jumps to Kernal code, which jumps to + ($0318), which points to the following routine: + + DD09 LSR $40 ; clear N flag + BPL $DD0A ; Note: $DD0A contains RTI. + + Operational diagram of BPL $DD0A: + + # data address R/W description + --- ---- ------- --- --------------------------------- + 1 10 $DD0B R fetch opcode + 2 11 $DD0C R fetch argument + 3 xx $DD0D R fetch opcode, add argument to PCL + 4 40 $DD0A R fetch opcode, (fix PCH) + + With RTI: + + ; the fastest possible interrupt handler in the 6500 family + ; preparations + SEI + LDA $01 ; disable ROM and enable I/O + AND #$FD + ORA #$05 + STA $01 + LDA #$7F + STA $DD0D ; disable CIA 2's all interrupt sources + LDA $DD0E + AND #$BE ; ensure that $DD0C remains constant + STA $DD0E ; and stop the timer + LDA #$40 + STA $DD0C ; store RTI to $DD0C + LDA #$0C + STA $FFFA + LDA #$DD + STA $FFFB ; change NMI vector to $DD0C + LDA #$FF ; Try changing this instruction's operand + STA $DD05 ; (see comment below). + LDA #$FF + STA $DD04 ; set interrupt frequency to 1/65536 cycles + LDA $DD0E + AND #$80 + ORA #$11 + LDX #$81 + STX $DD0D ; enable timer interrupt + STA $DD0E ; start timer + + LDA #$00 ; To see that the interrupts really occur, + STA $D011 ; use something like this and see how + LOOP DEC $D020 ; changing the byte loaded to $DD05 from + BNE LOOP ; #$FF to #$0F changes the image. + + When an NMI occurs, the processor jumps to Kernal code, which + jumps to ($0318), which points to the following routine: + + DD0C RTI + + How on earth can this clear the interrupts? Remember, the + processor always fetches two successive bytes for each + instruction. + + A little more practical version of this is redirecting the NMI + (or IRQ) to your own routine, whose last instruction is JMP + $DD0C or JMP $DC0C. If you want to confuse more, change the 0 + in the address to a hexadecimal digit different from the one + you used when writing the RTI. + + Or you can combine the latter two methods: + + DD09 LSR $xx ; xx is any appropriate BCD value 00-59. + BPL $DCFC + DCFC RTI + + This example acknowledges interrupts to both CIAs. + + If you want to confuse the examiners of your code, you can use any +of these techniques. Although these examples use no undefined opcodes, +they do not necessarily run correctly on CMOS processors. However, the +RTI example should run on 65C02 and 65C816, and the latter branch +instruction example might work as well. + + The RMW instruction method has been used in some demos, others were +developed by Marko Mäkelä. His favourite is the automagical RTI +method, although it does not have any practical applications, except +for some time dependent data decryption routines for very complicated +copy protections. + + Memory Management + +The processor's point of view + + The Commodore 64 has access to more memory than its processor can +directly handle. This is possible by banking the memory. There are +five user configurable inputs that affect the banking. Three of them +can be controlled by program, and the rest two serve as control lines +on the memory expansion port. + + The 6510 MPU has an integrated I/O port with six I/O lines. This +port is accessed through the memory locations 0 and 1. The location 0 +is the Data Direction Register for the Peripheral data Register, which +is mapped to the other location. When a bit in the DDR is set, the +corresponding PR bit controls the state of a corresponding Peripheral +line as an output. When it is clear, the state of the Peripheral line +is reflected by the Peripheral register. The Peripheral lines are +numbered from 0 to 5, and they are mapped to the DDR and PR bits 0 - 5, +respectively. The 8502 processor, which is used in the Commodore 128, +has seven Peripheral lines in its I/O port. The pin P6 is connected to +the ASC/CC key (Caps lock in English versions). + + The I/O lines have the following functions: + + Direction Line Function + --------- ---- -------- + out P5 Cassette motor control. (0 = motor spins) + in P4 Cassette sense. (0 = PLAY button depressed) + out P3 Cassette write data. + out P2 CHAREN + out P1 HIRAM + out P0 LORAM + + The default value of the DDR register is $2F, so all lines except +Cassette sense are outputs. The default PR value is $37 (Datassette +motor stopped, and all three memory management lines high). +If you turn any memory management line to input, the external pull-up +resistors make it to look like it is outputting logical "1". This +is actually why the computer always switches the ROMs in upon startup: +Pulling the -RESET line low resets all Peripheral lines to inputs, +thus setting all three processor-driven memory management lines to +logical "1" level. + + The two remaining memory management lines are -EXROM and -GAME on +the cartridge port. Each line has a pull-up resistor, so the lines +are "1" by default. + + Even though the memory banking has been implemented with a 82S100 +Programmable _Logic_ Array, there is only one control line that seems +to behave logically at first sight, the -CHAREN line. It is mostly +used to choose between I/O address space and the character generator +ROM. The following memory map introduces the oddities of -CHAREN and +the other memory management lines. It is based on the memory maps in +the Commodore 64 Programmer's Reference Guide, pp. 263 - 267, and some +errors and inaccuracies have been corrected. + + The leftmost column of the table contains addresses in hexadecimal +notation. The columns aside it introduce all possible memory +configurations. The default mode is on the left, and the absolutely +most rarely used Ultimax game console configuration is on the right. +(Has anybody ever seen any Ultimax games?) Each memory configuration +column has one or more four-digit binary numbers as a title. The bits, +from left to right, represent the state of the -LORAM, -HIRAM, -GAME +and -EXROM lines, respectively. The bits whose state does not matter +are marked with "x". For instance, when the Ultimax video game +configuration is active (the -GAME line is shorted to ground), the +-LORAM and -HIRAM lines have no effect. + + default 001x Ultimax + 1111 101x 1000 011x 00x0 1110 0100 1100 xx01 +10000 +---------------------------------------------------------------------- + F000 + Kernal RAM RAM Kernal RAM Kernal Kernal Kernal ROMH(* + E000 +---------------------------------------------------------------------- + D000 IO/C IO/C IO/RAM IO/C RAM IO/C IO/C IO/C I/O +---------------------------------------------------------------------- + C000 RAM RAM RAM RAM RAM RAM RAM RAM - +---------------------------------------------------------------------- + B000 + BASIC RAM RAM RAM RAM BASIC ROMH ROMH - + A000 +---------------------------------------------------------------------- + 9000 + RAM RAM RAM RAM RAM ROML RAM ROML ROML(* + 8000 +---------------------------------------------------------------------- + 7000 + + 6000 + RAM RAM RAM RAM RAM RAM RAM RAM - + 5000 + + 4000 +---------------------------------------------------------------------- + 3000 + + 2000 RAM RAM RAM RAM RAM RAM RAM RAM - + + 1000 +---------------------------------------------------------------------- + 0000 RAM RAM RAM RAM RAM RAM RAM RAM RAM +---------------------------------------------------------------------- + + *) Internal memory does not respond to write accesses to these + areas. + + Legend: Kernal E000-FFFF Kernal ROM. + + IO/C D000-DFFF I/O address space or Character + generator ROM, selected by + -CHAREN. If the CHAREN bit is + clear, the character generator + ROM will be selected. If it is + set, the I/O chips are + accessible. + + IO/RAM D000-DFFF I/O address space or RAM, + selected by -CHAREN. If the + CHAREN bit is clear, the + character generator ROM will + be selected. If it is set, the + internal RAM is accessible. + + I/O D000-DFFF I/O address space. + The -CHAREN line has no effect. + + BASIC A000-BFFF BASIC ROM. + + ROMH A000-BFFF or External ROM with the -ROMH line + E000-FFFF connected to its -CS line. + + ROML 8000-9FFF External ROM with the -ROML line + connected to its -CS line. + + RAM various ranges Commodore 64's internal RAM. + + - 1000-7FFF and Open address space. + A000-CFFF The Commodore 64's memory chips + do not detect any memory accesses + to this area except the VIC-II's + DMA and memory refreshes. + + NOTE: Whenever the processor tries to write to any ROM area + (Kernal, BASIC, CHAROM, ROML, ROMH), the data will get + "through the ROM" to the C64's internal RAM. + + For this reason, you can easily copy data from ROM to RAM, + without any bank switching. But implementing external + memory expansions without DMA is very hard, as you have to + use a 256 byte window on the I/O1 or I/O2 area, like + GEORAM, or the Ultimax memory configuration, if you do not + want the data to be written both to internal and external + RAM. + + However, this is not true for the Ultimax video game + configuration. In that mode, the internal RAM ignores all + memory accesses outside the area $0000-$0FFF, unless they + are performed by the VIC, and you can write to external + memory at $1000-$CFFF and $E000-$FFFF, if any, without + changing the contents of the internal RAM. + +A note concerning the I/O area + + The I/O area of the Commodore 64 is divided as follows: + + Address range Owner + ------------- ----- + D000-D3FF MOS 6567/6569 VIC-II Video Interface Controller + D400-D7FF MOS 6581 SID Sound Interface Device + D800-DBFF Color RAM (only lower nybbles are connected) + DC00-DCFF MOS 6526 CIA Complex Interface Adapter #1 + DD00-DDFF MOS 6526 CIA Complex Interface Adapter #2 + DE00-DEFF User expansion #1 (-I/O1 on Expansion Port) + DF00-DFFF User expansion #2 (-I/O2 on Expansion Port) + + As you can see, the address ranges for the chips are much larger +than required. Because of this, you can access the chips through +multiple memory areas. The VIC-II appears in its window every $40 +addresses. For instance, the addresses $D040 and $D080 are both mapped +to the Sprite 0 X co-ordinate register. The SID has one register +selection line less, thus it appears at every $20 bytes. The CIA chips +have only 16 registers, so there are 16 copies of each in their memory +area. + + However, you should not use other addresses than those specified by +Commodore. For instance, the Commodore 128 mapped its additional I/O +chips to this same memory area, and the SID responds only to the +addresses D400-D4FF, also when in C64 mode. And the Commodore 65, or +the C64DX, which unfortunately did not make its way to the market, +could narrow the memory window reserved for its CSG 4567 VIC-III. + +The video chip + + The MOS 6567/6569 VIC-II Video Interface Controller has access to +only 16 kilobytes at a time. To enable the VIC-II to access the whole +64 kB memory space, the main memory is divided to four banks of 16 kB +each. The lines PA0 and PA1 of the second CIA are the inverse of the +virtual VIC-II address lines VA14 and VA15, respectively. To select a +VIC-II bank other than the default, you must program the CIA lines to +output the desired bit pair. For instance, the following code selects +the memory area $4000-$7FFF (bank 1) for the video controller: + + LDA $DD02 ; Data Direction Register A + ORA #$03 ; Set pins PA0 and PA1 to outputs + STA $DD02 + LDA $DD00 + AND #$FC ; Mask the lowmost bit pair off + ORA #$02 ; Select VIC-II bank 1 (the inverse of binary 01 is 10) + STA $DD00 + + Why should you set the pins to outputs? Hardware RESET resets all +I/O lines to inputs, and thanks to the CIA's internal pull-up +resistors, the inputs actually output logical high voltage level. So, +upon -RESET, the video bank 0 is selected automatically, and older +Kernals could leave it uninitialized. + + Note that the VIC-II always fetches its information from the +internal RAM, totally ignoring the memory configuration lines. There +is only one exception to this rule: The character generator ROM. +Unless the Ultimax mode is selected, VIC-II "sees" character generator +ROM in the memory areas 1000-1FFF and 9000-9FFF. If the Ultimax +configuration is active, the VIC-II fetches all data from the internal +RAM. + +Accessing the memory places 0 and 1 + + Although the addresses 0 and 1 of the processor are hard-wired to +its on-chip I/O port registers, you can access the memory places 0 and +1. The video chip always reads from RAM (or character generator ROM), +so you can use it to read also from 0 and 1. Enable the bit-map screen +and set the start address of the graphics screen to 0. Now you can see +these two memory locations in the upper left corner. Alternatively, +you could set the character generator start address to 0, in which +case you would see these locations in @ characters (code 0). Or, you +can activate a sprite with start address 0. Whichever method you +choose, you can read these locations with sprite collision registers. +Define a sprite consisting of only one dot, and move it to read the 8 +bits of each byte with the sprite to sprite or sprite to background +collision registers. + + But how can you write to these locations? If you execute the command +POKE 53265,59, you will see that the memory place 1 changes its value +wildly. If you disable the interrupts (POKE53664,127), it will remain +stable. How is this possible? When the processor writes to 0 or 1, it +will put the address on the address bus and set the R/-W line to indicate +a write cycle, but it does not put the data on the data bus. Thus, it +writes "random" data. Of course this data is not truly random. Actually +it is something that the video chip left on the bus on its clock half. +So, if you want to write a certain value on 0 or 1, you have to make the +video chip to read that value just before the store cycle. This requires +very accurate timing, but it can be achieved even with a carefully +written BASIC program. Just wait the video chip to be in the top or +bottom border and the beam to be in the middle of the screen (not in the +side borders). At this area, the video chip will always read the last +byte of the video bank (by default $3FFF). Now, if you store anything to +the I/O port registers 0 or 1 while the video chip is refreshing this +screen area, the contents of the memory place $3FFF will be written to +the respective memory place (0 or 1). Note that this trick does not work +reliably on all computers. You need good RF protection, as the data bus +will not be driven at all when the value remains on it. + + On the C128 in its 2 MHz mode, you can write to the memory places +with an easier kludge. Just make sure that the video chip is not +performing the memory refresh (as it would slow down to 1 MHz in that +case), and use some instruction that reads from a proper memory location +before writing to 0 or 1. Indexed zero-page addressing modes are good +for it. I tested this trick with LDX#1 followed by STA $FF,X. As you +can read from the instruction timing section of this document, the +instruction first reads from $FF (the base address) and then writes to 0. +The timing can be done with a simple LDA$D012:CMP$D012:BEQ *-3 loop. +But in the C128 mode you can relocate the stack page to zero page, so +this trick is not really useful. + + You can also read the memory places 0 and 1 much faster than with +sprite collisions. Just make the video chip to read from 0 or 1, and +then read from non-connected address space ($DE00-$DFFF on a stock C64; +also $D700-$D7FF on C128's). Actually, you can produce a complete map +of the video timing on your computer by making a loop that reads from +open address space, pausing one frame and one cycle in between. And if +you are into copy protections, you could write a program on the open +address space. Just remember that there must be a byte on the bus for +each clock cycle. + + These tricks unfortunately do not work reliably on all units. So far +I have had the opportunity to try it on three computers, two of which +were Commodore 128 DCR's (C128's housed in metal case with a 1571 floppy +disk drive, whose controller is integrated on the mother board). One +C128DCR drove some of its data bits too heavily to high state. No wonder, +since its housing consisted of some newspapers spread on the floor. + + Autostart Code + + Although this document concentrates on hardware, there is one thing +that you must know about the firmware to get complete control over +your computer. As the Commodore 64 always switches the ROMs on upon +-RESET, you cannot relocate the RESET vector by writing something in +RAM. Instead, you have to use the Autostart code that will be +recognized by the KERNAL ROM. If the memory places from $8004 through +$8008 contain the PETSCII string 'CBM80' (C3 C2 CD 38 30), the RESET +routine jumps to ($8000) and the default NMI handler jumps to ($8002). + + Some programs that load into RAM take advantage of this and don't +let the machine to be reset. You don't have to modify the ROM to get +rid of this annoying behaviour. Simply ground the -EXROM line for the +beginning of the RESET sequence. + + Notes + + See the MCS 6500 Microcomputer Family Programming Manual for less +information. + +References: + C64 Memory Maps C64 Programmer's Reference Guide, pp. 262-267 + C64 Schematic Diagram + 6510 Block Diagram C64 Programmer's Reference Guide, p. 404 + Instruction Set C64 Programmer's Reference Guide, pp. 254-255, 416-417 + C64/128 Real Programmer's Revenge Guide + C=Lehti magazine 4/87 + +--------------------------------------------------------------------------- +Marko Mäkelä (Marko.Makela@HUT.FI) +