diff --git a/M6502/documentation/6502_cpu.txt b/M6502/documentation/6502_cpu.txt
new file mode 100644
index 0000000..126bb66
--- /dev/null
+++ b/M6502/documentation/6502_cpu.txt
@@ -0,0 +1,1534 @@
+#
+# $Id: 64doc,v 1.8 1994/06/03 19:50:04 jopi Exp $
+#
+# This file is part of Commodore 64 emulator
+#      and Program Development System.
+#
+# See README for copyright notice
+#
+# This file contains documentation for 6502/6510/8500/8502 instruction set.
+#
+#
+# Written by
+#   John West       (john@ucc.gu.uwa.edu.au)
+#   Marko MŠkelŠ    (msmakela@kruuna.helsinki.fi)
+#
+#
+# $Log: 64doc,v $
+# Revision 1.8  1994/06/03  19:50:04  jopi
+# Patchlevel 2
+#
+# Revision 1.7  1994/04/15  13:07:04  jopi
+# 65xx Register descriptions added
+#
+# Revision 1.6  1994/02/18  16:09:36  jopi
+#
+# Revision 1.5  1994/01/26  16:08:37  jopi
+# X64 version 0.2 PL 1
+#
+# Revision 1.4  1993/11/10  01:55:34  jopi
+#
+# Revision 1.3  93/06/21  13:37:18  jopi
+#  X64 version 0.2 PL 0
+#
+# Revision 1.2  93/06/21  13:07:15  jopi
+# *** empty log message ***
+#
+#
+
+ Note: To extract the uuencoded ML programs in this article most
+       easily you may use e.g. "uud" by Edwin Kremer ,
+       which extracts them all at once.
+
+
+Documentation for the NMOS 65xx/85xx Instruction Set
+
+        6510 Instructions by Addressing Modes
+        6502 Registers
+        6510/8502 Undocumented Commands
+        Register selection for load and store
+        Decimal mode in NMOS 6500 series
+        6510 features
+        Different CPU types
+        6510 Instruction Timing
+        How Real Programmers Acknowledge Interrupts
+        Memory Management
+        Autostart Code
+        Notes
+        References
+
+
+6510 Instructions by Addressing Modes
+
+off- ++++++++++ Positive ++++++++++  ---------- Negative ----------
+set  00      20      40      60      80      a0      c0      e0      mode
+
++00  BRK     JSR     RTI     RTS     NOP*    LDY     CPY     CPX     Impl/immed
++01  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     (indir,x)
++02   t       t       t       t      NOP*t   LDX     NOP*t   NOP*t     ? /immed
++03  SLO*    RLA*    SRE*    RRA*    SAX*    LAX*    DCP*    ISB*    (indir,x)
++04  NOP*    BIT     NOP*    NOP*    STY     LDY     CPY     CPX     Zeropage
++05  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     Zeropage
++06  ASL     ROL     LSR     ROR     STX     LDX     DEC     INC     Zeropage
++07  SLO*    RLA*    SRE*    RRA*    SAX*    LAX*    DCP*    ISB*    Zeropage
+
++08  PHP     PLP     PHA     PLA     DEY     TAY     INY     INX     Implied
++09  ORA     AND     EOR     ADC     NOP*    LDA     CMP     SBC     Immediate
++0a  ASL     ROL     LSR     ROR     TXA     TAX     DEX     NOP     Accu/impl
++0b  ANC**   ANC**   ASR**   ARR**   ANE**   LXA**   SBX**   SBC*    Immediate
++0c  NOP*    BIT     JMP     JMP ()  STY     LDY     CPY     CPX     Absolute
++0d  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     Absolute
++0e  ASL     ROL     LSR     ROR     STX     LDX     DEC     INC     Absolute
++0f  SLO*    RLA*    SRE*    RRA*    SAX*    LAX*    DCP*    ISB*    Absolute
+
++10  BPL     BMI     BVC     BVS     BCC     BCS     BNE     BEQ     Relative
++11  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     (indir),y
++12   t       t       t       t       t       t       t       t         ?
++13  SLO*    RLA*    SRE*    RRA*    SHA**   LAX*    DCP*    ISB*    (indir),y
++14  NOP*    NOP*    NOP*    NOP*    STY     LDY     NOP*    NOP*    Zeropage,x
++15  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     Zeropage,x
++16  ASL     ROL     LSR     ROR     STX  y) LDX  y) DEC     INC     Zeropage,x
++17  SLO*    RLA*    SRE*    RRA*    SAX* y) LAX* y) DCP*    ISB*    Zeropage,x
+
++18  CLC     SEC     CLI     SEI     TYA     CLV     CLD     SED     Implied
++19  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     Absolute,y
++1a  NOP*    NOP*    NOP*    NOP*    TXS     TSX     NOP*    NOP*    Implied
++1b  SLO*    RLA*    SRE*    RRA*    SHS**   LAS**   DCP*    ISB*    Absolute,y
++1c  NOP*    NOP*    NOP*    NOP*    SHY**   LDY     NOP*    NOP*    Absolute,x
++1d  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     Absolute,x
++1e  ASL     ROL     LSR     ROR     SHX**y) LDX  y) DEC     INC     Absolute,x
++1f  SLO*    RLA*    SRE*    RRA*    SHA**y) LAX* y) DCP*    ISB*    Absolute,x
+
+        ROR intruction is available on MC650x microprocessors after
+        June, 1976.
+
+        Legend:
+
+        t       Jams the machine
+        *t      Jams very rarely
+        *       Undocumented command
+        **      Unusual operation
+        y)      indexed using Y instead of X
+        ()      indirect instead of absolute
+
+Note that the NOP instructions do have other addressing modes than the
+implied addressing. The NOP instruction is just like any other load
+instruction, except it does not store the result anywhere nor affects the
+flags.
+
+6502 Registers
+
+The NMOS 65xx processors are not ruined with too many registers. In addition
+to that, the registers are mostly 8-bit. Here is a brief description of each
+register:
+
+     PC Program Counter
+          This register points the address from which the next instruction
+          byte (opcode or parameter) will be fetched. Unlike other
+          registers, this one is 16 bits in length. The low and high 8-bit
+          halves of the register are called PCL and PCH, respectively. The
+          Program Counter may be read by pushing its value on the stack.
+          This can be done either by jumping to a subroutine or by causing
+          an interrupt.
+     S Stack pointer
+          The NMOS 65xx processors have 256 bytes of stack memory, ranging
+          from $0100 to $01FF. The S register is a 8-bit offset to the stack
+          page. In other words, whenever anything is being pushed on the
+          stack, it will be stored to the address $0100+S.
+
+          The Stack pointer can be read and written by transfering its value
+          to or from the index register X (see below) with the TSX and TXS
+          instructions.
+     P Processor status
+          This 8-bit register stores the state of the processor. The bits in
+          this register are called flags. Most of the flags have something
+          to do with arithmetic operations.
+
+          The P register can be read by pushing it on the stack (with PHP or
+          by causing an interrupt). If you only need to read one flag, you
+          can use the branch instructions. Setting the flags is possible by
+          pulling the P register from stack or by using the flag set or
+          clear instructions.
+
+          Following is a list of the flags, starting from the 8th bit of the
+          P register (bit 7, value $80):
+               N Negative flag
+                    This flag will be set after any arithmetic operations
+                    (when any of the registers A, X or Y is being loaded
+                    with a value). Generally, the N flag will be copied from
+                    the topmost bit of the register being loaded.
+
+                    Note that TXS (Transfer X to S) is not an arithmetic
+                    operation. Also note that the BIT instruction affects
+                    the Negative flag just like arithmetic operations.
+                    Finally, the Negative flag behaves differently in
+                    Decimal operations (see description below).
+               V oVerflow flag
+                    Like the Negative flag, this flag is intended to be used
+                    with 8-bit signed integer numbers. The flag will be
+                    affected by addition and subtraction, the instructions
+                    PLP, CLV and BIT, and the hardware signal -SO. Note that
+                    there is no SEV instruction, even though the MOS
+                    engineers loved to use East European abbreviations, like
+                    DDR (Deutsche Demokratische Republik vs. Data Direction
+                    Register). (The Russian abbreviation for their former
+                    trade association COMECON is SEV.) The -SO (Set
+                    Overflow) signal is available on some processors, at
+                    least the 6502, to set the V flag. This enables response
+                    to an I/O activity in equal or less than three clock
+                    cycles when using a BVC instruction branching to itself
+                    ($50 $FE).
+
+                    The CLV instruction clears the V flag, and the PLP and
+                    BIT instructions copy the flag value from the bit 6 of
+                    the topmost stack entry or from memory.
+
+                    After a binary addition or subtraction, the V flag will
+                    be set on a sign overflow, cleared otherwise. What is a
+                    sign overflow? For instance, if you are trying to add
+                    123 and 45 together, the result (168) does not fit in a
+                    8-bit signed integer (upper limit 127 and lower limit
+                    -128). Similarly, adding -123 to -45 causes the
+                    overflow, just like subtracting -45 from 123 or 123 from
+                    -45 would do.
+
+                    Like the N flag, the V flag will not be set as expected
+                    in the Decimal mode. Later in this document is a precise
+                    operation description.
+
+                    A common misbelief is that the V flag could only be set
+                    by arithmetic operations, not cleared.
+               1 unused flag
+                    To the current knowledge, this flag is always 1.
+               B Break flag
+                    This flag is used to distinguish software (BRK)
+                    interrupts from hardware interrupts (IRQ or NMI). The B
+                    flag is always set except when the P register is being
+                    pushed on stack when jumping to an interrupt routine to
+                    process only a hardware interrupt.
+
+                    The official NMOS 65xx documentation claims that the BRK
+                    instruction could only cause a jump to the IRQ vector
+                    ($FFFE). However, if an NMI interrupt occurs while
+                    executing a BRK instruction, the processor will jump to
+                    the NMI vector ($FFFA), and the P register will be
+                    pushed on the stack with the B flag set.
+               D Decimal mode flag
+                    This flag is used to select the (Binary Coded) Decimal
+                    mode for addition and subtraction. In most applications,
+                    the flag is zero.
+
+                    The Decimal mode has many oddities, and it operates
+                    differently on CMOS processors. See the description of
+                    the ADC, SBC and ARR instructions below.
+               I Interrupt disable flag
+                    This flag can be used to prevent the processor from
+                    jumping to the IRQ handler vector ($FFFE) whenever the
+                    hardware line -IRQ is active. The flag will be
+                    automatically set after taking an interrupt, so that the
+                    processor would not keep jumping to the interrupt
+                    routine if the -IRQ signal remains low for several clock
+                    cycles.
+               Z Zero flag
+                    The Zero flag will be affected in the same cases than
+                    the Negative flag. Generally, it will be set if an
+                    arithmetic register is being loaded with the value zero,
+                    and cleared otherwise. The flag will behave differently
+                    in Decimal operations.
+               C Carry flag
+                    This flag is used in additions, subtractions,
+                    comparisons and bit rotations. In additions and
+                    subtractions, it acts as a 9th bit and lets you to chain
+                    operations to calculate with bigger than 8-bit numbers.
+                    When subtracting, the Carry flag is the negative of
+                    Borrow: if an overflow occurs, the flag will be clear,
+                    otherwise set. Comparisons are a special case of
+                    subtraction: they assume Carry flag set and Decimal flag
+                    clear, and do not store the result of the subtraction
+                    anywhere.
+
+                    There are four kinds of bit rotations. All of them store
+                    the bit that is being rotated off to the Carry flag. The
+                    left shifting instructions are ROL and ASL. ROL copies
+                    the initial Carry flag to the lowmost bit of the byte;
+                    ASL always clears it. Similarly, the ROR and LSR
+                    instructions shift to the right.
+     A Accumulator
+          The accumulator is the main register for arithmetic and logic
+          operations. Unlike the index registers X and Y, it has a direct
+          connection to the Arithmetic and Logic Unit (ALU). This is why
+          many operations are only available for the accumulator, not the
+          index registers.
+     X Index register X
+          This is the main register for addressing data with indices. It has
+          a special addressing mode, indexed indirect, which lets you to
+          have a vector table on the zero page.
+     Y Index register Y
+          The Y register has the least operations available. On the other
+          hand, only it has the indirect indexed addressing mode that
+          enables access to any memory place without having to use
+          self-modifying code.
+
+6510/8502 Undocumented Commands
+
+-- A brief explanation about what may happen while using don't care states.
+
+        ANE $8B         A = (A | #$EE) & X & #byte
+                        same as
+                        A = ((A & #$11 & X) | ( #$EE & X)) & #byte
+
+                        In real 6510/8502 the internal parameter #$11
+                        may occasionally be #$10, #$01 or even #$00.
+                        This occurs when the video chip starts DMA
+                        between the opcode fetch and the parameter fetch
+                        of the instruction.  The value probably depends
+                        on the data that was left on the bus by the VIC-II.
+
+        LXA $AB         C=Lehti:   A = X = ANE
+                        Alternate: A = X = (A & #byte)
+
+                        TXA and TAX have to be responsible for these.
+
+        SHA $93,$9F     Store (A & X & (ADDR_HI + 1))
+        SHX $9E         Store (X & (ADDR_HI + 1))
+        SHY $9C         Store (Y & (ADDR_HI + 1))
+        SHS $9B         SHA and TXS, where X is replaced by (A & X).
+
+                        Note: The value to be stored is copied also
+                        to ADDR_HI if page boundary is crossed.
+
+        SBX $CB         Carry and Decimal flags are ignored but the
+                        Carry flag will be set in substraction. This
+                        is due to the CMP command, which is executed
+                        instead of the real SBC.
+
+        ARR $6B         This instruction first performs an AND
+                        between the accumulator and the immediate
+                        parameter, then it shifts the accumulator to
+                        the right. However, this is not the whole
+                        truth. See the description below.
+
+Many undocumented commands do not use AND between registers, the CPU
+just throws the bytes to a bus simultaneously and lets the
+open-collector drivers perform the AND. I.e. the command called 'SAX',
+which is in the STORE section (opcodes $A0...$BF), stores the result
+of (A & X) by this way.
+
+More fortunate is its opposite, 'LAX' which just loads a byte
+simultaneously into both A and X.
+
+        $6B  ARR
+
+This instruction seems to be a harmless combination of AND and ROR at
+first sight, but it turns out that it affects the V flag and also has
+a special kind of decimal mode. This is because the instruction has
+inherited some properties of the ADC instruction ($69) in addition to
+the ROR ($6A).
+
+In Binary mode (D flag clear), the instruction effectively does an AND
+between the accumulator and the immediate parameter, and then shifts
+the accumulator to the right, copying the C flag to the 8th bit. It
+sets the Negative and Zero flags just like the ROR would. The ADC code
+shows up in the Carry and oVerflow flags. The C flag will be copied
+from the bit 6 of the result (which doesn't seem too logical), and the
+V flag is the result of an Exclusive OR operation between the bit 6
+and the bit 5 of the result.  This makes sense, since the V flag will
+be normally set by an Exclusive OR, too.
+
+In Decimal mode (D flag set), the ARR instruction first performs the
+AND and ROR, just like in Binary mode. The N flag will be copied from
+the initial C flag, and the Z flag will be set according to the ROR
+result, as expected. The V flag will be set if the bit 6 of the
+accumulator changed its state between the AND and the ROR, cleared
+otherwise.
+
+Now comes the funny part. If the low nybble of the AND result,
+incremented by its lowmost bit, is greater than 5, the low nybble in
+the ROR result will be incremented by 6. The low nybble may overflow
+as a consequence of this BCD fixup, but the high nybble won't be
+adjusted. The high nybble will be BCD fixed in a similar way. If the
+high nybble of the AND result, incremented by its lowmost bit, is
+greater than 5, the high nybble in the ROR result will be incremented
+by 6, and the Carry flag will be set. Otherwise the C flag will be
+cleared.
+
+To help you understand this description, here is a C routine that
+illustrates the ARR operation in Decimal mode:
+
+        unsigned
+           A,  /* Accumulator */
+           AL, /* low nybble of accumulator */
+           AH, /* high nybble of accumulator */
+
+           C,  /* Carry flag */
+           Z,  /* Zero flag */
+           V,  /* oVerflow flag */
+           N,  /* Negative flag */
+
+           t,  /* temporary value */
+           s;  /* value to be ARRed with Accumulator */
+
+        t = A & s;                      /* Perform the AND. */
+
+        AH = t >> 4;                    /* Separate the high */
+        AL = t & 15;                    /* and low nybbles. */
+
+        N = C;                          /* Set the N and */
+        Z = !(A = (t >> 1) | (C << 7)); /* Z flags traditionally */
+        V = (t ^ A) & 64;               /* and V flag in a weird way. */
+
+        if (AL + (AL & 1) > 5)          /* BCD "fixup" for low nybble. */
+          A = (A & 0xF0) | ((A + 6) & 0xF);
+
+        if (C = AH + (AH & 1) > 5)      /* Set the Carry flag. */
+          A = (A + 0x60) & 0xFF;        /* BCD "fixup" for high nybble. */
+
+        $CB  SBX   X <- (A & X) - Immediate
+
+The 'SBX' ($CB) may seem to be very complex operation, even though it
+is a combination of the subtraction of accumulator and parameter, as
+in the 'CMP' instruction, and the command 'DEX'. As a result, both A
+and X are connected to ALU but only the subtraction takes place. Since
+the comparison logic was used, the result of subtraction should be
+normally ignored, but the 'DEX' now happily stores to X the value of
+(A & X) - Immediate.  That is why this instruction does not have any
+decimal mode, and it does not affect the V flag. Also Carry flag will
+be ignored in the subtraction but set according to the result.
+
+ Proof:
+
+begin 644 vsbx
+M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```*D`H#V1*Z`_D2N@09$KJ0>%
+M^QBE^VEZJ+$KH#F1*ZD`2"BI`*(`RP`(:-B@.5$K*4#P`E@`H#VQ*SAI`)$K
+JD-Z@/[$K:0"1*Y#4J2X@TO\XH$&Q*VD`D2N0Q,;[$+188/_^]_:_OK>V
+`
+end
+
+ and
+
+begin 644 sbx
+M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI`*!-D2N@3Y$KH%&1*ZD#
+MA?L8I?M*2)`#J1@LJ3B@29$K:$J0`ZGX+*G8R)$K&/BXJ?2B8\L)AOP(:(7]
+MV#B@3;$KH$\Q*Z!1\2L(1?SP`0!H1?TIM]#XH$VQ*SAI`)$KD,N@3[$K:0"1
+9*Y#!J2X@TO\XH%&Q*VD`D2N0L<;[$))88-#X
+`
+end
+
+These test programs show if your machine is compatible with ours
+regarding the opcode $CB. The first test, vsbx, proves that SBX does
+not affect the V flag. The latter one, sbx, proves the rest of our
+theory. The vsbx test tests 33554432 SBX combinations (16777216
+different A, X and Immediate combinations, and two different V flag
+states), and the sbx test doubles that amount (16777216*4 D and C flag
+combinations). Both tests have run successfully on a C64 and a Vic20.
+They ought to run on C16, +4 and the PET series as well. The tests
+stop with BRK, if the opcode $CB does not work as expected. Successful
+operation ends in RTS. As the tests are very slow, they print dots on
+the screen while running so that you know that the machine has not
+jammed. On computers running at 1 MHz, the first test prints
+approximately one dot every four seconds and a total of 2048 dots,
+whereas the second one prints half that amount, one dot every seven
+seconds.
+
+If the tests fail on your machine, please let us know your processor's
+part number and revision. If possible, save the executable (after it
+has stopped with BRK) under another name and send it to us so that we
+know at which stage the program stopped.
+
+The following program is a Commodore 64 executable that Marko M"akel"a
+developed when trying to find out how the V flag is affected by SBX.
+(It was believed that the SBX affects the flag in a weird way, and
+this program shows how SBX sets the flag differently from SBC.)  You
+may find the subroutine at $C150 useful when researching other
+undocumented instructions' flags. Run the program in a machine
+language monitor, as it makes use of the BRK instruction. The result
+tables will be written on pages $C2 and $C3.
+
+begin 644 sbx-c100
+M`,%XH`",#L&,$,&,$L&XJ8*B@LL7AOL(:(7\N#BM#L$M$,'M$L$(Q?OP`B@`
+M:$7\\`,@4,'N#L'0U.X0P=#/SB#0[A+!T,<``````````````)BJ\!>M#L$M
+L$,'=_\'0":T2P=W_PM`!8,K0Z:T.P2T0P9D`PID`!*T2P9D`PYD`!
+
+Other undocumented instructions usually cause two preceding opcodes
+being executed. However 'NOP' seems to completely disappear from 'SBC'
+code $EB.
+
+The most difficult to comprehend are the rest of the instructions
+located on the '$0B' line.
+
+All the instructions located at the positive (left) side of this line
+should rotate either memory or the accumulator, but the addressing
+mode turns out to be immediate! No problem. Just read the operand, let
+it be ANDed with the accumulator and finally use accumulator
+addressing mode for the instructions above them.
+
+RELIGION_MODE_ON
+/* This part of the document is not accurate.  You can
+   read it as a fairy tale, but do not count on it when
+   performing your own measurements. */
+
+The rest two instructions on the same line, called 'ANE' and 'LXA'
+($8B and $AB respectively) often give quite unpredictable results.
+However, the most usual operation is to store ((A | #$ee) & X & #$nn)
+to accumulator. Note that this does not work reliably in a real 64!
+In the Commodore 128 the opcode $8B uses values 8C, CC, EE, and
+occasionally 0C and 8E for the OR instead of EE,EF,FE and FF used in
+the C64. With a C128 running at 2 MHz #$EE is always used.  Opcode $AB
+does not cause this OR taking place on 8502 while 6510 always performs
+it. Note that this behaviour depends on processor and/or video chip
+revision.
+
+Let's take a closer look at $8B (6510).
+
+        A <- X & D & (A | VAL)
+
+        where VAL comes from this table:
+
+       X high   D high  D low   VAL
+        even     even    ---    $EE (1)
+        even     odd     ---    $EE
+        odd      even    ---    $EE
+        odd      odd      0     $EE
+        odd      odd     not 0  $FE (2)
+
+(1) If the bottom 2 bits of A are both 1, then the LSB of the result may
+    be 0. The values of X and D are different every time I run the test.
+    This appears to be very rare.
+(2) VAL is $FE most of the time. Sometimes it is $EE - it seems to be random,
+    not related to any of the data. This is much more common than (1).
+
+  In decimal mode, VAL is usually $FE.
+
+Two different functions have been discovered for LAX, opcode $AB. One
+is A = X = ANE (see above) and the other, encountered with 6510 and
+8502, is less complicated A = X = (A & #byte). However, according to
+what is reported, the version altering only the lowest bits of each
+nybble seems to be more common.
+
+What happens, is that $AB loads a value into both A and X, ANDing the
+low bit of each nybble with the corresponding bit of the old
+A. However, there are exceptions. Sometimes the low bit is cleared
+even when A contains a '1', and sometimes other bits are cleared. The
+exceptions seem random (they change every time I run the test). Oops -
+that was in decimal mode. Much the same with D=0.
+
+What causes the randomness?  Probably it is that it is marginal logic
+levels - when too much wired-anding goes on, some of the signals get
+very close to the threshold. Perhaps we're seeing some of them step
+over it. The low bit of each nybble is special, since it has to cope
+with carry differently (remember decimal mode). We never see a '0'
+turn into a '1'.
+
+Since these instructions are unpredictable, they should not be used.
+
+There is still very strange instruction left, the one named SHA/X/Y,
+which is the only one with only indexed addressing modes. Actually,
+the commands 'SHA', 'SHX' and 'SHY' are generated by the indexing
+algorithm.
+
+While using indexed addressing, effective address for page boundary
+crossing is calculated as soon as possible so it does not slow down
+operation. As a result, in the case of SHA/X/Y, the address and data
+are processed at the same time making AND between them to take place.
+Thus, the value to be stored by SAX, for example, is in fact (A & X &
+(ADDR_HI + 1)).  On page boundary crossing the same value is copied
+also to high byte of the effective address.
+
+RELIGION_MODE_OFF
+
+
+Register selection for load and store
+
+   bit1 bit0     A  X  Y
+    0    0             x
+    0    1          x
+    1    0       x
+    1    1       x  x
+
+So, A and X are selected by bits 1 and 0 respectively, while
+ ~(bit1|bit0) enables Y.
+
+Indexing is determined by bit4, even in relative addressing mode,
+which is one kind of indexing.
+
+Lines containing opcodes xxx000x1 (01 and 03) are treated as absolute
+after the effective address has been loaded into CPU.
+
+Zeropage,y and Absolute,y (codes 10x1 x11x) are distinquished by bit5.
+
+
+Decimal mode in NMOS 6500 series
+
+  Most sources claim that the NMOS 6500 series sets the N, V and Z
+flags unpredictably when performing addition or subtraction in decimal
+mode. Of course, this is not true. While testing how the flags are
+set, I also wanted to see what happens if you use illegal BCD values.
+
+  ADC works in Decimal mode in a quite complicated way. It is amazing
+how it can do that all in a single cycle. Here's a C code version of
+the instruction:
+
+        unsigned
+           A,  /* Accumulator */
+           AL, /* low nybble of accumulator */
+           AH, /* high nybble of accumulator */
+
+           C,  /* Carry flag */
+           Z,  /* Zero flag */
+           V,  /* oVerflow flag */
+           N,  /* Negative flag */
+
+           s;  /* value to be added to Accumulator */
+
+        AL = (A & 15) + (s & 15) + C;         /* Calculate the lower nybble. */
+
+        AH = (A >> 4) + (s >> 4) + (AL > 15); /* Calculate the upper nybble. */
+
+        if (AL > 9) AL += 6;                  /* BCD fixup for lower nybble. */
+
+        Z = ((A + s + C) & 255 != 0);         /* Zero flag is set just
+                                                 like in Binary mode. */
+
+        /* Negative and Overflow flags are set with the same logic than in
+           Binary mode, but after fixing the lower nybble. */
+
+        N = (AH & 8 != 0);
+        V = ((AH << 4) ^ A) & 128 && !((A ^ s) & 128);
+
+        if (AH > 9) AH += 6;                  /* BCD fixup for upper nybble. */
+
+        /* Carry is the only flag set after fixing the result. */
+
+        C = (AH > 15);
+        A = ((AH << 4) | (AL & 15)) & 255;
+
+  The C flag is set as the quiche eaters expect, but the N and V flags
+are set after fixing the lower nybble but before fixing the upper one.
+They use the same logic than binary mode ADC. The Z flag is set before
+any BCD fixup, so the D flag does not have any influence on it.
+
+Proof: The following test program tests all 131072 ADC combinations in
+       Decimal mode, and aborts with BRK if anything breaks this theory.
+       If everything goes well, it ends in RTS.
+
+begin 600 dadc
+M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@   'BI&*  A/N$_$B@+)$KH(V1
+M*Q@(I?PI#X7]I?LI#V7]R0J0 FD%J"D/A?VE^RGP9?PI\ C $) ":0^JL @H
+ML ?)H) &""@X:5\X!?V%_0AH*3W@ ! ""8"HBD7[$ JE^T7\, 28"4"H**7[
+M9?S0!)@) J@8N/BE^V7\V A%_= G:(3]1?W0(.;[T(?F_-"#:$D8\ )88*D=
+0&&4KA?NI &4LA?RI.&S[  A%
+
+end
+
+  All programs in this chapter have been successfully tested on a Vic20
+and a Commodore 64 and a Commodore 128D in C64 mode. They should run on
+C16, +4 and on the PET series as well. If not, please report the problem
+to Marko M"akel"a. Each test in this chapter should run in less than a
+minute at 1 MHz.
+
+SBC is much easier. Just like CMP, its flags are not affected by
+the D flag.
+
+Proof:
+
+begin 600 dsbc-cmp-flags
+M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@   'B@ (3[A/RB XH8:66HL2N@
+M09$KH$R1*XII::BQ*Z!%D2N@4)$K^#BXI?OE_-@(:(7].+BE^^7\"&A%_? !
+5 .;[T./F_-#?RA"_8!@X&#CEY<7%
+
+end
+
+  The only difference in SBC's operation in decimal mode from binary mode
+is the result-fixup:
+
+        unsigned
+           A,  /* Accumulator */
+           AL, /* low nybble of accumulator */
+           AH, /* high nybble of accumulator */
+
+           C,  /* Carry flag */
+           Z,  /* Zero flag */
+           V,  /* oVerflow flag */
+           N,  /* Negative flag */
+
+           s;  /* value to be added to Accumulator */
+
+        AL = (A & 15) - (s & 15) - !C;        /* Calculate the lower nybble. */
+
+        if (AL & 16) AL -= 6;                 /* BCD fixup for lower nybble. */
+
+        AH = (A >> 4) - (s >> 4) - (AL & 16); /* Calculate the upper nybble. */
+
+        if (AH & 16) AH -= 6;                 /* BCD fixup for upper nybble. */
+
+        /* The flags are set just like in Binary mode. */
+
+        C = (A - s - !C) & 256 != 0;
+        Z = (A - s - !C) & 255 != 0;
+        V = ((A - s - !C) ^ s) & 128 && (A ^ s) & 128;
+        N = (A - s - !C) & 128 != 0;
+
+        A = ((AH << 4) | (AL & 15)) & 255;
+
+  Again Z flag is set before any BCD fixup. The N and V flags are set
+at any time before fixing the high nybble. The C flag may be set in any
+phase.
+
+  Decimal subtraction is easier than decimal addition, as you have to
+make the BCD fixup only when a nybble overflows. In decimal addition,
+you had to verify if the nybble was greater than 9. The processor has
+an internal "half carry" flag for the lower nybble, used to trigger
+the BCD fixup. When calculating with legal BCD values, the lower nybble
+cannot overflow again when fixing it.
+So, the processor does not handle overflows while performing the fixup.
+Similarly, the BCD fixup occurs in the high nybble only if the value
+overflows, i.e. when the C flag will be cleared.
+
+  Because SBC's flags are not affected by the Decimal mode flag, you
+could guess that CMP uses the SBC logic, only setting the C flag
+first. But the SBX instruction shows that CMP also temporarily clears
+the D flag, although it is totally unnecessary.
+
+  The following program, which tests SBC's result and flags,
+contains the 6502 version of the pseudo code example above.
+
+begin 600 dsbc
+M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@   'BI&*  A/N$_$B@+)$KH':1
+M*S@(I?PI#X7]I?LI#^7]L /I!1@I#ZBE_"GPA?VE^RGP"#CE_2GPL KI7RBP
+M#ND/.+ )*+ &Z0^P NE?A/T%_87]*+BE^^7\"&BH.+CXI?OE_-@(1?W0FVB$
+8_47]T)3F^]">YOS0FFA)&- $J3C0B%A@
+
+end
+
+  Obviously the undocumented instructions RRA (ROR+ADC) and ISB
+(INC+SBC) have inherited also the decimal operation from the official
+instructions ADC and SBC. The program droradc proves this statement
+for ROR, and the dincsbc test proves this for ISB. Finally,
+dincsbc-deccmp proves that ISB's and DCP's (DEC+CMP) flags are not
+affected by the D flag.
+
+begin 644 droradc
+M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI&*``A/N$_$B@+)$KH(V1
+M*S@(I?PI#X7]I?LI#V7]R0J0`FD%J"D/A?VE^RGP9?PI\`C`$)`":0^JL`@H
+ML`?)H)`&""@X:5\X!?V%_0AH*3W@`!`""8"HBD7[$`JE^T7\,`28"4"H**7[
+M9?S0!)@)`J@XN/BE^R;\9_S8"$7]T"=HA/U%_=`@YOO0A>;\T(%H21CP`EA@
+2J1T892N%^ZD`92R%_*DX;/L`
+`
+end
+
+begin 644 dincsbc
+M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI&*``A/N$_$B@+)$KH':1
+M*S@(I?PI#X7]I?LI#^7]L`/I!1@I#ZBE_"GPA?VE^RGP"#CE_2GPL`KI7RBP
+M#ND/.+`)*+`&Z0^P`NE?A/T%_87]*+BE^^7\"&BH.+CXI?O&_.?\V`A%_="9
+::(3]1?W0DN;[T)SF_-"8:$D8T`2I.-"&6&#\
+`
+end
+
+begin 644 dincsbc-deccmp
+M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'B@`(3[A/RB`XH8:7>HL2N@
+M3Y$KH%R1*XII>ZBQ*Z!3D2N@8)$KBFE_J+$KH%61*Z!BD2OX.+BE^^;\Q_S8
+L"&B%_3BXI?OF_,?\"&A%_?`!`.;[T-_F_-#;RA"M8!@X&#CFYL;&Q\?GYP#8
+`
+end
+
+
+6510 features
+
+   o  PHP always pushes the Break (B) flag as a `1' to the stack.
+      Jukka Tapanim"aki claimed in C=lehti issue 3/89, on page 27 that the
+      processor makes a logical OR between the status register's bit 4
+      and the bit 8 of the stack pointer register (which is always 1).
+      He did not give any reasons for this argument, and has refused to clarify
+      it afterwards. Well, this was not the only error in his article...
+
+   o  Indirect addressing modes do not handle page boundary crossing at all.
+      When the parameter's low byte is $FF, the effective address wraps
+      around and the CPU fetches high byte from $xx00 instead of $xx00+$0100.
+      E.g. JMP ($01FF) fetches PCL from $01FF and PCH from $0100,
+      and LDA ($FF),Y fetches the base address from $FF and $00.
+
+   o  Indexed zero page addressing modes never fix the page address on
+      crossing the zero page boundary.
+      E.g. LDX #$01 : LDA ($FF,X) loads the effective address from $00 and $01.
+
+   o  The processor always fetches the byte following a relative branch
+      instruction. If the branch is taken, the processor reads then the
+      opcode from the destination address. If page boundary is crossed, it
+      first reads a byte from the old page from a location that is bigger
+      or smaller than the correct address by one page.
+
+   o  If you cross a page boundary in any other indexed mode,
+      the processor reads an incorrect location first, a location that is
+      smaller by one page.
+
+   o  Read-Modify-Write instructions write unmodified data, then modified
+      (so INC effectively does LDX loc;STX loc;INX;STX loc)
+
+   o  -RDY is ignored during writes
+      (This is why you must wait 3 cycles before doing any DMA --
+      the maximum number of consecutive writes is 3, which occurs
+      during interrupts except -RESET.)
+
+   o  Some undefined opcodes may give really unpredictable results.
+
+   o  All registers except the Program Counter remain unmodified after -RESET.
+      (This is why you must preset D and I flags in the RESET handler.)
+
+
+Different CPU types
+
+The Rockwell data booklet 29651N52 (technical information about R65C00
+microprocessors, dated October 1984), lists the following differences between
+NMOS R6502 microprocessor and CMOS R65C00 family:
+
+
+     1. Indexed addressing across page boundary.
+             NMOS: Extra read of invalid address.
+             CMOS: Extra read of last instruction byte.
+
+
+     2. Execution of invalid op codes.
+             NMOS: Some terminate only by reset. Results are undefined.
+             CMOS: All are NOPs (reserved for future use).
+
+
+     3. Jump indirect, operand = XXFF.
+             NMOS: Page address does not increment.
+             CMOS: Page address increments and adds one additional cycle.
+
+
+     4. Read/modify/write instructions at effective address.
+             NMOS: One read and two write cycles.
+             CMOS: Two read and one write cycle.
+
+
+     5. Decimal flag.
+             NMOS: Indeterminate after reset.
+             CMOS: Initialized to binary mode (D=0) after reset and interrupts.
+
+
+     6. Flags after decimal operation.
+             NMOS: Invalid N, V and Z flags.
+             CMOS: Valid flag adds one additional cycle.
+
+
+     7. Interrupt after fetch of BRK instruction.
+             NMOS: Interrupt vector is loaded, BRK vector is ignored.
+             CMOS: BRK is executed, then interrupt is executed.
+
+
+6510 Instruction Timing
+
+  The NMOS 6500 series processors always perform at least two reads
+for each instruction. In addition to the operation code (opcode), they
+fetch the next byte. This is quite efficient, as most instructions are
+two or three bytes long.
+
+  The processors also use a sort of pipelining. If an instruction does
+not store data in memory on its last cycle, the processor can fetch
+the opcode of the next instruction while executing the last cycle. For
+instance, the instruction EOR #$FF truly takes three cycles. On the
+first cycle, the opcode $49 will be fetched. During the second cycle
+the processor decodes the opcode and fetches the parameter #$FF. On
+the third cycle, the processor will perform the operation and store
+the result to accumulator, but simultaneously it fetches the opcode
+for the next instruction. This is why the instruction effectively
+takes only two cycles.
+
+  The following tables show what happens on the bus while executing
+different kinds of instructions.
+
+  Interrupts
+
+     NMI and IRQ both take 7 cycles. Their timing diagram is much like
+     BRK's (see below). IRQ will be executed only when the I flag is
+     clear. IRQ and BRK both set the I flag, whereas the NMI does not
+     affect its state.
+
+     The processor will usually wait for the current instruction to
+     complete before executing the interrupt sequence. To process the
+     interrupt before the next instruction, the interrupt must occur
+     before the last cycle of the current instruction.
+
+     There is one exception to this rule: the BRK instruction. If a
+     hardware interrupt (NMI or IRQ) occurs before the fourth (flags
+     saving) cycle of BRK, the BRK instruction will be skipped, and
+     the processor will jump to the hardware interrupt vector. This
+     sequence will always take 7 cycles.
+
+     You do not completely lose the BRK interrupt, the B flag will be
+     set in the pushed status register if a BRK instruction gets
+     interrupted. When BRK and IRQ occur at the same time, this does
+     not cause any problems, as your program will consider it as a
+     BRK, and the IRQ would occur again after the processor returned
+     from your BRK routine, unless you cleared the interrupt source in
+     your BRK handler. But the simultaneous occurrence of NMI and BRK
+     is far more fatal. If you do not check the B flag in the NMI
+     routine and subtract two from the return address when needed, the
+     BRK instruction will be skipped.
+
+     If the NMI and IRQ interrupts overlap each other (one interrupt
+     occurs before fetching the interrupt vector for the other
+     interrupt), the processor will most probably jump to the NMI
+     vector in every case, and then jump to the IRQ vector after
+     processing the first instruction of the NMI handler. This has not
+     been measured yet, but the IRQ is very similar to BRK, and many
+     sources state that the NMI has higher priority than IRQ. However,
+     it might be that the processor takes the interrupt that comes
+     later, i.e. you could lose an NMI interrupt if an IRQ occurred in
+     four cycles after it.
+
+     After finishing the interrupt sequence, the processor will start
+     to execute the first instruction of the interrupt routine. This
+     proves that the processor uses a sort of pipelining: it finishes
+     the current instruction (or interrupt sequence) while reading the
+     opcode of the next instruction.
+
+     RESET does not push program counter on stack, and it lasts
+     probably 6 cycles after deactivating the signal. Like NMI, RESET
+     preserves all registers except PC.
+
+  Instructions accessing the stack
+
+     BRK
+
+        #  address R/W description
+       --- ------- --- -----------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  read next instruction byte (and throw it away),
+                       increment PC
+        3  $0100,S  W  push PCH on stack (with B flag set), decrement S
+        4  $0100,S  W  push PCL on stack, decrement S
+        5  $0100,S  W  push P on stack, decrement S
+        6   $FFFE   R  fetch PCL
+        7   $FFFF   R  fetch PCH
+
+     RTI
+
+        #  address R/W description
+       --- ------- --- -----------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  read next instruction byte (and throw it away)
+        3  $0100,S  R  increment S
+        4  $0100,S  R  pull P from stack, increment S
+        5  $0100,S  R  pull PCL from stack, increment S
+        6  $0100,S  R  pull PCH from stack
+
+     RTS
+
+        #  address R/W description
+       --- ------- --- -----------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  read next instruction byte (and throw it away)
+        3  $0100,S  R  increment S
+        4  $0100,S  R  pull PCL from stack, increment S
+        5  $0100,S  R  pull PCH from stack
+        6    PC     R  increment PC
+
+     PHA, PHP
+
+        #  address R/W description
+       --- ------- --- -----------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  read next instruction byte (and throw it away)
+        3  $0100,S  W  push register on stack, decrement S
+
+     PLA, PLP
+
+        #  address R/W description
+       --- ------- --- -----------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  read next instruction byte (and throw it away)
+        3  $0100,S  R  increment S
+        4  $0100,S  R  pull register from stack
+
+     JSR
+
+        #  address R/W description
+       --- ------- --- -------------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  fetch low address byte, increment PC
+        3  $0100,S  R  internal operation (predecrement S?)
+        4  $0100,S  W  push PCH on stack, decrement S
+        5  $0100,S  W  push PCL on stack, decrement S
+        6    PC     R  copy low address byte to PCL, fetch high address
+                       byte to PCH
+
+  Accumulator or implied addressing
+
+        #  address R/W description
+       --- ------- --- -----------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  read next instruction byte (and throw it away)
+
+  Immediate addressing
+
+        #  address R/W description
+       --- ------- --- ------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  fetch value, increment PC
+
+  Absolute addressing
+
+     JMP
+
+        #  address R/W description
+       --- ------- --- -------------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  fetch low address byte, increment PC
+        3    PC     R  copy low address byte to PCL, fetch high address
+                       byte to PCH
+
+     Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
+                        LAX, NOP)
+
+        #  address R/W description
+       --- ------- --- ------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  fetch low byte of address, increment PC
+        3    PC     R  fetch high byte of address, increment PC
+        4  address  R  read from effective address
+
+     Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
+                                     SLO, SRE, RLA, RRA, ISB, DCP)
+
+        #  address R/W description
+       --- ------- --- ------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  fetch low byte of address, increment PC
+        3    PC     R  fetch high byte of address, increment PC
+        4  address  R  read from effective address
+        5  address  W  write the value back to effective address,
+                       and do the operation on it
+        6  address  W  write the new value to effective address
+
+     Write instructions (STA, STX, STY, SAX)
+
+        #  address R/W description
+       --- ------- --- ------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  fetch low byte of address, increment PC
+        3    PC     R  fetch high byte of address, increment PC
+        4  address  W  write register to effective address
+
+  Zero page addressing
+
+     Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
+                        LAX, NOP)
+
+        #  address R/W description
+       --- ------- --- ------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  fetch address, increment PC
+        3  address  R  read from effective address
+
+     Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
+                                     SLO, SRE, RLA, RRA, ISB, DCP)
+
+        #  address R/W description
+       --- ------- --- ------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  fetch address, increment PC
+        3  address  R  read from effective address
+        4  address  W  write the value back to effective address,
+                       and do the operation on it
+        5  address  W  write the new value to effective address
+
+     Write instructions (STA, STX, STY, SAX)
+
+        #  address R/W description
+       --- ------- --- ------------------------------------------
+        1    PC     R  fetch opcode, increment PC
+        2    PC     R  fetch address, increment PC
+        3  address  W  write register to effective address
+
+  Zero page indexed addressing
+
+     Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
+                        LAX, NOP)
+
+        #   address  R/W description
+       --- --------- --- ------------------------------------------
+        1     PC      R  fetch opcode, increment PC
+        2     PC      R  fetch address, increment PC
+        3   address   R  read from address, add index register to it
+        4  address+I* R  read from effective address
+
+       Notes: I denotes either index register (X or Y).
+
+              * The high byte of the effective address is always zero,
+                i.e. page boundary crossings are not handled.
+
+     Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
+                                     SLO, SRE, RLA, RRA, ISB, DCP)
+
+        #   address  R/W description
+       --- --------- --- ---------------------------------------------
+        1     PC      R  fetch opcode, increment PC
+        2     PC      R  fetch address, increment PC
+        3   address   R  read from address, add index register X to it
+        4  address+X* R  read from effective address
+        5  address+X* W  write the value back to effective address,
+                         and do the operation on it
+        6  address+X* W  write the new value to effective address
+
+       Note: * The high byte of the effective address is always zero,
+               i.e. page boundary crossings are not handled.
+
+     Write instructions (STA, STX, STY, SAX)
+
+        #   address  R/W description
+       --- --------- --- -------------------------------------------
+        1     PC      R  fetch opcode, increment PC
+        2     PC      R  fetch address, increment PC
+        3   address   R  read from address, add index register to it
+        4  address+I* W  write to effective address
+
+       Notes: I denotes either index register (X or Y).
+
+              * The high byte of the effective address is always zero,
+                i.e. page boundary crossings are not handled.
+
+  Absolute indexed addressing
+
+     Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,
+                        LAX, LAE, SHS, NOP)
+
+        #   address  R/W description
+       --- --------- --- ------------------------------------------
+        1     PC      R  fetch opcode, increment PC
+        2     PC      R  fetch low byte of address, increment PC
+        3     PC      R  fetch high byte of address,
+                         add index register to low address byte,
+                         increment PC
+        4  address+I* R  read from effective address,
+                         fix the high byte of effective address
+        5+ address+I  R  re-read from effective address
+
+       Notes: I denotes either index register (X or Y).
+
+              * The high byte of the effective address may be invalid
+                at this time, i.e. it may be smaller by $100.
+
+              + This cycle will be executed only if the effective address
+                was invalid during cycle #4, i.e. page boundary was crossed.
+
+     Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,
+                                     SLO, SRE, RLA, RRA, ISB, DCP)
+
+        #   address  R/W description
+       --- --------- --- ------------------------------------------
+        1    PC       R  fetch opcode, increment PC
+        2    PC       R  fetch low byte of address, increment PC
+        3    PC       R  fetch high byte of address,
+                         add index register X to low address byte,
+                         increment PC
+        4  address+X* R  read from effective address,
+                         fix the high byte of effective address
+        5  address+X  R  re-read from effective address
+        6  address+X  W  write the value back to effective address,
+                         and do the operation on it
+        7  address+X  W  write the new value to effective address
+
+       Notes: * The high byte of the effective address may be invalid
+                at this time, i.e. it may be smaller by $100.
+
+     Write instructions (STA, STX, STY, SHA, SHX, SHY)
+
+        #   address  R/W description
+       --- --------- --- ------------------------------------------
+        1     PC      R  fetch opcode, increment PC
+        2     PC      R  fetch low byte of address, increment PC
+        3     PC      R  fetch high byte of address,
+                         add index register to low address byte,
+                         increment PC
+        4  address+I* R  read from effective address,
+                         fix the high byte of effective address
+        5  address+I  W  write to effective address
+
+       Notes: I denotes either index register (X or Y).
+
+              * The high byte of the effective address may be invalid
+                at this time, i.e. it may be smaller by $100. Because
+                the processor cannot undo a write to an invalid
+                address, it always reads from the address first.
+
+  Relative addressing (BCC, BCS, BNE, BEQ, BPL, BMI, BVC, BVS)
+
+        #   address  R/W description
+       --- --------- --- ---------------------------------------------
+        1     PC      R  fetch opcode, increment PC
+        2     PC      R  fetch operand, increment PC
+        3     PC      R  Fetch opcode of next instruction,
+                         If branch is taken, add operand to PCL.
+                         Otherwise increment PC.
+        4+    PC*     R  Fetch opcode of next instruction.
+                         Fix PCH. If it did not change, increment PC.
+        5!    PC      R  Fetch opcode of next instruction,
+                         increment PC.
+
+       Notes: The opcode fetch of the next instruction is included to
+              this diagram for illustration purposes. When determining
+              real execution times, remember to subtract the last
+              cycle.
+
+              * The high byte of Program Counter (PCH) may be invalid
+                at this time, i.e. it may be smaller or bigger by $100.
+
+              + If branch is taken, this cycle will be executed.
+
+              ! If branch occurs to different page, this cycle will be
+                executed.
+
+  Indexed indirect addressing
+
+     Read instructions (LDA, ORA, EOR, AND, ADC, CMP, SBC, LAX)
+
+        #    address   R/W description
+       --- ----------- --- ------------------------------------------
+        1      PC       R  fetch opcode, increment PC
+        2      PC       R  fetch pointer address, increment PC
+        3    pointer    R  read from the address, add X to it
+        4   pointer+X   R  fetch effective address low
+        5  pointer+X+1  R  fetch effective address high
+        6    address    R  read from effective address
+
+       Note: The effective address is always fetched from zero page,
+             i.e. the zero page boundary crossing is not handled.
+
+     Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP)
+
+        #    address   R/W description
+       --- ----------- --- ------------------------------------------
+        1      PC       R  fetch opcode, increment PC
+        2      PC       R  fetch pointer address, increment PC
+        3    pointer    R  read from the address, add X to it
+        4   pointer+X   R  fetch effective address low
+        5  pointer+X+1  R  fetch effective address high
+        6    address    R  read from effective address
+        7    address    W  write the value back to effective address,
+                           and do the operation on it
+        8    address    W  write the new value to effective address
+
+       Note: The effective address is always fetched from zero page,
+             i.e. the zero page boundary crossing is not handled.
+
+     Write instructions (STA, SAX)
+
+        #    address   R/W description
+       --- ----------- --- ------------------------------------------
+        1      PC       R  fetch opcode, increment PC
+        2      PC       R  fetch pointer address, increment PC
+        3    pointer    R  read from the address, add X to it
+        4   pointer+X   R  fetch effective address low
+        5  pointer+X+1  R  fetch effective address high
+        6    address    W  write to effective address
+
+       Note: The effective address is always fetched from zero page,
+             i.e. the zero page boundary crossing is not handled.
+
+  Indirect indexed addressing
+
+     Read instructions (LDA, EOR, AND, ORA, ADC, SBC, CMP)
+
+        #    address   R/W description
+       --- ----------- --- ------------------------------------------
+        1      PC       R  fetch opcode, increment PC
+        2      PC       R  fetch pointer address, increment PC
+        3    pointer    R  fetch effective address low
+        4   pointer+1   R  fetch effective address high,
+                           add Y to low byte of effective address
+        5   address+Y*  R  read from effective address,
+                           fix high byte of effective address
+        6+  address+Y   R  read from effective address
+
+       Notes: The effective address is always fetched from zero page,
+              i.e. the zero page boundary crossing is not handled.
+
+              * The high byte of the effective address may be invalid
+                at this time, i.e. it may be smaller by $100.
+
+              + This cycle will be executed only if the effective address
+                was invalid during cycle #5, i.e. page boundary was crossed.
+
+     Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP)
+
+        #    address   R/W description
+       --- ----------- --- ------------------------------------------
+        1      PC       R  fetch opcode, increment PC
+        2      PC       R  fetch pointer address, increment PC
+        3    pointer    R  fetch effective address low
+        4   pointer+1   R  fetch effective address high,
+                           add Y to low byte of effective address
+        5   address+Y*  R  read from effective address,
+                           fix high byte of effective address
+        6   address+Y   R  read from effective address
+        7   address+Y   W  write the value back to effective address,
+                           and do the operation on it
+        8   address+Y   W  write the new value to effective address
+
+       Notes: The effective address is always fetched from zero page,
+              i.e. the zero page boundary crossing is not handled.
+
+              * The high byte of the effective address may be invalid
+                at this time, i.e. it may be smaller by $100.
+
+     Write instructions (STA, SHA)
+
+        #    address   R/W description
+       --- ----------- --- ------------------------------------------
+        1      PC       R  fetch opcode, increment PC
+        2      PC       R  fetch pointer address, increment PC
+        3    pointer    R  fetch effective address low
+        4   pointer+1   R  fetch effective address high,
+                           add Y to low byte of effective address
+        5   address+Y*  R  read from effective address,
+                           fix high byte of effective address
+        6   address+Y   W  write to effective address
+
+       Notes: The effective address is always fetched from zero page,
+              i.e. the zero page boundary crossing is not handled.
+
+              * The high byte of the effective address may be invalid
+                at this time, i.e. it may be smaller by $100.
+
+  Absolute indirect addressing (JMP)
+
+        #   address  R/W description
+       --- --------- --- ------------------------------------------
+        1     PC      R  fetch opcode, increment PC
+        2     PC      R  fetch pointer address low, increment PC
+        3     PC      R  fetch pointer address high, increment PC
+        4   pointer   R  fetch low address to latch
+        5  pointer+1* R  fetch PCH, copy latch to PCL
+
+       Note: * The PCH will always be fetched from the same page
+               than PCL, i.e. page boundary crossing is not handled.
+
+                How Real Programmers Acknowledge Interrupts
+
+  With RMW instructions:
+
+        ; beginning of combined raster/timer interrupt routine
+        LSR $D019       ; clear VIC interrupts, read raster interrupt flag to C
+        BCS raster      ; jump if VIC caused an interrupt
+        ...             ; timer interrupt routine
+
+        Operational diagram of LSR $D019:
+
+          #  data  address  R/W
+         --- ----  -------  ---  ---------------------------------
+          1   4E     PC      R   fetch opcode
+          2   19    PC+1     R   fetch address low
+          3   D0    PC+2     R   fetch address high
+          4   xx    $D019    R   read memory
+          5   xx    $D019    W   write the value back, rotate right
+          6  xx/2   $D019    W   write the new value back
+
+        The 5th cycle acknowledges the interrupt by writing the same
+        value back. If only raster interrupts are used, the 6th cycle
+        has no effect on the VIC. (It might acknowledge also some
+        other interrupts.)
+
+  With indexed addressing:
+
+        ; acknowledge interrupts to both CIAs
+        LDX #$10
+        LDA $DCFD,X
+
+        Operational diagram of LDA $DCFD,X:
+
+          #  data  address  R/W  description
+         --- ----  -------  ---  ---------------------------------
+          1   BD     PC      R   fetch opcode
+          2   FD    PC+1     R   fetch address low
+          3   DC    PC+2     R   fetch address high, add X to address low
+          4   xx    $DC0D    R   read from address, fix high byte of address
+          5   yy    $DD0D    R   read from right address
+
+        ; acknowledge interrupts to CIA 2
+        LDX #$10
+        STA $DDFD,X
+
+        Operational diagram of STA $DDFD,X:
+
+          #  data  address  R/W  description
+         --- ----  -------  ---  ---------------------------------
+          1   9D     PC      R   fetch opcode
+          2   FD    PC+1     R   fetch address low
+          3   DC    PC+2     R   fetch address high, add X to address low
+          4   xx    $DD0D    R   read from address, fix high byte of address
+          5   ac    $DE0D    W   write to right address
+
+  With branch instructions:
+
+        ; acknowledge interrupts to CIA 2
+                LDA #$00  ; clear N flag
+                JMP $DD0A
+        DD0A    BPL $DC9D ; branch
+        DC9D    BRK       ; return
+
+        You need the following preparations to initialize the CIA registers:
+
+                LDA #$91  ; argument of BPL
+                STA $DD0B
+                LDA #$10  ; BPL
+                STA $DD0A
+                STA $DD08 ; load the ToD values from the latches
+                LDA $DD0B ; freeze the ToD display
+                LDA #$7F
+                STA $DC0D ; assure that $DC0D is $00
+
+        Operational diagram of BPL $DC9D:
+
+          #  data  address  R/W  description
+         --- ----  -------  ---  ---------------------------------
+          1   10    $DD0A    R   fetch opcode
+          2   91    $DD0B    R   fetch argument
+          3   xx    $DD0C    R   fetch opcode, add argument to PCL
+          4   yy    $DD9D    R   fetch opcode, fix PCH
+        ( 5   00    $DC9D    R   fetch opcode )
+
+        ; acknowledge interrupts to CIA 1
+                LSR       ; clear N flag
+                JMP $DCFA
+        DCFA    BPL $DD0D
+        DD0D    BRK
+
+        ; Again you need to set the ToD registers of CIA 1 and the
+        ; Interrupt Control Register of CIA 2 first.
+
+        Operational diagram of BPL $DD0D:
+
+          #  data  address  R/W  description
+         --- ----  -------  ---  ---------------------------------
+          1   10    $DCFA    R   fetch opcode
+          2   11    $DCFB    R   fetch argument
+          3   xx    $DCFC    R   fetch opcode, add argument to PCL
+          4   yy    $DC0D    R   fetch opcode, fix PCH
+        ( 5   00    $DD0D    R   fetch opcode )
+
+        ; acknowledge interrupts to CIA 2 automagically
+                ; preparations
+                LDA #$7F
+                STA $DD0D       ; disable all interrupt sources of CIA2
+                LDA $DD0E
+                AND #$BE        ; ensure that $DD0C remains constant
+                STA $DD0E       ; and stop the timer
+                LDA #$FD
+                STA $DD0C       ; parameter of BPL
+                LDA #$10
+                STA $DD0B       ; BPL
+                LDA #$40
+                STA $DD0A       ; RTI/parameter of LSR
+                LDA #$46
+                STA $DD09       ; LSR
+                STA $DD08       ; load the ToD values from the latches
+                LDA $DD0B       ; freeze the ToD display
+                LDA #$09
+                STA $0318
+                LDA #$DD
+                STA $0319       ; change NMI vector to $DD09
+                LDA #$FF        ; Try changing this instruction's operand
+                STA $DD05       ; (see comment below).
+                LDA #$FF
+                STA $DD04       ; set interrupt frequency to 1/65536 cycles
+                LDA $DD0E
+                AND #$80
+                ORA #$11
+                LDX #$81
+                STX $DD0D       ; enable timer interrupt
+                STA $DD0E       ; start timer
+
+                LDA #$00        ; To see that the interrupts really occur,
+                STA $D011       ; use something like this and see how
+        LOOP    DEC $D020       ; changing the byte loaded to $DD05 from
+                BNE LOOP        ; #$FF to #$0F changes the image.
+
+        When an NMI occurs, the processor jumps to Kernal code, which jumps to
+        ($0318), which points to the following routine:
+
+        DD09    LSR $40         ; clear N flag
+                BPL $DD0A       ; Note: $DD0A contains RTI.
+
+        Operational diagram of BPL $DD0A:
+
+          #  data  address  R/W  description
+         --- ----  -------  ---  ---------------------------------
+          1   10    $DD0B    R   fetch opcode
+          2   11    $DD0C    R   fetch argument
+          3   xx    $DD0D    R   fetch opcode, add argument to PCL
+          4   40    $DD0A    R   fetch opcode, (fix PCH)
+
+  With RTI:
+
+        ; the fastest possible interrupt handler in the 6500 family
+                ; preparations
+                SEI
+                LDA $01         ; disable ROM and enable I/O
+                AND #$FD
+                ORA #$05
+                STA $01
+                LDA #$7F
+                STA $DD0D       ; disable CIA 2's all interrupt sources
+                LDA $DD0E
+                AND #$BE        ; ensure that $DD0C remains constant
+                STA $DD0E       ; and stop the timer
+                LDA #$40
+                STA $DD0C       ; store RTI to $DD0C
+                LDA #$0C
+                STA $FFFA
+                LDA #$DD
+                STA $FFFB       ; change NMI vector to $DD0C
+                LDA #$FF        ; Try changing this instruction's operand
+                STA $DD05       ; (see comment below).
+                LDA #$FF
+                STA $DD04       ; set interrupt frequency to 1/65536 cycles
+                LDA $DD0E
+                AND #$80
+                ORA #$11
+                LDX #$81
+                STX $DD0D       ; enable timer interrupt
+                STA $DD0E       ; start timer
+
+                LDA #$00        ; To see that the interrupts really occur,
+                STA $D011       ; use something like this and see how
+        LOOP    DEC $D020       ; changing the byte loaded to $DD05 from
+                BNE LOOP        ; #$FF to #$0F changes the image.
+
+        When an NMI occurs, the processor jumps to Kernal code, which
+        jumps to ($0318), which points to the following routine:
+
+        DD0C    RTI
+
+        How on earth can this clear the interrupts? Remember, the
+        processor always fetches two successive bytes for each
+        instruction.
+
+        A little more practical version of this is redirecting the NMI
+        (or IRQ) to your own routine, whose last instruction is JMP
+        $DD0C or JMP $DC0C.  If you want to confuse more, change the 0
+        in the address to a hexadecimal digit different from the one
+        you used when writing the RTI.
+
+        Or you can combine the latter two methods:
+
+        DD09    LSR $xx  ; xx is any appropriate BCD value 00-59.
+                BPL $DCFC
+        DCFC    RTI
+
+        This example acknowledges interrupts to both CIAs.
+
+  If you want to confuse the examiners of your code, you can use any
+of these techniques. Although these examples use no undefined opcodes,
+they do not necessarily run correctly on CMOS processors. However, the
+RTI example should run on 65C02 and 65C816, and the latter branch
+instruction example might work as well.
+
+  The RMW instruction method has been used in some demos, others were
+developed by Marko M"akel"a. His favourite is the automagical RTI
+method, although it does not have any practical applications, except
+for some time dependent data decryption routines for very complicated
+copy protections.
+
+
diff --git a/M6502/documentation/mos_6500_mpu_nov_1985.pdf b/M6502/documentation/mos_6500_mpu_nov_1985.pdf
new file mode 100644
index 0000000..37ab6cb
Binary files /dev/null and b/M6502/documentation/mos_6500_mpu_nov_1985.pdf differ