The Second Step This essay discusses how to do 16-or-more bit addition and subtraction on the 6502, and how to do unsigned comparisons properly, thus making 16-bit arithmetic less necessary.
The problem The ADC, SBC, INX, and INY instructions are the only real arithmetic instructions the 6502 chip has. In and of themselves, they aren't too useful for general applications: the accumulator can only hold 8 bits, and thus can't store any value over 255. Matters get even worse when we're branching based on values; BMI and BPL hinge on the seventh (sign) bit of the result, so we can't represent any value above 127.
The solution We have two solutions available to us. First, we can use the unsigned discipline, which involves checking different flags, but lets us deal with values between 0 and 255 instead of -128 to 127. Second, we can trade speed and register persistence for multiple precision arithmetic, using 16-bit integers (-32768 to 32767, or 0-65535), 24-bit, or more. Multiplication, division, and floating point arithmetic are beyond the scope of this essay. The best way to deal with those is to find a math library on the web (I recommend ) and use the routines there.
Unsigned arithmetic When writing control code that hinges on numbers, we should always strive to have our comparison be with zero; that way, no explicit compare is necessary, and we can branch simply with BEQ/BNE, which test the zero flag. Otherwise, we use CMP. The CMP command subtracts its argument from the accumulator (without borrow), updates the flags, but throws away the result. If the value is equal, the result is zero. (CMP followed by BEQ branches if the argument is equal to the accumulator; this is probably why it's called BEQ and not something like BZS.) Intuitively, then, to check if the accumulator is less than some value, we CMP against that value and BMI. The BMI command branches based on the Negative Flag, which is equal to the seventh bit of CMP's subtract. That's exactly what we need, for signed arithmetic. However, this produces problems if you're writing a boundary detector on your screen or something and find that 192 < 4. 192 is outside of a signed byte's range, and is interpreted as if it were -64. This will not do for most graphics applications, where your values will be ranging from 0-319 or 0-199 or 0-255. Instead, we take advantage of the implied subtraction that CMP does. When subtracting, the result's carry bit starts at 1, and gets borrowed from if necessary. Let us consider some four-bit subtractions. C|3210 C|3210 ------ ------ 1|1001 9 1|1001 9 |0100 - 4 |1100 -12 ------ --- ------ --- 1|0101 5 0|1101 -3 The CMP command properly modifies the carry bit to reflect this. When computing A-B, the carry bit is set if A >= B, and it's clear if A < B. Consider the following two code sequences. (1) (2) CMP #$C0 CMP #$C0 BMI label BCC label The code in the first column treats the value in the accumulator as a signed value, and branches if the value is less than -64. (Because of overflow issues, it will actually branch for accumulator values between $40 and $BF, even though it *should* only be doing it for values between $80 and $BF. To see why, compare $40 to $C0 and look at the result.) The second column code treats the accumulator as holding an unsigned value, and branches if the value is less than 192. It will branch for accumulator values $00-$BF.
16-bit addition and subtraction Time to use the carry bit for what it was meant to do. Adding two 8 bit numbers can produce a 9-bit result. That 9th bit is stored in the carry flag. The ADC command adds the carry value to its result, as well. Thus, carries work just as we'd expect them to. Suppose we're storing two 16-bit values, low byte first, in $C100-1 and $C102-3. To add them together and store them in $C104-5, this is very easy: CLC LDA $C100 ADC $C102 STA $C104 LDA $C101 ADC $C103 STA $C105 Subtraction is identical, but you set the carry bit first with SEC (because borrow is the complement of carry—think about how the unsigned compare works if this puzzles you) and, of course, using the SBC instruction instead of ADC. The carry/borrow bit is set appropriately to let you continue, too. As long as you just keep working your way up to bytes of ever-higher significance, this generalizes to 24 (do it three times instead of two) or 32 (four, etc.) bit integers.
16-bit comparisons Doing comparisons on extended precision values is about the same as doing them on 8-bit values, but you have to have the value you test in memory, since it won't fit in the accumulator all at once. You don't have to store the values back anywhere, either, since all you care about is the final state of the flags. For example, here's a signed comparison, branching to label if the value in $C100-1 is less than 1000 ($03E8): SEC LDA $C100 SBC #$E8 LDA $C101 ; We only need the carry bit from that subtract SBC #$03 BMI label All the commentary on signed and unsigned compares holds for 16-bit (or higher) integers just as it does for the 8-bit ones.