A static BG1 is stable with BG0 offset values. A seam in BG1
needs to be closed up by taking into account the BG1XOrigin value
when setting the :shift_value.
Also, several routines were hard-coded for the scanline case. These
hanges need to be reverted and properly parametereized.
Find small optimizations to improve the average performance of the
blitter, especially in the odd-aligned case.
- Odd-aligned PEA exit is 2 cycles faster per line
- Odd-aligned JMP exit is 2 cycles faster per line
- Odd-aligned LDA exit is 6 cycles faster (eliminated long store)
- Merged setting the entry opcode and offset to convert 2 8-bit
store into a single 16-bit store (save 6 cycles per line)
- Load and save the full word for the high bytes. Cost 2 cycles
but enabled the 6 cycles saved for the LDA case.