wudsn-ide/com.wudsn.ide.asm/help/www.qotile.net/minidig/docs/2600_advanced_prog_guide.txt
2018-12-30 16:42:36 +01:00

712 lines
28 KiB
Plaintext

**************************************************************
**************************************************************
* *
* ATARI 2600 ADVANCED PROGRAMMING GUIDE *
* *
* Updated 05-21-04 *
* Compiled and edited by Paul Slocum *
* Written by the Atari 2600 programming community *
* *
* version 1.0 / 05-19-04 / first release *
* version 1.1 / 05-21-04 / added multi-sprite trick *
* *
**************************************************************
**************************************************************
This guide is intended to be a supplement to the standard Stella Programmer's
Guide. Some of the sections assume a working knowledge of 6502 assembly and
Atari 2600 registers.
If you would like to write new sections or update existing sections of this
document, contact me: paul at treewave dot com
**************************************************************
**************************************************************
* *
* TABLE OF CONTENTS *
* *
**************************************************************
**************************************************************
- Bankswitching
- BRK Subroutine Trick
- Checking the Number of Scanlines
- Constant Cycle Count to Avoid WSYNC
- Counting Down when Looping
- Illegal Opcodes
- Insurance Against Too Many VBlank/Overscan Cycles
- Multi-Sprite Trick
- Paddles
- Showing Missiles/Ball using PHP
- Skipdraw
- Sound and Music
- Using BRK with RESXX
- Wasting Cycles
- HMOVE Timing Chart
==============================================================
==============================================================
Bankswitching
==============================================================
==============================================================
For a bankswitching reference you'll want Kevin Horton's sizes.txt reference:
http://www.tripoint.org/kevtris/files/sizes.txt
One thing you'll probably want to know for any kind of bankswitching is the RORG
assembler directive. RORG is like ORG except it only affects the way label
addresses are handled, not where the code is placed in the ROM. So let's say
you're working on an 8K ROM. The first bank will start with ORG $1000 and all
the code and data for the first bank will follow. At the start of the second 4K
bank, you'll want:
ORG $2000
RORG $1000
If you don't use RORG, all your label addresses in the second bank will be in
the $2000-$2FFF range which won't work. With RORG, they will continue to
address the $1000-$1FFF range.
Here's a basic template for doing F8 bankswitching. This can easily be modified
to work with similar bankswitching methods (F4,F6,etc). This code allows you to
call "Bank2Subroutine" from bank 1 using jsr CallBank2Subroutine.
;------------------------------------
;This code in bank 1
;Switches to bank 2 where the subroutine it called.
org $1FE0
CallBank2Subroutine
ldx $1FF9 ; switch to bank 2
nop ; 1FE3 jsr Bank2Subroutine
nop ; .
nop ; .
nop ; 1FE6 lda $1FF8 (Switch back to bank 1)
nop ; .
nop ; .
rts
;------------------------------------
;This code in bank 2:
;Calls the subroutine and returns to bank 1
org $2FE3
rorg $1FE3
jsr Bank2Subroutine
ldx $1FF8 ;(Switch back to bank 1)
It's good practice to assume that your multi-bank ROM could start up in any bank
. In each bank, set up the startup vector so it points to code that switches to
the correct startup bank and then jumps to the start of your program.
==============================================================
==============================================================
BRK Subroutine Trick
--------------------------------------------------------------
Mark Lesser, Thomas Jentzch
==============================================================
==============================================================
Thomas found this trick in Mark Lesser's Lord of the Rings prototype: You can
use BRK to call a subroutine that needs to be called often and save ROM space.
If you aren't familiar with BRK, it pushes the flags and PC on the stack and
jumps to wherever the vector $FFFE is pointing.
Thomas found BRK commands like this scattered through Lord of the Rings:
brk
.byte $0e ; id-byte
lda $e3 ; <- here we will continue
ora #$04
And the BRK vector was pointing to this routine:
BrkRoutine:
plp ; remove flags from stack (not needed)
tsx ; load x with stackpointer
inx ; x++
dec $00,x ; adjust return address
lda ($00,x) ; read break-id...
tay ; ...and store in y
Subroutine:
[subroutine code...]
rts
So it ended up being the equivalent of passing a value to a subroutine similar
to this:
ldy #value
jsr Subroutine
But it saves 3 bytes with each call and the overhead is only 8 bytes.
After only 3 subroutine calls (Lord of the Rings has about 20) you are saving
ROM space.
In the case of Lord of the Rings, the subroutine was sound related code that
selected the sound effect to be played based on a priority system.
==============================================================
==============================================================
Checking the Number of Scanlines
--------------------------------------------------------------
Eckhard Strolberg
==============================================================
==============================================================
You'll want to verify that your program is drawing the desired number of
scanlines (around 262 NTSC and 312 PAL) and is not varying that number while
running. To do this, use the Z26 emulator in video modeo 9. While Z26 is
running, press ALT-9 to enter this mode which will display the number of
scanlines in the upper right corner. The -v9 switch will start Z26 in this
mode.
==============================================================
==============================================================
Constant Cycle Count to Avoid WSYNC
--------------------------------------------------------------
==============================================================
==============================================================
Normally you use STA WSYNC towards the end of each scanline in the kernal to
stay in sync with the TV. But by carefully programming your kernal so each line
consistently takes exactly 76 cycles, you can avoid using STA WSYNC at all in
your kernal loop. This will save you at least 3 cycles per scanline.
==============================================================
==============================================================
Counting Down When Looping
--------------------------------------------------------------
==============================================================
==============================================================
Have loops count down whenever possible since this can save a lot of cycles and
a little bit of ROM too.
------------------------------------------
Counting up requires a compare:
ldx #0
Loop
[your code...]
inx
cpx #20
bne Loop
------------------------------------------
Counting down allows you to get rid of the compare:
ldx #20
Loop
[your code...]
dex
bne Loop
==============================================================
==============================================================
Illegal Opcodes
--------------------------------------------------------------
Thomas Jentzch
==============================================================
==============================================================
These are the most commonly used illegal opcodes in 2600 programming and are
considered safe.
Decrement and Compare:
Decrements the memory location and compares it to the accumulator.
DCP Opcode:$C3 M <- (M)-1, (A-M) -> NZC (Ind,X) 2/8
Double NOP:
No operation command that takes 3 cycles and uses two bytes.
DOP Opcode:$04 [no operation] (Z-Page) 2/3
Load A and X:
Loads both A and X with the memory.
LAX Opcode:$AF A <- M, X <- M (Absolute) 3/4
==============================================================
==============================================================
Insurance Against Too Many VBLANK/Overscan Cycles
--------------------------------------------------------------
Paul Slocum
==============================================================
==============================================================
It's difficult to make sure that there is no unusual case where your game logic
in VBlank and Overscan will use too many cycles and cause the number of
scanlines in a frame to fluctuate. To insure against this problem, pad your
VBlank and Overscan with a few STA WSYNCs to add extra cycles during development
. Optimize your game so that it runs fine with these in. Then when the game is
finished and ready for release, comment out those extra WSYNCs. This is also an
easy way to estimate how much time you have left in VBlank and Overscan: keep
adding WSYNCs (each line is 76 cycles) until the screen jumps.
==============================================================
==============================================================
The Multi-Sprite Trick
--------------------------------------------------------------
Christopher Tumber
Todo: Add ASCII Images
==============================================================
==============================================================
The multi-sprite trick is a technique which allows a programmer to put more sprites on a scanline than would normally be
allowable. Using this trick, up to 18 [?] sprites may be displayed on a scanline.
This trick may be used with Player0, Player1, Missile0 and/or Missile1 in any combination. The most common use is with
P0 and P1 to create a row of sprites.
There are important limitations with this trick. This trick essentially allows you to create more than the normal limit
of 3 copies of a sprite. However, in doing so you probably will not have time to change the bitmap, colour or size of the
sprite (as applicable). So the kinds of displays created with this technique will usually be repetitive patterns.
The trick is accomplished by first setting NUSIZ0 and/or NUSIZ1 to display 2 or more copies of the sprites you want to
display [Is the exact setting significant?]. Your program then must strobe RESP0, RESP1, RESM0 and/or RESM1 repeatedly.
If this is timed correctly so that it occurs after the first copy is drawn but before the second then the TIA is tricked
into thinking it's just started drawing sprites and contiues drawing the sprite as if it's the first copy. You just
continue this for every copy you need.
The tightest formation of sprites possible with this trick is:
[image]
which is done by the following code:
sta RESP0
sta RESP1
sta RESP0
sta RESP1
sta RESP0
sta RESP1
sta RESP0
sta RESP1
sta RESP0
sta RESP1
However, this formation has one problem in that the last sprite on a row is shifted one pixel to the left. There is no
known soloution to this problem. If you can work this "glitch" into your game (for example as part of an asynchronous
display) then you can use this layout. If not, the closest "stable" formation is:
[image]
which is done by the following code:
sta RESP0,x
sta RESP1,x
sta RESP0,x
sta RESP1,x
sta RESP0,x
sta RESP1,x
sta RESP0,x
sta RESP1,x
sta RESP0,x
sta RESP1,x
or
sta.w RESP0
sta.w RESP1
sta.w RESP0
sta.w RESP1
sta.w RESP0
sta.w RESP1
sta.w RESP0
sta.w RESP1
sta.w RESP0
sta.w RESP1
The ,x and .w in this example are "dummies". They're only used to increased the length of time taken by each instruction
by 1 cycle. The tradeoff being that the former requires the X register be set to zero and the latter results in slightly
larger code.
If you need to turn off some of the sprites, you can do this by simply skipping their spot by inserting a dummy command.
For example:
[image]
sta RESP0
sta RESP1
sta Dummy
sta RESP1
sta RESP0
sta Dummy
sta RESP0
sta RESP1
sta RESP0
sta RESP1
Where Dummy is an unsued (or scratch or temporary) RAM location.
This trick is quite easy to use to generate static displays. However, if you want a fully dynamic display things get
considerably more complicated.
Since there is no time available while drawing the sprites to do any calculations, if you want a variable number of
sprites on and off you must predetermine which sprites to display. A simple way to do this is to create a subroutine
for every possible combination of on/off sprites. Then your program just needs to call the appropriate subroutine.
The problem with this approach is that if you have a lot of sprites, the number of subroutines becomes very large.
An alternative method is to place you drawing routine in RAM, adjusted for the current display - In the above example,
copy all the STA RESP0 and STA RESP1 commands into RAM and then where sprites don't appear, substitue in a dummy RAM
location for the relevant RESP0 or RESP1.
One thing you must be aware of if you are allowing individual sprites to be switched on and off. There are a number
of cominations which you must treat as an exception. They cannot be displayed using the general display as above.
For example, if either RESP0 or RESP1 needs to display only 1 copy of a sprite (because all other copies are off)
then you must reset NUSIZ0/NUSIZ1. This would also be the case when two copies of a sprite are set too far apart
for the second STA RESPn to occur before a "normal" copy is drawn.
In addition, these sprites are not positioned vertically like normal sprites. Rather their position is determined by which
display cycle the first STA RESPn or RESPMn occurs. So if you want to be able to reposition your sprites verically,
you will either need to add more subroutines (as above) or adjust your RAM routine further. Or some combination of the
two. Space Instigators uses a different subroutine for each possible vertical position, copies that routine into RAM
and then modifies the STA RESPn commands to turn off dead Instigators. The new, single scanline repositioning routine may
be of help here, however the multi-sprite trick tends to use up so much of your scanline time that if you're
trying to do other things (ie: display other sprites) on that scanline you may not have the luxury of enough
cycles for general purpose positioning code.
The multi-sprite trick has a side effect in that the formation of sprites is shifted [?exact number?] pixels right
as compared to where a normal sprite would appear with an STA RESPn at that cycle. This results in a left margin
that's not at the left edge of the screen. "Illegal" HMOVE/HMMn combination tricks may be used to fix this but with a
corresponding increase in complexity.
[Some more example code here]
References: The multi-sprite trick was originally used in Galaxian and was pioneered by Eckhard Stolberg, John
Saeger, Erik Mooney and Thomas Jentzsch. Search Stella List under "Grid demo","trick18","trick12" and "inv3".
==============================================================
==============================================================
Paddles
--------------------------------------------------------------
Thomas Jentzch
Todo: Explain and include how to discharge cap
==============================================================
==============================================================
Assumes Y is your kernal line counter.
lda INPT0 ;3
bmi paddles1 ;2 or 3
.byte $2c ;1 bit abs opcode
paddles1:
sty padVal1 ;3
==============================================================
==============================================================
Showing Missiles/Ball using PHP
==============================================================
==============================================================
This trick is originally from Combat and is probably the most efficient way to
display the missiles and/or ball. This trick just requires that you don't use
the stack during your kernal. Recall that:
ENABL = $1F
ENAM1 = $1E
ENAM0 = $1D
In this example I'll show how to use the trick for both missiles. You can
easily adapt it for the ball too. To set the trick up, before your kernal save
the stack pointer and set the top of the stack to ENAM1+1.
tsx ; Transfer stack pointer to X
stx SavedStackPointer ; Store it in RAM
ldx #ENAM1+1
txs ; Set the top of the stack to ENAM1+1
Now during the kernal you can compare your scanline counter to your missile
position register and this will set the zero flag in the processor. Then to
enable/disable the missile for that scanline, just push the processor flags onto
the stack. The ENAxx registers use bit 1 to enable/disable which corresponds
with the zero flag in the processor, so the enable/disable will be automatic.
It takes few cycles and doesn't vary the number of cycles depending on the
result like branching usually does.
; On each line of your the kernal...
cpy MissilePos1 ; Assumes Y is your kernal line counter
php
cpy MissilePos0
php
Then before you do it again, somewhere on each scanline you need to pull off the
stack again using two PLA's or PLP's, or you can manually reset the stack
pointer with ldx #ENAM1+1, txs.
After your kernal, restore the stack pointer:
ldx SavedStackPointer
txs
==============================================================
==============================================================
Skipdraw
--------------------------------------------------------------
Thomas Jentzch
Todo: Explain and clean up
==============================================================
==============================================================
The best way, i knew until now, was (if y contains linecounter):
tya ; 2
; sec ; 2) <- this can sometimes be avoided
sbc SpriteEnd ; 3
adc #SPRITEHEIGHT ; 2
bcx .skipDraw ; 2 = 9-11 cycles
...
---------- or ---------
If you like using illegal opcodes, you can use dcp (dec,cmp) here:
lda #SPRITEHEIGHT ; 2
dcp SpriteEnd ; 5 initial value has to be adjusted
bcx .skipDraw ; 2 = 9
...
Advantages:
- state of carry flag doesn't matter anymore (may save 2 cycles)
- a remains constant, could be useful for a 2nd sprite
- you could use the content of SpriteEnd instead of y for accesing sprite data
;==================================
;An Example:
;
; skipDraw routine for right player
TXA ; 2 A-> Current scannline
SEC ; 2 Set Carry
SBC slowP1YCoordFromBottom+1 ; 3
ADC #SPRITEHEIGHT+1 ; 2 calc if sprite is drawn
BCC skipDrawRight ; 2/3 To skip or not to skip?
TAY ; 2
lda P1Graphic,y ; 4
continueRight:
STA GRP0
;----- this part outside of kernel
skipDrawRight ; 3 from BCC
LDA #0 ; 2
BEQ continueRight ; 3 Return...
==============================================================
==============================================================
Sound and Music
--------------------------------------------------------------
==============================================================
==============================================================
Atari 2600 Music Programming Guide and music driver code:
http://qotile.net/sequencer.html
Eckhard Strolberg's Frequency and Waveform Guide:
http://buerger.metropolis.de/estolberg
==============================================================
==============================================================
Using BRK with RESXX
--------------------------------------------------------------
Eckhard Strolberg
==============================================================
==============================================================
(I'm not sure what you'd do with this trick but it's pretty interesting. Maybe
somebody will figure out how to use it.)
Pole Position puts the stack pointer over the RESxx registers and then does a
BRK. There are three write cycles in a BRK instruction, so the three position
registers for the objects that make up the road in PP, get accessed in three
consecutive cycles. This is how PP managed to get the road to meet so closely in
the horizon.
==============================================================
==============================================================
Wasting Cycles
--------------------------------------------------------------
Christopher Tumbler, Chris Wilkson, Andrew Davie
==============================================================
==============================================================
These are the most efficient ways to waste processor cycles.
Note that locations $2D-$3F do nothin and aren't decoded, and so they are used
often here. In some bankswitching schemes this could cause problems though.
-----------------
1 Cycle (0 or 1 byte)
.w (Change a zero page instruction to absolute, adds 1 byte of code)
,x (Change a zero page or absolute instruction to an indexed instruction.
Make sure x=0. Can also use Y)
-----------------
2 Cycles (1 byte)
nop
-----------------
3 Cycles (2 bytes)
sta $2D
- or -
lda $2D
- or -
dop (Double NOP illegal opcode)
-----------------
4 Cycles (2 bytes)
nop
nop
-----------------
5 Cycles (2 bytes)
dec $2D
- or -
sta $1800,X ; asssumes you can write to ROM without problems
-----------------
6 Cycles (2 bytes)
lda ($80,X) ; assumes possible reads from 0-$7f have no effect
6 Cycles (3 bytes)
nop
nop
nop
-----------------
7 Cycles (2 bytes, need 1 byte free on stack)
pha
pla
-----------------
8 Cycles (3 bytes)
lda ($80,X) ; assumes possible reads from 0-$7f have no effect
nop
-----------------
9 Cycles (3 bytes, need 1 byte free on stack)
pha
pla
nop
9 Cycles (4 bytes)
dec $2D
nop
nop
-----------------
10 Cycles (4 bytes)
dec $2D
dec $2D
- or -
rol $80
rol $80 ; leaves $80 unchanged
-----------------
11 Cycles (4 bytes)
ASSUMING we can safely write to ROM and have nothing disasterous...
STA $8000,X
LDA ($80,X) ; assumes possible reads from 0-$7f have no effect
-----------------
12 Cycles (3 bytes, need 2 bytes free on stack)
jsr return
; somewhere else
return:
rts
-----------------
12 Cycles (4 bytes)
LDA ($80,X) ; assumes possible reads from 0-$7f have no effect
LDA ($80,X) ; assumes possible reads from 0-$7f have no effect
Also:
You can use PHA/PHP (1 byte 3 cycles) or PLA/PLP (1 byte 4 cycles) alone but
you have to be carefull not to mess up your stack (PLP/PHA would be usefull if
you have no stack!
==============================================================
==============================================================
HMOVE Timing Chart
--------------------------------------------------------------
Brad Mott
==============================================================
==============================================================
Typically HMOVE is executed right after WSYNC, but hitting HMOVE at other times
during the line has effects that can sometimes be useful. It's possible to move
objects without the black HMOVE bars and/or move objects farther than would
normally be possible with HMPx registers. This test was performed using player
graphics, but will probably work with the ball and missiles as well.
HMPx values
0 1 2 3 4 5 6 7 8 9 a b c d e f
Cyc
10 0 -1 -2 -2 -2 -2 -2 -2 8 7 6 5 4 3 2 1 ** HBLANK
11 0 -1 -1 -1 -1 -1 -1 -1 8 7 6 5 4 3 2 1 HBLANK
12 0 0 0 0 0 0 0 0 8 7 6 5 4 3 2 1 HBLANK
13 1 1 1 1 1 1 1 1 8 7 6 5 4 3 2 1 HBLANK
14 1 1 1 1 1 1 1 1 8 7 6 5 4 3 2 1 ** HBLANK
15 2 2 2 2 2 2 2 2 8 7 6 5 4 3 2 2 HBLANK
16 3 3 3 3 3 3 3 3 8 7 6 5 4 3 3 3 HBLANK
17 4 4 4 4 4 4 4 4 8 7 6 5 4 4 4 4 HBLANK
18 4 4 4 4 4 4 4 4 8 7 6 5 4 4 4 4 ** HBLANK
19 5 5 5 5 5 5 5 5 8 7 6 5 5 5 5 5 HBLANK
20 6 6 6 6 6 6 6 6 8 7 6 6 6 6 6 6 HBLANK
21 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
.
.
53 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
54 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
55 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0
56 0 0 0 0 0 0 -1 -2 0 0 0 0 0 0 0 0
57 0 0 0 0 0 -1 -2 -3 0 0 0 0 0 0 0 0
58 0 0 0 0 0 -1 -2 -3 0 0 0 0 0 0 0 0 **
59 0 0 0 0 -1 -2 -3 -4 0 0 0 0 0 0 0 0
60 0 0 0 -1 -2 -3 -4 -5 0 0 0 0 0 0 0 0
61 0 0 -1 -2 -3 -4 -5 -6 0 0 0 0 0 0 0 0
62 0 0 -1 -2 -3 -4 -5 -6 0 0 0 0 0 0 0 0 **
63 0 -1 -2 -3 -4 -5 -6 -7 0 0 0 0 0 0 0 0
64 -1 -2 -3 -4 -5 -6 -7 -8 0 0 0 0 0 0 0 0
65 -2 -3 -4 -5 -6 -7 -8 -9 0 0 0 0 0 0 0 -1
66 -2 -3 -4 -5 -6 -7 -8 -9 0 0 0 0 0 0 0 -1 **
67 -3 -4 -5 -6 -7 -8 -9 -10 0 0 0 0 0 0 -1 -2
68 -4 -5 -6 -7 -8 -9 -10 -11 0 0 0 0 0 -1 -2 -3
69 -5 -6 -7 -8 -9 -10 -11 -12 0 0 0 0 -1 -2 -3 -4
70 -5 -6 -7 -8 -9 -10 -11 -12 0 0 0 0 -1 -2 -3 -4 **
71 -6 -7 -8 -9 -10 -11 -12 -13 0 0 0 -1 -2 -3 -4 -5
72 -7 -8 -9 -10 -11 -12 -13 -14 0 0 -1 -2 -3 -4 -5 -6
73 -8 -9 -10 -11 -12 -13 -14 -15 0 -1 -2 -3 -4 -5 -6 -7
74 -8 -9 -10 -11 -12 -13 -14 -15 0 -1 -2 -3 -4 -5 -6 -7 **
75 0 -1 -2 -3 -4 -5 -6 -7 8 7 6 5 4 3 2 1 HBLANK
76 0 -1 -2 -3 -4 -5 -6 -7 8 7 6 5 4 3 2 1 HBLANK
77 0 -1 -2 -3 -4 -5 -6 -7 8 7 6 5 4 3 2 1 HBLANK
78 0 -1 -2 -3 -4 -5 -6 -7 8 7 6 5 4 3 2 1 HBLANK
79 0 -1 -2 -3 -4 -5 -6 -7 8 7 6 5 4 3 2 1 HBLANK
80 0 -1 -2 -3 -4 -5 -6 -6 8 7 6 5 4 3 2 1 HBLANK
81 0 -1 -2 -3 -4 -5 -5 -5 8 7 6 5 4 3 2 1 HBLANK
82 0 -1 -2 -3 -4 -5 -5 -5 8 7 6 5 4 3 2 1 ** HBLANK
83 0 -1 -2 -3 -4 -4 -4 -4 8 7 6 5 4 3 2 1 HBLANK
84 0 -1 -2 -3 -3 -3 -3 -3 8 7 6 5 4 3 2 1 HBLANK
85 0 -1 -2 -2 -2 -2 -2 -2 8 7 6 5 4 3 2 1 HBLANK
( table repeats at this point)