Checkpoint for sprite rewrite

All of the sprite rendering has been deferred down to the level of
the tile drawing. Sprites are no longer drawn/erased, but instead
a sprite sheet is generated in AddSprite and referenced by the
renderer.

Because there is no longer a single off-screen buffer that holds
a copy of all the rendered sprites, the TileStore size must be
expanded to hold a reference to the sprite data address fo each
tile.  This increase in data structure size require the TileStore
to be put into its own bank and appropriate code restructuring.

The benefits to the rewrite are significant:

  1. Sprites are never drawn/erased off-screen.  They are only
     ever drawn directly to the screen or play field.
  2. The concept of "damaged" sprites is gone.  Every dirty tile
     automatically renders just to portion of a sprite that it
     intersects.

These two properties result in a substantial increase in throughput.
This commit is contained in:
Lucas Scharenbroich 2022-02-18 12:12:32 -06:00
parent d96928e562
commit 95058fb969
11 changed files with 959 additions and 847 deletions

View File

@ -177,15 +177,6 @@ _TileStoreOffsetX mac
adc TileStoreYTable,x
<<<
_; Macro variant to calculate inline from any source
_SpriteVBuffAddr mac
lda ]2
clc
adc #NUM_BUFF_LINES
xba
adc ]1
<<<
; Macro to define script steps
ScriptStep MAC
IF #=]5

View File

@ -12,14 +12,76 @@
NO_INTERRUPTS equ 1 ; turn off for crossrunner debugging
NO_MUSIC equ 1 ; turn music + tool loading off
; External data provided by the main program segment
; External data space provided by the main program segment
tiledata EXT
TileStore EXT
; Sprite plane data and mask banks are provided as an exteral segment
;
; The sprite data holds a set of pre-rendered sprites that are optimized to support the rendering pipeline. There
; are four copies of each sprite, along with the cooresponding mask laid out into 4x4 tile regions where the
; empty row and column is shared between adjacent blocks.
;
; Logically, the memory is laid out as 4 columns of sprites and 4 rows.
;
; +---+---+---+---+---+---+---+---+---+---+---+---+-...
; | | | | | | | | | | | | | ...
; +---+---+---+---+---+---+---+---+---+---+---+---+-...
; | | 0 | 0 | | 1 | 1 | | 2 | 2 | | 3 | 3 | ...
; +---+---+---+---+---+---+---+---+---+---+---+---+-...
; | | 0 | 0 | | 1 | 1 | | 2 | 2 | | 3 | 3 | ...
; +---+---+---+---+---+---+---+---+---+---+---+---+-...
; | | | | | | | | | | | | | ...
; +---+---+---+---+---+---+---+---+---+---+---+---+-...
; | | 4 | 4 | | 5 | 5 | | 6 | 6 | | 7 | 7 | ...
; +---+---+---+---+---+---+---+---+---+---+---+---+-...
; | | 4 | 4 | | 5 | 5 | | 6 | 6 | | 7 | 7 | ...
; +---+---+---+---+---+---+---+---+---+---+---+---+-...
; | | | | | | | | | | | | | ...
; +---+---+---+---+---+---+---+---+---+---+---+---+-...
;
; For each sprite, when it needs to be copied into an on-screen tile, it could exist at any offset compared to its
; natural alignment. By having a buffer around the sprite data, an address pointer can be set to a different origin
; and a simple 8x8 block copy can cut out the appropriate bit of the sprite. For example, here is a zoomed-in look
; at a sprite with an offset, O, at (-2,-3). As shown, by selecting an appropriate origin, just the top corner
; of the sprite data will be copied.
;
; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+..
; | | || | || | | | || | | | |
; +---+-- O----------------+ --+---++---+---+---+---++---+---+---+---+..
; | | | | | || | | | || | | | |
; +---+-- | | --+---++---+---+---+---++---+---+---+---+..
; | | | | | || | | | || | | | |
; +---+-- | | --+---++---+---+---+---++---+---+---+---+..
; | | | | | || | | | || | | | |
; +===+== | ++===+== | ==+===++===+===+===+===++===+===+===+===+..
; | | | || | S | S | S || S | S | S | || | | | |
; +---+-- +----------------+ --+---++---+---+---+---++---+---+---+---+..
; | | || S | S S | S || S | S | S | S || | | | |
; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+..
; | | | | || S | S | S | S || S | S | S | S || | | | |
; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+..
; | | | | || S | S | S | S || S | S | S | S || | | | |
; +===+===+===+===++===+===+===+===++===+===+===+===++===+===+===+===+..
; | | | | || S | S | S | S || S | S | S | S || | | | |
; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+..
; | | | | || S | S | S | S || S | S | S | S || | | | |
; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+..
; | | | | || S | S | S | S || S | S | S | S || | | | |
; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+..
; | | | | || | S | S | S || S | S | S | || | | | |
; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+..
; . . . . . . . . . . . . . . . . .
;
; Each sprite will take up, effectively 9 tiles of storage space per
; instance (plus edges) and there are 4 instances for the H/V bits
; and 4 more for the masks. This results in a need for 43,264 bytes
; for all 16 sprites.
spritedata EXT
spritemask EXT
; IF there are overlays, they are provided as an external
; If there are overlays, they are provided as an external
Overlay EXT
; Core engine functionality. The idea is that that source file can be PUT into
@ -285,12 +347,13 @@ EngineReset
; 7. Ancient Land of Y's : 36 x 16 288 x 128 (18,432 bytes ( 57.6%))
; 8. Game Boy Color : 20 x 18 160 x 144 (11,520 bytes ( 36.0%))
; 9. Agony (Amiga) : 36 x 24 288 x 192 (27,648 bytes ( 86.4%))
; 10. Atari Lynx : 20 x 13 160 x 102 (8,160 bytes ( 25.5%))
;
; X = mode number OR width in pixels (must be multiple of 2)
; Y = height in pixels (if X > 8)
ScreenModeWidth dw 320,272,256,256,280,256,240,288,160,320
ScreenModeHeight dw 200,192,200,176,160,160,160,128,144,1
ScreenModeWidth dw 320,272,256,256,280,256,240,288,160,288,160,320
ScreenModeHeight dw 200,192,200,176,160,160,160,128,144,192,102,1
SetScreenMode ENT
phb
@ -301,8 +364,8 @@ SetScreenMode ENT
rtl
_SetScreenMode
cpx #9
bcs :direct ; if x > 8, then assume X and Y are the dimensions
cpx #11
bcs :direct ; if x > 10, then assume X and Y are the dimensions
txa
asl
@ -415,6 +478,7 @@ _ReadControl
put Graphics.s
put Sprite.s
put Sprite2.s
put SpriteRender.s
put Render.s
put Timer.s
put Script.s

View File

@ -81,9 +81,10 @@ BG1TileMapPtr equ 86
SCBArrayPtr equ 90 ; Used for palette binding
SpriteBanks equ 94 ; Bank bytes for the sprite data and sprite mask
LastRender equ 96 ; Record which reder function was last executed
DamagedSprites equ 98
; DamagedSprites equ 98
SpriteMap equ 100 ; Bitmap of open sprite slots.
Next equ 102
ActiveSpriteCount equ 102
Next equ 104
BankLoad equ 128
@ -154,13 +155,13 @@ SPRITE_HFLIP equ $0200
MAX_TILES equ {26*41} ; Number of tiles in the code field (41 columns * 26 rows)
TILE_STORE_SIZE equ {MAX_TILES*2} ; The tile store contains a tile descriptor in each slot
TS_TILE_ID equ TILE_STORE_SIZE*0
TS_DIRTY equ TILE_STORE_SIZE*1
TS_SPRITE_FLAG equ TILE_STORE_SIZE*2
TS_TILE_ADDR equ TILE_STORE_SIZE*3 ; const value
TS_CODE_ADDR_LOW equ TILE_STORE_SIZE*4 ; const value
TS_TILE_ID equ TILE_STORE_SIZE*0 ; tile descriptor for this location
TS_DIRTY equ TILE_STORE_SIZE*1 ; Flag. Used to prevent a tile from being queues multiple times per frame
TS_SPRITE_FLAG equ TILE_STORE_SIZE*2 ; Bitfield of all sprites that intersect this tile. 0 if no sprites.
TS_TILE_ADDR equ TILE_STORE_SIZE*3 ; cached value, the address of the tiledata for this tile
TS_CODE_ADDR_LOW equ TILE_STORE_SIZE*4 ; const value, address of this tile in the code fields
TS_CODE_ADDR_HIGH equ TILE_STORE_SIZE*5 ; const value
TS_WORD_OFFSET equ TILE_STORE_SIZE*6
TS_BASE_ADDR equ TILE_STORE_SIZE*7
TS_SPRITE_ADDR equ TILE_STORE_SIZE*8
TS_SCREEN_ADDR equ TILE_STORE_SIZE*9
TS_WORD_OFFSET equ TILE_STORE_SIZE*6 ; const value, word offset value for this tile if LDA (dp),y instructions re used
TS_BASE_ADDR equ TILE_STORE_SIZE*7 ; const value, because there are two rows of tiles per bank, this is set to $0000 ot $8000.
TS_SCREEN_ADDR equ TILE_STORE_SIZE*8 ; cached value of on-screen location of tile. Used for DirtyRender.
TS_VBUFF_ARRAY_ADDR equ TILE_STORE_SIZE*9 ; const value to an aligned 32-byte array starting at $8000 in TileStore bank

View File

@ -63,7 +63,7 @@ TileStore EXT ; Tile store internal data structure
RenderDirty EXT ; Render only dirty tiles + sprites directly to the SHR screen
GetSpriteVBuffAddr EXT ; X = x-coordinate (0 - 159), Y = y-coordinate (0 - 199). Return in Acc.
; GetSpriteVBuffAddr EXT ; X = x-coordinate (0 - 159), Y = y-coordinate (0 - 199). Return in Acc.
; Allocate a full 64K bank
AllocBank EXT

View File

@ -32,4 +32,64 @@ The engine run through the following render loop on every frame
* The dirty tile list has a fast test to see if a tile has already been marked as dirty it is not added twice
* The tile renderer is where data from the sprite plane is combined with tile data to show the sprites on-screen.
* Typically, there will not be Overlays defined and the last step of the renderer is just a single render of all playfield lines at once.
* Typically, there will not be Overlays defined and the last step of the renderer is just a single render of all playfield lines at once.
= Sprite Redesign =
In the rendering function, for a given TileStore location, we need to be able to read and array of VBUFF addresses
for sprite data. This can be done by processing the SPRITE_BIT array in to a list to get a set of offsets. These
VBUFF addresses also need to be set. Currently we are calculating the addresses in the sprite functions, but the
issue is that we need to find an addressing scheme that's essentially 2D because we have >TileStore+VARNAME,x and
Sprite+VARNAME,y, but we need something like >TileStore+VARNAME[x][y]
In a perfect scenario, we can use the following code sequence to render stacked sprites
lda tiledata,y ; tile addressed (bank register set)
ldx activeSprite+4 ; sprite VBUFF address cached on direct page
andl >spritemask,x
oral >spritedata,x
ldx activeSprite+2
andl >spritemask,x
oral >spritedata,x
ldx activeSprite
andl >spritemask,x
oral >spritedata,x
sta tmp0
; Core phases
; Convert bit field to compact array of sprite indexes
lda TileStore+VBUFF_ARR_PTR,x
sta cache
lda TileStore+SPRITE_BITS,x
bit #$0008
bne ...
lda cache ; This is 11 cycles. A PEA + TSB is 13, so a bit faster to do it at once
ora #$0006
pha
; When storing for a sprite, the corner VBUFF is calulated and stored at
base = TileStore+VBUFF_ARR_ADDR,x + SPRITE_ID
sta base
sta base+32 ; next column (mod columns)
sta base+(32*width) ; next row
sta base+(32*width+32) ; next corner
Possibilities
1. Have >TileStore+SPRITE_VBUFF be the address to an array and then manually add the y-register value so we can still use
absolute addressing
tya
adc >TileStore+SPRITE_VBUFF_ARR_ADDR,x ; Points to addreses 32 bytes apart ad Y-reg is [0, 30]
tax
lda >TileStore,x ; Load the address
tay
lda 0000,y
lda 0002,y
...

View File

@ -190,29 +190,37 @@ _ApplyDirtyTiles
; Only render solid tiles and sprites
_RenderDirtyTile
pea >TileStore ; Need that addressing flexibility here. Callers responsible for restoring bank reg
plb
plb
lda TileStore+TS_SPRITE_FLAG,y ; This is a bitfield of all the sprites that intersect this tile, only care if non-zero or not
beq :nosprite
jsr BuildActiveSpriteArray ; Build the sprite index list from the bit field
; ldx TileStore+TS_SPRITE_ADDR,y
; stx _SPR_X_REG
lda TileStore+TS_TILE_ID,y ; build the finalized tile descriptor
and #TILE_VFLIP_BIT+TILE_HFLIP_BIT ; get the lookup value
xba
ldx TileStore+TS_SPRITE_FLAG,y ; This is a bitfield of all the sprites that intersect this tile, only care if non-zero or not
beq :nosprite
ldx TileStore+TS_SPRITE_ADDR,y
stx _SPR_X_REG
tax
lda DirtyTileSpriteProcs,x
ldal DirtyTileSpriteProcs,x
sta :tiledisp+1
bra :sprite
:nosprite
lda TileStore+TS_TILE_ID,y ; build the finalized tile descriptor
and #TILE_VFLIP_BIT+TILE_HFLIP_BIT ; get the lookup value
xba
tax
lda DirtyTileProcs,x ; load and patch in the appropriate subroutine
ldal DirtyTileProcs,x ; load and patch in the appropriate subroutine
sta :tiledisp+1
:sprite
ldx TileStore+TS_TILE_ADDR,y ; load the address of this tile's data (pre-calculated)
lda TileStore+TS_SCREEN_ADDR,y ; Get the on-screen address of this tile
lda TileStore+TS_SCREEN_ADDR,y ; Get the on-screen address of this tile
pha
lda TileStore+TS_WORD_OFFSET,y
@ -284,4 +292,125 @@ _TBApplyDirtySpriteData
sta: $0002+{]line*160},y
]line equ ]line+1
--^
rts
rts
; Input: A = bit field, assumed non-zero
; Output: A = number of bits set
; Side Effect: Fill in the ActiveSprite list with sprite indices.
;
; We try very hard to be fast and clever here. Early out, keeping everything in
; registers when possible, and reducing overhead.
spriteIdx equ tmp12
BuildActiveSpriteArray
; Push a sentinel value on the stack so we know where to end later. We con't count during the
; initial process, because the Z flag needs to be maintained and almost evey opcode affects it.
; cmp lastActiveValue ; Assume that there is a decent chance of having the same
; beq early_out ; sprite bitfield in consecutive dirty tiles. Saves a lot.
tsx ; save the stack pointer
pea $FFFF ; sentinel value
; This first loop scans the bits in the accumulator and pushed a sprite index onto the stack. We
; could push any constanct, which gives us some flexibility. This only works because the PEA
; instruction does not affect any register. We also check to see if the acumulator is zero as
; an early-out test, but only do that every 4 bits in order to amortize the overhead a bit.
]step equ 0
lup 4
ror
bcc :skip_1
pea ]step
:skip_1 ror
bcc :skip_2
pea ]step+2
:skip_2 ror
bcc :skip_3
pea ]step+4
:skip_3 ror
bcc :skip_4
pea ]step+6
:skip_4 beq :end_1
]step equ ]step+8
--^
:end_1
; This second loop pops values off of the stack and places them into a linear array. We also
; set the count on exit. As an optimization / restriction, we only allow up to four overlapping
; sprites. This is similar to the NES/C64 "8 sprites per line" restriction.
pla ; Can always assume at least one bit was set...
sta spriteIdx
pla
bmi :out_1
sta spriteIdx+2
pla
bmi :out_2
sta spriteIdx+4
pla
bmi :out_3
sta spriteIdx+6
; Reset the stack point if we did not pop everything off yet
txs
; These are the exit points which know exactly how many items (x2) have been processed
:out_4 lda #8
rts
:out_0 lda #0
rts
:out_1 lda #2
rts
:out_2 lda #4
rts
:out_3 lda #6
rts
; Run through all of the active sprites and put then on-screen. We have three different heuristics depending on
; how many active sprites there are intersecting this tile.
; Version 2. No sprite place, instead each sprite has a set of pre-rendered panels and we render from
; those panels in tile-sized blocks.
;
; If there is only one sprite + tile background, then we can render directly to the screen
;
; ldal tiledata+0,x
; and sprite+MASK_OFFSET,y
; ora sprite,y
; sta 00
; ...
; sta 02
; ...
; sta A0
; ...
; sta A2
; tdc
; adc #320
; tcd
;
; Since this is a common case, it is reasonable to do so. Otherwise, we must explode the TS_SPRITE_FLAG to
; get a list of sprite origin addresses and then flatten against the tile
;
; ldal tiledata+0,x
; ldx spriteCount
; jmp (disp,x)
; ...
; ldy list+2
; and sprite+MASK_OFFSET,y
; ora sprite,y
; ldy list
; and sprite+MASK_OFFSET,y
; ora sprite,y
; sta 00
; sta 02
; sta A0
; sta A2
; tdc
; adc #320
; tcd

File diff suppressed because it is too large Load Diff

View File

@ -64,7 +64,7 @@ _LocalToTileStore
; code field offsets and then cache variations of this value needed in the rest of the subroutine
;
; The SpriteX is always the MAXIMUM value of the corner coordinates. We subtract (SpriteX + StartX) mod 4
; to find the coordinate in the sprite plane that matches up with the tile in the play field and
; to find the coordinate in the sprite cache that matches up with the tile in the play field and
; then use that to calculate the VBUFF address from which to copy sprite data.
;
; StartX SpriteX z = * mod 4 (SpriteX - z)
@ -92,21 +92,21 @@ _MarkDirtySprite
lda _Sprites+IS_OFF_SCREEN,y ; Check if the sprite is visible in the playfield
bne mdsOut
; At this point we know that we have to update the tiles that overlap the sprite plane rectangle defined
; by (Top, Left), (Bottom, Right). The general process is to figure out the top-left coordinate in the
; sprite plane that matches up with the code field and then calculate the number of tiles in each direction
; that need to be dirtied to cover the sprite.
; At this point we know that we have to update the tiles that overlap the sprite's rectangle defined
; by (Top, Left), (Bottom, Right).
clc
lda _Sprites+SPRITE_CLIP_TOP,y
adc StartYMod208 ; Adjust for the scroll offset (could be a negative number!)
tax ; Save this value
and #$0007 ; Get (StartY + SpriteY) mod 8
eor #$FFFF
inc
clc
adc _Sprites+SPRITE_CLIP_TOP,y ; subtract from the Y position (possible to go negative here)
sta TileTop ; This position will line up with the tile that the sprite overlaps with
sta TileTop ; This is the relative offset to the sprite stamp
; eor #$FFFF
; inc
; clc
; adc _Sprites+SPRITE_CLIP_TOP,y ; subtract from the Y position (possible to go negative here)
; sta TileTop ; This position will line up with the tile that the sprite overlaps with
txa ; Get back the position of the sprite top in the code field
cmp #208 ; check if we went too far positive
@ -120,7 +120,10 @@ _MarkDirtySprite
lda _Sprites+SPRITE_CLIP_BOTTOM,y ; Figure out how many tiles are needed to cover the sprite's area
sec
sbc TileTop
sbc _Sprites+SPRITE_CLIP_TOP,y
clc
adc TileTop
and #$0018 ; Clear out the lower bits and stash in bits 4 and 5
sta AreaIndex
@ -131,12 +134,14 @@ _MarkDirtySprite
adc StartXMod164
tax
and #$0003
eor #$FFFF
inc
clc
adc _Sprites+SPRITE_CLIP_LEFT,y
sta TileLeft
; eor #$FFFF
; inc
; clc
; adc _Sprites+SPRITE_CLIP_LEFT,y
; sta TileLeft
txa
cmp #164
bcc *+5
@ -146,21 +151,13 @@ _MarkDirtySprite
and #$FFFE ; Same pre-multiply by 2 for later
sta ColLeft
; Calculate the offset into the TileStore lookup array for the top-left tile
; ldx RowTop
; lda ColLeft
; clc
; adc TileStore2DYTable,x ; Fixed offset to the next row
; sta Origin ; This is the index into the TileStore2DLookup table
; Sneak a pre-calculation here. Calculate the tile-aligned upper-left corner of the sprite in the sprite plane.
; We can reuse this in all of the routines below. This is not the (x,y) of the sprite itself, but
; the corner of the tile it overlaps with
; the corner of the tile it overlaps with, relative to the sprite's VBUFF_ADDR.
clc
lda TileTop
adc #NUM_BUFF_LINES
; adc #NUM_BUFF_LINES
xba
clc
adc TileLeft
@ -170,7 +167,9 @@ _MarkDirtySprite
lda _Sprites+SPRITE_CLIP_RIGHT,y
sec
sbc TileLeft
sbc _Sprites+SPRITE_CLIP_LEFT,y
clc
adc TileLeft
and #$000C
lsr ; bit 0 is always zero and width stored in bits 1 and 2
ora AreaIndex
@ -323,20 +322,33 @@ _MarkDirtySprite
; If we had a double-sized 2D array to be able to look up the tile store address without
; adding rows and column, we could save ~6 cycles per tile
; If all that is needed is to record the Tile Store offset for the sprite and delay any
; actual calculations, then we just need to do
;
; lda TileStore2DArray,x
; sta _Sprites+TILE_STORE_ADDR_0,y
; lda TileStore2DArray+2,x
; sta _Sprites+TILE_STORE_ADDR_1,y
; lda TileStore2DArray+41,x
; sta _Sprites+TILE_STORE_ADDR_2,y
; ...
:mark_0_0
ldx RowTop
lda ColLeft
ldx RowTop
lda ColLeft
clc
adc TileStoreYTable,x ; Fixed offset to the next row
adc TileStoreYTable,x ; Fixed offset to the next row
tax
; ldx Origin
; lda TileStore2DLookup,x
; tax ; This is the tile store offset
ldal TileStore+TS_VBUFF_ARRAY_ADDR,x
sta tmp0
lda VBuffOrigin ; This is an interesting case. The mapping between the tile store
lda VBuffOrigin
sta (tmp0),y
; lda VBuffOrigin ; This is an interesting case. The mapping between the tile store
; adc #{0*4}+{0*256} ; and the sprite buffers changes as the StartX, StartY values change
sta TileStore+TS_SPRITE_ADDR,x ; but don't depend on any sprite information. However, by setting the
; stal TileStore+TS_SPRITE_ADDR,x ; but don't depend on any sprite information. However, by setting the
; value only for the tiles that get added to the dirty tile list, we
; can avoid recalculating over 1,000 values whenever the screen scrolls
; (which is common) and just limit it to the number of tiles covered by
@ -344,10 +356,10 @@ _MarkDirtySprite
; moving and they are being dirtied, then we may do more work, but the
; odds are in our favor to just take care of it here.
lda TileStore+TS_SPRITE_FLAG,x
; lda TileStore+TS_SPRITE_FLAG,x
lda SpriteBit
ora TileStore+TS_SPRITE_FLAG,x
sta TileStore+TS_SPRITE_FLAG,x
oral TileStore+TS_SPRITE_FLAG,x
stal TileStore+TS_SPRITE_FLAG,x
jmp _PushDirtyTileX ; Needs X = tile store offset; destroys A,X. Returns X in A
@ -358,13 +370,16 @@ _MarkDirtySprite
adc TileStoreYTable+2,x
tax
ldal TileStore+TS_VBUFF_ARRAY_ADDR,x
sta tmp0
lda VBuffOrigin
adc #{0*4}+{1*8*256}
sta TileStore+TS_SPRITE_ADDR,x
adc #{0*4}+{1*8*SPRITE_PLANE_SPAN}
sta (tmp0),y
lda SpriteBit
ora TileStore+TS_SPRITE_FLAG,x
sta TileStore+TS_SPRITE_FLAG,x
oral TileStore+TS_SPRITE_FLAG,x
stal TileStore+TS_SPRITE_FLAG,x
jmp _PushDirtyTileX
@ -375,13 +390,16 @@ _MarkDirtySprite
adc TileStoreYTable+4,x
tax
ldal TileStore+TS_VBUFF_ARRAY_ADDR,x
sta tmp0
lda VBuffOrigin
adc #{0*4}+{2*8*256}
sta TileStore+TS_SPRITE_ADDR,x
adc #{0*4}+{2*8*SPRITE_PLANE_SPAN}
sta (tmp0),y
lda SpriteBit
ora TileStore+TS_SPRITE_FLAG,x
sta TileStore+TS_SPRITE_FLAG,x
oral TileStore+TS_SPRITE_FLAG,x
stal TileStore+TS_SPRITE_FLAG,x
jmp _PushDirtyTileX
@ -393,13 +411,16 @@ _MarkDirtySprite
adc TileStoreYTable,x
tax
ldal TileStore+TS_VBUFF_ARRAY_ADDR,x
sta tmp0
lda VBuffOrigin
adc #{1*4}+{0*8*256}
sta TileStore+TS_SPRITE_ADDR,x
adc #{1*4}+{0*8*SPRITE_PLANE_SPAN}
sta (tmp0),y
lda SpriteBit
ora TileStore+TS_SPRITE_FLAG,x
sta TileStore+TS_SPRITE_FLAG,x
oral TileStore+TS_SPRITE_FLAG,x
stal TileStore+TS_SPRITE_FLAG,x
jmp _PushDirtyTileX
@ -411,13 +432,16 @@ _MarkDirtySprite
adc TileStoreYTable+2,x
tax
ldal TileStore+TS_VBUFF_ARRAY_ADDR,x
sta tmp0
lda VBuffOrigin
adc #{1*4}+{1*8*256}
sta TileStore+TS_SPRITE_ADDR,x
adc #{1*4}+{1*8*SPRITE_PLANE_SPAN}
sta (tmp0),y
lda SpriteBit
ora TileStore+TS_SPRITE_FLAG,x
sta TileStore+TS_SPRITE_FLAG,x
oral TileStore+TS_SPRITE_FLAG,x
stal TileStore+TS_SPRITE_FLAG,x
jmp _PushDirtyTileX
@ -429,13 +453,16 @@ _MarkDirtySprite
adc TileStoreYTable+4,x
tax
ldal TileStore+TS_VBUFF_ARRAY_ADDR,x
sta tmp0
lda VBuffOrigin
adc #{1*4}+{2*8*256}
sta TileStore+TS_SPRITE_ADDR,x
adc #{1*4}+{2*8*SPRITE_PLANE_SPAN}
sta (tmp0),y
lda SpriteBit
ora TileStore+TS_SPRITE_FLAG,x
sta TileStore+TS_SPRITE_FLAG,x
oral TileStore+TS_SPRITE_FLAG,x
stal TileStore+TS_SPRITE_FLAG,x
jmp _PushDirtyTileX
@ -447,13 +474,16 @@ _MarkDirtySprite
adc TileStoreYTable,x
tax
ldal TileStore+TS_VBUFF_ARRAY_ADDR,x
sta tmp0
lda VBuffOrigin
adc #{2*4}+{0*8*256}
sta TileStore+TS_SPRITE_ADDR,x
adc #{2*4}+{0*8*SPRITE_PLANE_SPAN}
sta (tmp0),y
lda SpriteBit
ora TileStore+TS_SPRITE_FLAG,x
sta TileStore+TS_SPRITE_FLAG,x
oral TileStore+TS_SPRITE_FLAG,x
stal TileStore+TS_SPRITE_FLAG,x
jmp _PushDirtyTileX
@ -465,13 +495,16 @@ _MarkDirtySprite
adc TileStoreYTable+2,x
tax
ldal TileStore+TS_VBUFF_ARRAY_ADDR,x
sta tmp0
lda VBuffOrigin
adc #{2*4}+{1*8*256}
sta TileStore+TS_SPRITE_ADDR,x
adc #{2*4}+{1*8*SPRITE_PLANE_SPAN}
sta (tmp0),y
lda SpriteBit
ora TileStore+TS_SPRITE_FLAG,x
sta TileStore+TS_SPRITE_FLAG,x
oral TileStore+TS_SPRITE_FLAG,x
stal TileStore+TS_SPRITE_FLAG,x
jmp _PushDirtyTileX
@ -483,13 +516,16 @@ _MarkDirtySprite
adc TileStoreYTable+4,x
tax
ldal TileStore+TS_VBUFF_ARRAY_ADDR,x
sta tmp0
lda VBuffOrigin
adc #{2*4}+{2*8*256}
sta TileStore+TS_SPRITE_ADDR,x
adc #{2*4}+{2*8*SPRITE_PLANE_SPAN}
sta (tmp0),y
lda SpriteBit
ora TileStore+TS_SPRITE_FLAG,x
sta TileStore+TS_SPRITE_FLAG,x
oral TileStore+TS_SPRITE_FLAG,x
stal TileStore+TS_SPRITE_FLAG,x
jmp _PushDirtyTileX

389
src/SpriteRender.s Normal file
View File

@ -0,0 +1,389 @@
; Function to render a sprite from a sprite definition into the internal data buffers
;
; X = sprite index
_DrawSpriteSheet
phx
lda _Sprites+VBUFF_ADDR,y
sta tmp1
lda _Sprites+TILE_DATA_OFFSET,y
sta tmp2
lda _Sprites+SPRITE_DISP,y
and #-{SPRITE_VFLIP+SPRITE_HFLIP} ; dispatch to all of the different orientations
sta tmp3
; Set bank
phb
pea #^tiledata ; Set the bank to the tile data
plb
ldx tmp3
ldy tmp2
lda tmp1
jsr _DrawSprite
lda tmp3
ora #SPRITE_VFLIP
tax
ldy tmp2
lda tmp1
clc
adc #4*3
jsr _DrawSprite
lda tmp3
ora #SPRITE_HFLIP
tax
ldy tmp2
lda tmp1
clc
adc #4*6
jsr _DrawSprite
lda tmp3
ora #SPRITE_VFLIP
tax
ldy tmp2
lda tmp1
clc
adc #4*9
jsr _DrawSprite
; Restore bank
plb ; pop extra byte
plb
plx
rts
;
; X = _Sprites array offset
_DrawSprite
; ldx _Sprites+SPRITE_DISP,y ; use bits 9, 10, 11, 12 and 13 to dispatch
jmp (draw_sprite,x)
draw_sprite dw draw_8x8,draw_8x8h,draw_8x8v,draw_8x8hv
dw draw_8x16,draw_8x16h,draw_8x16v,draw_8x16hv
dw draw_16x8,draw_16x8h,draw_16x8v,draw_16x8hv
dw draw_16x16,draw_16x16h,draw_16x16v,draw_16x16hv
dw :rtn,:rtn,:rtn,:rtn ; hidden bit is set
dw :rtn,:rtn,:rtn,:rtn
dw :rtn,:rtn,:rtn,:rtn
dw :rtn,:rtn,:rtn,:rtn
:rtn rts
draw_8x8
draw_8x8h
tax
jmp _DrawTile8x8
draw_8x8v
draw_8x8hv
tax
jmp _DrawTile8x8V
draw_8x16
draw_8x16h
tax
jsr _DrawTile8x8
clc
txa
adc #{8*SPRITE_PLANE_SPAN}
tax
tya
adc #{128*32} ; 32 tiles to the next vertical one, each tile is 128 bytes
tay
jmp _DrawTile8x8
draw_8x16v
draw_8x16hv
tax
jsr _DrawTile8x8V
clc
txa
adc #{8*SPRITE_PLANE_SPAN}
tax
tya
adc #{128*32}
tay
jmp _DrawTile8x8V
draw_16x8
tax
jsr _DrawTile8x8
clc
txa
adc #4
tax
tya
adc #128 ; Next tile is 128 bytes away
tay
jmp _DrawTile8x8
draw_16x8h
clc
tax
tya
pha
adc #128
tay
jsr _DrawTile8x8
txa
adc #4
tax
ply
jmp _DrawTile8x8
draw_16x8v
tax
jsr _DrawTile8x8V
clc
txa
adc #4
tax
tya
adc #128
tay
jmp _DrawTile8x8V
draw_16x8hv
clc
tax
tya
pha
adc #128
tay
jsr _DrawTile8x8V
txa
adc #4
tax
ply
jmp _DrawTile8x8V
draw_16x16
clc
tax
jsr _DrawTile8x8
txa
adc #4
tax
tya
adc #128
tay
jsr _DrawTile8x8
txa
adc #{8*SPRITE_PLANE_SPAN}-4
tax
tya
adc #{128*{32-1}}
tay
jsr _DrawTile8x8
txa
adc #4
tax
tya
adc #128
tay
jmp _DrawTile8x8
draw_16x16h
clc
tax
tya
pha
adc #128
tay
jsr _DrawTile8x8
txa
adc #4
tax
ply
jsr _DrawTile8x8
txa
adc #{8*SPRITE_PLANE_SPAN}-4
tax
tya
adc #{128*32}
pha
adc #128
tay
jsr _DrawTile8x8
txa
adc #4
tax
ply
jmp _DrawTile8x8
draw_16x16v
clc
tax
tya
pha ; store some copies
phx
pha
adc #{128*32}
tay
jsr _DrawTile8x8V
txa
adc #{8*SPRITE_PLANE_SPAN}
tax
ply
jsr _DrawTile8x8V
pla
adc #4
tax
lda 1,s
adc #{128*{32+1}}
tay
jsr _DrawTile8x8V
txa
adc #{8*SPRITE_PLANE_SPAN}
tax
pla
adc #128
tay
jmp _DrawTile8x8V
draw_16x16hv
clc
tax
tya
pha
adc #128+{128*32} ; Bottom-right source to top-left
tay
jsr _DrawTile8x8V
txa
adc #4
tax
lda 1,s
adc #{128*32}
tay
jsr _DrawTile8x8V
txa
adc #{8*SPRITE_PLANE_SPAN}-4
tax
lda 1,s
adc #128
tay
jsr _DrawTile8x8V
txa
adc #4
tax
ply
jmp _DrawTile8x8V
; X = sprite vbuff address
; Y = tile data pointer
_DrawTile8x8
_CopyTile8x8
]line equ 0
lup 8
lda: tiledata+32+{]line*4},y
stal spritemask+{]line*SPRITE_PLANE_SPAN},x
lda: tiledata+{]line*4},y
stal spritedata+{]line*SPRITE_PLANE_SPAN},x
lda: tiledata+32+{]line*4}+2,y
stal spritemask+{]line*SPRITE_PLANE_SPAN}+2,x
lda: tiledata+{]line*4}+2,y
stal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x
]line equ ]line+1
--^
rts
_DrawTile8x8V
_CopyTile8x8V
]line equ 0
lup 8
lda: tiledata+32+{{7-]line}*4},y
stal spritemask+{]line*SPRITE_PLANE_SPAN},x
lda: tiledata+{{7-]line}*4},y
stal spritedata+{]line*SPRITE_PLANE_SPAN},x
lda: tiledata+32+{{7-]line}*4}+2,y
stal spritemask+{]line*SPRITE_PLANE_SPAN}+2,x
lda: tiledata+{{7-]line}*4}+2,y
stal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x
]line equ ]line+1
--^
rts
; X = sprite vbuff address
; Y = tile data pointer
;_DrawTile8x8
; phb
; pea #^tiledata ; Set the bank to the tile data
; plb
;
;]line equ 0
; lup 8
; lda: tiledata+32+{]line*4},y
; andl spritemask+{]line*SPRITE_PLANE_SPAN},x
; stal spritemask+{]line*SPRITE_PLANE_SPAN},x
;
; ldal spritedata+{]line*SPRITE_PLANE_SPAN},x
; and: tiledata+32+{]line*4},y
; ora: tiledata+{]line*4},y
; stal spritedata+{]line*SPRITE_PLANE_SPAN},x
;
; lda: tiledata+32+{]line*4}+2,y
; andl spritemask+{]line*SPRITE_PLANE_SPAN}+2,x
; stal spritemask+{]line*SPRITE_PLANE_SPAN}+2,x
;
; ldal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x
; and: tiledata+32+{]line*4}+2,y
; ora: tiledata+{]line*4}+2,y
; stal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x
;]line equ ]line+1
; --^
;
; plb ; pop extra byte
; plb
; rts
; X = sprite vbuff address
; Y = tile data pointer
;
; Draws the tile vertically flipped
;_DrawTile8x8V
; phb
; pea #^tiledata ; Set the bank to the tile data
; plb
;]line equ 0
; lup 8
; lda: tiledata+32+{{7-]line}*4},y
; andl spritemask+{]line*SPRITE_PLANE_SPAN},x
; stal spritemask+{]line*SPRITE_PLANE_SPAN},x
;
; ldal spritedata+{]line*SPRITE_PLANE_SPAN},x
; and: tiledata+32+{{7-]line}*4},y
; ora: tiledata+{{7-]line}*4},y
; stal spritedata+{]line*SPRITE_PLANE_SPAN},x
; lda: tiledata+32+{{7-]line}*4}+2,y
; andl spritemask+{]line*SPRITE_PLANE_SPAN}+2,x
; stal spritemask+{]line*SPRITE_PLANE_SPAN}+2,x
;
; ldal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x
; and: tiledata+32+{{7-]line}*4}+2,y
; ora: tiledata+{{7-]line}*4}+2,y
; stal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x
;]line equ ]line+1
; --^
;
; plb ; pop extra byte
; plb
; rts

View File

@ -137,7 +137,7 @@ SetScreenRect sty ScreenHeight ; Save the screen height and
ldx #0
ldy #0
:tsloop
sta TileStore+TS_SCREEN_ADDR,X
sta TileStore+TS_SCREEN_ADDR,x
clc
adc #4 ; Go to the next tile
@ -205,7 +205,7 @@ Counter equ tmp3
tax ; NOTE: Try to rework to use new TileStore2DLookup array
lda OnScreenAddr
sta TileStore+TS_SCREEN_ADDR,X
sta TileStore+TS_SCREEN_ADDR,x
clc
adc #4 ; Go to the next tile

View File

@ -112,13 +112,17 @@ RenderTile ENT
rtl
_RenderTile2
pea >TileStore ; Need that addressing flexibility here. Callers responsible for restoring bank reg
plb
plb
lda TileStore+TS_TILE_ID,y ; build the finalized tile descriptor
ldx TileStore+TS_SPRITE_FLAG,y ; This is a bitfield of all the sprites that intersect this tile, only care if non-zero or not
beq :nosprite
ora #TILE_SPRITE_BIT
ldx TileStore+TS_SPRITE_ADDR,y
stx _SPR_X_REG
; ldx TileStore+TS_SPRITE_ADDR,y ; TODO: collapse sprites
; stx _SPR_X_REG
:nosprite
sta _TILE_ID ; Some tile blitters need to get the tile descriptor
@ -498,17 +502,34 @@ _CopyBG1Tile
;
; TileStore+TS_TILE_ID : Tile descriptor
; TileStore+TS_DIRTY : $FFFF is clean, otherwise stores a back-reference to the DirtyTiles array
; TileStore+TS_SPRITE_FLAG : Set to TILE_SPRITE_BIT if a sprite is present at this tile location
; TileStore+TS_SPRITE_ADDR ; Address of the tile in the sprite plane
; TileStore+TS_TILE_ADDR : Address of the tile in the tile data buffer
; TileStore+TS_CODE_ADDR_LOW : Low word of the address in the code field that receives the tile
; TileStore+TS_CODE_ADDR_HIGH : High word of the address in the code field that receives the tile
; TileStore+TS_WORD_OFFSET : Logical number of word for this location
; TileStore+TS_BASE_ADDR : Copy of BTableAddrLow
; TileStore+TS_SCREEN_ADDR : Address ont he physical screen corresponding to this tile (for direct rendering)
; TileStore+TS_SCREEN_ADDR : Address on the physical screen corresponding to this tile (for direct rendering)
; TileStore+TS_SPRITE_FLAG : A bit field of all sprites that intersect this tile
; TileStore+TS_SPRITE_ADDR_1 ; Address of the sprite data that aligns with this tile. These
; TileStore+TS_SPRITE_ADDR_2 ; values are 1:1 with the TS_SPRITE_FLAG bits and are not contiguous.
; TileStore+TS_SPRITE_ADDR_3 ; If the bit position in TS_SPRITE_FLAG is not set, then the value in
; TileStore+TS_SPRITE_ADDR_4 ; the TS_SPRITE_ADDR_* field is undefined.
; TileStore+TS_SPRITE_ADDR_5
; TileStore+TS_SPRITE_ADDR_6
; TileStore+TS_SPRITE_ADDR_7
; TileStore+TS_SPRITE_ADDR_8
; TileStore+TS_SPRITE_ADDR_9
; TileStore+TS_SPRITE_ADDR_10
; TileStore+TS_SPRITE_ADDR_11
; TileStore+TS_SPRITE_ADDR_12
; TileStore+TS_SPRITE_ADDR_13
; TileStore+TS_SPRITE_ADDR_14
; TileStore+TS_SPRITE_ADDR_15
; TileStore+TS_SPRITE_ADDR_16
TileStore ENT
ds TILE_STORE_SIZE*11
; TileStore+
;TileStore ENT
; ds TILE_STORE_SIZE*11
; A list of dirty tiles that need to be updated in a given frame
DirtyTileCount ds 2
@ -519,7 +540,7 @@ DirtyTiles ds TILE_STORE_SIZE ; At most this many tiles can possibly
InitTiles
:col equ tmp0
:row equ tmp1
:vbuff equ tmp2
; Fill in the TileStoreYTable. This is just a table of offsets into the Tile Store for each row. There
; are 26 rows with a stride of 41
ldy #0
@ -570,18 +591,27 @@ InitTiles
sta :row
lda #40
sta :col
lda #$8000
sta :vbuff
:loop
; The first set of values in the Tile Store are changed during each frame based on the actions
; that are happening
stz TileStore+TS_TILE_ID,x ; clear the tile store with the special zero tile
stz TileStore+TS_TILE_ADDR,x
lda #0
stal TileStore+TS_TILE_ID,x ; clear the tile store with the special zero tile
stal TileStore+TS_TILE_ADDR,x
stz TileStore+TS_SPRITE_FLAG,x ; no sprites are set at the beginning
stal TileStore+TS_SPRITE_FLAG,x ; no sprites are set at the beginning
lda #$FFFF ; none of the tiles are dirty
sta TileStore+TS_DIRTY,x
stal TileStore+TS_DIRTY,x
lda :vbuff ; array of sprite vbuff addresses per tile
stal TileStore+TS_VBUFF_ARRAY_ADDR,x
clc
adc #32
sta :vbuff
; The next set of values are constants that are simply used as cached parameters to avoid needing to
; calculate any of these values during tile rendering
@ -590,20 +620,20 @@ InitTiles
asl ; exists in the code fields
tay
lda BRowTableHigh,y
sta TileStore+TS_CODE_ADDR_HIGH,x ; High word of the tile address (just the bank)
stal TileStore+TS_CODE_ADDR_HIGH,x ; High word of the tile address (just the bank)
lda BRowTableLow,y
sta TileStore+TS_BASE_ADDR,x ; May not be needed later if we can figure out the right constant...
stal TileStore+TS_BASE_ADDR,x ; May not be needed later if we can figure out the right constant...
lda :col ; Set the offset values based on the column
asl ; of this tile
asl
sta TileStore+TS_WORD_OFFSET,x ; This is the offset from 0 to 82, used in LDA (dp),y instruction
stal TileStore+TS_WORD_OFFSET,x ; This is the offset from 0 to 82, used in LDA (dp),y instruction
tay
lda Col2CodeOffset+2,y
clc
adc TileStore+TS_BASE_ADDR,x
sta TileStore+TS_CODE_ADDR_LOW,x ; Low word of the tile address in the code field
adcl TileStore+TS_BASE_ADDR,x
stal TileStore+TS_CODE_ADDR_LOW,x ; Low word of the tile address in the code field
dec :col
bpl :hop
@ -719,7 +749,7 @@ _PushDirtyTileX
inx
stx DirtyTileCount
; Same speed, but preserved the Z register
; Same speed, but preserved the X register
; sta (DirtyTiles) ; 6
; lda DirtyTiles ; 4
; inc ; 2