From 95058fb96918d52ac9ca36c883b1c040c5e28420 Mon Sep 17 00:00:00 2001 From: Lucas Scharenbroich Date: Fri, 18 Feb 2022 12:12:32 -0600 Subject: [PATCH] Checkpoint for sprite rewrite All of the sprite rendering has been deferred down to the level of the tile drawing. Sprites are no longer drawn/erased, but instead a sprite sheet is generated in AddSprite and referenced by the renderer. Because there is no longer a single off-screen buffer that holds a copy of all the rendered sprites, the TileStore size must be expanded to hold a reference to the sprite data address fo each tile. This increase in data structure size require the TileStore to be put into its own bank and appropriate code restructuring. The benefits to the rewrite are significant: 1. Sprites are never drawn/erased off-screen. They are only ever drawn directly to the screen or play field. 2. The concept of "damaged" sprites is gone. Every dirty tile automatically renders just to portion of a sprite that it intersects. These two properties result in a substantial increase in throughput. --- macros/CORE.MACS.S | 9 - src/Core.s | 76 +++- src/Defs.s | 23 +- src/GTE.s | 2 +- src/README.md | 62 ++- src/Render.s | 151 +++++++- src/Sprite.s | 850 +++++++---------------------------------- src/Sprite2.s | 174 +++++---- src/SpriteRender.s | 389 +++++++++++++++++++ src/blitter/Template.s | 4 +- src/blitter/Tiles.s | 66 +++- 11 files changed, 959 insertions(+), 847 deletions(-) create mode 100644 src/SpriteRender.s diff --git a/macros/CORE.MACS.S b/macros/CORE.MACS.S index 38b55d3..7aced11 100644 --- a/macros/CORE.MACS.S +++ b/macros/CORE.MACS.S @@ -177,15 +177,6 @@ _TileStoreOffsetX mac adc TileStoreYTable,x <<< -_; Macro variant to calculate inline from any source -_SpriteVBuffAddr mac - lda ]2 - clc - adc #NUM_BUFF_LINES - xba - adc ]1 - <<< - ; Macro to define script steps ScriptStep MAC IF #=]5 diff --git a/src/Core.s b/src/Core.s index 175265f..306f7ac 100644 --- a/src/Core.s +++ b/src/Core.s @@ -12,14 +12,76 @@ NO_INTERRUPTS equ 1 ; turn off for crossrunner debugging NO_MUSIC equ 1 ; turn music + tool loading off -; External data provided by the main program segment +; External data space provided by the main program segment tiledata EXT +TileStore EXT ; Sprite plane data and mask banks are provided as an exteral segment +; +; The sprite data holds a set of pre-rendered sprites that are optimized to support the rendering pipeline. There +; are four copies of each sprite, along with the cooresponding mask laid out into 4x4 tile regions where the +; empty row and column is shared between adjacent blocks. +; +; Logically, the memory is laid out as 4 columns of sprites and 4 rows. +; +; +---+---+---+---+---+---+---+---+---+---+---+---+-... +; | | | | | | | | | | | | | ... +; +---+---+---+---+---+---+---+---+---+---+---+---+-... +; | | 0 | 0 | | 1 | 1 | | 2 | 2 | | 3 | 3 | ... +; +---+---+---+---+---+---+---+---+---+---+---+---+-... +; | | 0 | 0 | | 1 | 1 | | 2 | 2 | | 3 | 3 | ... +; +---+---+---+---+---+---+---+---+---+---+---+---+-... +; | | | | | | | | | | | | | ... +; +---+---+---+---+---+---+---+---+---+---+---+---+-... +; | | 4 | 4 | | 5 | 5 | | 6 | 6 | | 7 | 7 | ... +; +---+---+---+---+---+---+---+---+---+---+---+---+-... +; | | 4 | 4 | | 5 | 5 | | 6 | 6 | | 7 | 7 | ... +; +---+---+---+---+---+---+---+---+---+---+---+---+-... +; | | | | | | | | | | | | | ... +; +---+---+---+---+---+---+---+---+---+---+---+---+-... +; +; For each sprite, when it needs to be copied into an on-screen tile, it could exist at any offset compared to its +; natural alignment. By having a buffer around the sprite data, an address pointer can be set to a different origin +; and a simple 8x8 block copy can cut out the appropriate bit of the sprite. For example, here is a zoomed-in look +; at a sprite with an offset, O, at (-2,-3). As shown, by selecting an appropriate origin, just the top corner +; of the sprite data will be copied. +; +; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+.. +; | | || | || | | | || | | | | +; +---+-- O----------------+ --+---++---+---+---+---++---+---+---+---+.. +; | | | | | || | | | || | | | | +; +---+-- | | --+---++---+---+---+---++---+---+---+---+.. +; | | | | | || | | | || | | | | +; +---+-- | | --+---++---+---+---+---++---+---+---+---+.. +; | | | | | || | | | || | | | | +; +===+== | ++===+== | ==+===++===+===+===+===++===+===+===+===+.. +; | | | || | S | S | S || S | S | S | || | | | | +; +---+-- +----------------+ --+---++---+---+---+---++---+---+---+---+.. +; | | || S | S S | S || S | S | S | S || | | | | +; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+.. +; | | | | || S | S | S | S || S | S | S | S || | | | | +; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+.. +; | | | | || S | S | S | S || S | S | S | S || | | | | +; +===+===+===+===++===+===+===+===++===+===+===+===++===+===+===+===+.. +; | | | | || S | S | S | S || S | S | S | S || | | | | +; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+.. +; | | | | || S | S | S | S || S | S | S | S || | | | | +; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+.. +; | | | | || S | S | S | S || S | S | S | S || | | | | +; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+.. +; | | | | || | S | S | S || S | S | S | || | | | | +; +---+---+---+---++---+---+---+---++---+---+---+---++---+---+---+---+.. +; . . . . . . . . . . . . . . . . . +; +; Each sprite will take up, effectively 9 tiles of storage space per +; instance (plus edges) and there are 4 instances for the H/V bits +; and 4 more for the masks. This results in a need for 43,264 bytes +; for all 16 sprites. + spritedata EXT spritemask EXT -; IF there are overlays, they are provided as an external +; If there are overlays, they are provided as an external Overlay EXT ; Core engine functionality. The idea is that that source file can be PUT into @@ -285,12 +347,13 @@ EngineReset ; 7. Ancient Land of Y's : 36 x 16 288 x 128 (18,432 bytes ( 57.6%)) ; 8. Game Boy Color : 20 x 18 160 x 144 (11,520 bytes ( 36.0%)) ; 9. Agony (Amiga) : 36 x 24 288 x 192 (27,648 bytes ( 86.4%)) +; 10. Atari Lynx : 20 x 13 160 x 102 (8,160 bytes ( 25.5%)) ; ; X = mode number OR width in pixels (must be multiple of 2) ; Y = height in pixels (if X > 8) -ScreenModeWidth dw 320,272,256,256,280,256,240,288,160,320 -ScreenModeHeight dw 200,192,200,176,160,160,160,128,144,1 +ScreenModeWidth dw 320,272,256,256,280,256,240,288,160,288,160,320 +ScreenModeHeight dw 200,192,200,176,160,160,160,128,144,192,102,1 SetScreenMode ENT phb @@ -301,8 +364,8 @@ SetScreenMode ENT rtl _SetScreenMode - cpx #9 - bcs :direct ; if x > 8, then assume X and Y are the dimensions + cpx #11 + bcs :direct ; if x > 10, then assume X and Y are the dimensions txa asl @@ -415,6 +478,7 @@ _ReadControl put Graphics.s put Sprite.s put Sprite2.s + put SpriteRender.s put Render.s put Timer.s put Script.s diff --git a/src/Defs.s b/src/Defs.s index 2353ac7..46fe577 100644 --- a/src/Defs.s +++ b/src/Defs.s @@ -81,9 +81,10 @@ BG1TileMapPtr equ 86 SCBArrayPtr equ 90 ; Used for palette binding SpriteBanks equ 94 ; Bank bytes for the sprite data and sprite mask LastRender equ 96 ; Record which reder function was last executed -DamagedSprites equ 98 +; DamagedSprites equ 98 SpriteMap equ 100 ; Bitmap of open sprite slots. -Next equ 102 +ActiveSpriteCount equ 102 +Next equ 104 BankLoad equ 128 @@ -154,13 +155,13 @@ SPRITE_HFLIP equ $0200 MAX_TILES equ {26*41} ; Number of tiles in the code field (41 columns * 26 rows) TILE_STORE_SIZE equ {MAX_TILES*2} ; The tile store contains a tile descriptor in each slot -TS_TILE_ID equ TILE_STORE_SIZE*0 -TS_DIRTY equ TILE_STORE_SIZE*1 -TS_SPRITE_FLAG equ TILE_STORE_SIZE*2 -TS_TILE_ADDR equ TILE_STORE_SIZE*3 ; const value -TS_CODE_ADDR_LOW equ TILE_STORE_SIZE*4 ; const value +TS_TILE_ID equ TILE_STORE_SIZE*0 ; tile descriptor for this location +TS_DIRTY equ TILE_STORE_SIZE*1 ; Flag. Used to prevent a tile from being queues multiple times per frame +TS_SPRITE_FLAG equ TILE_STORE_SIZE*2 ; Bitfield of all sprites that intersect this tile. 0 if no sprites. +TS_TILE_ADDR equ TILE_STORE_SIZE*3 ; cached value, the address of the tiledata for this tile +TS_CODE_ADDR_LOW equ TILE_STORE_SIZE*4 ; const value, address of this tile in the code fields TS_CODE_ADDR_HIGH equ TILE_STORE_SIZE*5 ; const value -TS_WORD_OFFSET equ TILE_STORE_SIZE*6 -TS_BASE_ADDR equ TILE_STORE_SIZE*7 -TS_SPRITE_ADDR equ TILE_STORE_SIZE*8 -TS_SCREEN_ADDR equ TILE_STORE_SIZE*9 +TS_WORD_OFFSET equ TILE_STORE_SIZE*6 ; const value, word offset value for this tile if LDA (dp),y instructions re used +TS_BASE_ADDR equ TILE_STORE_SIZE*7 ; const value, because there are two rows of tiles per bank, this is set to $0000 ot $8000. +TS_SCREEN_ADDR equ TILE_STORE_SIZE*8 ; cached value of on-screen location of tile. Used for DirtyRender. +TS_VBUFF_ARRAY_ADDR equ TILE_STORE_SIZE*9 ; const value to an aligned 32-byte array starting at $8000 in TileStore bank diff --git a/src/GTE.s b/src/GTE.s index da9627a..f0749f8 100644 --- a/src/GTE.s +++ b/src/GTE.s @@ -63,7 +63,7 @@ TileStore EXT ; Tile store internal data structure RenderDirty EXT ; Render only dirty tiles + sprites directly to the SHR screen -GetSpriteVBuffAddr EXT ; X = x-coordinate (0 - 159), Y = y-coordinate (0 - 199). Return in Acc. +; GetSpriteVBuffAddr EXT ; X = x-coordinate (0 - 159), Y = y-coordinate (0 - 199). Return in Acc. ; Allocate a full 64K bank AllocBank EXT diff --git a/src/README.md b/src/README.md index fc2c161..6ec0573 100644 --- a/src/README.md +++ b/src/README.md @@ -32,4 +32,64 @@ The engine run through the following render loop on every frame * The dirty tile list has a fast test to see if a tile has already been marked as dirty it is not added twice * The tile renderer is where data from the sprite plane is combined with tile data to show the sprites on-screen. -* Typically, there will not be Overlays defined and the last step of the renderer is just a single render of all playfield lines at once. \ No newline at end of file +* Typically, there will not be Overlays defined and the last step of the renderer is just a single render of all playfield lines at once. + += Sprite Redesign = + +In the rendering function, for a given TileStore location, we need to be able to read and array of VBUFF addresses +for sprite data. This can be done by processing the SPRITE_BIT array in to a list to get a set of offsets. These +VBUFF addresses also need to be set. Currently we are calculating the addresses in the sprite functions, but the +issue is that we need to find an addressing scheme that's essentially 2D because we have >TileStore+VARNAME,x and +Sprite+VARNAME,y, but we need something like >TileStore+VARNAME[x][y] + +In a perfect scenario, we can use the following code sequence to render stacked sprites + + lda tiledata,y ; tile addressed (bank register set) + ldx activeSprite+4 ; sprite VBUFF address cached on direct page + andl >spritemask,x + oral >spritedata,x + ldx activeSprite+2 + andl >spritemask,x + oral >spritedata,x + ldx activeSprite + andl >spritemask,x + oral >spritedata,x + sta tmp0 + + +; Core phases + +; Convert bit field to compact array of sprite indexes + lda TileStore+VBUFF_ARR_PTR,x + sta cache + lda TileStore+SPRITE_BITS,x + bit #$0008 + bne ... + + lda cache ; This is 11 cycles. A PEA + TSB is 13, so a bit faster to do it at once + ora #$0006 + pha + +; When storing for a sprite, the corner VBUFF is calulated and stored at + + base = TileStore+VBUFF_ARR_ADDR,x + SPRITE_ID + + sta base + sta base+32 ; next column (mod columns) + sta base+(32*width) ; next row + sta base+(32*width+32) ; next corner + +Possibilities + +1. Have >TileStore+SPRITE_VBUFF be the address to an array and then manually add the y-register value so we can still use + absolute addressing + + tya + adc >TileStore+SPRITE_VBUFF_ARR_ADDR,x ; Points to addreses 32 bytes apart ad Y-reg is [0, 30] + tax + lda >TileStore,x ; Load the address + tay + + lda 0000,y + lda 0002,y + ... diff --git a/src/Render.s b/src/Render.s index 6580020..49bb692 100644 --- a/src/Render.s +++ b/src/Render.s @@ -190,29 +190,37 @@ _ApplyDirtyTiles ; Only render solid tiles and sprites _RenderDirtyTile + pea >TileStore ; Need that addressing flexibility here. Callers responsible for restoring bank reg + plb + plb + + lda TileStore+TS_SPRITE_FLAG,y ; This is a bitfield of all the sprites that intersect this tile, only care if non-zero or not + beq :nosprite + + jsr BuildActiveSpriteArray ; Build the sprite index list from the bit field + +; ldx TileStore+TS_SPRITE_ADDR,y +; stx _SPR_X_REG + lda TileStore+TS_TILE_ID,y ; build the finalized tile descriptor and #TILE_VFLIP_BIT+TILE_HFLIP_BIT ; get the lookup value xba - - ldx TileStore+TS_SPRITE_FLAG,y ; This is a bitfield of all the sprites that intersect this tile, only care if non-zero or not - beq :nosprite - - ldx TileStore+TS_SPRITE_ADDR,y - stx _SPR_X_REG - tax - lda DirtyTileSpriteProcs,x + ldal DirtyTileSpriteProcs,x sta :tiledisp+1 bra :sprite :nosprite + lda TileStore+TS_TILE_ID,y ; build the finalized tile descriptor + and #TILE_VFLIP_BIT+TILE_HFLIP_BIT ; get the lookup value + xba tax - lda DirtyTileProcs,x ; load and patch in the appropriate subroutine + ldal DirtyTileProcs,x ; load and patch in the appropriate subroutine sta :tiledisp+1 :sprite ldx TileStore+TS_TILE_ADDR,y ; load the address of this tile's data (pre-calculated) - lda TileStore+TS_SCREEN_ADDR,y ; Get the on-screen address of this tile + lda TileStore+TS_SCREEN_ADDR,y ; Get the on-screen address of this tile pha lda TileStore+TS_WORD_OFFSET,y @@ -284,4 +292,125 @@ _TBApplyDirtySpriteData sta: $0002+{]line*160},y ]line equ ]line+1 --^ - rts \ No newline at end of file + rts + +; Input: A = bit field, assumed non-zero +; Output: A = number of bits set +; Side Effect: Fill in the ActiveSprite list with sprite indices. +; +; We try very hard to be fast and clever here. Early out, keeping everything in +; registers when possible, and reducing overhead. + +spriteIdx equ tmp12 +BuildActiveSpriteArray + +; Push a sentinel value on the stack so we know where to end later. We con't count during the +; initial process, because the Z flag needs to be maintained and almost evey opcode affects it. + +; cmp lastActiveValue ; Assume that there is a decent chance of having the same +; beq early_out ; sprite bitfield in consecutive dirty tiles. Saves a lot. + + tsx ; save the stack pointer + pea $FFFF ; sentinel value + +; This first loop scans the bits in the accumulator and pushed a sprite index onto the stack. We +; could push any constanct, which gives us some flexibility. This only works because the PEA +; instruction does not affect any register. We also check to see if the acumulator is zero as +; an early-out test, but only do that every 4 bits in order to amortize the overhead a bit. + +]step equ 0 + lup 4 + ror + bcc :skip_1 + pea ]step +:skip_1 ror + bcc :skip_2 + pea ]step+2 +:skip_2 ror + bcc :skip_3 + pea ]step+4 +:skip_3 ror + bcc :skip_4 + pea ]step+6 +:skip_4 beq :end_1 +]step equ ]step+8 + --^ +:end_1 + +; This second loop pops values off of the stack and places them into a linear array. We also +; set the count on exit. As an optimization / restriction, we only allow up to four overlapping +; sprites. This is similar to the NES/C64 "8 sprites per line" restriction. + + pla ; Can always assume at least one bit was set... + sta spriteIdx + + pla + bmi :out_1 + sta spriteIdx+2 + + pla + bmi :out_2 + sta spriteIdx+4 + + pla + bmi :out_3 + sta spriteIdx+6 + +; Reset the stack point if we did not pop everything off yet + txs + +; These are the exit points which know exactly how many items (x2) have been processed +:out_4 lda #8 + rts +:out_0 lda #0 + rts +:out_1 lda #2 + rts +:out_2 lda #4 + rts +:out_3 lda #6 + rts + +; Run through all of the active sprites and put then on-screen. We have three different heuristics depending on +; how many active sprites there are intersecting this tile. + +; Version 2. No sprite place, instead each sprite has a set of pre-rendered panels and we render from +; those panels in tile-sized blocks. +; +; If there is only one sprite + tile background, then we can render directly to the screen +; +; ldal tiledata+0,x +; and sprite+MASK_OFFSET,y +; ora sprite,y +; sta 00 +; ... +; sta 02 +; ... +; sta A0 +; ... +; sta A2 +; tdc +; adc #320 +; tcd +; +; Since this is a common case, it is reasonable to do so. Otherwise, we must explode the TS_SPRITE_FLAG to +; get a list of sprite origin addresses and then flatten against the tile +; +; ldal tiledata+0,x +; ldx spriteCount +; jmp (disp,x) +; ... +; ldy list+2 +; and sprite+MASK_OFFSET,y +; ora sprite,y +; ldy list +; and sprite+MASK_OFFSET,y +; ora sprite,y +; sta 00 + +; sta 02 +; sta A0 +; sta A2 +; tdc +; adc #320 +; tcd \ No newline at end of file diff --git a/src/Sprite.s b/src/Sprite.s index 9541670..f8425f6 100644 --- a/src/Sprite.s +++ b/src/Sprite.s @@ -44,7 +44,7 @@ ; set dirty tiles, identify DAMAGED sprites, and THEN perform the drawing. It is not possible to ; just do each sprite one at a time. ; -; Initialize the sprite plane data and mask banks (all data = $0000, all masks = $FFFF) +; Initialize the sprite data and mask banks (all data = $0000, all masks = $FFFF) InitSprites ldx #$FFFE lda #0 @@ -70,6 +70,24 @@ InitSprites dex bpl :loop3 +; Initialize the VBUFF address offsets in the data and mask banks for each sprite +; +; The internal grid 13 tiles wide where each sprite has a 2x2 interior square with a +; tile-size buffer all around. We pre-render each sprite with all four vert/horz flips +VBUFF_STRIDE_BYTES equ 13*4 +VBUFF_TILE_ROW_BYTES equ 8*VBUFF_STRIDE_BYTES +VBUFF_SPRITE_STEP equ VBUFF_TILE_ROW_BYTES*3 +VBUFF_SPRITE_START equ {8*VBUFF_TILE_ROW_BYTES}+4 + + ldx #{MAX_SPRITES-1}*2 + lda #VBUFF_SPRITE_START + clc +:loop4 sta _Sprites+VBUFF_ADDR,x + adc #VBUFF_SPRITE_STEP + dex + dex + bpl :loop4 + ; Precalculate some bank values jsr _CacheSpriteBanks rts @@ -83,81 +101,73 @@ _ClearSpriteFromTileStore ldx _Sprites+TILE_STORE_ADDR_1,y bne *+3 rts - lda TileStore+TS_SPRITE_FLAG,x ; Clear the bit in the bit field. This seems wasteful, but + ldal TileStore+TS_SPRITE_FLAG,x ; Clear the bit in the bit field. This seems wasteful, but and _SpriteBitsNot,y ; there is no indexed form of TSB/TRB and caching the value in - tsb DamagedSprites ; Mark which other sprites are impacted by this one - sta TileStore+TS_SPRITE_FLAG,x ; a direct page location, only saves 1 or 2 cycles per and costs 10. + stal TileStore+TS_SPRITE_FLAG,x ; a direct page location, only saves 1 or 2 cycles per and costs 10. jsr _PushDirtyTileX ldx _Sprites+TILE_STORE_ADDR_2,y bne *+3 rts - lda TileStore+TS_SPRITE_FLAG,x + ldal TileStore+TS_SPRITE_FLAG,x and _SpriteBitsNot,y - tsb DamagedSprites - sta TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jsr _PushDirtyTileX ldx _Sprites+TILE_STORE_ADDR_3,y bne *+3 rts - lda TileStore+TS_SPRITE_FLAG,x + ldal TileStore+TS_SPRITE_FLAG,x and _SpriteBitsNot,y - tsb DamagedSprites - sta TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jsr _PushDirtyTileX ldx _Sprites+TILE_STORE_ADDR_4,y bne *+3 rts - lda TileStore+TS_SPRITE_FLAG,x + ldal TileStore+TS_SPRITE_FLAG,x and _SpriteBitsNot,y - tsb DamagedSprites - sta TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jsr _PushDirtyTileX ldx _Sprites+TILE_STORE_ADDR_5,y bne *+3 rts - lda TileStore+TS_SPRITE_FLAG,x + ldal TileStore+TS_SPRITE_FLAG,x and _SpriteBitsNot,y - sta TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jsr _PushDirtyTileX ldx _Sprites+TILE_STORE_ADDR_6,y bne *+3 rts - lda TileStore+TS_SPRITE_FLAG,x + ldal TileStore+TS_SPRITE_FLAG,x and _SpriteBitsNot,y - tsb DamagedSprites - sta TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jsr _PushDirtyTileX ldx _Sprites+TILE_STORE_ADDR_7,y bne *+3 rts - lda TileStore+TS_SPRITE_FLAG,x + ldal TileStore+TS_SPRITE_FLAG,x and _SpriteBitsNot,y - tsb DamagedSprites - sta TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jsr _PushDirtyTileX ldx _Sprites+TILE_STORE_ADDR_8,y bne *+3 rts - lda TileStore+TS_SPRITE_FLAG,x + ldal TileStore+TS_SPRITE_FLAG,x and _SpriteBitsNot,y - tsb DamagedSprites - sta TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jsr _PushDirtyTileX ldx _Sprites+TILE_STORE_ADDR_9,y bne *+3 rts - lda TileStore+TS_SPRITE_FLAG,x + ldal TileStore+TS_SPRITE_FLAG,x and _SpriteBitsNot,y - tsb DamagedSprites - sta TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jmp _PushDirtyTileX ; This function looks at the sprite list and renders the sprite plane data into the appropriate @@ -181,6 +191,8 @@ _ClearSpriteFromTileStore ; Tile Store locations are marked as dirty. It is important to recognize that the sprites themselves ; can be marked dirty, and the underlying tiles in the tile store are independently marked dirty. +activeSpriteList equ blttmp + phase1 dw :phase1_0 dw :phase1_1,:phase1_2,:phase1_3,:phase1_4 dw :phase1_5,:phase1_6,:phase1_7,:phase1_8 @@ -242,7 +254,6 @@ phase1 dw :phase1_0 ; all of the tile store locations that it occupied on the previous frame and add those ; tile store locations to the dirty tile list. _DoPhase1 - lda _Sprites+SPRITE_STATUS,y ora forceSpriteFlag bit #SPRITE_STATUS_MOVED+SPRITE_STATUS_REMOVED @@ -250,15 +261,6 @@ _DoPhase1 jsr _ClearSpriteFromTileStore :no_clear -; If this sprite has been MOVED, UPDATED or REMOVED, then it needs to be erased from the -; sprite plane buffer - - lda _Sprites+SPRITE_STATUS,y - bit #SPRITE_STATUS_MOVED+SPRITE_STATUS_UPDATED+SPRITE_STATUS_REMOVED - beq :no_erase - jsr _EraseSpriteY -:no_erase - ; Check to see if sprite was REMOVED If so, then this is where we return its Sprite ID to the ; list of open slots @@ -347,66 +349,65 @@ _DoPhase2 and #SPRITE_STATUS_ADDED+SPRITE_STATUS_MOVED+SPRITE_STATUS_UPDATED beq :out -; This is the complicated part; we need to draw the sprite into the sprite plane, but then -; calculate the tiles that overlap with the sprite potentially and mark those as dirty _AND_ -; store the appropriate sprite plane address from which those tiles need to copy. -; ; Mark the appropriate tiles as dirty and as occupied by a sprite so that the ApplyTiles -; subroutine will get the drawn data from the sprite plane into the code field where it -; can be drawn to the screen +; subroutine will combine the sprite data with the tile data into the code field where it +; can be drawn to the screen. This routine is also responsible for setting the specific +; VBUFF address for each sprite's tile sheet position - jsr _MarkDirtySprite - -; Draw the sprite into the sprite plane buffer(s) - - lda _Sprites+SPRITE_DISP2,y ; use bits 9, 10, 11, 12, and 13 to dispatch - jmp (draw_sprite,x) + jmp _MarkDirtySprite :out rts -; Optimization: Could use 8-bit registers to save +; Use the blttmp space to build the active sprite list. Since the sprite tiles are not drawn until later, +; it's OK to use that scratch space here. And it's just the right size, 32 bytes RebuildSpriteArray - ldx #0 ; Number of non-empty sprite locations lda SpriteMap ; Get the bit field - tay ; Cache to restore - bit #$0001 ; For each bit position, test and store a value - beq :chk1 - stz activeSpriteList ; Shortcut for the first one - ldx #2 +; Unrolled loop to get the sprite index values that coorespond to the set bit positions -; A super-optimization here would be to put the activeSpriteList on the direct page (32 bytes) and then -; use PEA instructions to push the slot values. Calculate the count at the end based on the final stack -; address. Only 160 cycles to build the list. -:chk1 -]flag equ $0002 -]slot equ $0002 - lup 15 - bit #]flag - beq :chk2 - lda #]slot - sta activeSpriteList,x - tya - inx - inx -:chk2 -]flag equ ]flag*2 -]slot equ ]slot+2 + pea $FFFF ; end-of-list marker +]step equ 0 + lup 4 + ror + bcc :skip_1 + pea ]step +:skip_1 ror + bcc :skip_2 + pea ]step+2 +:skip_2 ror + bcc :skip_3 + pea ]step+4 +:skip_3 ror + bcc :skip_4 + pea ]step+6 +:skip_4 beq :end_1 +]step equ ]step+8 --^ +:end_1 - stx activeSpriteCount +; Now pop the values off of the stack until reaching the sentinel value. This could be unrolled, but +; it is only done once per frame. + + ldx #0 +:loop + pla + bmi :out + sta blttmp,x + inx + inx + bra :loop +:out + stx ActiveSpriteCount rts forceSpriteFlag ds 2 _RenderSprites - stz DamagedSprites ; clear the potential set of damaged sprites - ; Check to see if any sprites have been added or removed. If so, then we regenerate the active ; sprite list. Since adding and removing sprites is rare, this is a worthwhile tradeoff, because -; there are several places where we want to interative over the all of the sprites, and having a list -; and not have to contantly load and test the SPRITE_STATUS just to skip unused slots can help streamline -; the code. +; there are several places where we want to iterate over the all of the sprites, and having a list +; and not have to constantly load and test the SPRITE_STATUS just to skip unused slots can help +; streamline the code. lda #DIRTY_BIT_SPRITE_ARRAY trb DirtyBits ; clears the flag, if it was set @@ -440,16 +441,16 @@ _RenderSprites ; how many sprite to process and they are in a contiguous array. So we on't have to keep track ; of an iterating variable - ldx activeSpriteCount + ldx ActiveSpriteCount jmp (phase1,x) phase1_rtn ; Dispatch to the second phase of rendering the sprites. - ldx activeSpriteCount + ldx ActiveSpriteCount jmp (phase2,x) phase2_rtn -; Speite rendering complete +; Sprite rendering complete rts ; _GetTileAt @@ -495,464 +496,7 @@ _GetTileAt clc rts -; Y = _Sprites array offset -_EraseSpriteY - lda _Sprites+OLD_VBUFF_ADDR,y - beq :noerase - ldx _Sprites+SPRITE_DISP,y ; get the dispatch index for this sprite - jmp (:do_erase,x) -:noerase rts -:do_erase dw _EraseTileSprite8x8,_EraseTileSprite8x16 - dw _EraseTileSprite16x8,_EraseTileSprite16x16 - - -; X = _Sprites array offset -_DrawSpriteYA - lda _Sprites+SPRITE_DISP2,y ; use bits 9, 10, 11 and 12,13 to dispatch - jmp (draw_sprite,x) - -draw_sprite dw draw_8x8,draw_8x8h,draw_8x8v,draw_8x8hv - dw draw_8x16,draw_8x16h,draw_8x16v,draw_8x16hv - dw draw_16x8,draw_16x8h,draw_16x8v,draw_16x8hv - dw draw_16x16,draw_16x16h,draw_16x16v,draw_16x16hv - - dw :rtn,:rtn,:rtn,:rtn ; hidden bit is set - dw :rtn,:rtn,:rtn,:rtn - dw :rtn,:rtn,:rtn,:rtn - dw :rtn,:rtn,:rtn,:rtn -:rtn rts - -draw_8x8 -draw_8x8h - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - tay - jmp _DrawTile8x8 - -draw_8x8v -draw_8x8hv - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - tay - jmp _DrawTile8x8V - -draw_8x16 -draw_8x16h - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - tay - jsr _DrawTile8x8 - clc - txa - adc #{8*SPRITE_PLANE_SPAN} - tax - tya - adc #{128*32} ; 32 tiles to the next vertical one, each tile is 128 bytes - tay - jmp _DrawTile8x8 - -draw_8x16v -draw_8x16hv - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - tay - jsr _DrawTile8x8V - clc - txa - adc #{8*SPRITE_PLANE_SPAN} - tax - tya - adc #{128*32} - tay - jmp _DrawTile8x8V - -draw_16x8 - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - tay - jsr _DrawTile8x8 - clc - txa - adc #4 - tax - tya - adc #128 ; Next tile is 128 bytes away - tay - jmp _DrawTile8x8 - -draw_16x8h - clc - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - pha - adc #128 - tay - jsr _DrawTile8x8 - txa - adc #4 - tax - ply - jmp _DrawTile8x8 - -draw_16x8v - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - tay - jsr _DrawTile8x8V - clc - txa - adc #4 - tax - tya - adc #128 - tay - jmp _DrawTile8x8V - -draw_16x8hv - clc - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - pha - adc #128 - tay - jsr _DrawTile8x8V - txa - adc #4 - tax - ply - jmp _DrawTile8x8V - -draw_16x16 - clc - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - tay - jmp _DrawTile16x16 - -; jsr _DrawTile8x8 - txa - adc #4 - tax - tya - adc #128 - tay - jsr _DrawTile8x8 - txa - adc #{8*SPRITE_PLANE_SPAN}-4 - tax - tya - adc #{128*{32-1}} - tay - jsr _DrawTile8x8 - txa - adc #4 - tax - tya - adc #128 - tay - jmp _DrawTile8x8 - -draw_16x16h - clc - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - pha - adc #128 - tay - jsr _DrawTile8x8 - - txa - adc #4 - tax - ply - jsr _DrawTile8x8 - - txa - adc #{8*SPRITE_PLANE_SPAN}-4 - tax - tya - adc #{128*32} - pha - adc #128 - tay - jsr _DrawTile8x8 - - txa - adc #4 - tax - ply - jmp _DrawTile8x8 - -draw_16x16v - clc - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - pha ; store some copies - phx - pha - adc #{128*32} - tay - jsr _DrawTile8x8V - - txa - adc #{8*SPRITE_PLANE_SPAN} - tax - ply - jsr _DrawTile8x8V - - pla - adc #4 - tax - lda 1,s - adc #{128*{32+1}} - tay - jsr _DrawTile8x8V - - txa - adc #{8*SPRITE_PLANE_SPAN} - tax - pla - adc #128 - tay - jmp _DrawTile8x8V - -; TODO -draw_16x16hv - clc - ldx _Sprites+VBUFF_ADDR,y - lda _Sprites+TILE_DATA_OFFSET,y - pha - adc #128+{128*32} ; Bottom-right source to top-left - tay - jsr _DrawTile8x8V - - txa - adc #4 - tax - lda 1,s - adc #{128*32} - tay - jsr _DrawTile8x8V - - txa - adc #{8*SPRITE_PLANE_SPAN}-4 - tax - lda 1,s - adc #128 - tay - jsr _DrawTile8x8V - - txa - adc #4 - tax - ply - jmp _DrawTile8x8V - -DrawTileSprite ENT - jsr _DrawTile8x8 - rtl - -; Hypothetical compiled tile routine -; -; Need 1MB of memory to have 1:1 space for 512 tiles -; for 16 sprites we have 8 variants: Vert, Horz, Shift. The shift sprites need an extra column -; -; 16x16 sprite = 4x16 words x 4 = 256, 5x16 words x 4 = 320 = 576 words * 21 bytes/word = 12K per sprite - -; pei SpriteBanks -; plb - -; lda spritedata+0,x ; skipped if mask = $ffff -; and #tilemask -; ora #tiledata -; sta spritedata+0,x ; 12 bytes / word = 12 * 16 = 216 < 256 in the worst case - -; lda spritedata+2,x ; if mask != 0 and data = 0 -; and #tilemask -; sta spritedata+0,x - -; lda #tiledata ; if mask = 0 and data != 0 -; sta spritedata+0,x - -; stz spritedata+0,x ; if mask = 0 and data = 0 -; ... - -; plb -; lda #tilemask -; and spritemask+0,x -; sta spritemask+0,x ; 9 * 16 = 144 i the worst case -; -; stz spritemask+2,x ; if mask is zero (often the case) - -_DrawTileTemplate -; X = sprite vbuff address -; Y = tile data pointer -_DrawTile8x8 - phb - pea #^tiledata ; Set the bank to the tile data - plb - -]line equ 0 - lup 8 - lda: tiledata+32+{]line*4},y - andl spritemask+{]line*SPRITE_PLANE_SPAN},x - stal spritemask+{]line*SPRITE_PLANE_SPAN},x - - ldal spritedata+{]line*SPRITE_PLANE_SPAN},x - and: tiledata+32+{]line*4},y - ora: tiledata+{]line*4},y - stal spritedata+{]line*SPRITE_PLANE_SPAN},x - - lda: tiledata+32+{]line*4}+2,y - andl spritemask+{]line*SPRITE_PLANE_SPAN}+2,x - stal spritemask+{]line*SPRITE_PLANE_SPAN}+2,x - - ldal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x - and: tiledata+32+{]line*4}+2,y - ora: tiledata+{]line*4}+2,y - stal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x -]line equ ]line+1 - --^ - - plb ; pop extra byte - plb - rts - -; X = sprite vbuff address -; Y = tile data pointer -_DrawTile16x16 - phb - pea #^tiledata ; Set the bank to the tile data - plb - -]line equ 0 - lup 8 - lda: tiledata+32+{]line*4},y - andl spritemask+{]line*SPRITE_PLANE_SPAN},x - stal spritemask+{]line*SPRITE_PLANE_SPAN},x - - ldal spritedata+{]line*SPRITE_PLANE_SPAN},x - and: tiledata+32+{]line*4},y - ora: tiledata+{]line*4},y - stal spritedata+{]line*SPRITE_PLANE_SPAN},x - - lda: tiledata+32+{]line*4}+2,y - andl spritemask+{]line*SPRITE_PLANE_SPAN}+2,x - stal spritemask+{]line*SPRITE_PLANE_SPAN}+2,x - - ldal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x - and: tiledata+32+{]line*4}+2,y - ora: tiledata+{]line*4}+2,y - stal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x - - lda: tiledata+32+128+{]line*4},y - andl spritemask+{]line*SPRITE_PLANE_SPAN}+4,x - stal spritemask+{]line*SPRITE_PLANE_SPAN}+4,x - - ldal spritedata+{]line*SPRITE_PLANE_SPAN}+4,x - and: tiledata+32+128+{]line*4},y - ora: tiledata+128+{]line*4},y - stal spritedata+{]line*SPRITE_PLANE_SPAN}+4,x - - lda: tiledata+32+128+{]line*4}+2,y - andl spritemask+{]line*SPRITE_PLANE_SPAN}+6,x - stal spritemask+{]line*SPRITE_PLANE_SPAN}+6,x - - ldal spritedata+{]line*SPRITE_PLANE_SPAN}+6,x - and: tiledata+32+128+{]line*4}+2,y - ora: tiledata+128+{]line*4}+2,y - stal spritedata+{]line*SPRITE_PLANE_SPAN}+6,x - -]line equ ]line+1 - --^ - -TILE_ROW_STRIDE equ 32*128 -SPRITE_ROW_STRIDE equ 8*SPRITE_PLANE_SPAN - -]line equ 0 - lup 8 - lda: tiledata+TILE_ROW_STRIDE+32+{]line*4},y - andl spritemask+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN},x - stal spritemask+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN},x - - ldal spritedata+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN},x - and: tiledata+TILE_ROW_STRIDE+32+{]line*4},y - ora: tiledata+TILE_ROW_STRIDE+{]line*4},y - stal spritedata+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN},x - - lda: tiledata+TILE_ROW_STRIDE+32+{]line*4}+2,y - andl spritemask+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+2,x - stal spritemask+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+2,x - - ldal spritedata+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+2,x - and: tiledata+TILE_ROW_STRIDE+32+{]line*4}+2,y - ora: tiledata+TILE_ROW_STRIDE+{]line*4}+2,y - stal spritedata+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+2,x - - lda: tiledata+TILE_ROW_STRIDE+32+128+{]line*4},y - andl spritemask+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+4,x - stal spritemask+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+4,x - - ldal spritedata+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+4,x - and: tiledata+TILE_ROW_STRIDE+32+128+{]line*4},y - ora: tiledata+TILE_ROW_STRIDE+128+{]line*4},y - stal spritedata+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+4,x - - lda: tiledata+TILE_ROW_STRIDE+32+128+{]line*4}+2,y - andl spritemask+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+6,x - stal spritemask+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+6,x - - ldal spritedata+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+6,x - and: tiledata+TILE_ROW_STRIDE+128+32+{]line*4}+2,y - ora: tiledata+TILE_ROW_STRIDE+128+{]line*4}+2,y - stal spritedata+SPRITE_ROW_STRIDE+{]line*SPRITE_PLANE_SPAN}+6,x - -]line equ ]line+1 - --^ - - plb ; pop extra byte - plb - rts - -; X = sprite vbuff address -; Y = tile data pointer -; -; Draws the tile vertically flipped -_DrawTile8x8V - phb - pea #^tiledata ; Set the bank to the tile data - plb - -]line equ 0 - lup 8 - lda: tiledata+32+{{7-]line}*4},y - andl spritemask+{]line*SPRITE_PLANE_SPAN},x - stal spritemask+{]line*SPRITE_PLANE_SPAN},x - - ldal spritedata+{]line*SPRITE_PLANE_SPAN},x - and: tiledata+32+{{7-]line}*4},y - ora: tiledata+{{7-]line}*4},y - stal spritedata+{]line*SPRITE_PLANE_SPAN},x - - lda: tiledata+32+{{7-]line}*4}+2,y - andl spritemask+{]line*SPRITE_PLANE_SPAN}+2,x - stal spritemask+{]line*SPRITE_PLANE_SPAN}+2,x - - ldal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x - and: tiledata+32+{{7-]line}*4}+2,y - ora: tiledata+{{7-]line}*4}+2,y - stal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x -]line equ ]line+1 - --^ - - plb ; pop extra byte - plb - rts - -; Erase is easy -- set an 8x8 area of the data region to all $0000 and the corresponding mask -; resgion to all $FFFF -; -; X = address is sprite plane -- erases an 8x8 region +; Small initialization routine to cache the banks for the sprite data and mask _CacheSpriteBanks lda #>spritemask and #$FF00 @@ -960,152 +504,39 @@ _CacheSpriteBanks sta SpriteBanks rts -SPRITE_PLANE_SPAN equ 256 - -; A = bank address -_EraseTileSprite8x8 - tax - phb ; Save the bank to switch to the sprite plane - - pei SpriteBanks - plb ; pop the data bank (low byte) - -]line equ 0 - lup 8 - stz: {]line*SPRITE_PLANE_SPAN}+0,x - stz: {]line*SPRITE_PLANE_SPAN}+2,x -]line equ ]line+1 - --^ - - plb ; pop the mask bank (high byte) - lda #$FFFF -]line equ 0 - lup 8 - sta: {]line*SPRITE_PLANE_SPAN}+0,x - sta: {]line*SPRITE_PLANE_SPAN}+2,x -]line equ ]line+1 - --^ - - plb - rts - -_EraseTileSprite8x16 - tax - phb ; Save the bank to switch to the sprite plane - - pei SpriteBanks - plb ; pop the data bank (low byte) - -]line equ 0 - lup 16 - stz: {]line*SPRITE_PLANE_SPAN}+0,x - stz: {]line*SPRITE_PLANE_SPAN}+2,x -]line equ ]line+1 - --^ - - plb ; pop the mask bank (high byte) - lda #$FFFF -]line equ 0 - lup 16 - sta: {]line*SPRITE_PLANE_SPAN}+0,x - sta: {]line*SPRITE_PLANE_SPAN}+2,x -]line equ ]line+1 - --^ - - plb - rts - -_EraseTileSprite16x8 - tax - phb ; Save the bank to switch to the sprite plane - - pei SpriteBanks - plb ; pop the data bank (low byte) - -]line equ 0 - lup 8 - stz: {]line*SPRITE_PLANE_SPAN}+0,x - stz: {]line*SPRITE_PLANE_SPAN}+2,x - stz: {]line*SPRITE_PLANE_SPAN}+4,x - stz: {]line*SPRITE_PLANE_SPAN}+6,x -]line equ ]line+1 - --^ - - plb ; pop the mask bank (high byte) - lda #$FFFF -]line equ 0 - lup 8 - sta: {]line*SPRITE_PLANE_SPAN}+0,x - sta: {]line*SPRITE_PLANE_SPAN}+2,x - sta: {]line*SPRITE_PLANE_SPAN}+4,x - sta: {]line*SPRITE_PLANE_SPAN}+6,x -]line equ ]line+1 - --^ - - plb - rts - -_EraseTileSprite16x16 - tax - phb ; Save the bank to switch to the sprite plane - - pei SpriteBanks - plb ; pop the data bank (low byte) - -]line equ 0 - lup 16 - stz: {]line*SPRITE_PLANE_SPAN}+0,x - stz: {]line*SPRITE_PLANE_SPAN}+2,x - stz: {]line*SPRITE_PLANE_SPAN}+4,x - stz: {]line*SPRITE_PLANE_SPAN}+6,x -]line equ ]line+1 - --^ - - plb ; pop the mask bank (high byte) - - lda #$FFFF -]line equ 0 - lup 16 - sta: {]line*SPRITE_PLANE_SPAN}+0,x - sta: {]line*SPRITE_PLANE_SPAN}+2,x - sta: {]line*SPRITE_PLANE_SPAN}+4,x - sta: {]line*SPRITE_PLANE_SPAN}+6,x -]line equ ]line+1 - --^ - - plb - rts +; This is 13 blocks wide +SPRITE_PLANE_SPAN equ 52 ; 256 ; A = x coordinate ; Y = y coordinate -GetSpriteVBuffAddr ENT - jsr _GetSpriteVBuffAddr - rtl +;GetSpriteVBuffAddr ENT +; jsr _GetSpriteVBuffAddr +; rtl ; A = x coordinate ; Y = y coordinate -_GetSpriteVBuffAddr - pha - tya - clc - adc #NUM_BUFF_LINES ; The virtual buffer has 24 lines of off-screen space - xba ; Each virtual scan line is 256 bytes wide for overdraw space - clc - adc 1,s - sta 1,s - pla - rts +;_GetSpriteVBuffAddr +; pha +; tya +; clc +; adc #NUM_BUFF_LINES ; The virtual buffer has 24 lines of off-screen space +; xba ; Each virtual scan line is 256 bytes wide for overdraw space +; clc +; adc 1,s +; sta 1,s +; pla +; rts ; Version that uses temporary space (tmp15) -_GetSpriteVBuffAddrTmp - sta tmp15 - tya - clc - adc #NUM_BUFF_LINES ; The virtual buffer has 24 lines of off-screen space - xba ; Each virtual scan line is 256 bytes wide for overdraw space - clc - adc tmp15 - rts +;_GetSpriteVBuffAddrTmp +; sta tmp15 +; tya +; clc +; adc #NUM_BUFF_LINES ; The virtual buffer has 24 lines of off-screen space +; xba ; Each virtual scan line is 256 bytes wide for overdraw space +; clc +; adc tmp15 +; rts ; Add a new sprite to the rendering pipeline ; @@ -1149,7 +580,7 @@ _AddSprite sec ; Signal that no sprite slot was available rts -:open +:open sta _Sprites+SPRITE_ID,x ; Keep a copy of the full descriptor jsr _GetTileAddr ; This applies the TILE_ID_MASK sta _Sprites+TILE_DATA_OFFSET,x @@ -1162,12 +593,13 @@ _AddSprite pla ; X coordinate sta _Sprites+SPRITE_X,x - jsr _GetSpriteVBuffAddrTmp ; Preserves X-register - sta _Sprites+VBUFF_ADDR,x +; jsr _GetSpriteVBuffAddrTmp +; sta _Sprites+VBUFF_ADDR,x ; This is now pre-calculated since each sprite slot gets a fixed location - jsr _PrecalcAllSpriteInfo ; Cache stuff + jsr _PrecalcAllSpriteInfo ; Cache sprite property values (simple stuff) + jsr _DrawSpriteSheet ; Render the sprite into internal space -; Mark the dirty bit to indicate that the active sprite list needs to be rebuild in the next +; Mark the dirty bit to indicate that the active sprite list needs to be rebuilt in the next ; render call lda #DIRTY_BIT_SPRITE_ARRAY @@ -1194,7 +626,7 @@ _AddSprite ; Precalculate some cached values for a sprite. These are *only* to make other part of code, ; specifically the draw/erase routines more efficient. ; -; There are variations of thi routine based on whether we are adding a new sprite, updating +; There are variations of this routine based on whether we are adding a new sprite, updating ; it's tile information, or changing its position. ; ; X = sprite index @@ -1202,14 +634,7 @@ _PrecalcAllSpriteInfo lda _Sprites+SPRITE_ID,x and #$2E00 xba - sta _Sprites+SPRITE_DISP2,x ; use bits 9 through 13 for full dispatch - - lda _Sprites+SPRITE_ID,x - and #$1800 ; use bits 11 and 12 to dispatch (only care about size) - lsr - lsr - xba - sta _Sprites+SPRITE_DISP,x + sta _Sprites+SPRITE_DISP,x ; use bits 9 through 13 for full dispatch ; Clip the sprite's bounding box to the play field size and also set a flag if the sprite ; is fully offs-screen or not @@ -1344,13 +769,15 @@ _MoveSpriteXnc pha tya sta _Sprites+SPRITE_Y,x ; Update the Y coordinate - pla - jsr _GetSpriteVBuffAddrTmp ; A = x-coord, Y = y-coord - ldy _Sprites+VBUFF_ADDR,x ; Save the previous draw location for erasing - sta _Sprites+VBUFF_ADDR,x ; Overwrite with the new location - tya - sta _Sprites+OLD_VBUFF_ADDR,x +; pla +; jsr _GetSpriteVBuffAddrTmp ; A = x-coord, Y = y-coord +; ldy _Sprites+VBUFF_ADDR,x ; Save the previous draw location for erasing +; sta _Sprites+VBUFF_ADDR,x ; Overwrite with the new location +; tya +; sta _Sprites+OLD_VBUFF_ADDR,x + + jsr _PrecalcAllSpriteInfo ; Can be specialized to only update (x,y) values lda _Sprites+SPRITE_STATUS,x ora #SPRITE_STATUS_MOVED @@ -1361,17 +788,12 @@ _MoveSpriteXnc ; Sprite data structures. We cache quite a few pieces of information about the sprite ; to make calculations faster, so this is hidden from the caller. ; -; Each sprite record contains the following properties: ; -; +0: Sprite status word (0 = unoccupied) -; +2: Tile data address -; +4: Screen offset address (used for data and masks) - ; Number of "off-screen" lines above logical (0,0) -NUM_BUFF_LINES equ 24 +; NUM_BUFF_LINES equ 24 -MAX_SPRITES equ 16 -SPRITE_REC_SIZE equ 48 +MAX_SPRITES equ 16 +SPRITE_REC_SIZE equ 46 ; Mark each sprite as ADDED, UPDATED, MOVED, REMOVED depending on the actions applied to it ; on this frame. Quick note, the same Sprite ID cannot be removed and added in the same frame. @@ -1385,17 +807,13 @@ SPRITE_STATUS_MOVED equ $0002 ; Sprite's position was changed SPRITE_STATUS_UPDATED equ $0004 ; Sprite's non-position attributes were changed SPRITE_STATUS_REMOVED equ $0008 ; Sprite has been removed. -; Each subroutine just sets the relevant bits, so it's possible to call AddSprite / UpdateSprite / MoveSprite -; and RemoveSprite in a single frame. These bits have priorities, so in this case, the sprite is immediately -; removed and never displayed. - SPRITE_STATUS equ {MAX_SPRITES*0} TILE_DATA_OFFSET equ {MAX_SPRITES*2} -VBUFF_ADDR equ {MAX_SPRITES*4} +VBUFF_ADDR equ {MAX_SPRITES*4} ; Fixed address in sprite/mask banks SPRITE_ID equ {MAX_SPRITES*6} SPRITE_X equ {MAX_SPRITES*8} SPRITE_Y equ {MAX_SPRITES*10} -OLD_VBUFF_ADDR equ {MAX_SPRITES*12} +; OLD_VBUFF_ADDR equ {MAX_SPRITES*12} TILE_STORE_ADDR_1 equ {MAX_SPRITES*14} TILE_STORE_ADDR_2 equ {MAX_SPRITES*16} TILE_STORE_ADDR_3 equ {MAX_SPRITES*18} @@ -1412,7 +830,6 @@ SPRITE_CLIP_RIGHT equ {MAX_SPRITES*38} SPRITE_CLIP_TOP equ {MAX_SPRITES*40} SPRITE_CLIP_BOTTOM equ {MAX_SPRITES*42} IS_OFF_SCREEN equ {MAX_SPRITES*44} -SPRITE_DISP2 equ {MAX_SPRITES*46} ; Maintain the index of the next open sprite slot. This allows us to have amortized ; constant sprite add performance. A negative value means no slots are available. @@ -1420,9 +837,4 @@ _NextOpenSlot dw 0 _OpenListHead dw 0 _OpenList dw 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,$FFFF ; List with sentinel at the end -_Sprites ds SPRITE_REC_SIZE*MAX_SPRITES - -; On-demand cached list of active sprite slots -activeSpriteCount ds 2 -activeSpriteList ds 2*MAX_SPRITES - +_Sprites ds SPRITE_REC_SIZE*MAX_SPRITES diff --git a/src/Sprite2.s b/src/Sprite2.s index 0dd4556..8965a8e 100644 --- a/src/Sprite2.s +++ b/src/Sprite2.s @@ -64,7 +64,7 @@ _LocalToTileStore ; code field offsets and then cache variations of this value needed in the rest of the subroutine ; ; The SpriteX is always the MAXIMUM value of the corner coordinates. We subtract (SpriteX + StartX) mod 4 -; to find the coordinate in the sprite plane that matches up with the tile in the play field and +; to find the coordinate in the sprite cache that matches up with the tile in the play field and ; then use that to calculate the VBUFF address from which to copy sprite data. ; ; StartX SpriteX z = * mod 4 (SpriteX - z) @@ -92,21 +92,21 @@ _MarkDirtySprite lda _Sprites+IS_OFF_SCREEN,y ; Check if the sprite is visible in the playfield bne mdsOut -; At this point we know that we have to update the tiles that overlap the sprite plane rectangle defined -; by (Top, Left), (Bottom, Right). The general process is to figure out the top-left coordinate in the -; sprite plane that matches up with the code field and then calculate the number of tiles in each direction -; that need to be dirtied to cover the sprite. +; At this point we know that we have to update the tiles that overlap the sprite's rectangle defined +; by (Top, Left), (Bottom, Right). clc lda _Sprites+SPRITE_CLIP_TOP,y adc StartYMod208 ; Adjust for the scroll offset (could be a negative number!) tax ; Save this value and #$0007 ; Get (StartY + SpriteY) mod 8 - eor #$FFFF - inc - clc - adc _Sprites+SPRITE_CLIP_TOP,y ; subtract from the Y position (possible to go negative here) - sta TileTop ; This position will line up with the tile that the sprite overlaps with + sta TileTop ; This is the relative offset to the sprite stamp + +; eor #$FFFF +; inc +; clc +; adc _Sprites+SPRITE_CLIP_TOP,y ; subtract from the Y position (possible to go negative here) +; sta TileTop ; This position will line up with the tile that the sprite overlaps with txa ; Get back the position of the sprite top in the code field cmp #208 ; check if we went too far positive @@ -120,7 +120,10 @@ _MarkDirtySprite lda _Sprites+SPRITE_CLIP_BOTTOM,y ; Figure out how many tiles are needed to cover the sprite's area sec - sbc TileTop + sbc _Sprites+SPRITE_CLIP_TOP,y + clc + adc TileTop + and #$0018 ; Clear out the lower bits and stash in bits 4 and 5 sta AreaIndex @@ -131,12 +134,14 @@ _MarkDirtySprite adc StartXMod164 tax and #$0003 - eor #$FFFF - inc - clc - adc _Sprites+SPRITE_CLIP_LEFT,y sta TileLeft +; eor #$FFFF +; inc +; clc +; adc _Sprites+SPRITE_CLIP_LEFT,y +; sta TileLeft + txa cmp #164 bcc *+5 @@ -146,21 +151,13 @@ _MarkDirtySprite and #$FFFE ; Same pre-multiply by 2 for later sta ColLeft -; Calculate the offset into the TileStore lookup array for the top-left tile - -; ldx RowTop -; lda ColLeft -; clc -; adc TileStore2DYTable,x ; Fixed offset to the next row -; sta Origin ; This is the index into the TileStore2DLookup table - ; Sneak a pre-calculation here. Calculate the tile-aligned upper-left corner of the sprite in the sprite plane. ; We can reuse this in all of the routines below. This is not the (x,y) of the sprite itself, but -; the corner of the tile it overlaps with +; the corner of the tile it overlaps with, relative to the sprite's VBUFF_ADDR. clc lda TileTop - adc #NUM_BUFF_LINES +; adc #NUM_BUFF_LINES xba clc adc TileLeft @@ -170,7 +167,9 @@ _MarkDirtySprite lda _Sprites+SPRITE_CLIP_RIGHT,y sec - sbc TileLeft + sbc _Sprites+SPRITE_CLIP_LEFT,y + clc + adc TileLeft and #$000C lsr ; bit 0 is always zero and width stored in bits 1 and 2 ora AreaIndex @@ -323,20 +322,33 @@ _MarkDirtySprite ; If we had a double-sized 2D array to be able to look up the tile store address without ; adding rows and column, we could save ~6 cycles per tile +; If all that is needed is to record the Tile Store offset for the sprite and delay any +; actual calculations, then we just need to do +; +; lda TileStore2DArray,x +; sta _Sprites+TILE_STORE_ADDR_0,y +; lda TileStore2DArray+2,x +; sta _Sprites+TILE_STORE_ADDR_1,y +; lda TileStore2DArray+41,x +; sta _Sprites+TILE_STORE_ADDR_2,y +; ... + :mark_0_0 - ldx RowTop - lda ColLeft + ldx RowTop + lda ColLeft clc - adc TileStoreYTable,x ; Fixed offset to the next row + adc TileStoreYTable,x ; Fixed offset to the next row tax -; ldx Origin -; lda TileStore2DLookup,x -; tax ; This is the tile store offset + ldal TileStore+TS_VBUFF_ARRAY_ADDR,x + sta tmp0 - lda VBuffOrigin ; This is an interesting case. The mapping between the tile store + lda VBuffOrigin + sta (tmp0),y + +; lda VBuffOrigin ; This is an interesting case. The mapping between the tile store ; adc #{0*4}+{0*256} ; and the sprite buffers changes as the StartX, StartY values change - sta TileStore+TS_SPRITE_ADDR,x ; but don't depend on any sprite information. However, by setting the +; stal TileStore+TS_SPRITE_ADDR,x ; but don't depend on any sprite information. However, by setting the ; value only for the tiles that get added to the dirty tile list, we ; can avoid recalculating over 1,000 values whenever the screen scrolls ; (which is common) and just limit it to the number of tiles covered by @@ -344,10 +356,10 @@ _MarkDirtySprite ; moving and they are being dirtied, then we may do more work, but the ; odds are in our favor to just take care of it here. - lda TileStore+TS_SPRITE_FLAG,x + ; lda TileStore+TS_SPRITE_FLAG,x lda SpriteBit - ora TileStore+TS_SPRITE_FLAG,x - sta TileStore+TS_SPRITE_FLAG,x + oral TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jmp _PushDirtyTileX ; Needs X = tile store offset; destroys A,X. Returns X in A @@ -358,13 +370,16 @@ _MarkDirtySprite adc TileStoreYTable+2,x tax + ldal TileStore+TS_VBUFF_ARRAY_ADDR,x + sta tmp0 + lda VBuffOrigin - adc #{0*4}+{1*8*256} - sta TileStore+TS_SPRITE_ADDR,x + adc #{0*4}+{1*8*SPRITE_PLANE_SPAN} + sta (tmp0),y lda SpriteBit - ora TileStore+TS_SPRITE_FLAG,x - sta TileStore+TS_SPRITE_FLAG,x + oral TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jmp _PushDirtyTileX @@ -375,13 +390,16 @@ _MarkDirtySprite adc TileStoreYTable+4,x tax + ldal TileStore+TS_VBUFF_ARRAY_ADDR,x + sta tmp0 + lda VBuffOrigin - adc #{0*4}+{2*8*256} - sta TileStore+TS_SPRITE_ADDR,x + adc #{0*4}+{2*8*SPRITE_PLANE_SPAN} + sta (tmp0),y lda SpriteBit - ora TileStore+TS_SPRITE_FLAG,x - sta TileStore+TS_SPRITE_FLAG,x + oral TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jmp _PushDirtyTileX @@ -393,13 +411,16 @@ _MarkDirtySprite adc TileStoreYTable,x tax + ldal TileStore+TS_VBUFF_ARRAY_ADDR,x + sta tmp0 + lda VBuffOrigin - adc #{1*4}+{0*8*256} - sta TileStore+TS_SPRITE_ADDR,x + adc #{1*4}+{0*8*SPRITE_PLANE_SPAN} + sta (tmp0),y lda SpriteBit - ora TileStore+TS_SPRITE_FLAG,x - sta TileStore+TS_SPRITE_FLAG,x + oral TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jmp _PushDirtyTileX @@ -411,13 +432,16 @@ _MarkDirtySprite adc TileStoreYTable+2,x tax + ldal TileStore+TS_VBUFF_ARRAY_ADDR,x + sta tmp0 + lda VBuffOrigin - adc #{1*4}+{1*8*256} - sta TileStore+TS_SPRITE_ADDR,x + adc #{1*4}+{1*8*SPRITE_PLANE_SPAN} + sta (tmp0),y lda SpriteBit - ora TileStore+TS_SPRITE_FLAG,x - sta TileStore+TS_SPRITE_FLAG,x + oral TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jmp _PushDirtyTileX @@ -429,13 +453,16 @@ _MarkDirtySprite adc TileStoreYTable+4,x tax + ldal TileStore+TS_VBUFF_ARRAY_ADDR,x + sta tmp0 + lda VBuffOrigin - adc #{1*4}+{2*8*256} - sta TileStore+TS_SPRITE_ADDR,x + adc #{1*4}+{2*8*SPRITE_PLANE_SPAN} + sta (tmp0),y lda SpriteBit - ora TileStore+TS_SPRITE_FLAG,x - sta TileStore+TS_SPRITE_FLAG,x + oral TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jmp _PushDirtyTileX @@ -447,13 +474,16 @@ _MarkDirtySprite adc TileStoreYTable,x tax + ldal TileStore+TS_VBUFF_ARRAY_ADDR,x + sta tmp0 + lda VBuffOrigin - adc #{2*4}+{0*8*256} - sta TileStore+TS_SPRITE_ADDR,x + adc #{2*4}+{0*8*SPRITE_PLANE_SPAN} + sta (tmp0),y lda SpriteBit - ora TileStore+TS_SPRITE_FLAG,x - sta TileStore+TS_SPRITE_FLAG,x + oral TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jmp _PushDirtyTileX @@ -465,13 +495,16 @@ _MarkDirtySprite adc TileStoreYTable+2,x tax + ldal TileStore+TS_VBUFF_ARRAY_ADDR,x + sta tmp0 + lda VBuffOrigin - adc #{2*4}+{1*8*256} - sta TileStore+TS_SPRITE_ADDR,x + adc #{2*4}+{1*8*SPRITE_PLANE_SPAN} + sta (tmp0),y lda SpriteBit - ora TileStore+TS_SPRITE_FLAG,x - sta TileStore+TS_SPRITE_FLAG,x + oral TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jmp _PushDirtyTileX @@ -483,13 +516,16 @@ _MarkDirtySprite adc TileStoreYTable+4,x tax + ldal TileStore+TS_VBUFF_ARRAY_ADDR,x + sta tmp0 + lda VBuffOrigin - adc #{2*4}+{2*8*256} - sta TileStore+TS_SPRITE_ADDR,x + adc #{2*4}+{2*8*SPRITE_PLANE_SPAN} + sta (tmp0),y lda SpriteBit - ora TileStore+TS_SPRITE_FLAG,x - sta TileStore+TS_SPRITE_FLAG,x + oral TileStore+TS_SPRITE_FLAG,x + stal TileStore+TS_SPRITE_FLAG,x jmp _PushDirtyTileX diff --git a/src/SpriteRender.s b/src/SpriteRender.s new file mode 100644 index 0000000..a07e2cc --- /dev/null +++ b/src/SpriteRender.s @@ -0,0 +1,389 @@ +; Function to render a sprite from a sprite definition into the internal data buffers +; +; X = sprite index +_DrawSpriteSheet + phx + + lda _Sprites+VBUFF_ADDR,y + sta tmp1 + + lda _Sprites+TILE_DATA_OFFSET,y + sta tmp2 + + lda _Sprites+SPRITE_DISP,y + and #-{SPRITE_VFLIP+SPRITE_HFLIP} ; dispatch to all of the different orientations + sta tmp3 + +; Set bank + phb + pea #^tiledata ; Set the bank to the tile data + plb + + ldx tmp3 + ldy tmp2 + lda tmp1 + jsr _DrawSprite + + lda tmp3 + ora #SPRITE_VFLIP + tax + ldy tmp2 + lda tmp1 + clc + adc #4*3 + jsr _DrawSprite + + lda tmp3 + ora #SPRITE_HFLIP + tax + ldy tmp2 + lda tmp1 + clc + adc #4*6 + jsr _DrawSprite + + lda tmp3 + ora #SPRITE_VFLIP + tax + ldy tmp2 + lda tmp1 + clc + adc #4*9 + jsr _DrawSprite + +; Restore bank + plb ; pop extra byte + plb + + plx + rts +; +; X = _Sprites array offset +_DrawSprite +; ldx _Sprites+SPRITE_DISP,y ; use bits 9, 10, 11, 12 and 13 to dispatch + jmp (draw_sprite,x) + +draw_sprite dw draw_8x8,draw_8x8h,draw_8x8v,draw_8x8hv + dw draw_8x16,draw_8x16h,draw_8x16v,draw_8x16hv + dw draw_16x8,draw_16x8h,draw_16x8v,draw_16x8hv + dw draw_16x16,draw_16x16h,draw_16x16v,draw_16x16hv + + dw :rtn,:rtn,:rtn,:rtn ; hidden bit is set + dw :rtn,:rtn,:rtn,:rtn + dw :rtn,:rtn,:rtn,:rtn + dw :rtn,:rtn,:rtn,:rtn +:rtn rts + +draw_8x8 +draw_8x8h + tax + jmp _DrawTile8x8 + +draw_8x8v +draw_8x8hv + tax + jmp _DrawTile8x8V + +draw_8x16 +draw_8x16h + tax + jsr _DrawTile8x8 + clc + txa + adc #{8*SPRITE_PLANE_SPAN} + tax + tya + adc #{128*32} ; 32 tiles to the next vertical one, each tile is 128 bytes + tay + jmp _DrawTile8x8 + +draw_8x16v +draw_8x16hv + tax + jsr _DrawTile8x8V + clc + txa + adc #{8*SPRITE_PLANE_SPAN} + tax + tya + adc #{128*32} + tay + jmp _DrawTile8x8V + +draw_16x8 + tax + jsr _DrawTile8x8 + clc + txa + adc #4 + tax + tya + adc #128 ; Next tile is 128 bytes away + tay + jmp _DrawTile8x8 + +draw_16x8h + clc + tax + tya + pha + adc #128 + tay + jsr _DrawTile8x8 + txa + adc #4 + tax + ply + jmp _DrawTile8x8 + +draw_16x8v + tax + jsr _DrawTile8x8V + clc + txa + adc #4 + tax + tya + adc #128 + tay + jmp _DrawTile8x8V + +draw_16x8hv + clc + tax + tya + pha + adc #128 + tay + jsr _DrawTile8x8V + txa + adc #4 + tax + ply + jmp _DrawTile8x8V + +draw_16x16 + clc + tax + jsr _DrawTile8x8 + txa + adc #4 + tax + tya + adc #128 + tay + jsr _DrawTile8x8 + txa + adc #{8*SPRITE_PLANE_SPAN}-4 + tax + tya + adc #{128*{32-1}} + tay + jsr _DrawTile8x8 + txa + adc #4 + tax + tya + adc #128 + tay + jmp _DrawTile8x8 + +draw_16x16h + clc + tax + tya + pha + adc #128 + tay + jsr _DrawTile8x8 + + txa + adc #4 + tax + ply + jsr _DrawTile8x8 + + txa + adc #{8*SPRITE_PLANE_SPAN}-4 + tax + tya + adc #{128*32} + pha + adc #128 + tay + jsr _DrawTile8x8 + + txa + adc #4 + tax + ply + jmp _DrawTile8x8 + +draw_16x16v + clc + tax + tya + pha ; store some copies + phx + pha + adc #{128*32} + tay + jsr _DrawTile8x8V + + txa + adc #{8*SPRITE_PLANE_SPAN} + tax + ply + jsr _DrawTile8x8V + + pla + adc #4 + tax + lda 1,s + adc #{128*{32+1}} + tay + jsr _DrawTile8x8V + + txa + adc #{8*SPRITE_PLANE_SPAN} + tax + pla + adc #128 + tay + jmp _DrawTile8x8V + +draw_16x16hv + clc + tax + tya + pha + adc #128+{128*32} ; Bottom-right source to top-left + tay + jsr _DrawTile8x8V + + txa + adc #4 + tax + lda 1,s + adc #{128*32} + tay + jsr _DrawTile8x8V + + txa + adc #{8*SPRITE_PLANE_SPAN}-4 + tax + lda 1,s + adc #128 + tay + jsr _DrawTile8x8V + + txa + adc #4 + tax + ply + jmp _DrawTile8x8V + + +; X = sprite vbuff address +; Y = tile data pointer +_DrawTile8x8 +_CopyTile8x8 +]line equ 0 + lup 8 + lda: tiledata+32+{]line*4},y + stal spritemask+{]line*SPRITE_PLANE_SPAN},x + lda: tiledata+{]line*4},y + stal spritedata+{]line*SPRITE_PLANE_SPAN},x + + lda: tiledata+32+{]line*4}+2,y + stal spritemask+{]line*SPRITE_PLANE_SPAN}+2,x + lda: tiledata+{]line*4}+2,y + stal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x +]line equ ]line+1 + --^ + rts + +_DrawTile8x8V +_CopyTile8x8V +]line equ 0 + lup 8 + lda: tiledata+32+{{7-]line}*4},y + stal spritemask+{]line*SPRITE_PLANE_SPAN},x + lda: tiledata+{{7-]line}*4},y + stal spritedata+{]line*SPRITE_PLANE_SPAN},x + + lda: tiledata+32+{{7-]line}*4}+2,y + stal spritemask+{]line*SPRITE_PLANE_SPAN}+2,x + lda: tiledata+{{7-]line}*4}+2,y + stal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x +]line equ ]line+1 + --^ + rts + +; X = sprite vbuff address +; Y = tile data pointer +;_DrawTile8x8 +; phb +; pea #^tiledata ; Set the bank to the tile data +; plb +; +;]line equ 0 +; lup 8 +; lda: tiledata+32+{]line*4},y +; andl spritemask+{]line*SPRITE_PLANE_SPAN},x +; stal spritemask+{]line*SPRITE_PLANE_SPAN},x +; +; ldal spritedata+{]line*SPRITE_PLANE_SPAN},x +; and: tiledata+32+{]line*4},y +; ora: tiledata+{]line*4},y +; stal spritedata+{]line*SPRITE_PLANE_SPAN},x +; +; lda: tiledata+32+{]line*4}+2,y +; andl spritemask+{]line*SPRITE_PLANE_SPAN}+2,x +; stal spritemask+{]line*SPRITE_PLANE_SPAN}+2,x +; +; ldal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x +; and: tiledata+32+{]line*4}+2,y +; ora: tiledata+{]line*4}+2,y +; stal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x +;]line equ ]line+1 +; --^ +; +; plb ; pop extra byte +; plb +; rts + +; X = sprite vbuff address +; Y = tile data pointer +; +; Draws the tile vertically flipped +;_DrawTile8x8V +; phb +; pea #^tiledata ; Set the bank to the tile data +; plb + +;]line equ 0 +; lup 8 +; lda: tiledata+32+{{7-]line}*4},y +; andl spritemask+{]line*SPRITE_PLANE_SPAN},x +; stal spritemask+{]line*SPRITE_PLANE_SPAN},x +; +; ldal spritedata+{]line*SPRITE_PLANE_SPAN},x +; and: tiledata+32+{{7-]line}*4},y +; ora: tiledata+{{7-]line}*4},y +; stal spritedata+{]line*SPRITE_PLANE_SPAN},x + +; lda: tiledata+32+{{7-]line}*4}+2,y +; andl spritemask+{]line*SPRITE_PLANE_SPAN}+2,x +; stal spritemask+{]line*SPRITE_PLANE_SPAN}+2,x +; +; ldal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x +; and: tiledata+32+{{7-]line}*4}+2,y +; ora: tiledata+{{7-]line}*4}+2,y +; stal spritedata+{]line*SPRITE_PLANE_SPAN}+2,x +;]line equ ]line+1 +; --^ +; +; plb ; pop extra byte +; plb +; rts \ No newline at end of file diff --git a/src/blitter/Template.s b/src/blitter/Template.s index 6f61682..07ecc17 100644 --- a/src/blitter/Template.s +++ b/src/blitter/Template.s @@ -137,7 +137,7 @@ SetScreenRect sty ScreenHeight ; Save the screen height and ldx #0 ldy #0 :tsloop - sta TileStore+TS_SCREEN_ADDR,X + sta TileStore+TS_SCREEN_ADDR,x clc adc #4 ; Go to the next tile @@ -205,7 +205,7 @@ Counter equ tmp3 tax ; NOTE: Try to rework to use new TileStore2DLookup array lda OnScreenAddr - sta TileStore+TS_SCREEN_ADDR,X + sta TileStore+TS_SCREEN_ADDR,x clc adc #4 ; Go to the next tile diff --git a/src/blitter/Tiles.s b/src/blitter/Tiles.s index cccdbf5..ae0b7db 100644 --- a/src/blitter/Tiles.s +++ b/src/blitter/Tiles.s @@ -112,13 +112,17 @@ RenderTile ENT rtl _RenderTile2 + pea >TileStore ; Need that addressing flexibility here. Callers responsible for restoring bank reg + plb + plb + lda TileStore+TS_TILE_ID,y ; build the finalized tile descriptor ldx TileStore+TS_SPRITE_FLAG,y ; This is a bitfield of all the sprites that intersect this tile, only care if non-zero or not beq :nosprite ora #TILE_SPRITE_BIT - ldx TileStore+TS_SPRITE_ADDR,y - stx _SPR_X_REG +; ldx TileStore+TS_SPRITE_ADDR,y ; TODO: collapse sprites +; stx _SPR_X_REG :nosprite sta _TILE_ID ; Some tile blitters need to get the tile descriptor @@ -498,17 +502,34 @@ _CopyBG1Tile ; ; TileStore+TS_TILE_ID : Tile descriptor ; TileStore+TS_DIRTY : $FFFF is clean, otherwise stores a back-reference to the DirtyTiles array -; TileStore+TS_SPRITE_FLAG : Set to TILE_SPRITE_BIT if a sprite is present at this tile location -; TileStore+TS_SPRITE_ADDR ; Address of the tile in the sprite plane ; TileStore+TS_TILE_ADDR : Address of the tile in the tile data buffer ; TileStore+TS_CODE_ADDR_LOW : Low word of the address in the code field that receives the tile ; TileStore+TS_CODE_ADDR_HIGH : High word of the address in the code field that receives the tile ; TileStore+TS_WORD_OFFSET : Logical number of word for this location ; TileStore+TS_BASE_ADDR : Copy of BTableAddrLow -; TileStore+TS_SCREEN_ADDR : Address ont he physical screen corresponding to this tile (for direct rendering) +; TileStore+TS_SCREEN_ADDR : Address on the physical screen corresponding to this tile (for direct rendering) +; TileStore+TS_SPRITE_FLAG : A bit field of all sprites that intersect this tile +; TileStore+TS_SPRITE_ADDR_1 ; Address of the sprite data that aligns with this tile. These +; TileStore+TS_SPRITE_ADDR_2 ; values are 1:1 with the TS_SPRITE_FLAG bits and are not contiguous. +; TileStore+TS_SPRITE_ADDR_3 ; If the bit position in TS_SPRITE_FLAG is not set, then the value in +; TileStore+TS_SPRITE_ADDR_4 ; the TS_SPRITE_ADDR_* field is undefined. +; TileStore+TS_SPRITE_ADDR_5 +; TileStore+TS_SPRITE_ADDR_6 +; TileStore+TS_SPRITE_ADDR_7 +; TileStore+TS_SPRITE_ADDR_8 +; TileStore+TS_SPRITE_ADDR_9 +; TileStore+TS_SPRITE_ADDR_10 +; TileStore+TS_SPRITE_ADDR_11 +; TileStore+TS_SPRITE_ADDR_12 +; TileStore+TS_SPRITE_ADDR_13 +; TileStore+TS_SPRITE_ADDR_14 +; TileStore+TS_SPRITE_ADDR_15 +; TileStore+TS_SPRITE_ADDR_16 -TileStore ENT - ds TILE_STORE_SIZE*11 + +; TileStore+ +;TileStore ENT +; ds TILE_STORE_SIZE*11 ; A list of dirty tiles that need to be updated in a given frame DirtyTileCount ds 2 @@ -519,7 +540,7 @@ DirtyTiles ds TILE_STORE_SIZE ; At most this many tiles can possibly InitTiles :col equ tmp0 :row equ tmp1 - +:vbuff equ tmp2 ; Fill in the TileStoreYTable. This is just a table of offsets into the Tile Store for each row. There ; are 26 rows with a stride of 41 ldy #0 @@ -570,18 +591,27 @@ InitTiles sta :row lda #40 sta :col + lda #$8000 + sta :vbuff :loop ; The first set of values in the Tile Store are changed during each frame based on the actions ; that are happening - stz TileStore+TS_TILE_ID,x ; clear the tile store with the special zero tile - stz TileStore+TS_TILE_ADDR,x + lda #0 + stal TileStore+TS_TILE_ID,x ; clear the tile store with the special zero tile + stal TileStore+TS_TILE_ADDR,x - stz TileStore+TS_SPRITE_FLAG,x ; no sprites are set at the beginning + stal TileStore+TS_SPRITE_FLAG,x ; no sprites are set at the beginning lda #$FFFF ; none of the tiles are dirty - sta TileStore+TS_DIRTY,x + stal TileStore+TS_DIRTY,x + + lda :vbuff ; array of sprite vbuff addresses per tile + stal TileStore+TS_VBUFF_ARRAY_ADDR,x + clc + adc #32 + sta :vbuff ; The next set of values are constants that are simply used as cached parameters to avoid needing to ; calculate any of these values during tile rendering @@ -590,20 +620,20 @@ InitTiles asl ; exists in the code fields tay lda BRowTableHigh,y - sta TileStore+TS_CODE_ADDR_HIGH,x ; High word of the tile address (just the bank) + stal TileStore+TS_CODE_ADDR_HIGH,x ; High word of the tile address (just the bank) lda BRowTableLow,y - sta TileStore+TS_BASE_ADDR,x ; May not be needed later if we can figure out the right constant... + stal TileStore+TS_BASE_ADDR,x ; May not be needed later if we can figure out the right constant... lda :col ; Set the offset values based on the column asl ; of this tile asl - sta TileStore+TS_WORD_OFFSET,x ; This is the offset from 0 to 82, used in LDA (dp),y instruction + stal TileStore+TS_WORD_OFFSET,x ; This is the offset from 0 to 82, used in LDA (dp),y instruction tay lda Col2CodeOffset+2,y clc - adc TileStore+TS_BASE_ADDR,x - sta TileStore+TS_CODE_ADDR_LOW,x ; Low word of the tile address in the code field + adcl TileStore+TS_BASE_ADDR,x + stal TileStore+TS_CODE_ADDR_LOW,x ; Low word of the tile address in the code field dec :col bpl :hop @@ -719,7 +749,7 @@ _PushDirtyTileX inx stx DirtyTileCount -; Same speed, but preserved the Z register +; Same speed, but preserved the X register ; sta (DirtyTiles) ; 6 ; lda DirtyTiles ; 4 ; inc ; 2