mirror of
https://github.com/deater/dos33fsprogs.git
synced 2024-11-20 03:35:24 +00:00
e4dc7fbed3
was an off-by-one in the wraparound code |
||
---|---|---|
.. | ||
krw | ||
chip_title.inc | ||
chiptune_player.dsk | ||
chiptune_player.s | ||
chiptune.png | ||
Makefile | ||
OUT.0 | ||
OUT.LZ4 | ||
rasterbars.s | ||
README.chiptune | ||
TODO | ||
volume_bars.s | ||
zp.inc |
The Challenges of an Apple II chiptune player. The goal is to design a chiptune player that can play large (150k+ uncompressed) chiptune files on an Apple II with 48k of RAM and a Mockingboard sound card. An interrupt routine wakes at 50Hz to write the registers and a few other houskeeping things. Not enough RAM to hold full raw ym5 sound data (one byte for each of 14 registers, every 50Hz). This compresses amazingly. Using LZ4 by at least a factor of 10. But it won't fit all in RAM so we have to load the full file from disk (no way to do disk I/O, disk I/O disables interrupts) then decompress in chunks. So we need room for both the compressed file plus uncompressed data. The problem is decompression also takes a while, longer than the 50Hz. So if we just decompress the next chunk when needed the sound will noticibly pause for a fraction of a second. One solution to that is to have two decompress areas and flip between them, decompressing in the background to one while the other is playing. The problem is splitting the decompressed data into smaller chunks like this is that it doesn't compress as well so it takes up more disk/memory space for the raw file. As can be seen from the memory map, if we assume our player can fit in 4k we have roughly from $2000 to $9600 for memory. That's $7600 (29.5k). If we could have single buffered, we could have had 256*3*14 (10.5k) for decompress and 19k for file size which would let us play most of the reasonable sized songs on our play list (KRW(3) in table at end). If we need to double buffer, then we need 256*2*14*2 (14k) for decompress and 15.5k for file size which still works, at least if the move to KRW(2) sized files doesn't bloat things too much. Proposed plan Decompress 3, but in the room of 4? 1234 in memory ABCC decode as ABC, then copy C to 4 when playing C, play from 4, bring in next 3 DEFF This lets us have 14k of buffer, allowing 15.5k of compressed file. Do we have the spare cycles for this? Memory Map (not to scale) ------- $ffff | ROM/IO| ------- $c000 |DOS3.3 | -------| $9600 | | | | | FREE | | | | | |------- $0c00 |GR pg 1| |------- $0800 |GR pg 0| ------- $0400 | | ------- $0200 |stack | ------- $0100 |zero pg| ------- $0000 Sizes Disk time ym5 KRW(3) KRW(2) Blocks(3) ~~~~ ~~~ ~~~~~~ ~~~~~~ ~~~~~~ KORO.KRW 0:54 ? 2740 3039 12 12 FIGHTING.KRW 1:40 ? 3086 3316 14 14 CAMOUFLAGE.KRW 1:32 1162 4054 4972 17 17 DEMO4.KRW 2:05 1393 4061 6336 17 17 SDEMO.KRW 2:12 1635 5266 7598 22 22 CHRISTMAS.KRW 1:32 1751 4975 5811 21 21 SPUTNIK.KRW 2:05 2164 8422 10779 34 34 DEATH2.KRW 2:27 2560 8064 10295 33 33 CRMOROS.KRW 1:29 2566 8045 9565 33 33 TECHNO.KRW 2:23 2630 8934 11126 36 36 WAVE.KRW 2:52 2655 8368 11318 34 34 LYRA2.KRW 3:04 2870 9826 14418 40 40 INTRO2.KRW 2:59 3217 9214 9294 37 37 ROBOT.KRW 1:26 3448 7724 8337 32 32 UNIVERSE.KRW 1:49 4320 9990 11225 41 41 NEURO.KRW 3:47 8681 22376 25168 89 AXELF.KRW 10:55 9692 47989 54420 189 ----- ----- 423 30:29 Notes: my home-made songs don't have ym5 sizes as I don't have a working LHA encoder to make a real size. Apple II disk file sizes: uses 256 byte blocks. Needs an extra for the catalog entry (and an additional for every X blocks used) The Disk II / DOS3.3 can in theory hold 140k, but first 3 tracks are reserved for DOS (12k) and the Catalog track (4k) and the Hello program (512 bytes) and our chiptune player (4k), totalling 24.5k of overhead, with 115.5k free (462 blocks) Interesting bugs that were hard to debug: + Bug in qkumba's LZ4 decoder, only happened when a copy-block size was exactly a multiple of 256, in which case it would copy an extra time. + Bug where the box-drawing was starting at 0 rather than at Y. Turns out I was padding the filename buffer with A0 but going one too far and it was writing A0 to the first byte of the hlin routine, and A0 is a LDY # instruction.