mirror of
https://github.com/deater/dos33fsprogs.git
synced 2024-11-19 12:32:35 +00:00
mode7: update README
This commit is contained in:
parent
4254cca36b
commit
29adca8ed4
@ -1,6 +1,331 @@
|
|||||||
Plan:
|
Challenges found writing an 8k Lores Apple II Demo
|
||||||
Load at $1000
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
Decompress to $2000
|
by DEATER (Vince Weaver, vince@deater.net)
|
||||||
|
|
||||||
|
http://www.deater.net/weave/vmwprod/mode7_demo/
|
||||||
|
====================================================
|
||||||
|
19 March 2018
|
||||||
|
|
||||||
|
GOAL:
|
||||||
|
~~~~~
|
||||||
|
This started out as some SNES style mode7 pseudo-3d graphics code
|
||||||
|
I came up with while working on my TF7 game. The graphics looked
|
||||||
|
pretty cool, so I started developing a demo around it.
|
||||||
|
|
||||||
|
The codesize ended up being roughly around 8kB, so I thought I'd
|
||||||
|
make it into an 8k demo. There aren't many out there for the Apple II.
|
||||||
|
and a Mockingboard sound card.
|
||||||
|
|
||||||
|
The demo tries to hit the lowest common denominator for Apple II systems,
|
||||||
|
so in theory you could have run this on an Apple II in 1977 if you
|
||||||
|
were rich enough to afford 48k of RAM. The Mockingboard sound wasn't
|
||||||
|
available until 1981, but still this all predates the Commodore 64.
|
||||||
|
|
||||||
|
USING:
|
||||||
|
~~~~~~
|
||||||
|
Boot disk on a real system, or emulator with Mockingboard support.
|
||||||
|
|
||||||
|
Applewin works fine (even under Wine on Linux).
|
||||||
|
MESS does too, it's harder to setup (ROMs) but the audio sounds clearer.
|
||||||
|
|
||||||
|
If you have no emulator you can try one of the online javascript ones.
|
||||||
|
https://www.scullinsteel.com/apple2/
|
||||||
|
|
||||||
|
|
||||||
|
Hardware:
|
||||||
|
~~~~~~~~~
|
||||||
|
The Apple II has a 6502 processor running at roughly 1.023MHz.
|
||||||
|
|
||||||
|
Early models only shipped with 4k of RAM, but later 48k, 64k, and 128k
|
||||||
|
systems were common.
|
||||||
|
|
||||||
|
The most common disk drive was the Disk II which typically held
|
||||||
|
140k of data (single-sided).
|
||||||
|
|
||||||
|
The only sound available was a bit-banged speaker. No timer,
|
||||||
|
if you wanted music you had to cycle-count via the CPU.
|
||||||
|
|
||||||
|
Later some sound cards were available. This demo uses the
|
||||||
|
Mockingboard which has dual AY-3-8910 sound chips. Each
|
||||||
|
chip provides 3 channels of square waves, with noise and
|
||||||
|
envelope effects available.
|
||||||
|
|
||||||
|
GRAPHICS
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
The Apple II had nice graphics for its time, with this time being
|
||||||
|
around 1977. Otherwise it is quite limited.
|
||||||
|
Hardware Sprites? No
|
||||||
|
Linear framebuffer? No
|
||||||
|
User-defined charset? No
|
||||||
|
Blanking interrupts? No
|
||||||
|
Palette selection? No
|
||||||
|
Hardware scrolling? No
|
||||||
|
Hardware page flip? Yes
|
||||||
|
|
||||||
|
The hi-res graphics mode was a complex mess of NTSC hacks by Woz.
|
||||||
|
You got 280x192 graphics, with 6 colors available. However the colors
|
||||||
|
were from NTSC artifacts and there were limitations on which colors
|
||||||
|
could be next to each other (in blocks of 3.5 pixels) as well as
|
||||||
|
fringing. Also the addresses were interleaved, so not a linear
|
||||||
|
framebuffer. Hi-res page0 is at $2000 and page1 at $4000.
|
||||||
|
Optionally 4 lines of text can be shown at the bottom of the
|
||||||
|
screen instead of graphics.
|
||||||
|
|
||||||
|
The lo-res mode is a bit easier to use. It is 40x48 blocks
|
||||||
|
(40x40 if 4 lines of text are displayed at the bottom).
|
||||||
|
15 colors are available, though there is fringing at the edges.
|
||||||
|
Again the addresses are interleaved. Lo-res page0 is at $400
|
||||||
|
and page1 is at $800.
|
||||||
|
|
||||||
|
========================================
|
||||||
|
DETAILED STEP-BY-STEP REVIEW OF THE DEMO
|
||||||
|
========================================
|
||||||
|
|
||||||
|
BOOTLOADER
|
||||||
|
~~~~~~~~~~
|
||||||
|
A BASIC "HELLO" program loads the binary.
|
||||||
|
This just makes things auto-boot at startup, this doesn't count
|
||||||
|
towards the executable size, you could manually BRUN the 8k program
|
||||||
|
if you wanted.
|
||||||
|
|
||||||
|
The binary is loaded at $2000 (hi-res page0) and BASIC kicks into
|
||||||
|
HIRES mode before loading so you can watch as the memory is loaded
|
||||||
|
from disk in a seemingly random pattern.
|
||||||
|
|
||||||
|
Since this is an 8k demo, the entirety of the program is shown on
|
||||||
|
the screen (or would be if we POKEd the right address to turn off
|
||||||
|
the 4 lines of text on the bottom of the screen).
|
||||||
|
|
||||||
|
Execution starts at address $2000
|
||||||
|
|
||||||
|
DECOMPRESSER
|
||||||
|
~~~~~~~~~~~~
|
||||||
|
The binary is LZ4 encoded. The decompresser flips to HGR page 1 so
|
||||||
|
we can watch memory as the program is decompressed.
|
||||||
|
|
||||||
|
The LZ4 decompression code was written by qkumba (Peter Ferrie).
|
||||||
|
http://pferrie.host22.com/misc/appleii.htm
|
||||||
|
|
||||||
|
The actual program/data decompresses to around 22k starting at $4000.
|
||||||
|
It over-writes parts of DOS3.3, but since we won't be using the disk
|
||||||
|
anymore this isn't an issue.
|
||||||
|
|
||||||
|
At the top left corner of the screen you'll see the VMW triangles logo
|
||||||
|
as it decompresses. To do this I had to put the proper bit pattern
|
||||||
|
at $4000, $4400, $4800, and $4C00. I mean to have some words too
|
||||||
|
but ran out of disk space. The bit pattern at $4000 is executable
|
||||||
|
and is run as code.
|
||||||
|
|
||||||
|
Optimizing for code size inside of a compressed binary is a pain.
|
||||||
|
Removing instructions sometimes made the binary larger as it no longer
|
||||||
|
compressed as well. Long runs of values (such as 0 padding) are
|
||||||
|
essentially free. This was a difficult challenge.
|
||||||
|
|
||||||
|
FADE EFFECT
|
||||||
|
~~~~~~~~~~~
|
||||||
|
The title screen fades in from black.
|
||||||
|
|
||||||
|
This is a software hack, with a lookup table copying from an off-screen
|
||||||
|
buffer. The Apple II doesn't have any palette support.
|
||||||
|
|
||||||
|
TITLE SCREEN
|
||||||
|
~~~~~~~~~~~~
|
||||||
|
Once things are decompressed, we jump to $4000. We switch to low-res
|
||||||
|
mode for the rest of the DEMO.
|
||||||
|
|
||||||
|
A background image is loaded from disk. This is RLE encoded (probably
|
||||||
|
unnecessary when being further LZ4 encoded).
|
||||||
|
|
||||||
|
Why not just load the program at $400 and load the graphics image for
|
||||||
|
free? Well, remember the graphics are 40x48 (shared with the text).
|
||||||
|
Really it's 40x24, with each text char mapping to 4-bits top/bottom
|
||||||
|
for color. Do the math, we have 1k reserved for this mode but 40x24
|
||||||
|
is only 960 bytes. It turns out there are "holes" in the address range
|
||||||
|
that aren't displayed, and various pieces of hardware use these holes
|
||||||
|
as scratchpad memory. So if you just blindly uncompress graphics data
|
||||||
|
there you can corrupt the scratchpad. So you have to be careful
|
||||||
|
when uncompressing to skip the holes.
|
||||||
|
|
||||||
|
The title screen has scrolling text at the bottom. This is nothing fancy,
|
||||||
|
the text is in a buffer off screen and a 40x4 chunk of RAM is copied in
|
||||||
|
every so many cycles.
|
||||||
|
|
||||||
|
You might notice that there is tearing/jitter in the scrolling, even
|
||||||
|
though we are double-buffering the graphics. This is because there is
|
||||||
|
not a reliable cross-platform way to get the VBLANK info (especially
|
||||||
|
on older machines) so we are having some bad luck about when we flip
|
||||||
|
pages.
|
||||||
|
|
||||||
|
MOCKINGBOARD MUSIC
|
||||||
|
~~~~~~~~~~~~~~~~~~
|
||||||
|
I like chiptune music, especially that for AY-3-8910 based systems.
|
||||||
|
Before obtaining a Mockingboard I built a Raspberry Pi chiptune player
|
||||||
|
that is essentially the same hardware.
|
||||||
|
|
||||||
|
Most of my sound infrastructure involves YM5 files, which are often used
|
||||||
|
by ZX Spectrum and ATARI ST users. These are usually register dumps
|
||||||
|
taken typically at 50Hz. So to play them back you just have to interrupt
|
||||||
|
50 times a second and write the registers.
|
||||||
|
|
||||||
|
To program the Mockingboard, each AY-3-8910 chip has 14 sound related
|
||||||
|
registers that control the 3 channels. Each AY chip has a dedicated
|
||||||
|
VIA 6522 parallel I/O chip that handles the I/O.
|
||||||
|
|
||||||
|
Doing this quickly enough is a challenge on the Apple II. For each
|
||||||
|
register you have to do a handshake, set the register # and the value.
|
||||||
|
This can take upwards of 40 1MHz cycles per register.
|
||||||
|
|
||||||
|
For complex chiptune files (especially those written on an ST with much
|
||||||
|
faster hardware) it's sometimes not possible to get exact playback
|
||||||
|
due to the delay. Also one AY is on the left channel and one on the right
|
||||||
|
so you have to write both if you want sound from both speakers.
|
||||||
|
|
||||||
|
I have a whole suite of code for manipulating YM sound data, in my
|
||||||
|
vmw-meter git repository.
|
||||||
|
|
||||||
|
The first step for getting this to work is detecting if a mockingboard is
|
||||||
|
there. This can be in any slot 1-7 on the Apple II, though typically
|
||||||
|
Slot 4 is standard (in this demo we only check slot 4).
|
||||||
|
|
||||||
|
The board is initialized, and then one of the 6522 timers is set to
|
||||||
|
interrupt at 25Hz (it has to be an on-board timer as the default
|
||||||
|
Apple II has no timers).
|
||||||
|
|
||||||
|
Why 25Hz and not 50Hz? At 50Hz with 14 registers you use 700 bytes/s.
|
||||||
|
So a 2 minute song would take 84k of RAM, much more than is available.
|
||||||
|
|
||||||
|
For this demo I run at 25Hz, and also pack the 14 registers of the data
|
||||||
|
into 11 (there are various fields that are not packed well, we can
|
||||||
|
unpack at play time). Also I stripped out the envelope data as many
|
||||||
|
songs do not use it (so this is a lossy compression method).
|
||||||
|
|
||||||
|
Also, we keep track of the last values written last frame and only
|
||||||
|
write out to the board if things change, which helps with the latency
|
||||||
|
a bit.
|
||||||
|
|
||||||
|
The sound quality suffered a bit, but it's hard to fit a catchy chiptune
|
||||||
|
file in 8K.
|
||||||
|
|
||||||
|
The song being played is a stripped down and re-arranged version of
|
||||||
|
"Electric Wave" from CC'00 by EA (Ilya Abrosimov).
|
||||||
|
|
||||||
|
|
||||||
|
MODE7 BACKGROUND
|
||||||
|
~~~~~~~~~~~~~~~~
|
||||||
|
"MODE7" was a Super Nintendo (SNES) graphics mode that took a tiled
|
||||||
|
background and transformed it to look as if it was squashed out to
|
||||||
|
the horizon, giving a 3d look. The SNES did this in hardware, but
|
||||||
|
in this demo we do this in software.
|
||||||
|
|
||||||
|
As found on Wikipedia, the transform is of the type
|
||||||
|
|
||||||
|
[x'] = [a b]([x]-[x0])+[x0]
|
||||||
|
[y'] [c d]([y] [y0]) [y0]
|
||||||
|
|
||||||
|
For our code, we managed to reduce things to a small number of additions
|
||||||
|
and subtractions for each pixel on the screen. Of course the 6502 can't
|
||||||
|
do floating point, so we do fixed point math. We convert as much as we
|
||||||
|
can to table lookups that are pre-calculated. We also make liberal use
|
||||||
|
of self-modifying code.
|
||||||
|
|
||||||
|
Despite all of this there are still some cases where we have to do a
|
||||||
|
16bit x 16bit = 32bit multiply, something that is *really* slow on 6502,
|
||||||
|
around 700 cycles (for a 8.8 x 8.8 fixed point multiply).
|
||||||
|
|
||||||
|
To make this faster we use a method described by Stephen Judd.
|
||||||
|
|
||||||
|
The key to note is that (a+b)^2 = a^2+2ab+b^2 and (a-b)^2=a^2-2ab+b^2
|
||||||
|
and if you add them you can simplify to:
|
||||||
|
(a+b)^2 (a-b)^2
|
||||||
|
a*b = --------- - -------
|
||||||
|
4 4
|
||||||
|
This is you have a table of squares from 0..511 (all 8-bit a+b and a-b
|
||||||
|
will fall in this range) then you can convert a multiply into a table
|
||||||
|
lookup plus a subtract.
|
||||||
|
|
||||||
|
The downsize is you will need 2kB of squares lookup tables (which can
|
||||||
|
be generated at startup). This reduces the multiply cost to the order
|
||||||
|
of 200 to 250 cycles.
|
||||||
|
|
||||||
|
By using the fast multiply and a lot of careful optimization you can
|
||||||
|
generate a Mode7 background in 40x40 graphics mode at about 5 frames/second.
|
||||||
|
|
||||||
|
The engine can be parameterized with different tilesets to use, which we
|
||||||
|
do to provide both a black+white checkerboard background, as well as the
|
||||||
|
island background from the TFV game.
|
||||||
|
|
||||||
|
BOUNCING BALL ON CHECKERBOARD
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
What would a demo be without some sort of bouncing geometric shape.
|
||||||
|
|
||||||
|
This is just done with 16 sprites. The sphere was modeled in OpenGL
|
||||||
|
from a 2000-era game-engine that I never finished. I then took screenshots
|
||||||
|
and then reduced the size/color to an appropriate value.
|
||||||
|
|
||||||
|
The shadow is also just sprites.
|
||||||
|
|
||||||
|
The clicking noise on bounce is just touching the speaker at $C030.
|
||||||
|
It's mostly there to give some sound effects for those playing the demo
|
||||||
|
without a mockingboard.
|
||||||
|
|
||||||
|
TFV SPACESHIP FLYING
|
||||||
|
~~~~~~~~~~~~~~~~~~~~
|
||||||
|
The spaceship, water splash, and shadows are all sprites. This is all
|
||||||
|
done in software, the Apple II has no sprite hardware.
|
||||||
|
|
||||||
|
This is the TFV game engine flying-spaceship code, with the keyboard
|
||||||
|
routines replaced to read from memory instead (sort of like a script
|
||||||
|
of what to do when).
|
||||||
|
|
||||||
|
STARFIELD
|
||||||
|
~~~~~~~~~
|
||||||
|
The starfield is your typical starfield code. Only 16 stars are modeled.
|
||||||
|
It re-uses the fast-multiply code from the mode7 graphics.
|
||||||
|
|
||||||
|
Random number generation is not fast on the 6502, so we cheat.
|
||||||
|
Originally we had a 256-byte blob of "random" values generated earlier.
|
||||||
|
|
||||||
|
This wasted space, so now instead we just treat the executable code
|
||||||
|
at $5000 as if it were a block of random numbers. This was arbitrarily
|
||||||
|
chosen, I tried different areas of memory until I got one where the
|
||||||
|
stars seemed to move in a pleasing pattern.
|
||||||
|
|
||||||
|
A simple state machine controls if the stars move or not, whether the
|
||||||
|
background is cleared or not (the streak effect) and what color the
|
||||||
|
background is (for the blue flash).
|
||||||
|
|
||||||
|
The ship moving to the distance is just done with different sized sprites.
|
||||||
|
|
||||||
|
RASTERBARS/CREDITS
|
||||||
|
~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The credits happen with the starfield continuing to run.
|
||||||
|
|
||||||
|
The text is written in the bottom 4 lines of the screen. Some inverse-mode
|
||||||
|
space characters are used to try to make it look like graphics are surrounding
|
||||||
|
the text. It's actually possible with careful cycle counting to switch
|
||||||
|
modes fast enough to have actual mixed graphics/text (See the FrenchTouch
|
||||||
|
demos) but I was too lazy to attempt that here.
|
||||||
|
|
||||||
|
The rasterbar effect isn't really rasterbars, it's just a rainbow assortment
|
||||||
|
of lines being drawn with a SINEWAVE lookup table.
|
||||||
|
|
||||||
|
It's the same rasterbar code from my chiptune player demo. I ended up
|
||||||
|
optimizing it a lot via inlining and a few other ways because it turned
|
||||||
|
out just drawing a horizontal line can take a very long time.
|
||||||
|
|
||||||
|
The rotating text is just taking the output string and rapidly rotating the
|
||||||
|
character values through the ASCII table.
|
||||||
|
|
||||||
|
The annoying clicking noise is the same speaker effect caused by hitting
|
||||||
|
$C030.
|
||||||
|
|
||||||
|
Choosing who to thank ended up being extremely critical to fitting in 8kB,
|
||||||
|
as unique text strings do not compress well. I'm also still not satisfied
|
||||||
|
with how the centering looks.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Memory Map
|
Memory Map
|
||||||
==========
|
==========
|
||||||
@ -35,51 +360,3 @@ Memory Map
|
|||||||
|zero pg | 0.25
|
|zero pg | 0.25
|
||||||
------- $0000
|
------- $0000
|
||||||
|
|
||||||
=============================================
|
|
||||||
Getting the VMW logo to appear on page2 HGR
|
|
||||||
==============================================
|
|
||||||
|
|
||||||
; Need to have lines at
|
|
||||||
; $4000 AA,AD,D5,AC,95
|
|
||||||
; $4400 A8,D5,95,35,85 1k
|
|
||||||
; $4800 A0,55,26,55,81 2k
|
|
||||||
; $4C00 00,00,00,00,00 3k
|
|
||||||
|
|
||||||
|
|
||||||
MAIN: 0000 - 013A = 0x13A = 314
|
|
||||||
.include "deater.scrolltext" 13DF - 1577 = 0x198 = 408
|
|
||||||
.include "a2.scrolltext" 1577 - 1695 = 0x11E = 286
|
|
||||||
=============
|
|
||||||
1008
|
|
||||||
|
|
||||||
.include "starfield_demo.s" 1695 - 19Ac = 0x317 = 791
|
|
||||||
.include "rasterbars.s" 19AC - 1A9E = 0xF2 = 242
|
|
||||||
=============
|
|
||||||
1033
|
|
||||||
|
|
||||||
.include "../asm_routines/gr_fast_clear.s" 01B6 - 02A0 = 0xEA = 234
|
|
||||||
.include "credits.s" 1A9E - 1CEA = 0x257 = 599
|
|
||||||
.include "interrupt_handler.s" 1CEA - 1DE3 = 0xD9 = 217
|
|
||||||
===================
|
|
||||||
1050
|
|
||||||
3D (61) too many, want 173
|
|
||||||
|
|
||||||
|
|
||||||
.include "../asm_routines/gr_unrle.s" 013A - 01B6
|
|
||||||
|
|
||||||
.include "../asm_routines/gr_hlin.s" 02A0 - 02FD
|
|
||||||
.include "../asm_routines/gr_setpage.s" 02FD - 0311
|
|
||||||
.include "../asm_routines/pageflip.s" 0311 - 032B
|
|
||||||
.include "../asm_routines/gr_fade.s" 032B - 0459
|
|
||||||
.include "../asm_routines/gr_copy.s" 0459 - 0491
|
|
||||||
.include "../asm_routines/gr_scroll.s" 0491 - 0565 = 0xC5 = 197
|
|
||||||
.include "../asm_routines/gr_offsets.s" 0565 - 0595
|
|
||||||
.include "../asm_routines/gr_plot.s" 0595 - 05C7
|
|
||||||
.include "../asm_routines/text_print.s" 05C7 - 060F
|
|
||||||
|
|
||||||
.include "../asm_routines/mockingboard_a.s" 060F - 06BC = 0xAD = 173
|
|
||||||
|
|
||||||
.include "mode7.s" 06BC - 1201 = 0xB43 = 2883
|
|
||||||
|
|
||||||
.include "mode7_demo_backgrounds.inc" 1201 - 13DF = 0x1DE = 478
|
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user