mirror of
https://github.com/deater/dos33fsprogs.git
synced 2024-11-19 12:32:35 +00:00
mode7: update README
This commit is contained in:
parent
4254cca36b
commit
29adca8ed4
@ -1,6 +1,331 @@
|
||||
Plan:
|
||||
Load at $1000
|
||||
Decompress to $2000
|
||||
Challenges found writing an 8k Lores Apple II Demo
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
by DEATER (Vince Weaver, vince@deater.net)
|
||||
|
||||
http://www.deater.net/weave/vmwprod/mode7_demo/
|
||||
====================================================
|
||||
19 March 2018
|
||||
|
||||
GOAL:
|
||||
~~~~~
|
||||
This started out as some SNES style mode7 pseudo-3d graphics code
|
||||
I came up with while working on my TF7 game. The graphics looked
|
||||
pretty cool, so I started developing a demo around it.
|
||||
|
||||
The codesize ended up being roughly around 8kB, so I thought I'd
|
||||
make it into an 8k demo. There aren't many out there for the Apple II.
|
||||
and a Mockingboard sound card.
|
||||
|
||||
The demo tries to hit the lowest common denominator for Apple II systems,
|
||||
so in theory you could have run this on an Apple II in 1977 if you
|
||||
were rich enough to afford 48k of RAM. The Mockingboard sound wasn't
|
||||
available until 1981, but still this all predates the Commodore 64.
|
||||
|
||||
USING:
|
||||
~~~~~~
|
||||
Boot disk on a real system, or emulator with Mockingboard support.
|
||||
|
||||
Applewin works fine (even under Wine on Linux).
|
||||
MESS does too, it's harder to setup (ROMs) but the audio sounds clearer.
|
||||
|
||||
If you have no emulator you can try one of the online javascript ones.
|
||||
https://www.scullinsteel.com/apple2/
|
||||
|
||||
|
||||
Hardware:
|
||||
~~~~~~~~~
|
||||
The Apple II has a 6502 processor running at roughly 1.023MHz.
|
||||
|
||||
Early models only shipped with 4k of RAM, but later 48k, 64k, and 128k
|
||||
systems were common.
|
||||
|
||||
The most common disk drive was the Disk II which typically held
|
||||
140k of data (single-sided).
|
||||
|
||||
The only sound available was a bit-banged speaker. No timer,
|
||||
if you wanted music you had to cycle-count via the CPU.
|
||||
|
||||
Later some sound cards were available. This demo uses the
|
||||
Mockingboard which has dual AY-3-8910 sound chips. Each
|
||||
chip provides 3 channels of square waves, with noise and
|
||||
envelope effects available.
|
||||
|
||||
GRAPHICS
|
||||
~~~~~~~~
|
||||
|
||||
The Apple II had nice graphics for its time, with this time being
|
||||
around 1977. Otherwise it is quite limited.
|
||||
Hardware Sprites? No
|
||||
Linear framebuffer? No
|
||||
User-defined charset? No
|
||||
Blanking interrupts? No
|
||||
Palette selection? No
|
||||
Hardware scrolling? No
|
||||
Hardware page flip? Yes
|
||||
|
||||
The hi-res graphics mode was a complex mess of NTSC hacks by Woz.
|
||||
You got 280x192 graphics, with 6 colors available. However the colors
|
||||
were from NTSC artifacts and there were limitations on which colors
|
||||
could be next to each other (in blocks of 3.5 pixels) as well as
|
||||
fringing. Also the addresses were interleaved, so not a linear
|
||||
framebuffer. Hi-res page0 is at $2000 and page1 at $4000.
|
||||
Optionally 4 lines of text can be shown at the bottom of the
|
||||
screen instead of graphics.
|
||||
|
||||
The lo-res mode is a bit easier to use. It is 40x48 blocks
|
||||
(40x40 if 4 lines of text are displayed at the bottom).
|
||||
15 colors are available, though there is fringing at the edges.
|
||||
Again the addresses are interleaved. Lo-res page0 is at $400
|
||||
and page1 is at $800.
|
||||
|
||||
========================================
|
||||
DETAILED STEP-BY-STEP REVIEW OF THE DEMO
|
||||
========================================
|
||||
|
||||
BOOTLOADER
|
||||
~~~~~~~~~~
|
||||
A BASIC "HELLO" program loads the binary.
|
||||
This just makes things auto-boot at startup, this doesn't count
|
||||
towards the executable size, you could manually BRUN the 8k program
|
||||
if you wanted.
|
||||
|
||||
The binary is loaded at $2000 (hi-res page0) and BASIC kicks into
|
||||
HIRES mode before loading so you can watch as the memory is loaded
|
||||
from disk in a seemingly random pattern.
|
||||
|
||||
Since this is an 8k demo, the entirety of the program is shown on
|
||||
the screen (or would be if we POKEd the right address to turn off
|
||||
the 4 lines of text on the bottom of the screen).
|
||||
|
||||
Execution starts at address $2000
|
||||
|
||||
DECOMPRESSER
|
||||
~~~~~~~~~~~~
|
||||
The binary is LZ4 encoded. The decompresser flips to HGR page 1 so
|
||||
we can watch memory as the program is decompressed.
|
||||
|
||||
The LZ4 decompression code was written by qkumba (Peter Ferrie).
|
||||
http://pferrie.host22.com/misc/appleii.htm
|
||||
|
||||
The actual program/data decompresses to around 22k starting at $4000.
|
||||
It over-writes parts of DOS3.3, but since we won't be using the disk
|
||||
anymore this isn't an issue.
|
||||
|
||||
At the top left corner of the screen you'll see the VMW triangles logo
|
||||
as it decompresses. To do this I had to put the proper bit pattern
|
||||
at $4000, $4400, $4800, and $4C00. I mean to have some words too
|
||||
but ran out of disk space. The bit pattern at $4000 is executable
|
||||
and is run as code.
|
||||
|
||||
Optimizing for code size inside of a compressed binary is a pain.
|
||||
Removing instructions sometimes made the binary larger as it no longer
|
||||
compressed as well. Long runs of values (such as 0 padding) are
|
||||
essentially free. This was a difficult challenge.
|
||||
|
||||
FADE EFFECT
|
||||
~~~~~~~~~~~
|
||||
The title screen fades in from black.
|
||||
|
||||
This is a software hack, with a lookup table copying from an off-screen
|
||||
buffer. The Apple II doesn't have any palette support.
|
||||
|
||||
TITLE SCREEN
|
||||
~~~~~~~~~~~~
|
||||
Once things are decompressed, we jump to $4000. We switch to low-res
|
||||
mode for the rest of the DEMO.
|
||||
|
||||
A background image is loaded from disk. This is RLE encoded (probably
|
||||
unnecessary when being further LZ4 encoded).
|
||||
|
||||
Why not just load the program at $400 and load the graphics image for
|
||||
free? Well, remember the graphics are 40x48 (shared with the text).
|
||||
Really it's 40x24, with each text char mapping to 4-bits top/bottom
|
||||
for color. Do the math, we have 1k reserved for this mode but 40x24
|
||||
is only 960 bytes. It turns out there are "holes" in the address range
|
||||
that aren't displayed, and various pieces of hardware use these holes
|
||||
as scratchpad memory. So if you just blindly uncompress graphics data
|
||||
there you can corrupt the scratchpad. So you have to be careful
|
||||
when uncompressing to skip the holes.
|
||||
|
||||
The title screen has scrolling text at the bottom. This is nothing fancy,
|
||||
the text is in a buffer off screen and a 40x4 chunk of RAM is copied in
|
||||
every so many cycles.
|
||||
|
||||
You might notice that there is tearing/jitter in the scrolling, even
|
||||
though we are double-buffering the graphics. This is because there is
|
||||
not a reliable cross-platform way to get the VBLANK info (especially
|
||||
on older machines) so we are having some bad luck about when we flip
|
||||
pages.
|
||||
|
||||
MOCKINGBOARD MUSIC
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
I like chiptune music, especially that for AY-3-8910 based systems.
|
||||
Before obtaining a Mockingboard I built a Raspberry Pi chiptune player
|
||||
that is essentially the same hardware.
|
||||
|
||||
Most of my sound infrastructure involves YM5 files, which are often used
|
||||
by ZX Spectrum and ATARI ST users. These are usually register dumps
|
||||
taken typically at 50Hz. So to play them back you just have to interrupt
|
||||
50 times a second and write the registers.
|
||||
|
||||
To program the Mockingboard, each AY-3-8910 chip has 14 sound related
|
||||
registers that control the 3 channels. Each AY chip has a dedicated
|
||||
VIA 6522 parallel I/O chip that handles the I/O.
|
||||
|
||||
Doing this quickly enough is a challenge on the Apple II. For each
|
||||
register you have to do a handshake, set the register # and the value.
|
||||
This can take upwards of 40 1MHz cycles per register.
|
||||
|
||||
For complex chiptune files (especially those written on an ST with much
|
||||
faster hardware) it's sometimes not possible to get exact playback
|
||||
due to the delay. Also one AY is on the left channel and one on the right
|
||||
so you have to write both if you want sound from both speakers.
|
||||
|
||||
I have a whole suite of code for manipulating YM sound data, in my
|
||||
vmw-meter git repository.
|
||||
|
||||
The first step for getting this to work is detecting if a mockingboard is
|
||||
there. This can be in any slot 1-7 on the Apple II, though typically
|
||||
Slot 4 is standard (in this demo we only check slot 4).
|
||||
|
||||
The board is initialized, and then one of the 6522 timers is set to
|
||||
interrupt at 25Hz (it has to be an on-board timer as the default
|
||||
Apple II has no timers).
|
||||
|
||||
Why 25Hz and not 50Hz? At 50Hz with 14 registers you use 700 bytes/s.
|
||||
So a 2 minute song would take 84k of RAM, much more than is available.
|
||||
|
||||
For this demo I run at 25Hz, and also pack the 14 registers of the data
|
||||
into 11 (there are various fields that are not packed well, we can
|
||||
unpack at play time). Also I stripped out the envelope data as many
|
||||
songs do not use it (so this is a lossy compression method).
|
||||
|
||||
Also, we keep track of the last values written last frame and only
|
||||
write out to the board if things change, which helps with the latency
|
||||
a bit.
|
||||
|
||||
The sound quality suffered a bit, but it's hard to fit a catchy chiptune
|
||||
file in 8K.
|
||||
|
||||
The song being played is a stripped down and re-arranged version of
|
||||
"Electric Wave" from CC'00 by EA (Ilya Abrosimov).
|
||||
|
||||
|
||||
MODE7 BACKGROUND
|
||||
~~~~~~~~~~~~~~~~
|
||||
"MODE7" was a Super Nintendo (SNES) graphics mode that took a tiled
|
||||
background and transformed it to look as if it was squashed out to
|
||||
the horizon, giving a 3d look. The SNES did this in hardware, but
|
||||
in this demo we do this in software.
|
||||
|
||||
As found on Wikipedia, the transform is of the type
|
||||
|
||||
[x'] = [a b]([x]-[x0])+[x0]
|
||||
[y'] [c d]([y] [y0]) [y0]
|
||||
|
||||
For our code, we managed to reduce things to a small number of additions
|
||||
and subtractions for each pixel on the screen. Of course the 6502 can't
|
||||
do floating point, so we do fixed point math. We convert as much as we
|
||||
can to table lookups that are pre-calculated. We also make liberal use
|
||||
of self-modifying code.
|
||||
|
||||
Despite all of this there are still some cases where we have to do a
|
||||
16bit x 16bit = 32bit multiply, something that is *really* slow on 6502,
|
||||
around 700 cycles (for a 8.8 x 8.8 fixed point multiply).
|
||||
|
||||
To make this faster we use a method described by Stephen Judd.
|
||||
|
||||
The key to note is that (a+b)^2 = a^2+2ab+b^2 and (a-b)^2=a^2-2ab+b^2
|
||||
and if you add them you can simplify to:
|
||||
(a+b)^2 (a-b)^2
|
||||
a*b = --------- - -------
|
||||
4 4
|
||||
This is you have a table of squares from 0..511 (all 8-bit a+b and a-b
|
||||
will fall in this range) then you can convert a multiply into a table
|
||||
lookup plus a subtract.
|
||||
|
||||
The downsize is you will need 2kB of squares lookup tables (which can
|
||||
be generated at startup). This reduces the multiply cost to the order
|
||||
of 200 to 250 cycles.
|
||||
|
||||
By using the fast multiply and a lot of careful optimization you can
|
||||
generate a Mode7 background in 40x40 graphics mode at about 5 frames/second.
|
||||
|
||||
The engine can be parameterized with different tilesets to use, which we
|
||||
do to provide both a black+white checkerboard background, as well as the
|
||||
island background from the TFV game.
|
||||
|
||||
BOUNCING BALL ON CHECKERBOARD
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
What would a demo be without some sort of bouncing geometric shape.
|
||||
|
||||
This is just done with 16 sprites. The sphere was modeled in OpenGL
|
||||
from a 2000-era game-engine that I never finished. I then took screenshots
|
||||
and then reduced the size/color to an appropriate value.
|
||||
|
||||
The shadow is also just sprites.
|
||||
|
||||
The clicking noise on bounce is just touching the speaker at $C030.
|
||||
It's mostly there to give some sound effects for those playing the demo
|
||||
without a mockingboard.
|
||||
|
||||
TFV SPACESHIP FLYING
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
The spaceship, water splash, and shadows are all sprites. This is all
|
||||
done in software, the Apple II has no sprite hardware.
|
||||
|
||||
This is the TFV game engine flying-spaceship code, with the keyboard
|
||||
routines replaced to read from memory instead (sort of like a script
|
||||
of what to do when).
|
||||
|
||||
STARFIELD
|
||||
~~~~~~~~~
|
||||
The starfield is your typical starfield code. Only 16 stars are modeled.
|
||||
It re-uses the fast-multiply code from the mode7 graphics.
|
||||
|
||||
Random number generation is not fast on the 6502, so we cheat.
|
||||
Originally we had a 256-byte blob of "random" values generated earlier.
|
||||
|
||||
This wasted space, so now instead we just treat the executable code
|
||||
at $5000 as if it were a block of random numbers. This was arbitrarily
|
||||
chosen, I tried different areas of memory until I got one where the
|
||||
stars seemed to move in a pleasing pattern.
|
||||
|
||||
A simple state machine controls if the stars move or not, whether the
|
||||
background is cleared or not (the streak effect) and what color the
|
||||
background is (for the blue flash).
|
||||
|
||||
The ship moving to the distance is just done with different sized sprites.
|
||||
|
||||
RASTERBARS/CREDITS
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The credits happen with the starfield continuing to run.
|
||||
|
||||
The text is written in the bottom 4 lines of the screen. Some inverse-mode
|
||||
space characters are used to try to make it look like graphics are surrounding
|
||||
the text. It's actually possible with careful cycle counting to switch
|
||||
modes fast enough to have actual mixed graphics/text (See the FrenchTouch
|
||||
demos) but I was too lazy to attempt that here.
|
||||
|
||||
The rasterbar effect isn't really rasterbars, it's just a rainbow assortment
|
||||
of lines being drawn with a SINEWAVE lookup table.
|
||||
|
||||
It's the same rasterbar code from my chiptune player demo. I ended up
|
||||
optimizing it a lot via inlining and a few other ways because it turned
|
||||
out just drawing a horizontal line can take a very long time.
|
||||
|
||||
The rotating text is just taking the output string and rapidly rotating the
|
||||
character values through the ASCII table.
|
||||
|
||||
The annoying clicking noise is the same speaker effect caused by hitting
|
||||
$C030.
|
||||
|
||||
Choosing who to thank ended up being extremely critical to fitting in 8kB,
|
||||
as unique text strings do not compress well. I'm also still not satisfied
|
||||
with how the centering looks.
|
||||
|
||||
|
||||
|
||||
Memory Map
|
||||
==========
|
||||
@ -35,51 +360,3 @@ Memory Map
|
||||
|zero pg | 0.25
|
||||
------- $0000
|
||||
|
||||
=============================================
|
||||
Getting the VMW logo to appear on page2 HGR
|
||||
==============================================
|
||||
|
||||
; Need to have lines at
|
||||
; $4000 AA,AD,D5,AC,95
|
||||
; $4400 A8,D5,95,35,85 1k
|
||||
; $4800 A0,55,26,55,81 2k
|
||||
; $4C00 00,00,00,00,00 3k
|
||||
|
||||
|
||||
MAIN: 0000 - 013A = 0x13A = 314
|
||||
.include "deater.scrolltext" 13DF - 1577 = 0x198 = 408
|
||||
.include "a2.scrolltext" 1577 - 1695 = 0x11E = 286
|
||||
=============
|
||||
1008
|
||||
|
||||
.include "starfield_demo.s" 1695 - 19Ac = 0x317 = 791
|
||||
.include "rasterbars.s" 19AC - 1A9E = 0xF2 = 242
|
||||
=============
|
||||
1033
|
||||
|
||||
.include "../asm_routines/gr_fast_clear.s" 01B6 - 02A0 = 0xEA = 234
|
||||
.include "credits.s" 1A9E - 1CEA = 0x257 = 599
|
||||
.include "interrupt_handler.s" 1CEA - 1DE3 = 0xD9 = 217
|
||||
===================
|
||||
1050
|
||||
3D (61) too many, want 173
|
||||
|
||||
|
||||
.include "../asm_routines/gr_unrle.s" 013A - 01B6
|
||||
|
||||
.include "../asm_routines/gr_hlin.s" 02A0 - 02FD
|
||||
.include "../asm_routines/gr_setpage.s" 02FD - 0311
|
||||
.include "../asm_routines/pageflip.s" 0311 - 032B
|
||||
.include "../asm_routines/gr_fade.s" 032B - 0459
|
||||
.include "../asm_routines/gr_copy.s" 0459 - 0491
|
||||
.include "../asm_routines/gr_scroll.s" 0491 - 0565 = 0xC5 = 197
|
||||
.include "../asm_routines/gr_offsets.s" 0565 - 0595
|
||||
.include "../asm_routines/gr_plot.s" 0595 - 05C7
|
||||
.include "../asm_routines/text_print.s" 05C7 - 060F
|
||||
|
||||
.include "../asm_routines/mockingboard_a.s" 060F - 06BC = 0xAD = 173
|
||||
|
||||
.include "mode7.s" 06BC - 1201 = 0xB43 = 2883
|
||||
|
||||
.include "mode7_demo_backgrounds.inc" 1201 - 13DF = 0x1DE = 478
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user