374 lines
10 KiB
Plaintext
374 lines
10 KiB
Plaintext
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
|
|
tiny_hgr8
|
|
an 8-byte hi-res Apple II demo
|
|
|
|
by Deater / dSr
|
|
|
|
Lovebyte 2023
|
|
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
|
|
|
|
TLDR: I wrote an Apple II graphics
|
|
demo that's only 8 bytes of
|
|
6502 assembly language
|
|
|
|
LINK:
|
|
https://youtu.be/8QYezzXC9PA
|
|
|
|
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
|
|
|
|
I really wanted to make a hi-res 8-byte
|
|
demo but that is trickier than you
|
|
might think.
|
|
|
|
=== THE CHALLENGE ===
|
|
|
|
The Apple II has a 6502 processor.
|
|
|
|
To enable hi-res graphics you need three
|
|
bytes, typically a jump to the HGR
|
|
routine in the Applesoft BASIC ROM:
|
|
JSR HGR
|
|
|
|
The HGR routine will flip the proper
|
|
soft-switches to enable graphics mode,
|
|
enable split graphics/text mode, select
|
|
viewing the 8K of graphics info in PAGE1
|
|
and then clear the screen to black.
|
|
(The nearby HGR2 call is similar but
|
|
makes the graphics full-screen and
|
|
uses PAGE2 instead).
|
|
|
|
Once you set hi-res mode you still
|
|
need to draw some graphics. It is
|
|
hard to do this compactly. The most
|
|
obvious way is the ROM HPLOT call,
|
|
but this depends on the A, X, and Y
|
|
registers holding the screen
|
|
co-ordinates as well as the desired
|
|
color being set up at a zero page
|
|
location.
|
|
|
|
When I create 16 byte demos I often
|
|
use the built-in ROM vector drawing
|
|
shapetable/XDRAW functionality which
|
|
avoids the need for color setting
|
|
because it just XORs pixels.
|
|
However you still usually need to
|
|
call the HPOSN routine to set up
|
|
the co-ordinate values in the zero
|
|
page such as GBASL/GBASH. The default
|
|
values from uninitialized RAM at boot
|
|
usually aren't useful.
|
|
|
|
You can try drawing directly to screen
|
|
memory at addresses $2000* (PAGE1)
|
|
or $4000 (PAGE2) , but that takes
|
|
3 bytes and if you want to draw to
|
|
all 8K of the screen you need to
|
|
have a way to increment a 16-bit
|
|
pointer. If we were lucky at boot
|
|
there'd be an indirect pointer in
|
|
the zero page with a good address
|
|
for this, but alas there isn't.
|
|
|
|
So to summarize, to do hi-res graphics
|
|
it takes 3 bytes to init, at least 3
|
|
to draw a pixel, and then 2 bytes for
|
|
a loop. We're at 8-bytes already and
|
|
we haven't even done anything useful
|
|
like increment the pixel location or
|
|
change the color.
|
|
|
|
So is all hope lost?
|
|
|
|
* note a leading $ is how you
|
|
traditionally indicate hexadecimal
|
|
numbers on 6502 computers
|
|
|
|
=== THE CHRGET TRICK ===
|
|
|
|
We can use a trick I found in a
|
|
previous lo-res graphics entry
|
|
shown at Lovebyte 2022.
|
|
|
|
We can abuse some code put into the
|
|
zero page by the Applesoft ROM at
|
|
boot (this is available on any
|
|
Apple II from the Apple II+ onward,
|
|
which is to say most of them).
|
|
|
|
Applesoft uses this code when parsing
|
|
BASIC programs, and it is apparently
|
|
put into the zero page so the address
|
|
being loaded can be self-modified.
|
|
|
|
The code looks like this:
|
|
|
|
CHRGET:
|
|
00B1- E6 B8 INC $B8
|
|
00B3- D0 02 BNE $00B7
|
|
00B5- E6 B9 INC $B9
|
|
00B7- AD 05 02 LDA $0205
|
|
00BA- C9 3A CMP #$3A
|
|
00BC- B0 0A BCS $00C8
|
|
00BE- C9 20 CMP #$20
|
|
00C0- F0 EF BEQ 00B1
|
|
|
|
What the code originally does is not
|
|
important, what is interesting is that
|
|
it does a 16-bit increment of the
|
|
address of the LDA (load accumulator)
|
|
instruction at $B7, and there's
|
|
a convenient BEQ (branch if equal)
|
|
back to the beginning of the routine
|
|
at $C0. If we drop our code in
|
|
between these two chunks of code we
|
|
can just barely do some interesting
|
|
graphics.
|
|
|
|
=== THE PLAN ===
|
|
|
|
The first thing we need to do is get
|
|
into hi-res graphics mode. As
|
|
discussed earlier doing a 3-byte
|
|
jsr HGR2
|
|
will do this. It uses soft-switches
|
|
to enable graphics, switch to hi-res,
|
|
set it to full-screen (no text), and
|
|
finally to get the graphics from
|
|
PAGE2 ($4000). It then drops into
|
|
a routine that does a linear clear of
|
|
the screen to color 0 (black). This
|
|
might seem boring, but on the Apple II
|
|
due to the weird (and clever) way Woz
|
|
designed the DRAM/video refresh
|
|
circuitry this gives a venetian-blind
|
|
effect which looks pretty neat.
|
|
|
|
This is great, but we want some pretty
|
|
pixels on the screen too. It turns
|
|
out that if we jump into the middle
|
|
of the previously mentioned routine
|
|
we can hit the screen clearing
|
|
code at a point where it is drawing
|
|
the pattern in the A register to
|
|
the screen. So if we do a
|
|
jsr BKGND0
|
|
it will fill the screen with a nice
|
|
pattern. This is an unofficial entry
|
|
point in the ROM, but for various
|
|
complex reasons involving the license
|
|
with Microsoft it turns out Apple never
|
|
updated the Applesoft BASIC ROMs despite
|
|
there being various known bugs.
|
|
|
|
So now we in theory have 6 bytes of
|
|
code we can drop into the middle of
|
|
the CHRGET routine and have it
|
|
repeatedly clear the screen to a color
|
|
and then clear it back to black, with a
|
|
nice blinds effect.
|
|
|
|
That's boring though, can we switch
|
|
up the colors drawn? It'd be nice
|
|
to load a random value into the
|
|
accumulator (A register) before the
|
|
call to fill the screen. The existing
|
|
code does a load from an always-
|
|
incrementing 16-bit address, let's
|
|
point it into the ROM code and that
|
|
can act as a random enough series
|
|
of bytes.
|
|
|
|
== LOAD ADDRESS CONSIDERATIONS ==
|
|
|
|
The CHRGET load address starts at
|
|
$800, the default load address of BASIC
|
|
programs. We want to point it to ROM
|
|
which is at the top of the address
|
|
space. The easiest way to do this
|
|
is just have some high address bytes
|
|
at the start of the code and just load
|
|
the program so it drops into the middle
|
|
of the LDA instruction.
|
|
|
|
If we were running code by entering
|
|
it into the assembly language monitor
|
|
that would be fine, we could load
|
|
the bytes and then jump to an arbitrary
|
|
memory offset. However for the
|
|
competition we are going to load from
|
|
disk so we have to start executing
|
|
from the start of our binary. This
|
|
means these address bytes also need
|
|
to be valid code with no bad side
|
|
effects. An obvious choice would
|
|
be the no-operation NOP instruction,
|
|
which is $EA. Convenient, as $EAEA
|
|
points nicely into the ROM. It turns
|
|
out there are some fun** complications
|
|
with doing this.
|
|
|
|
** As per 4am, no fun is actually
|
|
guaranteed in this process
|
|
|
|
=== WHEREIN WE GET A BEEP AND ===
|
|
====== A TEXT SCREEN OF Ws =======
|
|
|
|
So we set our code to load in
|
|
the middle of CHRGET, calling BKGND0
|
|
immediately after the LDA which
|
|
puts the needed color pattern into
|
|
the A register. We can't call HGR2
|
|
first as it will always reset A to
|
|
be $60.
|
|
|
|
Sadly, if you run this, you'll get
|
|
a text screen filled with characters
|
|
before crashing into the monitor.
|
|
|
|
The problem here is BKGND0 assumes the
|
|
value of the first page of graphics
|
|
you want to fill is in zero-page
|
|
location HGR_PAGE ($E6). On bootup
|
|
this is likely uninitialized (it
|
|
often ends up $00 or $FF), so when
|
|
you call the routine it happily writes
|
|
your color pattern across the first 8k
|
|
of RAM which unfortunately is where the
|
|
zero-page, stack, and your code live.
|
|
Not Good.
|
|
|
|
We need a way to skip BKGND0 the first
|
|
time through the loop.
|
|
|
|
|
|
=== SKIPPING CHUNKS OF INSTRUCTIONS ===
|
|
= SURPRISINGLY YOU DO THIS A LOT WHEN =
|
|
======= WRITING 6502 ASSEMBLY ========
|
|
|
|
There's one famous way to skip ahead
|
|
on the 6502. This is to use the BIT
|
|
instruction. By putting a $2C byte
|
|
in your code it will do a BIT
|
|
(logical AND to set bits but throw
|
|
away the result) and it will use
|
|
two bytes following (that you are
|
|
trying to skip) as an address.
|
|
This is usually harmless (unless those
|
|
address bits point to a soft-switch).
|
|
You can use this trick to compactly
|
|
have code where you can jump into the
|
|
middle of the BIT instruction to
|
|
execute the two address bytes as code,
|
|
but otherwise execute the BIT as sort
|
|
of a 3-byte almost NOP.
|
|
|
|
We can construct our code so the
|
|
entry point is a BIT instruction
|
|
that skips the first JSR, but later
|
|
loop iterations branch earlier and
|
|
instead the BIT is part of the address
|
|
to the LDA instruction and the JSR
|
|
happens as normal.
|
|
|
|
So the first time through the loop
|
|
BKGND0 is skipped and HGR2 gets
|
|
called first. HGR2 usefully sets
|
|
up the HGR_PAGE value in $E6 to a good
|
|
value so the BKGND0 call works in
|
|
all future loop iterations.
|
|
|
|
=== ALMOST ON THE HOME STRETCH ===
|
|
|
|
We should be just about there, right?
|
|
|
|
There is a problem though, the first
|
|
time through the loop the BIT consumes
|
|
the next two bytes, avoiding the
|
|
JSR to BKGND0. However it means
|
|
the address of BKGND0, $F3F4,
|
|
(actually $F4, $F3 as the 6502 is
|
|
little-endian) get executed as code.
|
|
Is this a problem?
|
|
|
|
It turns out those two instructions
|
|
are invalid opcodes on both 6502
|
|
and 65c02 processors. Luckily, though,
|
|
instead of trapping like a modern
|
|
processor would the processor tries
|
|
to execute them anyway. You can
|
|
look up the side effects for these
|
|
invalid instructions online; on the
|
|
NMOS 6502 at least you get behavior
|
|
based on the don't care terms in the
|
|
instruction PLA. Happily though in
|
|
our case the instructions are close
|
|
enough to NOPs that our code will
|
|
work.
|
|
|
|
=== POINTING TO ROM ===
|
|
|
|
So with the BIT in place the last
|
|
step is to make sure we are pointing
|
|
to ROM when we load the accumulator.
|
|
|
|
If we load our 8-bytes of code at
|
|
address $B8 we can have $2C of the
|
|
BIT as the low byte of the LDA
|
|
instruction address, and the high
|
|
byte can be anything we want.
|
|
I arbitrarily put a NOP there even
|
|
though the code never gets executed
|
|
as $EA works to give a nice "random"
|
|
set of color patterns starting
|
|
at $EA2C (If you're curious, this is
|
|
in the middle of the ROM Floating
|
|
Point addition routine).
|
|
|
|
=== FINALLY, THE LOOP ===
|
|
|
|
We can't forget we need to loop.
|
|
If we load our code at $B8, the
|
|
8-bytes stop just short of the BEQ
|
|
branch-if-equal instruction back to
|
|
the beginning. BEQ checks the Zero
|
|
flag, but luckily the HGR2 call always
|
|
ends with the Zero flag set so this
|
|
nicely turns the BEQ into a
|
|
branch-always.
|
|
|
|
=== ALL FINISHED ===
|
|
|
|
The program loads, it skips the
|
|
first color fill, inits the screen,
|
|
then loops back alternately setting
|
|
and clearing the screen based on
|
|
a color pattern from an incrementing
|
|
pointer into ROM, leading to a colorful
|
|
animated venetian-blind pattern.
|
|
|
|
It actually looks lovely, arguably
|
|
nicer than many of the 16-byte intros
|
|
I've done.
|
|
|
|
|
|
=== TRY IT FOR YOURSELF ===
|
|
|
|
On an Apple II (or emulator) get to
|
|
the ']' BASIC prompt and enter
|
|
these commands to run it for yourself:
|
|
|
|
CALL -151
|
|
B8: 2C EA 20 F4 F3 20 D8 F3
|
|
B8G
|
|
|
|
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
|
|
|
by Vince `deater` Weaver
|
|
http://www.deater.net/weave
|
|
11 February 2023
|
|
|
|
with apologies to 4AM for vaguely
|
|
stealing his writeup format
|