mirror of
https://github.com/deater/dos33fsprogs.git
synced 2025-01-19 15:30:08 +00:00
109 lines
3.1 KiB
Plaintext
109 lines
3.1 KiB
Plaintext
40x40 lo-res rotozoomer for Apple II
|
|
|
|
by Vince "deater" Weaver vince _at_ deater.net
|
|
|
|
Working on this as it's part of a cutscene in my TFV game.
|
|
|
|
|
|
Theory:
|
|
~~~~~~~
|
|
|
|
In a rotozoomer you scan across the screen (in our case in
|
|
Apple lo-res, 40x40) and for each pixel do a mapping to find
|
|
out what color to draw it.
|
|
|
|
In this case you have a texture, and to find what point on the
|
|
texture maps to the screen co-ordinates you do a transform to
|
|
rotate and scale the co-ordinates. This usually involves some
|
|
multiplies and some sin/cos calls.
|
|
|
|
Optimization:
|
|
~~~~~~~~~~~~~
|
|
This effect is often done on 8-bit computers, the trick is to
|
|
take as much work as possible out of the inner loop.
|
|
For our case, each cycle we save in the innermost loop saves
|
|
1600 cycles total (40x40).
|
|
|
|
The first optimization is to note that the transform is basically
|
|
a set of straight lines plotted across the texture. So you can
|
|
calculate the slope of this at the beginning (using sin/cos),
|
|
then calculate all the points using simple add instructions.
|
|
|
|
The code in C looks something like this. Some extra transformation
|
|
is done to have the center of rotation be the center of the screen
|
|
at 20,20.
|
|
|
|
|
|
ca = cos(theta)*scale;
|
|
sa = sin(theta)*scale;
|
|
|
|
cca = -20*ca;
|
|
csa = -20*sa;
|
|
|
|
yca=cca+ycenter;
|
|
ysa=csa+xcenter;
|
|
|
|
for(yy=0;yy<40;yy++) {
|
|
|
|
xp=cca+ysa;
|
|
yp=yca-csa;
|
|
|
|
for(xx=0;xx<40;xx++) {
|
|
|
|
if ((xp<0) || (xp>39)) color=0;
|
|
else if ((yp<0) || (yp>39)) color=0;
|
|
else {
|
|
color=scrn_page(xp,yp,PAGE2);
|
|
}
|
|
|
|
color_equals(color);
|
|
plot(xx,yy);
|
|
xp=xp+ca;
|
|
yp=yp-sa;
|
|
}
|
|
yca+=ca;
|
|
ysa+=sa;
|
|
}
|
|
|
|
|
|
|
|
Apple II/6502 optimizations
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
+ We use an optimized multiply routine (using subtractions of
|
|
squares) to do 8.8 fixed point signed multiply
|
|
+ We use lookup tables for sin() [and save space by using an
|
|
offset into the sin() table for cos()]
|
|
+ We use 8.8 fixed point values for math, even though that's
|
|
a bit slow on an 8-bit processor like the 6502
|
|
+ Apple II screen-read/pixel plotting is a pain as memory is not
|
|
linear and has holes in it. We use lookup tables
|
|
to calculate the address for each line
|
|
+ Apple II lores mode lines are grouped together into the
|
|
top/bottom nibbles of a byte. So typically to draw
|
|
an arbitrary pixel you have to read the old value, mask
|
|
off top or bottom, then OR in the new value.
|
|
Our code avoids this... since we are drawing the entire
|
|
screen destructively we don't have to save old values
|
|
when drawing the bytes.
|
|
+ In addition, we unroll the Y loop by one which allows us to
|
|
have custom code for odd/even rows which allow optimizing
|
|
away a lot of conditional code
|
|
|
|
|
|
Notes on making it even faster
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
There are other lo-res rotozoomer implementations.
|
|
They are faster too, but because they don't usually do full 40x40
|
|
resolution.
|
|
|
|
+ If we used a smaller texture (rather than 40x40) things would
|
|
be much faster. Other demos use 20x20 which would be
|
|
blockier but also 4x faster
|
|
|
|
+ If we wrapped the texture at the edges (instead of filling with
|
|
a solid color out of bounds) we could save at least 20
|
|
cycles, which would improve the frame rate.
|
|
|
|
|
|
|