Added Woz's new routine

Michael 2016-01-15 17:34:19 -08:00
parent dfc7d2bf49
commit 2f9786ab6a
1 changed files with 52 additions and 0 deletions

52
Home.md

@ -80,7 +80,59 @@ F430: A5 27 LDA GBASH ; 0 cdefghcd cdefghcd eabab000
F432: 29 1F AND #$1F ; 0 000fghcd cdefghcd eabab000
F434: 05 E6 ORA HPAG ; 0 pppfghcd cdefghcd eabab000
F436: 85 27 STA GBASH ; 0 pppfghcd pppfghcd eabab000
RTS
```
* http://www.txbobsc.com/aal/1986/aal8612.html#a9
Woz Re-Codes Hi-Res Address Calculations Bob Sander-Cederlof
In the October or November issue of the Washington Apple Pi newsletter, Rick Chapman wrote a review of various methods of calculating the hi-res base addresses. Steve Wozniak liked the article, and responded with a long "letter to the editor" in the December issue. Steve also presented a new version of the hi-res address calculator which is both shorter and faster. In fact, as far as I am aware, it is the fastest method ever, except for table-lookups.
In the September 1983 issue of Apple Assembly Line, I presented both the original Woz code and a shorter-faster version by Harry Cheung of Nigeria. Here are the specs:
Applesoft ROM version: 33 bytes, 61 cycles
Harry Cheung version: 25 bytes, 46 cycles
New Wozniak version: 26 bytes, 36 or 37 cycles
The byte counts do not include an RTS at the end of the code, nor do the times include a JSR-RTS. After all, if you are really working for speed you will put the code in its place, not make it a subroutine.
Woz's new version takes either 36 or 37 cycles, depending on the values for the first two bits of the line number. Remember that the line number can be any value from 0 to 191, or $00...$BF. That means the first two bits are either 00, 01, or 10. If you look at lines 1090-1120 below, you will see that the shortest path is for 00, taking both branches, giving a running time for the whole calculation of 36 cycles. If the first two bits are 01 or 10, one branch will be taken and the other not, making the total time 37 cycles. In Woz's letter he shortchanged himself, thinking possibly both branches might not be taken, giving a total running time of 38 cycles; this cannot happen with legal line numbers.
Line 1180 adds in either $10 or $20, depending on which hi-res page you are using. The Applesoft code here adds in a value of either $20 or $40, so if this version were to be inserted into Applesoft the generation of HPAG2 would have to be changed. No problem, and not likely anyway. By the way, if you are only using one specific hi-res page, you can change line 1180 to an immediate mode form, saving yet another cycle.
Here is Woz's new version, reformatted for the S-C Assembler and with some changes in comments:
```assembly
1000 *SAVE S.NEW.WOZ.HIRES.CALC
1010 *--------------------------------
1020 GBASL .EQ $26
1030 GBASH .EQ $27
1040 HPAG2 .EQ $E6 Applesoft puts it here anyway.
1050 *--------------------------------
1060 CALC ASL A--BCDEFGH0
1070 TAX TAX...TXA could be TAY...TYA
1080 AND #$F0 A--BCDE0000
1090 BPL .1 B=0
1100 ORA #$05 A--BCDE0B0B
1110 .1 BCC .2 A-0
1120 ORA #$0A A--BCDEABAB
1130 .2 ASL B--CDEABAB0
1140 ASL C--DEABAB00
1150 STA GBASL
1160 TXA C--BCDEFGH0
1170 AND #$0E C--0000FGH0
1180 ADC HPAG2 O--OOxxFGH0
1190 * HPAG2 = $10 for base $2000, $20 for base $4000
1200 ASL GBASL D--00xxFGHC GBASL=EABAB000
1210 ROL 0--0xxFGHCD
1220 STA GBASH
1230 RTS
1240 *--------------------------------
```
Of course the table/array lookup is the fastest:
```assembly
y2hgrL EQU $1000
y2hgrH EQU $1100