mirror of
https://github.com/irmen/prog8.git
synced 2025-01-10 20:30:23 +00:00
remove non-functional verafx.mult(). note: muls() is still there and just fine!
added documentation/source code comments to the cpu word*word multiplication routine not producing the correct upper 16 bits.
This commit is contained in:
parent
4acf38031a
commit
9c7a645e18
@ -111,15 +111,17 @@ verafx {
|
||||
|
||||
; unsigned multiplication just passes the values as signed to muls
|
||||
; if you do this yourself in your call to muls, it will save a few instructions.
|
||||
inline asmsub mult(uword value1 @R0, uword value2 @R1) clobbers(X) -> uword @AY, uword @R0 {
|
||||
; Returns the 32 bits unsigned result in AY and R0 (lower word, upper word).
|
||||
%asm {{
|
||||
jsr verafx.muls
|
||||
}}
|
||||
}
|
||||
; TODO fix this: verafx.muls doesn't support unsigned values like this
|
||||
; inline asmsub mult(uword value1 @R0, uword value2 @R1) clobbers(X) -> uword @AY, uword @R0 {
|
||||
; ; Returns the 32 bits unsigned result in AY and R0 (lower word, upper word).
|
||||
; %asm {{
|
||||
; jsr verafx.muls
|
||||
; }}
|
||||
; }
|
||||
|
||||
asmsub muls(word value1 @R0, word value2 @R1) clobbers(X) -> word @AY, word @R0 {
|
||||
; Returns the 32 bits signed result in AY and R0 (lower word, upper word).
|
||||
; Vera Fx multiplication support only works on signed values!
|
||||
%asm {{
|
||||
lda #(2 << 1)
|
||||
sta cx16.VERA_CTRL ; $9F25
|
||||
|
@ -55,13 +55,14 @@ _multiplier = P8ZP_SCRATCH_REG
|
||||
|
||||
|
||||
multiply_words .proc
|
||||
; -- multiply two 16-bit words into a 32-bit result (signed and unsigned)
|
||||
; -- multiply two 16-bit words into a 32-bit result (UNSIGNED)
|
||||
; input: A/Y = first 16-bit number, multiply_words.multiplier = second 16-bit number
|
||||
; output: multiply_words.result, 4-bytes/32-bits product, LSB order (low-to-high) low 16 bits also in AY.
|
||||
; you can retrieve the upper 16 bits via math.mul16_last_upper()
|
||||
|
||||
; NOTE: the result (which includes the multiplier parameter on entry) is a 4-byte array.
|
||||
; this routine could be faster if we could stick that into zeropage,
|
||||
; but there currently is no way to use 4 consecutive bytes in ZP (without disabling irq and saving/restoring them)...
|
||||
; NOTE FOR NEGATIVE VALUES:
|
||||
; The routine also works for NEGATIVE (signed) word values, but ONLY the lower 16 bits of the result are correct then!
|
||||
; Prog8 only uses those so that's not an issue, but math.mul16_last_upper() no longer gives the correct result here.
|
||||
|
||||
; mult62.a
|
||||
; from: https://github.com/TobyLobster/multiply_test/blob/main/tests/mult62.a
|
||||
@ -179,7 +180,7 @@ _inner_loop2
|
||||
ldy result+1
|
||||
rts
|
||||
|
||||
result .byte 0,0,0,0
|
||||
result .byte 0,0,0,0 ; routine could be faster if this were in Zeropage...
|
||||
|
||||
.pend
|
||||
|
||||
|
@ -168,6 +168,9 @@ _sinecosR8 .char trunc(127.0 * sin(range(180+45) * rad(360.0/180.0)))
|
||||
; for instance, simply printing a number may already result in new multiplication calls being performed
|
||||
; - not all multiplications in the source code result in an actual multiplication call:
|
||||
; some simpler multiplications will be optimized away into faster routines. These will not set the upper 16 bits at all!
|
||||
; - THE RESULT IS ONLY VALID IF THE MULTIPLICATION WAS DONE WITH UWORD ARGUMENTS (or two positive WORD arguments)
|
||||
; as soon as a negative word value (or 2) was used in the multiplication, these upper 16 bits are not valid!!
|
||||
; Suggestion (if you are on the Commander X16): use verafx.muls() to get a hardware accelerated 32 bit signed multplication.
|
||||
%asm {{
|
||||
lda multiply_words.result+2
|
||||
ldy multiply_words.result+3
|
||||
|
@ -293,6 +293,8 @@ math {
|
||||
; for instance, simply printing a number may already result in new multiplication calls being performed
|
||||
; - not all multiplications in the source code result in an actual multiplication call:
|
||||
; some simpler multiplications will be optimized away into faster routines. These will not set the upper 16 bits at all!
|
||||
; - THE RESULT IS ONLY VALID IF THE MULTIPLICATION WAS DONE WITH UWORD ARGUMENTS (or two positive WORD arguments)
|
||||
; as soon as a negative word value (or 2) was used in the multiplication, these upper 16 bits are not valid!!
|
||||
%ir {{
|
||||
syscall 33 (): r0.w
|
||||
returnr.w r0
|
||||
|
@ -787,6 +787,10 @@ but perhaps the provided ones can be of service too.
|
||||
It does not work for the verafx multiplication routines on the Commander X16!
|
||||
These have a different way to obtain the upper 16 bits of the result: just read cx16.r0.
|
||||
|
||||
**NOTE:** the result is only valid if the multiplication was done with uword arguments (or two positive word arguments).
|
||||
As soon as a single negative word value (or both) was used in the multiplication, these upper 16 bits are not valid!
|
||||
Suggestion (if you are on the Commander X16): use ``verafx.muls()`` to get a hardware accelerated 32 bit signed multiplication.
|
||||
|
||||
``crc16 (uword data, uword length) -> uword``
|
||||
Returns a CRC-16 (XMODEM) checksum over the given data buffer.
|
||||
Note: on the Commander X16, there is a CRC-16 routine in the kernal: cx16.memory_crc().
|
||||
|
@ -14,7 +14,6 @@ Compiler:
|
||||
- Can we support signed % (remainder) somehow?
|
||||
- Don't add "random" rts to %asm blocks but instead give a warning about it? (but this breaks existing behavior that others already depend on... command line switch? block directive?)
|
||||
- IR: implement missing operators in AssignmentGen (array shifts etc)
|
||||
- IR: CMPI+BSTEQ --> new BEQ reg,value,label instruction (like BGT etc)
|
||||
- instead of copy-pasting inline asmsubs, make them into a 64tass macro and use that instead.
|
||||
that will allow them to be reused from custom user written assembly code as well.
|
||||
- Multidimensional arrays and chained indexing, purely as syntactic sugar over regular arrays.
|
||||
|
Loading…
x
Reference in New Issue
Block a user