mirror of
https://github.com/irmen/prog8.git
synced 2024-11-25 19:31:36 +00:00
385 lines
14 KiB
Plaintext
385 lines
14 KiB
Plaintext
------------------------------------------------------------
|
|
il65 - "Intermediate Language for 6502/6510 microprocessors"
|
|
------------------------------------------------------------
|
|
Written by Irmen de Jong (irmen@razorvine.net)
|
|
License: GNU GPL 3.0, see LICENSE
|
|
------------------------------------------------------------
|
|
|
|
|
|
The python program parses it and generates 6502 assembler code.
|
|
It uses the 64tass macro cross assembler to assemble it into binary files.
|
|
|
|
|
|
|
|
Memory Model
|
|
------------
|
|
|
|
Zero page: $00 - $ff
|
|
Hardware stack: $100 - $1ff
|
|
Free RAM/ROM: $0200 - $ffff
|
|
|
|
Reserved:
|
|
|
|
data direction $00
|
|
bank select $01
|
|
NMI VECTOR $fffa
|
|
RESET VECTOR $fffc
|
|
IRQ VECTOR $fffe
|
|
|
|
A particular 6502/6510 machine such as the Commodore-64 will have many other
|
|
special addresses due to:
|
|
- ROMs installed in the machine (basic, kernel and character generator roms)
|
|
- memory-mapped I/O registers (for the video and sound chip for example)
|
|
- RAM areas used for screen graphics and sprite data.
|
|
|
|
|
|
Usable Hardware registers:
|
|
A, X, Y,
|
|
AX, AY, XY (16-bit combined register pairs)
|
|
SC (status register Carry flag)
|
|
These cannot occur as variable names - they will always refer to the hardware registers.
|
|
|
|
|
|
The zero page locations $02-$ff can be regarded as 254 other registers.
|
|
Free zero page addresses on the C-64:
|
|
$02,$03 # reserved as scratch addresses
|
|
$04,$05
|
|
$06
|
|
$0a
|
|
$2a
|
|
$52
|
|
$93
|
|
$f7,$f8
|
|
$f9,$fa
|
|
$fb,$fc
|
|
$fd,$fe
|
|
|
|
|
|
|
|
IL program parsing structure:
|
|
-----------------------------
|
|
|
|
|
|
OUTPUT MODES:
|
|
-------------
|
|
output raw ; no load address bytes
|
|
output prg ; include the first two load address bytes, (default is $0801), no basic program
|
|
output prg,sys ; include the first two load address bytes, basic start program with sys call to code, default code start
|
|
; immediately after the basic program at $081d, or beyond.
|
|
|
|
address $0801 ; override program start address (default is set to $c000 for raw mode and $0801 for c-64 prg mode)
|
|
; cannot be used if output mode is prg,sys because basic programs always have to start at $0801
|
|
|
|
|
|
data types:
|
|
byte 8 bits $8f (unsigned, @todo signed bytes)
|
|
int 16 bits $8fee (unsigned, @todo signed ints)
|
|
bool true/false (aliases for the integer values 1 and 0, not a true datatype by itself)
|
|
char '@' (converted to a byte)
|
|
float 40 bits 1.2345 (stored in 5-byte cbm MFLPT format)
|
|
@todo 24 and 32 bits integers, unsigned and signed?
|
|
string 0-terminated sequence of bytes "hello." (implicit 0-termination byte)
|
|
pstring sequence of bytes where first byte is the length. (no 0-termination byte)
|
|
For strings, both petscii and screencode variants can be written in source, they will be translated at compile/assembler time.
|
|
|
|
|
|
Note: for many floating point operations, the compiler uses routines in the C64 BASIC and KERNAL ROMs.
|
|
So they will only work if the BASIC ROM (and KERNAL ROM) are banked in.
|
|
largest 5-byte MFLPT float: 1.7014118345e+38 (negative: -1.7014118345e+38)
|
|
|
|
|
|
Note: with the # prefix you can take the address of something. This is sometimes useful,
|
|
for instance when you want to manipulate the ADDRESS of a memory mapped variable rather than
|
|
the value it represents. You can take the address of a string as well, but the compiler already
|
|
treats those as a value that you manipulate via its address, so the # is ignored here.
|
|
|
|
|
|
|
|
BLOCKS
|
|
------
|
|
|
|
~ blockname [address] {
|
|
statements
|
|
}
|
|
|
|
The blockname "ZP" is reserved and always means the ZeroPage. Its start address is always set to $04,
|
|
because $00/$01 are used by the hardware and $02/$03 are reserved as general purpose scratch registers.
|
|
|
|
Block names cannot occur more than once, EXCEPT 'ZP' where the contents of every occurrence of it are merged.
|
|
Block address must be >= $0200 (because $00-$fff is the ZP and $100-$200 is the cpu stack)
|
|
|
|
You can omit the blockname but then you can only refer to the contents of the block via its absolute address,
|
|
which is required in this case. If you omit both, the block is ignored altogether (and a warning is displayed).
|
|
|
|
|
|
IMPORTING, INCLUDING and BINARY-INCLUDING files
|
|
-----------------------------------------------
|
|
|
|
import "filename[.ill]"
|
|
Can only be used outside of a block (usually at the top of your file).
|
|
Reads everything from the named IL65 file at this point and compile it as a normal part of the program.
|
|
|
|
asminclude "filename.txt", scopelabel
|
|
Can only be used in a block.
|
|
The assembler will include the file as asm source text at this point, il65 will not process this at all.
|
|
The scopelabel will be used as a prefix to access the labels from the included source code,
|
|
otherwise you would risk symbol redefinitions or duplications.
|
|
|
|
asmbinary "filename.bin" [, <offset>[, <length>]]
|
|
Can only be used in a block.
|
|
The assembler will include the file as binary bytes at this point, il65 will not process this at all.
|
|
The optional offset and length can be used to select a particular piece of the file.
|
|
|
|
|
|
|
|
MACROS
|
|
------
|
|
|
|
@todo macros are meta-code (written in Python syntax) that actually runs in a preprecessing step
|
|
during the compilation, and produces output value that is then replaced on that point in the input source.
|
|
Allows us to create pre calculated sine tables and such. Something like:
|
|
|
|
var .array sinetable ``[sin(x) * 10 for x in range(100)]``
|
|
|
|
|
|
|
|
EXPRESSIONS
|
|
-----------
|
|
|
|
In most places where a number or other value is expected, you can use just the number, or a full constant expression.
|
|
The expression is parsed and evaluated by Python itself at compile time, and the (constant) resulting value is used in its place.
|
|
Ofcourse the special il65 syntax for hexadecimal numbers ($xxxx), binary numbers (%bbbbbb),
|
|
and the address-of (#xxxx) is supported. Other than that it must be valid Python syntax.
|
|
Expressions can contain function calls to the math library (sin, cos, etc) and you can also use
|
|
all builtin functions (max, avg, min, sum etc). They can also reference idendifiers defined elsewhere in your code,
|
|
if this makes sense.
|
|
|
|
The syntax "[address]" means: the contents of the memory at address.
|
|
By default, if not otherwise known, a single byte is assumed. You can add the ".byte" or ".word" or ".float" suffix
|
|
to make it clear what data type the address points to.
|
|
|
|
Everything after a semicolon ';' is a comment and is ignored, however the comment is copied into the resulting assembly source code.
|
|
|
|
|
|
FLOW CONTROL
|
|
------------
|
|
|
|
Required building blocks: additional forms of 'go' statement: including an if clause, comparison statement.
|
|
|
|
- a primitive conditional branch instruction (special case of 'go'): directly translates to a branch instruction:
|
|
if[_XX] go <label>
|
|
XX is one of: (cc, cs, vc, vs, eq, ne, pos, min,
|
|
lt==cc, lts==min, gt==eq+cs, gts==eq+pos, le==cc+eq, les==neg+eq, ge==cs, ges==pos)
|
|
and when left out, defaults to ne (not-zero, i.e. true)
|
|
NOTE: some combination branches such as cc+eq an be peephole optimized see http://www.6502.org/tutorials/compare_beyond.html#2.2
|
|
|
|
- conditional go with expression: where the if[_XX] is followed by a <expression>
|
|
in that case, evaluate the <expression> first (whatever it is) and then emit the primitive if[_XX] go
|
|
if[_XX] <expression> go <label>
|
|
eventually translates to:
|
|
<expression-code>
|
|
bXX <label>
|
|
|
|
- comparison statement: compares left with right: compare <first_value>, <second_value>
|
|
(and keeps the comparison result in the status register.)
|
|
this translates into a lda first_value, cmp second_value sequence after which a conditional branch is possible.
|
|
|
|
|
|
|
|
IF_XX:
|
|
------
|
|
if[_XX] [<expression>] {
|
|
...
|
|
}
|
|
[ else {
|
|
... ; evaluated when the condition is not met
|
|
} ]
|
|
|
|
|
|
==> DESUGARING ==>
|
|
|
|
(no else:)
|
|
|
|
if[_!XX] [<expression>] go il65_if_999_end ; !XX being the conditional inverse of XX
|
|
.... (true part)
|
|
il65_if_999_end ; code continues after this
|
|
|
|
|
|
(with else):
|
|
if[_XX] [<expression>] go il65_if_999
|
|
... (else part)
|
|
go il65_if_999_end
|
|
il65_if_999 ... (true part)
|
|
il65_if_999_end ; code continues after this
|
|
|
|
|
|
IF X <COMPARISON> Y:
|
|
-----------------------
|
|
|
|
==> DESUGARING ==>
|
|
compare X, Y
|
|
if_XX go ....
|
|
XX based on <COMPARISON>.
|
|
|
|
|
|
|
|
|
|
|
|
WHILE:
|
|
------
|
|
while[_XX] <expression> {
|
|
...
|
|
continue
|
|
break
|
|
}
|
|
|
|
==> DESUGARING ==>
|
|
|
|
go il65_while_999_check ; jump to the check
|
|
il65_while_999
|
|
... (code)
|
|
go il65_while_999 ;continue
|
|
go il65_while_999_end ;break
|
|
il65_while_999_check
|
|
if[_XX] <expression> go il65_while_999 ; loop condition
|
|
il65_while_999_end ; code continues after this
|
|
|
|
|
|
|
|
REPEAT:
|
|
------
|
|
|
|
repeat {
|
|
...
|
|
continue
|
|
break
|
|
} until[_XX] <expressoin>
|
|
|
|
==> DESUGARING ==>
|
|
|
|
il65_repeat_999
|
|
... (code)
|
|
go il65_repeat_999 ;continue
|
|
go il65_repeat_999_end ;break
|
|
if[_!XX] <expression> go il65_repeat_999 ; loop condition via conditional inverse of XX
|
|
il65_repeat_999_end ; code continues after this
|
|
|
|
|
|
|
|
FOR:
|
|
----
|
|
|
|
for <loopvar> = <from_expression> to <to_expression> [step <step_expression>] {
|
|
...
|
|
break
|
|
continue
|
|
}
|
|
|
|
|
|
@todo how to do signed integer loopvars?
|
|
|
|
|
|
==> DESUGARING ==>
|
|
|
|
loopvar = <from_expression>
|
|
compare loopvar, <to_expression>
|
|
if_ge go il65_for_999_end ; loop condition
|
|
step = <step_expression> ; (store only if step < -1 or step > 1)
|
|
il65_for_999
|
|
go il65_for_999_end ;break
|
|
go il65_for_999_loop ;continue
|
|
.... (code)
|
|
il65_for_999_loop
|
|
loopvar += step ; (if step > 1 or step < -1)
|
|
loopvar++ ; (if step == 1)
|
|
loopvar-- ; (if step == -1)
|
|
go il65_for_999 ; continue the loop
|
|
il65_for_999_end ; code continues after this
|
|
|
|
|
|
|
|
MEMORY BLOCK OPERATIONS:
|
|
|
|
@todo matrix,list,string memory block operations:
|
|
- matrix type operations (whole matrix, per row, per column, individual row/column)
|
|
operations: set, get, copy (from another matrix with the same dimensions, or list with same length),
|
|
shift-N (up, down, left, right, and diagonals, meant for scrolling)
|
|
rotate-N (up, down, left, right, and diagonals, meant for scrolling)
|
|
clear (set whole matrix to the given value, default 0)
|
|
|
|
- list operations (whole list, individual element)
|
|
operations: set, get, copy (from another list with the same length), shift-N(left,right), rotate-N(left,right)
|
|
clear (set whole list to the given value, default 0)
|
|
|
|
- list and matrix operations ofcourse work identical on vars and on memory mapped vars of these types.
|
|
|
|
- strings: identical operations as on lists.
|
|
|
|
|
|
these call (or emit inline) optimized pieces of assembly code, so they run as fast as possible
|
|
|
|
|
|
|
|
SUBROUTINES DEFINITIONS
|
|
-----------------------
|
|
|
|
External subroutines for instance defined in ROM, can be defined using the 'subx' statement.
|
|
|
|
subx <identifier> ([proc_parameters]) -> ([proc_results]) <address>
|
|
|
|
proc_parameters = sequence of "<parametername>:<register>" pairs that specify what the input parameters are
|
|
proc_results = sequence of <register> names that specify in which register(s) the output is returned
|
|
if the name ends with a '?', that means the register doesn't contain a real return value but
|
|
is clobbered in the process so the original value it had before calling the sub is no longer valid.
|
|
|
|
example: "subx CLOSE (logical: A) -> (A?, X?, Y?) $FFC3"
|
|
|
|
|
|
ISOLATION (register preservation when calling subroutines): @todo isolation
|
|
|
|
isolate [regs] { .... } that adds register preservation around the containing code default = all 3 regs, or specify which.
|
|
fcall -> fastcall, doesn't do register preservations
|
|
call -> as before, alsways does it, even when already in isolate block
|
|
|
|
|
|
|
|
@todo user defined subroutines
|
|
|
|
|
|
SUBROUTINE CALLS
|
|
----------------
|
|
|
|
CALL and FCALL:
|
|
They are just inserting a call to the specified location or subroutine.
|
|
[F]CALL: calls subroutine and continue afterwards ('gosub'):
|
|
[f]call <subroutine> / <label> / <address> / `[`indirect-pointer`]` [arguments...]
|
|
|
|
A 'call' preserves all registers when doing the procedure call and restores them afterwards.
|
|
'fcall' (fast call) doesn't preserve registers, so generates code that is a lot faster.
|
|
It's basically one jmp or jsr instruction. It can clobber register values because of this.
|
|
If you provide arguments (not required) these will be matched to the subroutine's parameters.
|
|
If you don't provide arguments, it is assumed you have prepared the correct registers etc yourself.
|
|
|
|
|
|
The following contemporary syntax to call a subroutine is also available:
|
|
subroutine `(` [arguments...] `)`
|
|
subroutine! `(` [arguments...] `)`
|
|
These are understood as: "call subroutine arguments" and "fcall subroutine arguments" respectively.
|
|
You can only call a subroutine or label this way. This syntax cannot be used
|
|
to call a memory address or variable, you have to use the call statement for that.
|
|
|
|
GO:
|
|
'go' continues execution with the specified routine or address and doesn't retuurn (it is a 'goto'):
|
|
go <subroutine> / <label> / <address> / [indirect-pointer]
|
|
|
|
|
|
@todo support call non-register args (variable parameter passing)
|
|
@todo support call return values (so that you can assign these to other variables, and allows the line to be a full expression)
|
|
|
|
|
|
@todo BITMAP DEFINITIONS:
|
|
to define CHARACTERS (8x8 monochrome or 4x8 multicolor = 8 bytes)
|
|
--> PLACE in memory on correct address (???k aligned)
|
|
and SPRITES (24x21 monochrome or 12x21 multicolor = 63 bytes)
|
|
--> PLACE in memory on correct address (base+sprite pointer, 64-byte aligned)
|