1
0
mirror of https://github.com/KarolS/millfork.git synced 2026-04-21 09:16:34 +00:00

Rename documentation files

This commit is contained in:
Karol Stasiak
2018-03-28 19:31:10 +02:00
parent 1fcbf9fd5b
commit 1a0737e4c9
21 changed files with 3 additions and 3 deletions
+55
View File
@@ -0,0 +1,55 @@
# Guide to generated label names
Many Millfork constructs generate labels.
Knowing what they mean can be useful when reading the generated assembly code.
Every generated label is of form `.xx__11111`
where `11111` is a sequential number and `xx` is the type:
* `ah` optimized addition of carry
* `an` logical conjunction short-circuiting
* `bc` array bounds checking (`-fbounds-checking`)
* `c8` constant `#8` for `BIT` when immediate addressing is not available
* `co` greater-than comparison
* `cp` equality comparison for larger types
* `de` decrement for larger types
* `do` beginning of a `do-while` statement
* `ds` decimal right shift operation
* `el` beginning of the "else" block in an `if` statement
* `ew` end of a `while` statement
* `fi` end of an `if` statement
* `he` beginning of the body of a `while` statement
* `in` increment for larger types
* `is` optimized addition of carry using undocumented instructions
* `no` nonet to word extension caused by the `nonet` operator
* `od` end of a `do-while` statement
* `or` logical alternative short-circuiting
* `sx` sign extension, from a smaller signed type to a larger type
* `th` beginning of the "then" block in an `if` statement
* `to` end of a `for-to` loop
* `ur` a copy due to loop unrolling
* `wh` beginning of a `while` statement
+11
View File
@@ -0,0 +1,11 @@
# Macros and inlining
## Macros
`macro` keyword
## Automatic inlining
`--inline` command-line option
`inline` and `noinline` keyword
+31
View File
@@ -0,0 +1,31 @@
# Undefined behaviour
Since Millfork is only a middle-level programming language and attempts to eschew runtime checks in favour of performance,
there are many situation when the program may not behave as expected.
In the following list, "undefined value" means an arbitrary value that cannot be relied upon,
and "undefined behaviour" means arbitrary and unpredictable behaviour that may lead to anything,
even up to hardware damage.
* array overruns: indexing past the end of an array leads to undefined behaviour
* stray pointers: indexing a pointer that doesn't point to a valid object or indexing it past the end of the pointed object leads to undefined behaviour
* reading uninitialized variables: will return undefined values
* reading variables used by return dispatch statements but not assigned a value: will return undefined values
* returning a value from a function by return dispatch to a function of different return type: will return undefined values
* passing an index out of range for a return dispatch statement
* stack overflow: exhausting the hardware stack due to excess recursion, excess function calls or excess stack-allocated variables
* on ROM-based platforms: writing to arrays
* on ROM-based platforms: using global variables with an initial value
* violating the [safe assembly rules](../lang/assembly.md)
* violating the [safe reentrancy rules](../lang/reentrancy.md)
The above list is not exhaustive.
+61
View File
@@ -0,0 +1,61 @@
# Undocumented opcodes
Original 6502 processors accidentally supported a bunch of extra undocumented instructions.
Millfork can emit them if so desired.
## Mnemonics
Since various assemblers use different mnemonics for undocumented opcodes,
Millfork supports multiple mnemonics per opcode. The default one is given first:
* **AHX**, AXA, SHA
* **ALR**
* **ANC**
* **ARR**
* **DCP**, DCM
* **ISC**, INS
* **LAS**
* **LAX**
* **LXA**, OAL
* **RLA**
* **RRA**
* **SAX**\*
* **SHX**, XAS
* **SHY**, SAY\*
* **SBX**, AXS\*\*
* **SRE**, LSE
* **SLO**, ASO
* **TAS**
* **XAA**, ANE
\* HuC2680 has different instructions also called SAX and SAY,
but Millfork can distinguish between them and the NMOS illegal instructions based on the addressing mode.
\*\* AXS is also used for SAX in some assemblers. Millfork interprets AXS based on the addressing mode.
## Generation
In order for the compiler to emit one of those opcodes,
an appropriate CPU architecture must be chosen (`nmos` or `ricoh`)
and either it must appear in an assembly block or it may be a result of optimization.
Optimization will never emit any of the following opcodes due to their instability and/or uselessness:
AHX, LAS, LXA, SHX, SHY, TAS, XAA.
+70
View File
@@ -0,0 +1,70 @@
# Variable storage
Variables in Millfork can belong to one of the following storage classes:
* static: all global variables; local variables declared with `static`
* stack: local variables declared with `stack`
* automatic: other local variables
* parameter: function parameters
Variables can also belong to one of the following memory segments
(unless overridden with the `@` operator):
* zeropage: all `pointer` variables and parameters
* high RAM: all the other variables and parameters
All arrays can be considered static.
## Static variables
Static variables have a fixed and unique memory location.
Their lifetime is for the entire runtime of the program.
If they do not have initial value declared, reading them before initialization yields an undefined value.
## Stack variables
Stack variables, as their name suggests, live on the stack.
Their lifetime starts with the beginning of the function they're in
and ends when the function returns.
They are not automatically initialized before reading, reading them before initialization yields an undefined value.
The main advantage is that they are perfectly safe to use in reentrant code,
but the main disadvantages are:
* slower access
* bigger code
* increased stack usage
* cannot take their addresses
* cannot use them in inline assembly code blocks
## Automatic variables
Automatic variables have lifetime starting with the beginning of the function they're in
and ending when the function returns.
Most automatic variables reside in memory.
They can share their memory location with other automatic variables and parameters,
to conserve memory usage.
Some small automatic variables may be inlined to index registers.
They are not automatically initialized before reading, reading them before initialization yields an undefined value.
Automatic local variables are not safe to use with reentrant functions, see the [relevant documentation](../lang/reentrancy.md) for more details.
Automatic variables defined with the `register` keyword will have the priority when it comes to register allocation.
## Parameters
Automatic variables have lifetime starting with the beginning
of the function call to the function they're defined in
and ending when the function returns.
They reside in memory and can share their memory location with other parameters and automatic variables,
to conserve memory usage.
Unlike automatic variables, they are never inlined into index registers.
Parameters are not safe to use with reentrant functions, see the [relevant documentation](../lang/reentrancy.md) for more details.
+102
View File
@@ -0,0 +1,102 @@
# Command-line options
## General options
* `--version` Display the version number and quit.
* `--help` Displays help the command line option.
* `--` End of the options, all the following parameters will be treated as input files, even if they look like options.
## I/O options
* `-o <file>` Output filename, without extension. Extension will be added automatically, `.prg` for Commodore, `.a2` for Apple and `.xex` for Atari.
* `-s` Generate also the assembly output. It is not compatible with any assembler, but it serves purely informational purpose. The file has the same nam as the output file and the extension is `.asm`.
* `-g` Generate also the label file. The label file contains labels with their addresses, with duplicates removed. It can be loaded into the monitor of the Vice emulator for debugging purposes. The file has the same nam as the output file and the extension is `.lbl`.
* `-I <dir>;<dir>` The include directories. The current working directory is also an include directory. Those directories are searched for modules and platform definitions.
* `-t <platform>` Target platform. It is loaded from an `.ini` file found in any of the include directories. See also [this document](target-platforms.md).
* `-r <program>` Run given program after successful compilation. Useful for automatically launching emulators without any external scripting.
## Verbosity options
* `-q` Suppress all messages except for errors.
* `-v`, `-vv`, `-vvv` Increase verbosity, various levels.
## Code generation options
* `-fcmos-ops`, `-fno-cmos-ops` Whether should emit CMOS opcodes.
`.ini` equivalent: `emit_cmos`.
Default: yes if targeting a 65C02-compatible architecture, no otherwise.
* `-fillegals`, `-fno-illegals` Whether should emit illegal (undocumented) NMOS opcodes.
`.ini` equivalent: `emit_illegals`.
Default: no.
* `-f65ce02-ops`, `-fno-65ce02-ops` Whether should emit 65CE02 opcodes.
`.ini` equivalent: `emit_65ce026`.
Default: yes if targeting 65CE02, no otherwise.
* `-fhuc6280-ops`, `-fno-huc6280-ops` Whether should emit HuC6280 opcodes.
`.ini` equivalent: `emit_huc6280`.
Default: yes if targeting HuC6280, no otherwise.
* `-fno-65816-ops`, `-femulation-65816-ops`, `-fnative-65816-ops` Which subset of 65816 instructions to support.
`-fnative-65816-ops` is required to use any 16-bit operations.
Currently, there is not much support in the compiler for the native mode.
`.ini` equivalent: `emit_65816`.
Default: native if targeting 65816, no otherwise.
* `-fjmp-fix`, `-fno-jmp-fix` Whether should prevent indirect JMP bug on page boundary.
`.ini` equivalent: `prevent_jmp_indirect_bug`.
Default: no if targeting a 65C02-compatible architecture, yes otherwise.
* `-fzp-register`, `-fno-zp-register` Whether should reserve 2 bytes of zero page as a pseudoregister.
Increases language features.
`.ini` equivalent: `zeropage_register`.
Default: yes.
* `-fdecimal-mode`, `-fno-decimal-mode` Whether decimal mode should be available.
`.ini` equivalent: `decimal_mode`.
Default: no if targeting Ricoh, yes otherwise.
* `-fvariable-overlap`, `-fno-variable-overlap` Whether variables should overlap if their scopes do not intersect.
Default: yes.
* `-fbounds-checking`, `-fnobounds-checking` Whether should insert bounds checking on array access.
Default: no.
* `-fcompact-dispatch-params`, `-fnocompact-dispatch-params`
Whether parameter values in return dispatch statements may overlap other objects.
This may cause problems if the parameter table is stored next to a hardware register that has side effects when reading.
`.ini` equivalent: `compact_dispatch_params`. Default: yes.
## Optimization options
* `-O0` Disable all optimizations.
* `-O`, `-O2`, `-O3` Optimize code, various levels.
* `-O9` Optimize code using superoptimizer (experimental). Computationally expensive, decent results.
* `--inline` Inline functions automatically (experimental). See the [documentation about inlining](../abi/inlining.md). Computationally easy, can give decent gains.
* `--size` Optimize for size, sacrificing some speed (experimental).
* `--fast` Optimize for speed, even if it increases the size a bit (experimental).
* `--blast-processing` Optimize for speed, even if it increases the size a lot (experimental).
* `--dangerous-optimizations` Use dangerous optimizations (experimental). Dangerous optimizations are more likely to result in broken code.
## Warning options
* `-Wall` Enable extra warnings.
* `-Wfatal` Treat warnings as errors.
+65
View File
@@ -0,0 +1,65 @@
# Famicom/NES programming guide
## Program lifecycle
The default Famicom vectors are defined as following:
* on reset, the predefined `on_reset` routine is called, which in turn calls `main`.
The `main` routine is not allowed to return, or the program will crash.
* on NMI, the default interrupt handler calls the `nmi` routine.
It should not be defined as `interrupt`, the handler is, so your routine shouldn't.
* on IRQ, the default interrupt handler calls the `irq` routine.
It should not be defined as `interrupt`, the handler is, so your routine shouldn't.
The minimal Famicom program thus looks like this:
void main() {
// initialize things
while(true) { }
}
void irq() {
// do things
}
void nmi() {
// do things
}
## Mappers
To use a mapper of your choice, create a new `.ini` file with the definitions you need.
The most important ones are `[output]format` and `[allocation]segments`.
Currently, its a bit inconvenient to create programs using mappers that change the bank containing the interrupt vectors.
Therefore, it's recommended to stick to mappers that have a fixed bank at the end of the address space.
Mappers that should be fine: NROM (0), CNROM (1), UxROM(2), MMC2 (9), MMC3 (4), MMC4 (10), MMC6 (4).
Mappers that can have arbitrary bank at the end and are therefore not recommended: MMC1 (1), MMC5 (5).
You should define at least three segments:
* `default` from $200 to $7FF, it will represent the physical RAM of the console.
* `chrrom` (sample name) from $0000 to $1FFF, it will represent the CHRROM
(if you need more, you can make it bigger, up to $ffff, or even add another segment of CHRROM).
Put there only arrays with pattern tables. Don't read from them directly, it won't work.
* `prgrom` (sample name) it will contain the code of your program and read-only data.
Each segment should be defined in a range it is going to be switched into.
You should set the `default_code_segment` to the segment that contains the $FFxx addresses.
If your mapper supports it, you can add more CHRROM or PRGROM segments,
just specify them correctly in the `[output]format` tag.
The `[output]format` tag should contain a valid iNES or NES 2.0 header of the mapper of your choice
and then all the segments in proper order (first PRGROM, then CHRROM).
See [the MMC4 example](../../include/nes_mmc4.ini) to see how it can be done.
See [the NesDev wiki](https://wiki.nesdev.com/w/index.php/NES_2.0) for more info about the NES 2.0 file format.
+54
View File
@@ -0,0 +1,54 @@
# Getting started
## Hello world example
Save the following as `hello_world.mfk`:
```
import stdio
array hello_world = "hello world" petscii
void main(){
putstr(hello_world, hello_world.length)
while(true){}
}
```
Compile is using the following commandline:
```
java -jar millfork.jar hello_world.mfk -o hello_world -t c64 -I path_to_millfork\include
```
Run the output executable (here using the VICE emulator):
```
x64 hello_world.prg
```
## Basic commandline usage
The following options are crucial when compiling your sources:
* `-o FILENAME` specifies the base name for your output file, an appropriate file extension will be appended (`prg` for Commodore, `xex` for Atari, `a2` for Apple, `asm` for assembly output, `lbl` for label file)
* `-I DIR;DIR;DIR;...` specifies the paths to directories with modules to include.
* `-t PLATFORM` specifies the target platform (`c64` is the default). Each platform is defined in an `.ini` file in the include directory. For the list of supported platforms, see [Supported platforms](target-platforms.md)
You may be also interested in the following:
* `-O`, `-O2`, `-O3` enable optimization (various levels)
* `--inline` automatically inline functions for better optimization
* `-s` additionally generate assembly output
* `-g` additionally generate a label file, in format compatible with VICE emulator
* `-r PROGRAM` automatically launch given program after successful compilation
* `-Wall` enable all warnings
* `--help` list all commandline options
+154
View File
@@ -0,0 +1,154 @@
# Target platforms
Currently, Millfork supports creating disk- or tape-based programs for Commodore, Apple and Atari 8-bit computers,
but it may be expanded to support other 6502-based platforms in the future.
## Supported platforms
The following platforms are currently supported:
* `c64` Commodore 64
* `c64_scpu` Commodore 64 with SuperCPU in emulation mode
* `c64_scpu16` Commodore 64 with SuperCPU in native, 16-bit mode (very buggy)
* `c16` Commodore 16
* `plus4` Commodore Plus/4
* `vic20` Commodore VIC-20 without memory expansion
* `vic20_3k` Commodore VIC-20 with 3K memory expansion
* `vic20_8k` Commodore VIC-20 with 8K or 16K memory expansion
* `c128` Commodore 128 in its native mode
* `pet` Commodore PET
* `nes_small` a tiny 32K PRGROM + 8K CHRROM Famicom/NES program, using iNES mapper 0 (NROM)
* `nes_mcc4` a 128K PRGROM + 128K CHRROM + extra 8KRAM Famicom/NES program, using iNES mapper 10 (MMC4)
For more complex programs, you need to create your own "platform" definition.
Read [the NES programming guide](./famicom-programming-guide.md) for more info.
* `a8` Atari 8-bit computers
* `apple2` Apple II+/IIe/Enhanced IIe
The primary and most tested platform is Commodore 64.
Currently, targets that assume that the program will be loaded from disk or tape are better tested.
Cartridge targets may exhibit unexpected bugs.
### A note about Apple II
Apple II variants other than II+/IIe/Enhanced IIe are untested;
this includes the original II, IIc and IIc+, but also later compatible computers (Apple III and IIgs).
They may or may not work.
The compiler output is a raw machine code file, which then has to be put on a disk.
You can do it using [CiderPress](http://a2ciderpress.com/),
[AppleCommander](https://applecommander.github.io/),
or some other tool.
The file has to be loaded from $0C00. An example how to put such file onto a disk using AppleCommander:
java -jar AppleCommander-1.3.5.jar -p disk_image.dsk FILENAME B 0xc00 < compiler_output.a2
Creating a bootable disk is beyond the scope of this document.
## Adding a custom platform
Every platform is defined in an `.ini` file with an appropriate name.
#### `[compilation]` section
* `arch` CPU architecture. It defines which instructions are available. Available values:
* `nmos` (original 6502)
* `strict` (NMOS without illegal instructions)
* `ricoh` (Ricoh 2A03/2A07, NMOS without decimal mode)
* `strictricoh` (Ricoh 2A03/2A07 without illegal instructions)
* `cmos` (WDC 65C02 or 65SC02)
* `65ce02` (CSG 65CE02; experimental)
* `huc6280` (Hudson HuC6280; experimental)
* `65816` (WDC 65816/65802; experimental; currently only programs that use only 16-bit addressing are supported)
* `modules` comma-separated list of modules that will be automatically imported
* other compilation options (they can be overridden using commandline options):
* `emit_illegals` whether the compiler should emit illegal instructions, default `false`
* `emit_cmos` whether the compiler should emit CMOS instructions, default is `true` on compatible processors and `false` elsewhere
* `emit_65816` which 65816 instructions should the compiler emit, either `no`, `emulation` or `native`
* `decimal_mode` whether the compiler should emit decimal instructions, default is `false` on `ricoh` and `strictricoh` and `true` elsewhere
* `ro_arrays` whether the compiler should warn upon array writes, default is `false`
* `prevent_jmp_indirect_bug` whether the compiler should try to avoid the indirect JMP bug,
default is `false` on 65C02-compatible processors and `true` elsewhere
* `compact_dispatch_params` whether parameter values in return dispatch statements may overlap other objects, default is `true`
This may cause problems if the parameter table is stored next to a hardware register that has side effects when reading.
#### `[allocation]` section
* `zp_pointers` either a list of comma separated zeropage addresses that can be used by the program as zeropage pointers, or `all` for all. Each value should be the address of the first of two free bytes in the zeropage.
* `segments` a comma-separated list of segment names.
A segment named `default` is always required.
Default: `default`. In all options below, `NAME` refers to a segment name.
* `default_code_segment` the default segment for code and initialized arrays.
Note that the default segment for uninitialized arrays and variables is always `default`.
Default: `default`
* `segment_NAME_start` the first address used for automatic allocation in the segment.
Note that the `default` segment shouldn't start before $200, as the $0-$1FF range is reserved for the zeropage and the stack.
The `main` function will be placed as close to the beginning of its segment as possible, but not necessarily at `segment_NAME_start`
* `segment_NAME_end` the last address in the segment
* `segment_NAME_codeend` the last address in the segment for code and initialized arrays.
Only uninitialized variables are allowed between `segment_NAME_codeend` and `segment_NAME_end`.
Default: the same as `segment_NAME_end`.
* `segment_NAME_datastart` the first address used for non-zeropage variables, or `after_code` if the variables should be allocated after the code.
Default: `after_code`.
#### `[output]` section
* `style` how multi-segment programs should be output:
* `single` output a single file, based mostly, but not necessarily only on data in the `default` segment (the default)
* `per_segment` generate a separate file with each segment
* `format` output file format; a comma-separated list of tokens:
* literal byte values
* `startaddr` little-endian 16-bit address of the first used byte of the compiled output (not necessarily the segment start)
* `endaddr` little-endian 16-bit address of the last used byte of the compiled output (usually not the segment end)
* `allocated` all used bytes
* `<addr>:<addr>` - inclusive range of bytes
* `<segment>:<addr>:<addr>` - inclusive range of bytes in a given segment
* `extension` target file extension, with or without the dot
+37
View File
@@ -0,0 +1,37 @@
# Documentation
**★ WORK IN PROGRESS ★**
## Compiler usage
* [Getting started](api/getting-started.md)
* [Command-line option reference](api/command-line.md)
* [Target platform reference](api/target-platforms.md)
## Language reference
* [Syntax](lang/syntax.md)
* [Types](lang/types.md)
* [Operators reference](lang/operators.md)
* [Functions](lang/functions.md)
* [Inline assembly syntax](lang/assembly.md)
* [Important guidelines regarding reentrancy](lang/reentrancy.md)
## Implementation details
* [Variable storage](abi/variable-storage.md)
* [Undefined behaviour](abi/undefined-behaviour.md)
* [Undocumented instruction support](abi/undocumented.md)
* [Reference for labels in generated assembly code](abi/generated-labels.md)
+184
View File
@@ -0,0 +1,184 @@
# Using assembly within Millfork programs
There are two ways to include raw assembly code in your Millfork programs:
* inline assembly code blocks
* whole assembly functions
## Assembly syntax
Millfork inline assembly uses the same three-letter opcodes as most other 6502 assemblers.
Indexing syntax is also the same. Only instructions available on the current CPU architecture are available.
**Work in progress**:
Currently, `RMBx`/`SMBx`/`BBRx`/`BBSx` and some extra 65CE02/HuC6280/65816 instructions are not supported yet.
Undocumented instructions are supported using various opcodes
Labels have to be followed by a colon and they can optionally be on a separate line.
Indentation is not important:
first: INC x
second:
INC y
INC z
Label names have to start with a letter and can contain digits, underscores and letters.
This means than they cannot start with a period like in many other assemblers.
Similarly, anonymous labels designated with `+` or `-` are also not supported
Labels are global,
which means that they live in the same namespace as functions, types and global variables.
Assembly can refer to variables and constants defined in Millfork,
but you need to be careful with using absolute vs immediate addressing:
const byte fiveConstant = 5
byte fiveVariable = 5
byte ten() {
byte result
asm {
LDA #fiveConstant
CLC
ADC fiveVariable
STA result
}
return result
}
Any assembly opcode can be prefixed with `?`, which allows the optimizer change it or elide it if needed.
Opcodes without that prefix will be always compiled as written.
You can insert macros into assembly, by prefixing them with `+` and using the same syntax as in Millfork:
macro void run(byte x) {
output = x
}
byte output @$c000
void main () {
byte a
a = 7
asm {
+ run(a)
}
}
Currently there is no way to insert raw bytes into inline assembly
(required for certain optimizations and calling conventions).
## Assembly functions
Assembly functions can be declared as `macro` or not.
A macro assembly function is inserted into the calling function like an inline assembly block,
and therefore usually it shouldn't end with `RTS` or `RTI`.
A non-macro assembly function should end with `RTS`, `JMP` or `RTI` as appropriate,
or it should be an external function.
For both macro and non-macro assembly functions,
the return type can be any valid return type, like for Millfork functions.
If the size of the return type is one byte,
then the result is passed via the accumulator.
If the size of the return type is two bytes,
then the low byte of the result is passed via the accumulator
and the high byte of the result is passed via the X register.
### Assembly function parameters
An assembly function can have parameters.
They differ from what is used by Millfork functions.
Macro assembly functions can have the following parameter types:
* reference parameters: `byte ref paramname`: every occurrence of the parameter will be replaced with the variable given as an argument
* constant parameters: `byte const paramname`: every occurrence of the parameter will be replaced with the constant value given as an argument
For example, if you have:
inline asm void increase(byte ref v, byte const inc) {
LDA v
CLC
ADC #inc
STA v
}
and call `increase(score, 10)`, the entire call will compile into:
LDA score
CLC
ADC #10
STA score
Non-macro functions can only have their parameters passed via registers:
* `byte a`, `byte x`, `byte y`: a single byte passed via the given CPU register
* `word xa`, `word ax`, `word ay`, `word ya`, `word xy`, `word yx`: a 2-byte word byte passed via given two CPU registers, with the high byte passed through the first register and the low byte passed through the second register
Macro assembly functions can have maximum one parameter passed via a register.
### External functions
An external function should be declared with a defined memory address
and the `extern` keyword instead of the body:
asm void putchar(byte a) @$FFD2 extern
## Safe assembly
Since assembly gives the programmer unlimited access to all machine features,
certain assumptions about the code may be broken.
In order to make assembly cooperate with the rest of the Millfork code,
it should abide to the following rules:
* don't leave the D flag set
* don't jump between functions if either of functions has stack variables
* don't do `RTS` or `RTI` if the function has stack variables
* don't jump or call things that are not functions or labels
* don't store data in locations other than variables or arrays
* don't change the stack pointer
* end non-inline assembly functions with `RTS`, `JMP` or `RTI` as appropriate
* on NMOS 6502:
* don't use `XAA`, `LXA`, `AHX`, `SHX`, `SHY`, `LAS` and `TAS` instructions
* on 65816:
* keep the direct page register set to $0000
* keep the M and X flags set to 1 (8-bit registers by default, native mode)
* if running in the native mode, be careful with the stack pointer (you should keep it between $000100 and $0001FF)
* do not change the data page register (keep an eye at the `PLD`, `MVN`, `MVP` instructions)
* explicitly use 16-bit immediate operands when appropriate; the assembler doesn't track flags and assumes 8-bit immediates by default
* use far jumps unless you're sure that the called function returns with an `RTS`
* on 65CE02:
* keep the `B` register set to $00
* don't change the `E` flag
* on HuC6280
* don't use the `SET` instruction
The above list is not exhaustive.
+45
View File
@@ -0,0 +1,45 @@
# Function definitions
Syntax:
`[segment (<segment>)] [<modifiers>] <return_type> <name> ( <params> ) [@ <address>] { <body> }`
`[segment (<segment>)] asm <return_type> <name> ( <params> ) @ <address> extern`
* `<segment>`: segment name; if absent, then defaults to `default_code_segment` as defined for the platform
* `<modifiers>`: zero or more of the following:
* `asm` the function is written in assembly, not in Millfork (obligatory for `extern` functions),
see [Using assembly within Millfork programs#Assembly functions](./assembly.md#assembly-functions)
* `macro` the function is a macro,
see [Macros_and inlining#Macros](../abi/inlining.md#macros)
* `inline` the function should preferably be inlined
see [Macros_and inlining#Inlining](../abi/inlining.md#automatic_inlining.md)
* `noinline` the function should never be inlined
* `interrupt` the function is a hardware interrupt handler.
You are not allowed to call such functions directly.
The function cannot have parameters and the retrn type should be `void`.
* `kernal_interrupt` the function is an interrupt handler called from a generic vendor-provider hardware interrupt handler.
The hardware instruction handler is assumed to have preserved the CPU registers,
so this function only has to preserve the zeropage pseudoregisters.
An example is the Commodore 64 interrupt handler that calls the function at an address read from $314/$315.
Unline hardware handlers with `interrupt`, you can treat functions with `kernal_interrupt` like normal functions.
* `<return_type>` is a valid return type, see [Types](./types.md)
* `<params>` is a comma-separated list of parameters, in form `type name`. Allowed types are the same as for local variables.
* `<address>` is a constant expression that defines where in the memory the function is or will be located.
* `extern` is a keyword than marks functions that are not defined in the current program,
but are likely to be available at certain address in memory.
Such functions should be marked as written in assembly and should have their parameters passed through registers.
* `<body>` is a newline-separated list of either Millfork or assembly statements
+39
View File
@@ -0,0 +1,39 @@
# Interfacing with external code
## Calling external functions at a static address
To call an external function, you need to declare it as `asm extern`. For example:
```
asm void putchar(byte a) @$FFD2 extern
```
The function parameter will be passed via the accumulator,
the function itself is located in ROM at $FFD2. A call like this:
```
putchar(13)
```
will be compiled to something like this:
```
LDA #13
JSR $FFD2
```
For more details about how to pass parameters to `asm` functions,
see [Using assembly within Millfork programs#Assembly functions](./assembly.md#assembly-functions).
## Calling external functions at a dynamic address
To call a function that has its address calculated dynamically,
you just need to do the same as what you would do in assembly:
```
asm void call_function(byte a) {
JMP (function_address)
}
```
where `function_address` is a variable that contains the address of the function to call.
+46
View File
@@ -0,0 +1,46 @@
# Literals and initializers
## Numeric literals
Decimal: `1`, `10`
Binary: `%0101`, `0b101001`
Quaternary: `0q2131`
Octal: `0o172`
Hexadecimal: `$D323`, `0x2a2`
## String literals
String literals are surrounded with double quotes and followed by the name of the encoding:
"this is a string" ascii
Characters between the quotes are interpreted literally,
there are no ways to escape special characters or quotes.
Currently available encodings:
* `ascii` standard ASCII
* `pet` or `petscii` PETSCII (ASCII-like character set used by Commodore machines)
* `scr` Commodore screencodes
When programming for Commodore,
use `pet` for strings you're printing using standard I/O routines
and `scr` for strings you're copying to screen memory directly.
## Array initialisers
An array is initialized with either a string literal,
or a list of byte literals and strings, surrounded by brackets:
array a = [1, 2]
array b = "----" scr
array c = ["hello world!" ascii, 13]
Trailing commas (`[1, 2,]`) are not allowed.
+201
View File
@@ -0,0 +1,201 @@
# Operators
Unlike in high-level languages, operators in Millfork have limited applicability.
Not every well-formed expression is actually compilable.
Most expressions involving single bytes compile,
but for larger types usually you need to use in-place modification operators.
Further improvements to the compiler may increase the number of acceptable combinations.
Certain expressions require the commandline flag `-fzp-register` (`.ini` equivalent: `zeropage_register`) to be enabled.
They will be marked with (zpreg) next to them.
The flag is enabled by default, but you can disable it if you need it.
## Precedence
Millfork has different operator precedence compared to most other languages. From highest to lowest it goes:
* `*`, `*'`
* `+`, `+'`, `-`, `-'`, `|`, `&`, `^`, `>>`, `>>'`, `<<`, `<<'`, `>>>>`
* `:`
* `==`, `!=`, `<`, `>`, `<=`, `>=`
* `&&`
* `||`
* assignment and in-place modification operators
You cannot use two different operators at the same precedence levels without using parentheses to disambiguate.
It is to prevent confusion about whether `a + b & c << d` means `(a + b) & (c << d)` `((a + b) & c) << d` or something else.
The only exceptions are `+` and `-`, and `+'` and `-'`.
They are interpreted as expected: `5 - 3 + 2 == 4` and `5 -' 3 +' 2 == 4`.
Note that you cannot mix `+'` and `-'` with `+` and `-`.
## Argument types
In the descriptions below, arguments to the operators are explained as follows:
* `byte` means any one-byte type
* `word` means any two-byte type, or a byte expanded to a word
* `long` means any type longer than two bytes, or a shorter type expanded to such length to match the other argument
* `constant` means a compile-time constant
* `simple` means either: a constant, a non-stack variable,
a pointer indexed with a constant, a pointer indexed with a non-stack variable,
an array indexed with a constant, an array indexed with a non-stack variable,
an array indexed with a sum of a constant and a non-stack variable,
or a split-word expression made of two simple expressions.
Examples: `1`, `a`, `p[2]`, `p[i]`, `arr[2]`, `arr[i]`, `arr[i+2]`, `h:l`, `h[i]:l[i]`
Such expressions have the property that the only register they may clobber is Y.
* `mutable` means an expression that can be assigned to
## Split-word operator
Expressions of the shape `h:l` where `h` and `l` are of type byte, are considered expressions of type word.
If and only if both `h` and `l` are assignable expressions, then `h:l` is also an assignable expression.
## Binary arithmetic operators
* `+`, `-`:
`byte + byte`
`constant word + constant word`
`constant long + constant long`
`constant word + byte`
`word + word` (zpreg)
* `*`: multiplication; the size of the result is the same as the size of the arguments
`byte * constant byte`
`constant byte * byte`
`constant word * constant word`
`constant long * constant long`
`byte * byte` (zpreg)
There are no division, remainder or modulo operators.
## Bitwise operators
* `|`, `^`, `&`: OR, EXOR and AND
`byte | byte`
`constant word | constant word`
`constant long | constant long`
`word | word` (zpreg)
* `<<`, `>>`: bit shifting; shifting pads the result with zeroes
`byte << byte`
`word << byte` (zpreg)
`constant word << constant byte`
`constant long << constant byte`
* `>>>>`: shifting a 9-bit value and returning a byte; `a >>>> b` is equivalent to `(a & $1FF) >> b`
`word >>>> constant byte`
## Decimal arithmetic operators
These operators work using the decimal arithmetic and will not work on Ricoh CPU's.
The compiler issues a warning if these operators appear in the code.
* `+'`, `-'`: decimal addition/subtraction
`byte +' byte`
`constant word +' constant word`
`constant long +' constant long`
`word +' word` (zpreg)
* `*'`: decimal multiplication
`constant *' constant`
* `<<'`, `>>'`: decimal multiplication/division by power of two
`byte <<' constant byte`
## Comparison operators
These operators (except for `!=`) can accept more than 2 arguments.
In such case, the result is true if each comparison in the group is true.
Note you cannot mix those operators, so `a <= b < c` is not valid.
Note that currently in cases like `a < f() < b`, `f()` will be evaluated twice!
* `==`: equality
`byte == byte`
`simple word == simple word`
`simple long == simple long`
* `!=`: inequality
`byte != byte`
`simple word != simple word`
`simple long != simple long`
* `>`, `<`, `<=`, `>=`: inequality
`byte > byte`
`simple word > simple word`
`simple long > simple long`
Currently, `>`, `<`, `<=`, `>=` operators perform unsigned comparison
if none of the types of their arguments is signed,
and fail to compile otherwise. This will be changed in the future.
## Assignment and in-place modification operators
* `=`: normal assignment
`mutable byte = byte`
`mutable word = word`
`mutable long = long`
* `+=`, `+'=`, `|=`, `^=`, `&=`: modification in place
`mutable byte += byte`
`mutable word += word`
`mutable long += long`
* `<<=`, `>>=`: shift in place
`mutable byte <<= byte`
`mutable word <<= byte`
`mutable long <<= byte`
* `<<'=`, `>>'=`: decimal shift in place
`mutable byte <<= constant byte`
`mutable word <<= constant byte`
`mutable long <<= constant byte`
* `-=`, `-'=`: subtraction in place
`mutable byte -= byte`
`mutable word -= simple word`
`mutable long -= simple long`
* `*=`: multiplication in place
`mutable byte *= constant byte`
`mutable byte *= byte` (zpreg)
* `*'=`: decimal multiplication in place
`mutable byte *'= constant byte`
## Indexing
While Millfork does not consider indexing an operator, this is a place as good as any to discuss it.
An expression of form `a[i]`, where `i` is an expression of type `byte`, is:
* when `a` is an array: an access to the `i`-th element of the array `a`
* when `a` is a pointer variable: an access to the byte in memory at address `a + i`
Those expressions are of type `byte`. If `a` is any other kind of expression, `a[i]` is invalid.
## Built-in functions
* `not`: negation of a boolean expression
`not(bool)`
* `nonet`: expansion of an 8-bit operation to a 9-bit operation
`nonet(byte + byte)`
`nonet(byte +' byte)`
`nonet(byte << constant byte)`
`nonet(byte <<' constant byte)`
Other kinds of expressions than the above (even `nonet(byte + byte + byte)`) will not work as expected.
+70
View File
@@ -0,0 +1,70 @@
# Reentrancy
A function is called reentrant,
when its execution can be interrupted and the function can be then safely called again.
When programming in Millfork, you need to distinguish conceptually three kinds of reentrant functions:
* nesting-safe
* recursion-safe
* interrupt-safe
As Millfork is a middle-level language, it leaves taking care of those issues to the programmer.
## Nesting safety
Nesting occurs when a function is called when calculating parameters for another call of the same function:
f(f(4))
f(0, f(1,1))
f(g(f(5))
f(g()) // where g calls f, directly or indirectly
Since parameters are passed via global variables,
calling a function while preparing parameters for another call to the same function may cause undefined behaviour.
For that reason, a function is considered nesting-safe if it has maximum one parameter.
It is possible to make a safe nested call to a non-nesting safe function, provided two conditions are met:
* the function cannot modify its parameters
* the non-nested parameters have to have the same values in all co-occurring calls: `f(5, f(5, 6, 7), 7)`
In all other cases, the nested call may cause undefined behaviour.
## Recursion safety
A function is recursive if it calls itself, either directly or indirectly.
Since most automatic variables will be overwritten by the inner call, the function is recursive-safe if:
* parameters are no longer read after the recursive call is made
* an automatic variable is not read from without reinitialization after each recursive call
* all the other variables are stack variables
In all other cases, the recursive call may cause undefined behaviour.
The easiest, but suboptimal way to make a function recursion-safe is to make all local variables stack-allocated
and assigning all parameters to variables as soon as possible. This is slow though, so don't do it unless really necessary.
## Interrupt safety
A function is interrupt-safe if it can be safely called, either directly or indirectly,
simultaneously by the main code and by an interrupt routine.
The only way to make a function interrupt-safe is to have no parameters and make all local variables stack-allocated.
# Reentrancy safety violations
Each of the following things is a violation of reentrancy safety rules and will cause undefined behaviour with high probability:
* calling a non-nesting-safe function without extra precautions as above while preparing another call to that function
* calling a non-recursion-safe function from within itself recursively
* calling a non-interrupt-safe function from both the main code and an interrupt
+222
View File
@@ -0,0 +1,222 @@
# Syntax
For information about types, see [Types](./types.md).
For information about literals, see [Literals](./literals.md).
For information about assembly, see [Using assembly within Millfork programs](./assembly.md).
## Comments
Comments start with `//` and last until the end of line.
## Declarations
### Variable declarations
A variable declaration can happen at either top level of a file (*global* variables),
or a top level of a function (*local* variables).
Syntax:
`[segment(<segment>)] [<storage>] <type> <name> [@<address>] [= <initial_value>]`
* `<segment>`: segment name; if absent, then defaults to `default`.
* `<storage>` can be only specified for local variables. It can be either `stack`, `static`, `register` or nothing.
`register` is only a hint for the optimizer.
See [the description of variable storage](../abi/variable-storage.md).
* `<address>` is a constant expression that defines where in the memory the variable will be located.
If not specified, it will be located according to the usual allocation rules.
`stack` variables cannot have a defined address.
* `<initial_value>` is a constant expression that contains the initial value of the variable.
Only global variables can be initialized that way.
The behaviour is undefined when targeting a ROM-based platform.
### Constant declarations
`const <type> <name> = <value>`
TODO
### Array declarations
An array is a continuous sequence of bytes in memory.
Syntax:
`[segment(<segment>)] array <name> [[<size>]] [@<address>] [= <initial_values>]`
* `<segment>`: segment name; if absent,
then defaults to `default_code_segment` as defined for the platform if the array has initial values,
or to `default` if it doesn't.
TODO
### Function declarations
A function can be declared at the top level. For more details, see [Functions](./functions.md)
## `import` statements
TODO
## Statements
### Expression statement
TODO
### `if` statement
Syntax:
```
if <expression> {
<body>
}
```
```
if <expression> {
<body>
} else {
<body>
}
```
### `return` statement
Syntax:
```
return
```
```
return <expression>
```
### `return[]` statement (return dispatch)
Syntax examples:
```
return [a + b] {
0 @ underflow
255 @ overflow
default @ nothing
}
```
```
return [getF()] {
1 @ function1
2 @ function2
default(5) @ functionDefault
}
```
```
return [i] (param1, param2) {
1,5,8 @ function1(4, 6)
2 @ function2(9)
default(0,20) @ functionDefault
}
```
Return dispatch calculates the value of an index, picks the correct branch,
assigns some global variables and jumps to another function.
The index has to evaluate to a byte. The functions cannot be `macro` and shouldn't have parameters.
Jumping to a function with parameters gives those parameters undefined values.
The functions are not called, so they don't return to the function the return dispatch statement is in, but to its caller.
The return values are passed along. If the dispatching function has a non-`void` return type different that the type
of the function dispatched to, the return value is undefined.
If the `default` branch exists, then it is used for every missing index value between other supported values.
Optional parameters to `default` specify the maximum, or both the minimum and maximum supported index value.
In the above examples: the first example supports values 0255, second 15, and third 020.
If the index has an unsupported value, the behaviour is formally undefined, but in practice the program will simply crash.
Before jumping to the function, the chosen global variables will be assigned parameter values.
Variables have to be global byte-sized. Some simple array indexing expressions are also allowed.
Parameter values have to be constants.
For example, in the third example one of the following will happen:
* if `i` is 1, 5 or 8, then `param1` is assigned 4, `param2` is assigned 6 and then `function1` is called;
* if `i` is 2, then `param1` is assigned 9, `param2` is assigned an undefined value and then `function2` is called;
* if `i` is any other value from 0 to 20, then `param1` and `param2` are assigned undefined values and then `functionDefault` is called;
* if `i` has any other value, then undefined behaviour.
### `while` and `do-while` statements
Syntax:
```
while <expression> {
<body>
}
```
```
do {
<body>
} while <expression>
```
### `for` statements
**Warning: `for` loops are a bit buggy.**
Syntax:
```
for <variable>,<start>,<direction>,<end> {
}
```
* `<variable>` an already defined numeric variable
* `<direction>` the range to traverse:
* `to` from `<start>` inclusive to `<end>` inclusive, in ascending order
(e.g. `0,to,9` to traverse 0, 1,... 9)
* `downto` from `<start>` inclusive to `<end>` inclusive, in descending order
(e.g. `9,downto,0` to traverse 9, 8,... 0)
* `until` from `<start>` inclusive to `<end>` exclusive, in ascending order
(e.g. `0,until,10` to traverse 0, 1,... 9)
* `parallelto` the same as `to`, but the iterations may be executed in any order
* `paralleluntil` the same as `until`, but the iterations may be executed in any order
There is no `paralleldownto`, because it would do the same as `parallelto`.
### `break` and `continue` statements
Syntax:
```
break
break for
break while
break do
break <variable>
continue
continue for
continue while
continue do
continue <variable>
```
### `asm` statements
See [Using assembly within Millfork programs](./assembly.md).
+36
View File
@@ -0,0 +1,36 @@
# Types
Millfork puts extra limitations on which types can be used in which contexts.
## Numeric types
* `byte` 1-byte value of undefined signedness, defaulting to unsigned
* `word` 2-byte value of undefined signedness, defaulting to unsigned
* `long` 4-byte value of undefined signedness, defaulting to unsigned
* `sbyte` signed 1-byte value
* `ubyte` unsigned 1-byte value
* `pointer` the same as `word`, but variables of this type default to be zero-page-allocated
and you can index `pointer` variables (not arbitrary `pointer`-typed expressions though, `f()[0]` won't compile)
Functions cannot return types longer than 2 bytes.
Numeric types can be converted automatically:
* from a smaller type to a bigger type (`byte``word`)
* from a type of undefined signedness to a type of defined signedness (`byte``sbyte`)
* from a type of defined signedness to a type of undefined signedness (`sbyte``byte`)
## Boolean types
TODO
## Special types
* `void` a unit type containing no information, can be only used as a return type for a function.