1
0
mirror of https://github.com/KarolS/millfork.git synced 2024-06-02 00:41:40 +00:00

Some more documentation

This commit is contained in:
Karol Stasiak 2018-01-04 01:15:04 +01:00
parent 1f020e2ced
commit 76122a2dd7
16 changed files with 525 additions and 3 deletions

View File

@ -1,7 +1,32 @@
# Documentation
**★ WORK IN PROGRESS ★**
## Tutorial
* [Getting started](tutorial/01-getting-started.md)
* [Basic functions and variables](tutorial/02-functions-variables.md)
* [Basic functions and variables](tutorial/02-functions-variables.md)
## Compiler usage
* [Command-line option reference](api/command-line.md)
* [Target platform reference](api/target-platforms.md)
## Language reference
* [Inline assembly syntax](lang/assembly.md)
* [Important guidelines regarding reentrancy](lang/reentrancy.md)
## Implementation details
* [Variable storage](abi/variable-storage.md)
* [Undefined behaviour](abi/undefined-behaviour.md)
* [Undocumented instruction support](abi/undocumented.md)
* [Reference for labels in generated assembly code](abi/generated-labels.md)

View File

@ -0,0 +1,47 @@
# Guide to generated label names
Many Millfork constructs generate labels.
Knowing what they mean can be useful when reading the generated assembly code.
Every generated label is of form `.xx__11111`
where `11111` is a sequential number and `xx` is the type:
* `ah` optimized addition of carry
* `an` logical conjunction short-circuiting
* `c8` constant `#8` for `BIT` when immediate addressing is not available
* `co` greater-than comparison
* `cp` equality comparison for larger types
* `de` decrement for larger types
* `do` beginning of a `do-while` statement
* `ds` decimal right shift operation
* `el` beginning of the "else" block in an `if` statement
* `ew` end of a `while` statement
* `fi` end of an `if` statement
* `he` beginning of the body of a `while` statement
* `in` increment for larger types
* `is` optimized addition of carry using undocumented instructions
* `od` end of a `do-while` statement
* `or` logical alternative short-circuiting
* `sx` sign extension, from a smaller signed type to a larger type
* `th` beginning of the "then" block in an `if` statement
* `wh` beginning of a `while` statement

9
doc/abi/inlining.md Normal file
View File

@ -0,0 +1,9 @@
# Function inlining
## Explicit inlining
`inline` keyword
## Automatic inlining
`--inline` command-line option

View File

@ -0,0 +1,21 @@
# Undefined behaviour
Since Millfork is only a middle-level programming language and attempts to eschew runtime checks in favour of performance,
there are many situation when the program may not behave as expected.
In the following list, "undefined value" means an arbitrary value that cannot be relied upon,
and "undefined behaviour" means arbitrary and unpredictable behaviour that may lead to anything,
even up to hardware damage.
* array overruns: indexing past the end of an array leads to undefined behaviour
* stray pointers: indexing a pointer that doesn't point to a valid object or indexing it past the end of the pointed object leads to undefined behaviour
* reading uninitialized variables: will return undefined values
* stack overflow: exhausting the hardware stack due to excess recursion, excess function calls or excess stack-allocated variables
* violating the [safe assembly rules](../lang/assembly.md)
* violating the [safe reentrancy rules](../lang/reentrancy.md)
The above list is not exhaustive.

58
doc/abi/undocumented.md Normal file
View File

@ -0,0 +1,58 @@
# Undocumented opcodes
Original 6502 processors accidentally supported a bunch of extra undocumented instructions.
Millfork can emit them if so desired.
## Mnemonics
Since various assemblers use different mnemonics for undocumented opcodes,
Millfork supports multiple mnemonics per opcode. The default one is given first:
* **AHX**, AXA, SHA
* **ALR**
* **ANC**
* **ARR**
* **DCP**, DCM
* **ISC**, INS
* **LAS**
* **LAX**
* **LXA**, OAL
* **RLA**
* **RRA**
* **SAX**
* **SHX**, XAS
* **SHY**, SAY
* **SBX**, AXS\*
* **SRE**, LSE
* **SLO**, ASO
* **TAS**
* **XAA**, ANE
\* AXS is also used for SAX in some assemblers, but Millfork always interprets AXS as a synonym for SBX
## Generation
In order for the compiler to emit one of those opcodes,
an appropriate CPU architecture must be chosen (`nmos` or `ricoh`)
and either it must appear in an assembly block or it may be a result of optimization.
Optimization will never emit any of the following opcodes due to their instability and/or uselessness:
AHX, LAS, LXA, SHX, SHY, TAS, XAA.

View File

@ -0,0 +1,68 @@
# Variable storage
Variables in Millfork can belong to one of the following storage classes:
* static: all global variables; local variables declared with `static`
* stack: local variables declared with `stack`
* automatic: other local variables
* parameter: function parameters
Variables can also belong to one of the following memory segments
(unless overriden with the `@` operator):
* zeropage: all `pointer` variables and parameters
* high RAM: all the other variables and parameters
All arrays can be considered static.
## Static variables
Static variables have a fixed and unique memory location.
Their lifetime is for the entire runtime of the program.
If they do not have initial value declared, reading them before initialization yields an undefined value.
## Stack variables
Stack variables, as their name suggests, live on the stack.
Their lifetime starts with the beginning of the function they're in
and ends when the function returns.
They are not automatically initialized before reading, reading them before initialization yields an undefined value.
The main advantage is that they are perfectly safe to use in reentrant code,
but the main disadvantages are:
* slower access
* bigger code
* increased stack usage
* cannot take their addresses
* cannot use them in inline assembly code blocks
## Automatic variables
Automatic variables have lifetime starting with the beginning of the function they're in
and ending when the function returns.
Most automatic variables reside in memory.
They can share their memory location with other automatic variables and parameters,
to conserve memory usage.
Some small automatic variables may be inlined to index registers.
They are not automatically initialized before reading, reading them before initialization yields an undefined value.
Automatic local variables are not safe to use with reentrant functions, see the [relevant documentation](../lang/reentrancy.md) for more details.
## Parameters
Automatic variables have lifetime starting with the beginning
of the function call to the function they're defined in
and ending when the function returns.
They reside in memory and can share their memory location with other parameters and automatic variables,
to conserve memory usage.
Unlike automatic variables, they are never inlined into index registers.
Parameters are not safe to use with reentrant functions, see the [relevant documentation](../lang/reentrancy.md) for more details.

62
doc/api/command-line.md Normal file
View File

@ -0,0 +1,62 @@
# Command-line options
## General options
* `--version` Display the version number and quit.
* `--help` Displays help the command line option.
* `--` End of the options, all the following parameters will be treated as input files, even if they look like options.
## I/O options
* `-o <file>` Output filename, without extension. Extension will be added automatically, `.prg` for Commodore and `.xex` for Atari.
* `-s` Generate also the assembly output. It is not compatible with any assembler, but it serves purely informational purpose. The file has the same nam as the output file and the extension is `.asm`.
* `-s` Generate also the label file. The label file contains labels with their addresses, with duplicates removed. It can be loaded into the monitor of the Vice emulator for debugging purposes. The file has the same nam as the output file and the extension is `.lbl`.
* `-I <dir>;<dir>` The include directories. The current working directory is also an include directory. Those directories are searched for modules and platform definitions.
* `-t <platform>` Target platform. It is loaded from an `.ini` file found in any of the include directories. See also [this document](target-platforms.md).
* `-r <program>` Run given program after successful compilation. Useful for automatically launching emulators without any external scripting.
## Verbosity options
* `-q` Supress all messages except for errors.
* `-v`, `-vv`, `-vvv` Increase verbosity, various levels.
## Code generation options
* `-fcmos-ops`, `-fno-cmos-ops` Whether should emit CMOS opcodes. `.ini` equivalent: `emit_cmos`.
* `-fillegals`, `-fno-illegals` Whether should emit illegal (undocumented) NMOS opcodes. `.ini` equivalent: `emit_illegals`.
* `-fjmp-fix`, `-fno-jmp-fix` Whether should prevent indirect JMP bug on page boundary. `.ini` equivalent: `prevent_jmp_indirect_bug`.
* `-fdecimal-mode`, `-fno-decimal-mode` Whether should decimal mode be available.` .ini` equivalent: `decimal_mode`.
* `-fvariable-overlap`, `-fno-variable-overlap` Whether should variables overlap if their scopes do not intersect. Default: yes.
## Optimization options
* `-O0` Disable all optimizations.
* `-O`, `-O2`, `-O3` Optimize code, various levels.
* `-O9` Optimize code using superoptimizer (experimental). Computationally expensive, decent results.
* `--inline` Inline functions automatically (experimental). See the [documentation about inlining](../abi/inlining.md). Computationally easy, can give decent gains.
* `--detailed-flow` Use detailed flow analysis (experimental). Very computationally expensive and not that great.
* `--dangerous-optimizations` Use dangerous optimizations (experimental). Dangerous optimizations are more likely to result in broken code.
## Warning options
* `-Wall` Enable extra warnings.
* `-Wfatal` Treat warnings as errors.

134
doc/lang/assembly.md Normal file
View File

@ -0,0 +1,134 @@
# Using assembly within Millfork programs
There are two ways to include raw assembly code in your Millfork programs:
* inline assembly code blocks
* whole assembly functions
## Assembly syntax
Millfork inline assembly uses the same three-letter opcodes as most other 6502 assemblers.
Indexing syntax is also the same. Only instructions available on the current CPU architecture are available.
Currently, `RMBx`/`SMBx`/`BBRx`/`BBSx` are not supported yet.
Undocumented instructions are supported using various opcodes
Labels have to be followed by a colon and they can optionally be on a separate line:
first: INC x
second:
INC y
Label names have to start with a letter and can contain digits, underscores and letters.
This means than they cannot start with a period like in many other assemblers.
Similarly, anonymous labels designated with `+` or `-` are also not supported
Labels are global,
which means that they live in the same namespace as functions, types and global variables.
Assembly can refer to variables and constants defined in Millfork,
but you need to be careful with using absolute vs immediate addressing:
const byte fiveConstant = 5
byte fiveVariable = 5
byte ten() {
byte result
asm {
LDA #fiveConstant
CLC
ADC fiveVariable
STA result
}
return result
}
Any assembly opcode can be prefixed with `?`, which allows the optimizer change it or elide it if needed.
Opcodes without that prefix will be always compiled as written.
Currently there is no way to insert raw bytes into inline assembly
(required for certain optimizations and calling conventions).
## Assembly functions
Assembly functions can be declared as `inline` or not.
An inline assembly function is inserted into the calling function like an inline assembly block,
and therefore usually it shouldn't end with `RTS` or `RTI`.
The return type on inline functions has to be `void`.
A non-inline assembly function should end with `RTS`, `JMP` or `RTI` as appropriate,
or it should be an external function.
Their return type can be any valid return type, like for Millfork functions.
If the size of the return type is one byte,
then the result is passed via the accumulator.
If the size of the return type is two bytes,
then the low byte of the result is passed via the accumulator
and the high byte of the result is passed via the X register.
### Assembly function parameters
An assembly function can have parameters.
They differ from what is used by Millfork functions.
Inline assembly functions can have the following parameter types:
* reference parameters: `byte ref paramname`: every occurrence of the parameter will be replaced with the variable given as an argument
* constant parameters: `byte const paramname`: every occurrence of the parameter will be replaced with the constant value given as an argument
For example, if you have:
inline asm void increase(byte ref v, byte const inc) {
LDA v
CLC
ADC #inc
STA v
}
and call `increase(score, 10)`, the entire call will compile into:
LDA score
CLC
ADC #10
STA score
Non-inline functions can only have their parameters passed via registers:
* `byte a`, `byte x`, `byte y`: a single byte passed via the given CPU register
* `word xa`, `word ax`, `word ay`, `word ya`, `word xy`, `word yx`: a 2-byte word byte passed via given two CPU registers, with the high byte passed through the first register and the low byte passed through the second register
### External functions
An external function should be declared with a defined memory address
and the `extern` keyword instead of the body:
asm void putchar(byte a) @$FFD2 extern
## Safe assembly
Since assembly gives the programmer unlimited access to all machine features,
certain assumptions about the code may be broken.
In order to make assembly cooperate with the rest of the Millfork code,
it should abide to the following rules:
* don't leave the D flag set
* don't jump between functions if either of functions has stack variables
* don't do `RTS` or `RTI` if the function has stack variables
* don't jump or call things that are not functions or labels
* don't store data in locations other than variables or arrays
* don't change the stack pointer
* end non-inline assembly functions with `RTS`, `JMP` or `RTI` as appropriate
The above list is not exhaustive.

2
doc/lang/functions.md Normal file
View File

@ -0,0 +1,2 @@
# Function definitions

2
doc/lang/interfacing.md Normal file
View File

@ -0,0 +1,2 @@
# Interfacing with external code

1
doc/lang/literals.md Normal file
View File

@ -0,0 +1 @@
# Literals and initializers

70
doc/lang/reentrancy.md Normal file
View File

@ -0,0 +1,70 @@
# Reentrancy
A function is called reentrant,
when its execution can be interrupted and the function can be then safely called again.
When programming in Millfork, you need to distinguish conceptually three kinds of reentrant functions:
* nesting-safe
* recursion-safe
* interrupt-safe
As Millfork is a middle-level language, it leaves taking care of those issues to the programmer.
## Nesting safety
Nesting occurs when a function is called when calculating parameters for another call of the same function:
f(f(4))
f(0, f(1,1))
f(g(f(5))
f(g()) // where g calls f, directly or indirectly
Since parameters are passed via global variables,
calling a function while preparing parameters for another call to the same function may cause undefined behaviour.
For that reason, a function is considered nesting-safe if it has maximum one parameter.
It is possible to make a safe nested call to a non-nesting safe function, provided two conditions are met:
* the function cannot modify its parameters
* the non-nested parameters have to have the same values in all co-occuring calls: `f(5, f(5, 6, 7), 7)`
In all other cases, the nested call may cause undefined behaviour.
## Recursion safety
A function is recursive if it calls itself, either directly or indirectly.
Since most automatic variables will be overwritten by the inner call, the function is recursive-safe if:
* parameters are no longer read after the recursive call is made
* an automatic variable is not read from without reinitialization after each recursive call
* all the other variables are stack variables
In all other cases, the recursive call may cause undefined behaviour.
The easiest, but unoptimal way to make a function recursion-safe is to make all local variables stack-allocated
and assigning all parameters to variables as soon as possible. This is slow though, so don't do it unless really necessary.
## Interrupt safety
A function is interrupt-safe if it can be safely called, either directly or indirectly,
simultaneously by the main code and by an interrupt routine.
The only way to make a function interrupt-safe is to have no parameters and make all local variables stack-allocated.
# Reentrancy safety violations
Each of the following things is a violation of reentrancy safety rules and will cause undefined behaviour with high probability:
* calling a non-nesting-safe function without extra precautions as above while preparing another call to that function
* calling a non-recursion-safe function from within itself recursively
* calling a non-interrupt-safe function from both the main code and an interrupt

1
doc/lang/syntax.md Normal file
View File

@ -0,0 +1 @@
# Syntax

View File

@ -35,7 +35,7 @@ The following options are crucial when compiling your sources:
* `-I DIR;DIR;DIR;...` specifies the paths to directories with modules to include.
* `-t PLATFORM` specifies the target platform (`c64` is the default). Each platform is defined in an `.ini` file in the include directory. For the list of supported platforms, see [Supported platforms](../target-platforms.md)
* `-t PLATFORM` specifies the target platform (`c64` is the default). Each platform is defined in an `.ini` file in the include directory. For the list of supported platforms, see [Supported platforms](../api/target-platforms.md)
You may be also interested in the following:
@ -51,4 +51,4 @@ You may be also interested in the following:
* `-Wall` enable all warnings
* `--help` list all commandline options
* `--help` list all commandline options

View File

@ -110,4 +110,26 @@ class AssemblySuite extends FunSuite with Matchers {
""".stripMargin)(_.readWord(0xc000) should equal(0x100))
}
test("Example from docs") {
EmuBenchmarkRun(
"""
| byte output @$c000
| void main () {
| output = ten()
| }
| const byte fiveConstant = 5
| byte fiveVariable = 5
|
| byte ten() {
| byte result
| asm {
| LDA #fiveConstant
| CLC
| ADC fiveVariable
| STA result
| }
| return result
| }
""".stripMargin)(_.readByte(0xc000) should equal(10))
}
}