mirror of
https://github.com/irmen/prog8.git
synced 2024-12-24 16:29:21 +00:00
1219 lines
57 KiB
ReStructuredText
1219 lines
57 KiB
ReStructuredText
====================
|
|
Programming in Prog8
|
|
====================
|
|
|
|
This chapter describes a high level overview of the elements that make up a program.
|
|
|
|
|
|
Elements of a program
|
|
---------------------
|
|
|
|
Program
|
|
Consists of one or more *modules*.
|
|
|
|
Module
|
|
A file on disk with the ``.p8`` suffix. It can contain *directives* and *code blocks*.
|
|
Whitespace and indentation in the source code are arbitrary and can be mixed tabs or spaces.
|
|
A module file can *import* other modules, including *library modules*.
|
|
It should be saved in UTF-8 encoding.
|
|
Line endings are significant because *only one* declaration, statement or other instruction can occur on every line.
|
|
Other whitespace and line indentation is arbitrary and ignored by the compiler.
|
|
You can use tabs or spaces as you wish.
|
|
|
|
Comments
|
|
Everything on the line after a semicolon ``;`` is a comment and is ignored by the compiler.
|
|
If the whole line is just a comment, this line will be copied into the resulting assembly source code for reference.
|
|
There's also a block-comment: everything surrounded with ``/*`` and ``*/`` is ignored and this can span multiple lines.
|
|
This block comment is experimental for now: it may change or even be removed again in a future compiler version.
|
|
The recommended way to comment out a bunch of lines remains to just bulk comment them individually with ``;``.
|
|
|
|
Directive
|
|
These are special instructions for the compiler, to change how it processes the code
|
|
and what kind of program it creates. A directive is on its own line in the file, and
|
|
starts with ``%``, optionally followed by some arguments. See the syntax reference for all directives.
|
|
The list of directives is given below at :ref:`directives`.
|
|
|
|
Code block
|
|
A block of actual program code. It has a starting address in memory,
|
|
and defines a *scope* (also known as 'namespace').
|
|
It contains variables and subroutines.
|
|
More details about this below: :ref:`blocks`.
|
|
|
|
Variable declarations
|
|
The data that the code works on is stored in variables ('named values that can change').
|
|
They are described in the chapter :ref:`variables`.
|
|
|
|
Code
|
|
These are the instructions that make up the program's logic.
|
|
Code can only occur inside a subroutine.
|
|
There are different kinds of instructions ('statements' is a better name) such as:
|
|
|
|
- value assignment
|
|
- looping (for, while, do-until, repeat, unconditional jumps)
|
|
- conditional execution (if - then - else, when, and conditional jumps)
|
|
- subroutine calls
|
|
- label definition
|
|
|
|
Subroutine
|
|
Defines a piece of code that can be called by its name from different locations in your code.
|
|
It accepts parameters and can return a value (optional).
|
|
It can define its own variables, and it is also possible to define subroutines within other subroutines.
|
|
Nested subroutines can access the variables from outer scopes easily, which removes the need and overhead to pass everything via parameters all the time.
|
|
Subroutines do not have to be declared in the source code before they can be called.
|
|
|
|
Label
|
|
This is a named position in your code where you can jump to from another place.
|
|
You can jump to it with a jump statement elsewhere. It is also possible to use a
|
|
subroutine call to a label (but without parameters and return value).
|
|
A label is an identifier followed by a colon ``:``. It's ok to put the next statement on
|
|
the same line, immediately after the label.
|
|
|
|
Scope
|
|
Also known as 'namespace', this is a named box around the symbols defined in it.
|
|
This prevents name collisions (or 'namespace pollution'), because the name of the scope
|
|
is needed as prefix to be able to access the symbols in it.
|
|
Anything *inside* the scope can refer to symbols in the same scope without using a prefix.
|
|
There are three scope levels in Prog8:
|
|
|
|
- global (no prefix), everything in a module file goes in here;
|
|
- block;
|
|
- subroutine, can be nested in another subroutine.
|
|
|
|
Even though modules are separate files, they are *not* separate scopes!
|
|
Everything defined in a module is merged into the global scope.
|
|
This is different from most other languages that have modules.
|
|
The global scope can only contain blocks and some directives, while the others can contain variables and subroutines too.
|
|
Some more details about how to deal with scopes and names is discussed below.
|
|
|
|
|
|
Identifiers
|
|
-----------
|
|
|
|
Naming things in Prog8 is done via valid *identifiers*. They start with a letter,
|
|
and after that, a combination of letters, numbers, or underscores.
|
|
Note that any Unicode Letter symbol is accepted as a letter!
|
|
Examples of valid identifiers::
|
|
|
|
a
|
|
A
|
|
monkey
|
|
COUNTER
|
|
Better_Name_2
|
|
something_strange__
|
|
knäckebröd
|
|
приблизительно
|
|
π
|
|
|
|
**Scoped names**
|
|
|
|
Sometimes called "qualified names" or "dotted names", a scoped name is a sequence of identifiers separated by a dot.
|
|
They are used to reference symbols in other scopes. Note that unlike many other programming languages,
|
|
scoped names always need to be fully scoped (because they always start in the global scope). Also see :ref:`blocks`::
|
|
|
|
main.start ; the entrypoint subroutine
|
|
main.start.variable ; a variable in the entrypoint subroutine
|
|
|
|
**Aliases**
|
|
|
|
The ``alias`` statement makes it easier to refer to symbols from other places, and they can save
|
|
you from having to type the fully scoped name everytime you need to access that symbol.
|
|
Aliases can be created in any scope except at the module level.
|
|
An alias is created with ``alias <name> = <target>`` and then you can use ``<name>`` as if it were ``<target>``.
|
|
It is possible to alias variables, labels and subroutines, but not whole blocks.
|
|
The name has to be an unscoped identifier name, the target can be any symbol.
|
|
|
|
|
|
.. _blocks:
|
|
|
|
Blocks, Scopes, and accessing Symbols
|
|
-------------------------------------
|
|
|
|
**Blocks** are the top level separate pieces of code and data of your program. They have a
|
|
starting address in memory and will be combined together into a single output program.
|
|
They can only contain *directives*, *variable declarations*, *subroutines* and *inline assembly code*::
|
|
|
|
<blockname> [<address>] {
|
|
<directives>
|
|
<variables>
|
|
<subroutines>
|
|
<inline asm>
|
|
}
|
|
|
|
The <blockname> must be a valid identifier, and must be unique in the entire program (there's
|
|
a directive to merge multiple occurences).
|
|
The <address> is optional. If specified it must be a valid memory address such as ``$c000``.
|
|
It's used to tell the compiler to put the block at a certain position in memory.
|
|
|
|
.. sidebar::
|
|
Using qualified names ("dotted names") to reference symbols defined elsewhere
|
|
|
|
Every symbol is 'public' and can be accessed from anywhere else, when given its *full* "dotted name".
|
|
So, accessing a variable ``counter`` defined in subroutine ``worker`` in block ``main``,
|
|
can be done from anywhere by using ``main.worker.counter``.
|
|
Unlike most other programming langues, as soon as a name is scoped,
|
|
Prog8 treats it as a name starting in the *global* namespace.
|
|
Relative name lookup is only performed for *non-scoped* names.
|
|
|
|
The address can be used to place a block at a specific location in memory.
|
|
Usually it is omitted, and the compiler will automatically choose the location (usually immediately after
|
|
the previous block in memory).
|
|
It must be >= ``$0200`` (because ``$00``--``$ff`` is the ZP and ``$100``--``$1ff`` is the cpu stack).
|
|
|
|
*Symbols* are names defined in a certain *scope*. Inside the same scope, you can refer
|
|
to them by their 'short' name directly. If the symbol is not found in the same scope,
|
|
the enclosing scope is searched for it, and so on, up to the top level block, until the symbol is found.
|
|
If the symbol was not found the compiler will issue an error message.
|
|
|
|
**Subroutines** create a new scope. All variables inside a subroutine are hoisted up to the
|
|
scope of the subroutine they are declared in. Note that you can define **nested subroutines** in Prog8,
|
|
and such a nested subroutine has its own scope! This also means that you have to use a fully qualified name
|
|
to access a variable from a nested subroutine::
|
|
|
|
main {
|
|
sub start() {
|
|
sub nested() {
|
|
ubyte counter
|
|
...
|
|
}
|
|
...
|
|
txt.print_ub(counter) ; Error: undefined symbol
|
|
txt.print_ub(main.start.nested.counter) ; OK
|
|
}
|
|
}
|
|
|
|
**Aliases** make it easier to refer to symbols from other places. They save
|
|
you from having to type the fully scoped name everytime you need to access that symbol.
|
|
Aliases can be created in any scope except at the module level.
|
|
You can create and use an alias with the ``alias`` statement like so::
|
|
|
|
alias score = cx16.r7L ; 'name' the virtual register
|
|
alias prn = txt.print_ub ; shorter name for a subroutine elsewhere
|
|
...
|
|
prn(score)
|
|
|
|
|
|
.. important::
|
|
Emphasizing this once more: unlike most other programming languages, a new scope is *not* created inside
|
|
for, while, repeat, and do-until statements, the if statement, and the branching conditionals.
|
|
These all share the same scope from the subroutine they're defined in.
|
|
You can define variables in these blocks, but these will be treated as if they
|
|
were defined in the subroutine instead.
|
|
|
|
|
|
Program Start and Entry Point
|
|
-----------------------------
|
|
|
|
Your program must have a single entry point where code execution begins.
|
|
The compiler expects a ``start`` subroutine in the ``main`` block for this,
|
|
taking no parameters and having no return value.
|
|
|
|
As any subroutine, it has to end with a ``return`` statement (or a ``goto`` call)::
|
|
|
|
main {
|
|
sub start () {
|
|
; program entrypoint code here
|
|
return
|
|
}
|
|
}
|
|
|
|
|
|
The ``main`` module is always relocated to the start of your programs
|
|
address space, and the ``start`` subroutine (the entrypoint) will be on the
|
|
first address. This will also be the address that the BASIC loader program (if generated)
|
|
calls with the SYS statement.
|
|
|
|
|
|
.. _directives:
|
|
|
|
Directives
|
|
-----------
|
|
|
|
.. data:: %address <address>
|
|
|
|
Level: module.
|
|
Global setting, set the program's start memory address. It's usually fixed at ``$0801`` because the
|
|
default launcher type is a CBM-BASIC program. But you have to specify this address yourself when
|
|
you don't use a CBM-BASIC launcher.
|
|
|
|
|
|
.. data:: %align <interval>
|
|
|
|
Level: not at module scope.
|
|
Tells the assembler to continue assembling on the given alignment interval. For example, ``%align $100``
|
|
will insert an assembler command to align on the next page boundary.
|
|
Note that this has no impact on variables following this directive! Prog8 reallocates all variables
|
|
using different rules. If you want to align a specific variable (array or string), you should use
|
|
one of the alignment tags for variable declarations instead.
|
|
Valid intervals are from 2 to 65536.
|
|
**Warning:** if you use this directive in between normal statements, it will disrupt the output
|
|
of the machine code instructions by making gaps between them, this will probably crash the program!
|
|
|
|
|
|
.. data:: %asm {{ ... }}
|
|
|
|
Level: not at module scope.
|
|
Declares that a piece of *assembly code* is inside the curly braces.
|
|
This code will be copied as-is into the generated output assembly source file.
|
|
Note that the start and end markers are both *double curly braces* to minimize the chance
|
|
that the assembly code itself contains either of those. If it does contain a ``}}``,
|
|
it will confuse the parser.
|
|
|
|
If you use the correct scoping rules you can access symbols from the prog8 program from inside
|
|
the assembly code. Sometimes you'll have to declare a variable in prog8 with `@shared` if it
|
|
is only used in such assembly code.
|
|
|
|
.. note::
|
|
64tass syntax is required for the assembly code. As such, mnemonics need to be written in lowercase.
|
|
|
|
.. caution::
|
|
Avoid using single-letter symbols in included assembly code, as they could be confused with CPU registers.
|
|
Also, note that all prog8 symbols are prefixed in assembly code, see :ref:`symbol-prefixing`.
|
|
|
|
|
|
.. data:: %asmbinary "<filename>" [, <offset>[, <length>]]
|
|
|
|
Level: not at module scope.
|
|
This directive can only be used inside a block.
|
|
The assembler itself will include the file as binary bytes at this point, prog8 will not process this at all.
|
|
This means that the filename must be spelled exactly as it appears on your computer's file system.
|
|
Note that this filename may differ in case compared to when you chose to load the file from disk from within the
|
|
program code itself (for example on the C64 and X16 there's the PETSCII encoding difference).
|
|
The file is located relative to the current working directory!
|
|
The optional offset and length can be used to select a particular piece of the file.
|
|
To reference the contents of the included binary data, you can put a label in your prog8 code
|
|
just before the %asmbinary. To find out where the included binary data ends, add another label directly after it.
|
|
An example program for this can be found below at the description of %asminclude.
|
|
|
|
|
|
.. data:: %asminclude "<filename>"
|
|
|
|
Level: not at module scope.
|
|
This directive can only be used inside a block.
|
|
The assembler will include the file as raw assembly source text at this point,
|
|
prog8 will not process this at all. Symbols defined in the included assembly can not be referenced
|
|
from prog8 code. However they can be referenced from other assembly code if properly prefixed.
|
|
You can of course use a label in your prog8 code just before the %asminclude directive, and reference
|
|
that particular label to get to (the start of) the included assembly.
|
|
Be careful: you risk symbol redefinitions or duplications if you include a piece of
|
|
assembly into a prog8 block that already defines symbols itself.
|
|
The compiler first looks for the file relative to the same directory as the module containing this statement is in,
|
|
if the file can't be found there it is searched relative to the current directory.
|
|
|
|
.. caution::
|
|
Avoid using single-letter symbols in included assembly code, as they could be confused with CPU registers.
|
|
Also, note that all prog8 symbols are prefixed in assembly code, see :ref:`symbol-prefixing`.
|
|
|
|
Here is a small example program to show how to use labels to reference the included contents from prog8 code::
|
|
|
|
%import textio
|
|
%zeropage basicsafe
|
|
|
|
main {
|
|
|
|
sub start() {
|
|
txt.print("first three bytes of included asm:\n")
|
|
uword included_addr = &included_asm
|
|
txt.print_ub(@(included_addr))
|
|
txt.spc()
|
|
txt.print_ub(@(included_addr+1))
|
|
txt.spc()
|
|
txt.print_ub(@(included_addr+2))
|
|
|
|
txt.print("\nfirst three bytes of included binary:\n")
|
|
included_addr = &included_bin
|
|
txt.print_ub(@(included_addr))
|
|
txt.spc()
|
|
txt.print_ub(@(included_addr+1))
|
|
txt.spc()
|
|
txt.print_ub(@(included_addr+2))
|
|
txt.nl()
|
|
return
|
|
|
|
included_asm:
|
|
%asminclude "inc.asm"
|
|
|
|
included_bin:
|
|
%asmbinary "inc.bin"
|
|
end_of_included_bin:
|
|
|
|
}
|
|
}
|
|
|
|
|
|
.. data:: %breakpoint
|
|
|
|
Level: not at module scope.
|
|
Defines a debugging breakpoint at this location. See :ref:`debugging`
|
|
|
|
|
|
.. data:: %encoding <encodingname>
|
|
|
|
Overrides, in the module file it occurs in,
|
|
the default text encoding to use for strings and characters that have no explicit encoding prefix.
|
|
You can use one of the recognised encoding names, see :ref:`encodings`.
|
|
|
|
|
|
.. data:: %import <name>
|
|
|
|
Level: module.
|
|
This reads and compiles the named module source file as part of your current program.
|
|
Symbols from the imported module become available in your code,
|
|
without a module or filename prefix.
|
|
You can import modules one at a time, and importing a module more than once has no effect.
|
|
|
|
|
|
.. data:: %launcher <type>
|
|
|
|
Level: module.
|
|
Global setting, selects the program launcher stub to use.
|
|
Only relevant when using the ``prg`` output type. Defaults to ``basic``.
|
|
|
|
- type ``basic`` : add a tiny C64 BASIC program, with a SYS statement calling into the machine code
|
|
- type ``none`` : no launcher logic is added at all
|
|
|
|
|
|
.. data:: %memtop <address>
|
|
|
|
Level: module.
|
|
Global setting, changes the program's top memory address. This is usually specified internally by the compiler target,
|
|
but with this you can change it to another value. This can be useful for example to 'reserve' a piece
|
|
of memory at the end of program space where other data such as external library files can be loaded into.
|
|
This memtop value is used for a check instruction for the assembler to see if the resulting program size
|
|
exceeds the given memtop address. This value is exclusive, so $a000 means that $a000 is the first address
|
|
that program can no longer use. Everything up to and including $9fff is still usable.
|
|
|
|
|
|
.. data:: %option <option> [, <option> ...]
|
|
|
|
Level: module, block.
|
|
Sets special compiler options.
|
|
|
|
- ``enable_floats`` (module level) tells the compiler
|
|
to deal with floating point numbers (by using various subroutines from the Kernal).
|
|
Otherwise, floating point support is not enabled. Normally you don't have to use this yourself as
|
|
importing the ``floats`` library is required anyway and that will enable it for you automatically.
|
|
- ``no_sysinit`` (module level) which cause the resulting program to *not* include
|
|
the system re-initialization logic of clearing the screen, resetting I/O config, setting memory bank configuration etc.
|
|
You'll have to take care of that yourself. The program will just start running from whatever state the machine is in when the
|
|
program was launched.
|
|
- ``force_output`` (in a block) will force the block to be outputted in the final program.
|
|
Can be useful to make sure some data is generated that would otherwise be discarded because the compiler thinks it's not referenced (such as sprite data)
|
|
- ``merge`` (in a block) will merge this block's contents into an already existing block with the same name.
|
|
Can be used to add or override subroutines to an existing library block, for instance.
|
|
Overriding (monkeypatching) happens only if the signature of the subroutine exactly matches the original subroutine, including the exact names and types of the parameters.
|
|
Where blocks with this option are merged into is intricate: it looks for the first other block with the same name that does not have %option merge,
|
|
if that can't be found, select the first occurrence regardless. If no other blocks are found, no merge is done. Blocks in libraries are considered first to merge into.
|
|
- ``splitarrays`` (block or module) makes all word-arrays in this scope lsb/msb split arrays (as if they all have the @split tag). See Arrays.
|
|
- ``no_symbol_prefixing`` (block or module) makes the compiler *not* use symbol-prefixing when translating prog8 code into assembly.
|
|
Only use this if you know what you're doing because it could result in invalid assembly code being generated.
|
|
This option can be useful when writing library modules that you don't want to be exposing prefixed assembly symbols.
|
|
- ``ignore_unused`` (block or module) suppress warnings about unused variables and subroutines. Instead, these will be silently stripped.
|
|
This option is useful in library modules that contain many more routines beside the ones that you actually use.
|
|
- ``verafxmuls`` (block, cx16 target only) uses Vera FX hardware word multiplication on the CommanderX16 for all word multiplications in this block. Warning: this may interfere with IRQs and other Vera operations, so use this only when you know what you're doing. It's safer to explicitly use ``verafx.muls()``.
|
|
|
|
|
|
.. data:: %output <type>
|
|
|
|
Level: module.
|
|
Global setting, selects program output type. Default is ``prg``.
|
|
|
|
- type ``raw`` : no header at all, just the raw machine code data
|
|
- type ``prg`` : C64 program (with load address header)
|
|
|
|
|
|
.. data:: %zeropage <style>
|
|
|
|
Level: module.
|
|
Global setting, select zeropage handling style. Defaults to ``kernalsafe``.
|
|
|
|
- style ``kernalsafe`` -- use the part of the ZP that is 'free' or only used by BASIC routines,
|
|
and don't change anything else. This allows full use of Kernal ROM routines (but not BASIC routines),
|
|
including default IRQs during normal system operation.
|
|
It's not possible to return cleanly to BASIC when the program exits. The only choice is
|
|
to perform a system reset. (A ``system_reset`` subroutine is available in the syslib to help you do this)
|
|
- style ``floatsafe`` -- like the previous one but also reserves the addresses that
|
|
are required to perform floating point operations (from the BASIC Kernal). No clean exit is possible.
|
|
- style ``basicsafe`` -- the most restricted mode; only use the handful 'free' addresses in the ZP, and don't
|
|
touch change anything else. This allows full use of BASIC and Kernal ROM routines including default IRQs
|
|
during normal system operation.
|
|
When the program exits, it simply returns to the BASIC ready prompt.
|
|
- style ``full`` -- claim the whole ZP for variables for the program, overwriting everything,
|
|
except for a few addresses that are used by the system's IRQ handler.
|
|
Even though that default IRQ handler is still active, it is impossible to use most BASIC and Kernal ROM routines.
|
|
This includes many floating point operations and several utility routines that do I/O, such as ``print``.
|
|
This option makes programs smaller and faster because even more variables can
|
|
be stored in the ZP (which allows for more efficient assembly code).
|
|
It's not possible to return cleanly to BASIC when the program exits. The only choice is
|
|
to perform a system reset. (A ``system_reset`` subroutine is available in the syslib to help you do this)
|
|
- style ``dontuse`` -- don't use *any* location in the zeropage.
|
|
|
|
.. note::
|
|
``kernalsafe`` and ``full`` on the C64 leave enough room in the zeropage to reallocate the
|
|
16 virtual registers cx16.r0...cx16.r15 from the Commander X16 into the zeropage as well
|
|
(but not on the same locations). They are relocated automatically by the compiler.
|
|
The other options need those locations for other things so those virtual registers have
|
|
to be put into memory elsewhere (outside of the zeropage). Trying to use them as zeropage
|
|
variables or pointers etc. will be a lot slower in those cases!
|
|
On the Commander X16 the registers are always in zeropage. On other targets, for now, they
|
|
are always outside of the zeropage.
|
|
|
|
|
|
.. data:: %zpallowed <fromaddress>,<toaddress>
|
|
|
|
Level: module.
|
|
Global setting, can occur multiple times. It allows you to designate a part of the zeropage that
|
|
the compiler is allowed to use (if other options don't prevent usage).
|
|
|
|
|
|
.. data:: %zpreserved <fromaddress>,<toaddress>
|
|
|
|
Level: module.
|
|
Global setting, can occur multiple times. It allows you to reserve or 'block' a part of the zeropage so
|
|
that it will not be used by the compiler.
|
|
|
|
|
|
Loops
|
|
-----
|
|
|
|
The *for*-loop is used to let a variable iterate over a range of values. Iteration is done in steps of 1, but you can change this.
|
|
|
|
.. sidebar::
|
|
Optimization
|
|
|
|
Usually a loop in descending order downto 0 or 1, produces more efficient assembly code than the same loop in ascending order.
|
|
|
|
The loop variable must be declared separately as byte or word earlier, so that you can reuse it for multiple occasions.
|
|
Iterating with a floating point variable is not supported. If you want to loop over a floating-point array, use a loop with an integer index variable instead.
|
|
If the from value is already outside of the loop range, the whole for loop is skipped.
|
|
|
|
The *while*-loop is used to repeat a piece of code while a certain condition is still true.
|
|
The *do--until* loop is used to repeat a piece of code until a certain condition is true.
|
|
The *repeat* loop is used as a short notation of a for loop where the loop variable doesn't matter and you're only interested in the number of iterations.
|
|
(without iteration count specified it simply loops forever). A repeat loop will result in the most efficient code generated so use this if possible.
|
|
|
|
You can also create loops by using the ``goto`` statement, but this should usually be avoided.
|
|
|
|
Breaking out of a loop prematurely is possible with the ``break`` statement,
|
|
immediately continue into the next cycle of the loop with the ``continue`` statement.
|
|
(These are just shorthands for a goto + a label)
|
|
|
|
The *unroll* loop is not really a loop, but looks like one. It actually duplicates the statements in its block on the spot by
|
|
the given number of times. It's meant to "unroll loops" - trade memory for speed by avoiding the actual repeat loop counting code.
|
|
Only simple statements are allowed to be inside an unroll loop (assignments, function calls etc.).
|
|
|
|
.. attention::
|
|
The value of the loop variable after executing the loop *is undefined* - you cannot rely
|
|
on it to be the last value in the range for instance! The value of the variable should only be used inside the for loop body.
|
|
(this is an optimization issue to avoid having to deal with mostly useless post-loop logic to adjust the loop variable's value)
|
|
|
|
|
|
for loop
|
|
^^^^^^^^
|
|
|
|
The loop variable must be a byte or word variable, and it must be defined separately first.
|
|
The expression that you loop over can be anything that supports iteration (such as ranges like ``0 to 100``,
|
|
array variables and strings) *except* floating-point arrays (because a floating-point loop variable is not supported).
|
|
Remember that a step value in a range must be a constant value.
|
|
|
|
You can use a single statement, or a statement block like in the example below::
|
|
|
|
for <loopvar> in <expression> [ step <amount> ] {
|
|
; do something...
|
|
break ; break out of the loop
|
|
continue ; immediately next iteration
|
|
}
|
|
|
|
For example, this is a for loop using a byte variable ``i``, defined before, to loop over a certain range of numbers::
|
|
|
|
ubyte i
|
|
|
|
...
|
|
|
|
for i in 20 to 155 {
|
|
; do something
|
|
}
|
|
|
|
To loop over a decreasing or descending range, use the ``downto`` keyword::
|
|
|
|
ubyte i
|
|
|
|
...
|
|
|
|
for i in 155 downto 20 { ; 155, 154, 153, ..., 20
|
|
; do something
|
|
}
|
|
|
|
Similarly, a descending range may be specified by using ``to`` in combination with a ``step`` that is ``< 0``::
|
|
|
|
ubyte i
|
|
|
|
...
|
|
|
|
for i in 155 to 20 step -1 { ; 155, 154, 153, ..., 20
|
|
; do something
|
|
}
|
|
|
|
The following example is a loop over the values of the array ``fibonacci_numbers``::
|
|
|
|
uword[] fibonacci_numbers = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]
|
|
|
|
uword number
|
|
for number in fibonacci_numbers {
|
|
; do something with number...
|
|
break ; break out of the loop early
|
|
}
|
|
|
|
See :ref:`range-expression` for all of the details.
|
|
|
|
while loop
|
|
^^^^^^^^^^
|
|
|
|
As long as the condition is true (1), repeat the given statement(s).
|
|
You can use a single statement, or a statement block like in the example below::
|
|
|
|
while <condition> {
|
|
; do something...
|
|
break ; break out of the loop
|
|
continue ; immediately next iteration
|
|
}
|
|
|
|
|
|
do-until loop
|
|
^^^^^^^^^^^^^
|
|
|
|
Until the given condition is true (1), repeat the given statement(s).
|
|
You can use a single statement, or a statement block like in the example below::
|
|
|
|
do {
|
|
; do something...
|
|
break ; break out of the loop
|
|
continue ; immediately next iteration
|
|
} until <condition>
|
|
|
|
|
|
repeat loop
|
|
^^^^^^^^^^^
|
|
|
|
When you're only interested in repeating something a given number of times.
|
|
It's a short hand for a for loop without an explicit loop variable::
|
|
|
|
repeat 15 {
|
|
; do something...
|
|
break ; you can break out of the loop
|
|
continue ; immediately next iteration
|
|
}
|
|
|
|
If you omit the iteration count, it simply loops forever.
|
|
You can still ``break`` out of such a loop if you want though.
|
|
|
|
|
|
unroll loop
|
|
^^^^^^^^^^^
|
|
|
|
Like a repeat loop, but trades memory for speed by not generating the code
|
|
for the counter. Instead it duplicates the code inside the loop on the spot for
|
|
the given number of iterations. This means that only a constant number of iterations can be specified.
|
|
Also, only simple statements such as assignments and function calls can be inside the loop::
|
|
|
|
unroll 80 {
|
|
cx16.VERA_DATA0 = 255
|
|
}
|
|
|
|
A `break` or `continue` statement cannot occur in an unroll loop, as there is no actual loop to break out of.
|
|
|
|
|
|
Conditional Execution
|
|
---------------------
|
|
|
|
if statement
|
|
^^^^^^^^^^^^
|
|
|
|
Conditional execution means that the flow of execution changes based on certain conditions,
|
|
rather than having fixed gotos or subroutine calls::
|
|
|
|
if xx==5 {
|
|
yy = 99
|
|
zz = 42
|
|
} else {
|
|
aa = 3
|
|
bb = 9
|
|
}
|
|
|
|
if xx==5
|
|
yy = 42
|
|
else if xx==6
|
|
yy = 43
|
|
else
|
|
yy = 44
|
|
|
|
if aa>4 goto some_label
|
|
|
|
if xx==3 yy = 4
|
|
|
|
if xx==3 yy = 4 else aa = 2
|
|
|
|
|
|
Conditional jumps (``if condition goto label``) are compiled using 6502's branching instructions (such as ``bne`` and ``bcc``) so
|
|
the rather strict limit on how *far* it can jump applies. The compiler itself can't figure this
|
|
out unfortunately, so it is entirely possible to create code that cannot be assembled successfully.
|
|
Thankfully the ``64tass`` assembler that is used has the option to automatically
|
|
convert such branches to their opposite + a normal jmp. This is slower and takes up more space
|
|
and you will get warning printed if this happens. You may then want to restructure your branches (place target labels closer to the branch,
|
|
or reduce code complexity).
|
|
|
|
|
|
There is a special form of the if-statement that immediately translates into one of the 6502's branching instructions.
|
|
This allows you to write a conditional jump or block execution directly acting on the current values of the CPU's status register bits.
|
|
The eight branching instructions of the CPU each have an if-equivalent (and there are some easier to understand aliases):
|
|
|
|
====================== =====================
|
|
condition meaning
|
|
====================== =====================
|
|
``if_cs`` if carry status is set
|
|
``if_cc`` if carry status is clear
|
|
``if_vs`` if overflow status is set
|
|
``if_vc`` if overflow status is clear
|
|
``if_eq`` / ``if_z`` if result is equal to zero
|
|
``if_ne`` / ``if_nz`` if result is not equal to zero
|
|
``if_pl`` / ``if_pos`` if result is 'plus' (>= zero)
|
|
``if_mi`` / ``if_neg`` if result is 'minus' (< zero)
|
|
====================== =====================
|
|
|
|
So ``if_cc goto target`` will directly translate into the single CPU instruction ``BCC target``.
|
|
|
|
.. caution::
|
|
These special ``if_XX`` branching statements are only useful in certain specific situations where you are *certain*
|
|
that the status register (still) contains the correct status bits.
|
|
This is not always the case after a function call or other operations!
|
|
If in doubt, check the generated assembly code!
|
|
|
|
.. note::
|
|
For now, the symbols used or declared in the statement block(s) are shared with
|
|
the same scope the if statement itself is in.
|
|
Maybe in the future this will be a separate nested scope, but for now, that is
|
|
only possible when defining a subroutine.
|
|
|
|
|
|
if expression
|
|
^^^^^^^^^^^^^
|
|
|
|
Similar to the if statement, but this time selects one of two possible values as the outcome of the expression,
|
|
depending on the condition. You write it as ``if <condition> <value1> else <value2>`` and it can be
|
|
used anywhere an expression is used to assign or pass a value.
|
|
The first value will be used if the condition is true, otherwise the second value is used.
|
|
Sometimes it may be more legible if you surround the condition expression with parentheses so it is better
|
|
separated visually from the first value following it.
|
|
You must always provide two alternatives to choose from, and they can only be values (expressions).
|
|
An example, to select the number of cards to use depending on what game is played::
|
|
|
|
ubyte numcards = if game_is_piquet 32 else 52
|
|
|
|
; it's more verbose with an if statement:
|
|
ubyte numcards
|
|
if game_is_piquet
|
|
numcards = 32
|
|
else
|
|
numcards = 52
|
|
|
|
|
|
|
|
when statement ('jump table')
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Instead of writing a bunch of sequential if-elseif statements, it is more readable to
|
|
use a ``when`` statement. (It will also result in greatly improved assembly code generation)
|
|
Use a ``when`` statement if you have a set of fixed choices that each should result in a certain
|
|
action. It is possible to combine several choices to result in the same action::
|
|
|
|
when value {
|
|
4 -> txt.print("four")
|
|
5 -> txt.print("five")
|
|
10,20,30 -> {
|
|
txt.print("ten or twenty or thirty")
|
|
}
|
|
else -> txt.print("don't know")
|
|
}
|
|
|
|
The when-*value* can be any expression but the choice values have to evaluate to
|
|
compile-time constant integers (bytes or words). They also have to be the same
|
|
datatype as the when-value, otherwise no efficient comparison can be done.
|
|
The else part is optional.
|
|
Choices can result in a single statement or a block of multiple statements in which
|
|
case you have to use { } to enclose them.
|
|
|
|
|
|
.. note::
|
|
Instead of chaining several value equality checks together using ``or`` (ex.: ``if x==1 or xx==5 or xx==9``),
|
|
consider using a ``when`` statement or ``in`` containment check instead. These are more efficient.
|
|
|
|
|
|
Unconditional jump: goto
|
|
------------------------
|
|
|
|
To jump to another part of the program, you use a ``goto`` statement with an address or the name
|
|
of a label or subroutine. Referencing labels or subroutines outside of their defined scope requires
|
|
using qualified "dotted names"::
|
|
|
|
goto $c000 ; address
|
|
goto name ; label or subroutine
|
|
goto main.mysub.name ; qualified dotted name; see, "Blocks, Scopes, and accessing Symbols"
|
|
|
|
uword address = $4000
|
|
goto address ; jump via address variable
|
|
|
|
Notice that this is a valid way to end a subroutine (you can either ``return`` from it, or jump
|
|
to another piece of code that eventually returns).
|
|
|
|
If you jump to an address variable (uword), it is doing an 'indirect' jump: the jump will be done
|
|
to the address that's currently in the variable.
|
|
|
|
|
|
Assignments
|
|
-----------
|
|
|
|
Assignment statements assign a single value to a target variable or memory location.
|
|
Augmented assignments (such as ``aa += xx``) are also available, but these are just shorthands
|
|
for normal assignments (``aa = aa + xx``).
|
|
|
|
It is possible to "chain" assignments: ``x = y = z = 42``, this is just a shorthand
|
|
for the three individual assignments with the same value 42.
|
|
|
|
Only for certain subroutines that return multiple values it is possible to write a "multi assign" statement
|
|
with comma separated assignment targets, that assigns those multiple values to different targets in one statement.
|
|
Details can be found here: :ref:`multiassign`.
|
|
|
|
|
|
.. attention::
|
|
**Data type conversion (in assignments):**
|
|
When assigning a value with a 'smaller' datatype to variable with a 'larger' datatype,
|
|
the value will be automatically converted to the target datatype: byte --> word --> float.
|
|
So assigning a byte to a word variable, or a word to a floating point variable, is fine.
|
|
The reverse is *not* true: it is *not* possible to assign a value of a 'larger' datatype to
|
|
a variable of a smaller datatype without an explicit conversion. Otherwise you'll get an error telling you
|
|
that there is a loss of precision. You can use builtin functions such as ``round`` and ``lsb`` to convert
|
|
to a smaller datatype, or revert to integer arithmetic.
|
|
|
|
|
|
Expressions
|
|
-----------
|
|
|
|
Expressions tell the program to *calculate* something. They consist of
|
|
values, variables, operators such as ``+`` and ``-``, function calls, type casts, or other expressions.
|
|
Here is an example that calculates to number of seconds in a certain time period::
|
|
|
|
num_hours * 3600 + num_minutes * 60 + num_seconds
|
|
|
|
Long expressions can be split over multiple lines by inserting a line break before or after an operator::
|
|
|
|
num_hours * 3600
|
|
+ num_minutes * 60
|
|
+ num_seconds
|
|
|
|
In most places where a number or other value is expected, you can use just the number, or a constant expression.
|
|
If possible, the expression is parsed and evaluated by the compiler itself at compile time, and the (constant) resulting value is used in its place.
|
|
Expressions that cannot be compile-time evaluated will result in code that calculates them at runtime.
|
|
Expressions can contain procedure and function calls.
|
|
There are various built-in functions that can be used in expressions (see :ref:`builtinfunctions`).
|
|
You can also reference identifiers defined elsewhere in your code.
|
|
|
|
.. note::
|
|
**Order of evaluation:**
|
|
|
|
The order of evaluation of expression operands is *unspecified* and should not be relied upon.
|
|
There is no guarantee of a left-to-right or right-to-left evaluation. But don't confuse this with
|
|
operator precedence order (multiplication comes before addition etcetera).
|
|
|
|
.. attention::
|
|
**Floating point values used in expressions:**
|
|
|
|
When a floating point value is used in a calculation, the result will be a floating point, and byte or word values
|
|
will be automatically converted into floats in this case. The compiler will issue a warning though when this happens, because floating
|
|
point calculations are very slow and possibly unintended!
|
|
|
|
Calculations with integer variables will not result in floating point values.
|
|
if you divide two integer variables say 32500 and 99 the result will be the integer floor
|
|
division (328) rather than the floating point result (328.2828282828283). If you need the full precision,
|
|
you'll have to make sure at least the first operand is a floating point. You can do this by
|
|
using a floating point value or variable, or use a type cast.
|
|
When the compiler can calculate the result during compile-time, it will try to avoid loss
|
|
of precision though and gives an error if you may be losing a floating point result.
|
|
|
|
|
|
|
|
Arithmetic and Logical expressions
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
Arithmetic expressions are expressions that calculate a numeric result (integer or floating point).
|
|
Many common arithmetic operators can be used and follow the regular precedence rules.
|
|
Logical expressions are expressions that calculate a boolean result: true or false
|
|
(which in reality are just a 1 or 0 integer value). When using variables of the type ``bool``,
|
|
logical expressions will compile more efficiently than when you're using regular integer type operands
|
|
(because these have to be converted to 0 or 1 every time)
|
|
Prog8 applies short-circuit aka McCarthy evaluation for ``and`` and ``or`` on boolean expressions.
|
|
|
|
You can use parentheses to group parts of an expression to change the precedence.
|
|
Usually the normal precedence rules apply (``*`` goes before ``+`` etc.) but subexpressions
|
|
within parentheses will be evaluated first. So ``(4 + 8) * 2`` is 24 and not 20,
|
|
and ``(true or false) and false`` is false instead of true.
|
|
|
|
.. attention::
|
|
**calculations keep their datatype even if the target variable is larger:**
|
|
When you do calculations on a BYTE type, the result will remain a BYTE.
|
|
When you do calculations on a WORD type, the result will remain a WORD.
|
|
For instance::
|
|
|
|
byte b = 44
|
|
word w = b*55 ; the result will be 116! (even though the target variable is a word)
|
|
w *= 999 ; the result will be -15188 (the multiplication stays within a word, but overflows)
|
|
|
|
*The compiler does NOT warn about this!* It's doing this for
|
|
performance reasons - so you won't get sudden 16 bit (or even float)
|
|
calculations where you needed only simple fast byte arithmetic.
|
|
If you do need the extended resulting value, cast at least one of the
|
|
operands explicitly to the larger datatype. For example::
|
|
|
|
byte b = 44
|
|
w = (b as word)*55
|
|
w = b*(55 as word)
|
|
|
|
|
|
Operators
|
|
---------
|
|
|
|
arithmetic: ``+`` ``-`` ``*`` ``/`` ``%``
|
|
``+``, ``-``, ``*``, ``/`` are the familiar arithmetic operations.
|
|
``/`` is division (will result in integer division when using on integer operands, and a floating point division when at least one of the operands is a float)
|
|
``%`` is the remainder operator: ``25 % 7`` is 4. Be careful: without a space after the %, it will be parsed as a binary number.
|
|
So ``25 %10`` will be parsed as the number 25 followed by the binary number 2, which is a syntax error.
|
|
Note that remainder is only supported on integer operands (not floats).
|
|
|
|
bitwise arithmetic: ``&`` ``|`` ``^`` ``~`` ``<<`` ``>>``
|
|
``&`` is bitwise and, ``|`` is bitwise or, ``^`` is bitwise xor, ``~`` is bitwise invert (this one is an unary operator)
|
|
``<<`` is bitwise left shift and ``>>`` is bitwise right shift (both will not change the datatype of the value)
|
|
|
|
assignment: ``=``
|
|
Sets the target on the LHS (left hand side) of the operator to the value of the expression on the RHS (right hand side).
|
|
Note that an assignment sometimes is not possible or supported.
|
|
It's possible to chain assignments like ``x = y = z = 42`` as a shorthand for the three assignments with the same value.
|
|
|
|
augmented assignment: ``+=`` ``-=`` ``*=`` ``/=`` ``&=`` ``|=`` ``^=`` ``<<=`` ``>>=``
|
|
This is syntactic sugar; ``aa += xx`` is equivalent to ``aa = aa + xx``
|
|
|
|
postfix increment and decrement: ``++`` ``--``
|
|
Syntactic sugar: ``aa++`` is equivalent to ``aa += 1``, and ``aa--`` is equivalent to ``aa -= 1``.
|
|
Because these operations are so common, and often used in other languages, we have these short forms.
|
|
*Notes:* unlike some other languages, they are *not* expressions in prog8, but statements. You cannot
|
|
increment or decrement something inside an expression like, for example, ``x = value[aa++]`` is invalid.
|
|
Also because of this, there is no *prefix* increment and decrement.
|
|
|
|
comparison: ``==`` ``!=`` ``<`` ``>`` ``<=`` ``>=``
|
|
Equality, Inequality, Less-than, Greater-than, Less-or-Equal-than, Greater-or-Equal-than comparisons.
|
|
The result is a boolean, true or false.
|
|
|
|
logical: ``not`` ``and`` ``or`` ``xor``
|
|
These operators are the usual logical operations that are part of a logical expression to reason
|
|
about truths (boolean values). The result of such an expression is a boolean, true or false.
|
|
Prog8 applies short-circuit aka McCarthy evaluation for ``and`` and ``or``.
|
|
|
|
range creation: ``to``, ``downto``
|
|
Creates a range of values from the LHS value to the RHS value, inclusive.
|
|
These are mainly used in for loops to set the loop range.
|
|
See :ref:`range-expression` for details.
|
|
|
|
containment check: ``in``
|
|
Tests if a value is present in a list of values, which can be a string, or an array, or a range expression.
|
|
The result is a simple boolean true or false.
|
|
Consider using this instead of chaining multiple value tests with ``or``, because the
|
|
containment check is more efficient.
|
|
Checking N in a range from x to y, is identical to x<=N and N<=y; the actual range of values is never created.
|
|
Examples::
|
|
|
|
ubyte cc
|
|
if cc in [' ', '@', 0] {
|
|
txt.print("cc is one of the values")
|
|
}
|
|
|
|
if cc in 10 to 20 {
|
|
txt.print("10 <= cc and cc <=20")
|
|
}
|
|
|
|
str email_address = "name@test.com"
|
|
if '@' in email_address {
|
|
txt.print("email address seems ok")
|
|
}
|
|
|
|
|
|
address of: ``&``
|
|
This is a prefix operator that can be applied to a string or array variable or literal value.
|
|
It results in the memory address (UWORD) of that string or array in memory: ``uword a = &stringvar``
|
|
Sometimes the compiler silently inserts this operator to make it easier for instance
|
|
to pass strings or arrays as subroutine call arguments.
|
|
This operator can also be used as a prefix to a variable's data type keyword to indicate that
|
|
it is a memory mapped variable (for instance: ``&ubyte screencolor = $d021``)
|
|
|
|
ternary:
|
|
Prog8 doesn't have a ternary operator to choose one of two values (``x? y : z`` in many other languages)
|
|
instead it provides this feature in the form of an *if expression*. See below under "Conditional Execution".
|
|
|
|
precedence grouping in expressions, or subroutine parameter list: ``(`` *expression* ``)``
|
|
Parentheses are used to group parts of an expression to change the order of evaluation.
|
|
(the subexpression inside the parentheses will be evaluated first):
|
|
``(4 + 8) * 2`` is 24 instead of 20.
|
|
|
|
Parentheses are also used in a subroutine call, they follow the name of the subroutine and contain
|
|
the list of arguments to pass to the subroutine: ``big_function(1, 99)``
|
|
|
|
|
|
Subroutines
|
|
-----------
|
|
|
|
Defining a subroutine
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
You define a subroutine like so::
|
|
|
|
sub <identifier> ( [parameters] ) [ -> returntype ] {
|
|
... statements ...
|
|
}
|
|
|
|
; example:
|
|
sub triple (word amount) -> word {
|
|
return amount * 3
|
|
}
|
|
|
|
The parameters is a (possibly empty) comma separated list of "<datatype> <parametername>" pairs specifying the input parameters.
|
|
The return type has to be specified if the subroutine returns a value.
|
|
|
|
Subroutines can be defined in a Block, but also nested inside another subroutine. Everything is scoped accordingly.
|
|
There are three different types of subroutines: regular subroutines (the one above), assembly-only, and
|
|
external subroutines. These last two are described in detail below.
|
|
|
|
Reusing *virtual registers* R0-R15 for parameters
|
|
*************************************************
|
|
.. sidebar::
|
|
🦶🔫 Footgun warning
|
|
|
|
when using this the program can clobber the contents of R0-R15 when doing other operations that also
|
|
use these registers, or when calling other routines because Prog8 doesn't have a callstack.
|
|
Be very aware of what you are doing, the compiler can't guarantee correct values by itself anymore.
|
|
|
|
Normally, every subroutine parameter will get its own local variable in the subroutine where the argument value
|
|
will be stored when the subroutine is called. In certain situations, this may lead to many variables being allocated.
|
|
You *can* instruct the compiler to not allocate a new variable, but instead to reuse one of the *virtual registers* R0-R15
|
|
(accessible in the code as ``cx16.r0`` - ``cx16.r15``) for the parameter. This is done by adding a ``@Rx`` tag
|
|
to the parameter. This can only be done for byte and word types.
|
|
Note: the R0-R15 *virtual registers* are described in more detail below for the Assembly subroutines.
|
|
Here's an example that reuses the R0 and the R1L (lower byte of R1) virtual registers for the paremeters::
|
|
|
|
sub get_indexed_byte(uword pointer @R0, ubyte index @R1) -> ubyte {
|
|
return @(cx16.r0 + cx16.r1L)
|
|
}
|
|
|
|
|
|
Assembly-Subroutines
|
|
^^^^^^^^^^^^^^^^^^^^
|
|
These are user-written subroutines in the program source code itself, implemented purely in assembly and
|
|
which have an assembly calling convention (i.e. the parameters are strictly passed via cpu registers).
|
|
Such subroutines are defined with ``asmsub`` like this::
|
|
|
|
asmsub clear_screenchars (ubyte char @ A) clobbers(Y) {
|
|
%asm {{
|
|
ldy #0
|
|
_loop sta cbm.Screen,y
|
|
sta cbm.Screen+$0100,y
|
|
sta cbm.Screen+$0200,y
|
|
sta cbm.Screen+$02e8,y
|
|
iny
|
|
bne _loop
|
|
rts
|
|
}}
|
|
}
|
|
|
|
the statement body of such a subroutine can only consist of just inline assembly.
|
|
|
|
The ``@ <register>`` part is required for rom and assembly-subroutines, as it specifies for the compiler
|
|
what cpu registers should take the routine's arguments. You can use the regular set of registers
|
|
(A, X, Y), special 16-bit register pairs to take word values (AX, AY and XY) and even a processor status
|
|
flag such as Carry (Pc).
|
|
|
|
It is not possible to use floating point arguments or return values in an asmsub.
|
|
|
|
**inline:** Trivial ``asmsub`` routines can be tagged as ``inline`` to tell the compiler to copy their code
|
|
in-place to the locations where the subroutine is called, rather than inserting an actual call and return to the
|
|
subroutine. This may increase code size significantly and can only be used in limited scenarios, so YMMV.
|
|
Note that the routine's code is copied verbatim into the place of the subroutine call in this case,
|
|
so pay attention to any jumps and rts instructions in the inlined code!
|
|
|
|
.. note::
|
|
Asmsubs can also be tagged as ``inline asmsub`` to make trivial pieces of assembly inserted
|
|
directly instead of a call to them. Note that it is literal copy-paste of code that is done,
|
|
so make sure the assembly is actually written to behave like such - which probably means you
|
|
don't want a ``rts`` or ``jmp`` or ``bra`` in it!
|
|
|
|
.. note::
|
|
The 'virtual' 16-bit registers from the Commander X16 can also be specified as ``R0`` .. ``R15`` .
|
|
This means you don't have to set them up manually before calling a subroutine that takes
|
|
one or more parameters in those 'registers'. You can just list the arguments directly.
|
|
*This also works on the Commodore 64!* (however they are not as efficient there because they're not in zeropage)
|
|
In prog8 and assembly code these 'registers' are directly accessible too via
|
|
``cx16.r0`` .. ``cx16.r15`` (these are memory mapped uword values),
|
|
``cx16.r0s`` .. ``cx16.r15s`` (these are memory mapped word values),
|
|
and ``L`` / ``H`` variants are also available to directly access the low and high bytes of these.
|
|
You can use them directly but their name isn't very descriptive, so it may be useful to define
|
|
an alias for them when using them regularly.
|
|
|
|
|
|
External subroutines
|
|
^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Thse define an external subroutine that's implemented outside of the program
|
|
(for instance, a ROM routine, or a routine in a library loaded elsewhere in RAM).
|
|
External subroutines are usually defined by compiler library files, with the following syntax::
|
|
|
|
extsub $FFD5 = LOAD(ubyte verify @ A, uword address @ XY) clobbers()
|
|
-> bool @Pc, ubyte @ A, ubyte @ X, ubyte @ Y
|
|
|
|
This defines the ``LOAD`` subroutine at memory address $FFD5, taking arguments in all three registers A, X and Y,
|
|
and returning stuff in several registers as well. The ``clobbers`` clause is used to signify to the compiler
|
|
what CPU registers are clobbered by the call instead of being unchanged or returning a meaningful result value.
|
|
Note that the address ($ffd5 in the example above) can actually be an expression as long as it is a compile time constant. This can
|
|
make it easier to define jump tables for example, like this::
|
|
|
|
const uword APIBASE = $8000
|
|
extsub APIBASE+0 = firstroutine()
|
|
extsub APIBASE+10 = secondroutine()
|
|
extsub APIBASE+20 = thirdroutine()
|
|
|
|
**Banks:** it is possible to declare a non-standard ROM or RAM bank that the routine is living in, with ``@bank`` like this:
|
|
``extsub @bank 10 $C09F = audio_init()`` to define a routine at $C09F in bank 10. You can also specify a variable for the bank.
|
|
See :ref:`banking` for more information.
|
|
|
|
|
|
Calling a subroutine
|
|
^^^^^^^^^^^^^^^^^^^^
|
|
|
|
You call a subroutine like this::
|
|
|
|
[ void / result = ] subroutinename_or_address ( [argument...] )
|
|
|
|
; example:
|
|
resultvariable = subroutine(arg1, arg2, arg3)
|
|
void noresultvaluesub(arg)
|
|
|
|
Arguments are separated by commas. The argument list can also be empty if the subroutine
|
|
takes no parameters. If the subroutine returns a value, usually you assign it to a variable.
|
|
If you're not interested in the return value, prefix the function call with the ``void`` keyword.
|
|
Otherwise the compiler will warn you about discarding the result of the call.
|
|
|
|
.. note::
|
|
**Order of evaluation:**
|
|
|
|
The order of evaluation of arguments to a single function call is *unspecified* and should not be relied upon.
|
|
There is no guarantee of a left-to-right or right-to-left evaluation of the call arguments.
|
|
|
|
.. caution::
|
|
Note that due to the way parameters are processed by the compiler,
|
|
subroutines are *non-reentrant*. This means you cannot create recursive calls.
|
|
If you do need a recursive algorithm, you'll have to hand code it in embedded assembly for now,
|
|
or rewrite it into an iterative algorithm.
|
|
Also, subroutines used in the main program should not be used from an IRQ handler. This is because
|
|
the subroutine may be interrupted, and will then call itself from the IRQ handler. Results are
|
|
then undefined because the variables will get overwritten.
|
|
|
|
|
|
.. _multiassign:
|
|
|
|
Multiple return values
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
|
Normal subroutines can only return zero or one return values.
|
|
However, the special ``asmsub`` routines (implemented in assembly code) or ``extsub`` routines
|
|
(referencing an external routine in ROM or elsewhere in RAM) can return more than one return value.
|
|
For example a status in the carry bit and a number in A, or a 16-bit value in A/Y registers and some more values in R0 and R1.
|
|
In all of these cases, you have to "multi assign" all return values of the subroutine call to something.
|
|
You simply write the assignment targets as a comma separated list,
|
|
where the element's order corresponds to the order of the return values declared in the subroutine's signature.
|
|
So for instance::
|
|
|
|
bool flag
|
|
ubyte bytevar
|
|
uword wordvar
|
|
|
|
wordvar, flag, bytevar = multisub() ; call and assign the three result values
|
|
|
|
asmsub multisub() -> uword @AY, bool @Pc, ubyte @X { ... }
|
|
|
|
.. sidebar:: Using just one of the values
|
|
|
|
Sometimes it is easier to just have a single return value in the subroutine's signagure (even though it
|
|
actually may return multiple values): this avoids having to put ``void`` for all other values.
|
|
It also allows it to be called in expressions such as if-statements again.
|
|
Examples of these second 'convenience' definition are library routines such as ``cbm.STOP2`` and ``cbm.GETIN2``,
|
|
that only return a single value where the "official" versions ``STOP`` and ``GETIN`` always return multiple values.
|
|
|
|
**Skipping values:** Instead of using ``void`` to ignore the result of a subroutine call altogether,
|
|
you can also use it as a placeholder name in a multi-assignment. This skips assignment of the return value in that place.
|
|
One of the cases where this is useful, is with boolean values returned in status flags such as the carry flag.
|
|
Storing that flag as a boolean in a variable first, and then possibly adding an ``if flag...`` statement afterwards, is a lot less
|
|
efficient than just keeping the flag as-is and using a conditional branch such as ``if_cs`` to do something with it.
|
|
So in the case above that could be::
|
|
|
|
wordvar, void, bytevar = multisub()
|
|
if_cs
|
|
something()
|
|
|
|
Notice that a call to a subroutine that returns multiple values cannot be used inside an expression,
|
|
because expression terms always need to be a single value. You'll have to use a separate multi-assignment
|
|
first and then use the result of that in the expression. However, also read the sidebar about a possible alternative.
|
|
|
|
|
|
Deferred ("cleanup") code
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Usually when a subroutine exits, it has to clean up things that it worked on. For example, it has to close
|
|
a file that it opened before to read data from, or it has to free a piece of memory that it allocated via
|
|
a dynamic memory allocation library, etc.
|
|
Every spot where the subroutine exits (return statement, jump, or the end of the routine) you have to take care
|
|
of doing the cleanups required. This can get tedious, and the cleanup code is separated from the place where
|
|
the resource allocation was done at the start.
|
|
|
|
The ``defer`` keyword can be used to schedule a statement (or block of statements) to be executed
|
|
just before exiting of the current subroutine. That can be via a return statement or a jump to somewhere else,
|
|
or just the normal ending of the subroutine. This is often useful to "not forget" to clean up stuff,
|
|
and if the subroutine has multiple ways or places where it can exit, it saves you from repeating
|
|
the cleanup code at every exit spot. Multiple defers can be scheduled in a single subroutine (up to a maximum of 8).
|
|
They are handled in reversed order. Return values are evaluated before any deferred code is executed.
|
|
You write defers like so::
|
|
|
|
sub example() -> bool {
|
|
ubyte file = open_file()
|
|
defer close_file(file) ; "close it when we exit from here"
|
|
|
|
uword memory = allocate(1000)
|
|
if memory==0
|
|
return false
|
|
defer deallocate(memory) ; "deallocate when we exit from here"
|
|
|
|
process(file, memory)
|
|
return true
|
|
}
|
|
|
|
In this example, the two deferred statements are not immediately executed. Instead, they are executed when the
|
|
subroutine exits at any point. So for example the ``return false`` after the memory check will automatically also close
|
|
the file that was opened earlier because the close_file() call was scheduled there.
|
|
At the bottom when the ``return true`` appears, *both* deferred cleanup calls are executed: first the deallocation of
|
|
the memory, and then the file close. As you can see this saves you from duplicating the cleanup logic,
|
|
and the logic is declared very close to the spot where the allocation of the resource happens, so it's easier to read and understand.
|
|
|
|
It's possible to write a defer for a block of statements, but the advice is to keep such cleanup code as simple and short as possible.
|
|
|
|
.. caution::
|
|
Defers only work for subroutines that are written in regular Prog8 code.
|
|
If a piece of inlined assembly somehow causes the routine to exit, the compiler cannot detect this,
|
|
and defers won't be handled in such cases.
|
|
|
|
|
|
Library routines and builtin functions
|
|
--------------------------------------
|
|
|
|
There are many routines available in the compiler libraries or as builtin functions.
|
|
The most important ones can be found in the :doc:`libraries` chapter.
|