mirror of
https://github.com/irmen/prog8.git
synced 2025-07-15 21:24:11 +00:00
new docs
This commit is contained in:
@@ -1,73 +1,335 @@
|
||||
==============================
|
||||
Writing and building a program
|
||||
==============================
|
||||
.. _programstructure:
|
||||
|
||||
What is a "Program" anyway?
|
||||
---------------------------
|
||||
===================
|
||||
Programming in IL65
|
||||
===================
|
||||
|
||||
A "complete runnable program" is a compiled, assembled, and linked together single unit.
|
||||
It contains all of the program's code and data and has a certain file format that
|
||||
allows it to be loaded directly on the target system. IL65 currently has no built-in
|
||||
support for programs that exceed 64 Kb of memory, nor for multi-part loaders.
|
||||
|
||||
For the Commodore-64, most programs will have a tiny BASIC launcher that does a SYS into the generated machine code.
|
||||
This way the user can load it as any other program and simply RUN it to start. (This is a regular ".prg" program).
|
||||
Il65 can create those, but it is also possible to output plain binary programs
|
||||
that can be loaded into memory anywhere.
|
||||
This chapter describes a high level overview of the elements that make up a program.
|
||||
Details about the syntax can be found in the :ref:`syntaxreference` chapter.
|
||||
|
||||
|
||||
Compiling program code
|
||||
----------------------
|
||||
|
||||
Compilation of program code is done by telling the IL65 compiler to compile a main source code module file.
|
||||
Other modules that this code needs will be loaded and processed via imports from within that file.
|
||||
The compiler will link everything together into one output program at the end.
|
||||
|
||||
The compiler is invoked with the command:
|
||||
|
||||
``$ @todo``
|
||||
|
||||
|
||||
Module source code files
|
||||
------------------------
|
||||
|
||||
A module source file is a text file with the ``.ill`` suffix, containing the program's source code.
|
||||
It consists of compilation options and other directives, imports of other modules,
|
||||
and source code for one or more code blocks.
|
||||
|
||||
IL65 has a couple of *LIBRARY* modules that are defined in special internal files provided by the compiler:
|
||||
``c64lib``, ``il65lib``, ``mathlib``.
|
||||
You should not overwrite these or reuse their names.
|
||||
|
||||
|
||||
.. _debugging:
|
||||
|
||||
Debugging (with Vice)
|
||||
Elements of a program
|
||||
---------------------
|
||||
|
||||
There's support for using the monitor and debugging capabilities of the rather excellent
|
||||
`Vice emulator <http://vice-emu.sourceforge.net/>`_.
|
||||
Program
|
||||
Consists of one or more *modules*.
|
||||
|
||||
The ``%breakpoint`` directive (see :ref:`directives`) in the source code instructs the compiler to put
|
||||
a *breakpoint* at that position. Some systems use a BRK instruction for this, but
|
||||
this will usually halt the machine altogether instead of just suspending execution.
|
||||
IL65 issues a NOP instruction instead and creates a 'virtual' breakpoint at this position.
|
||||
All breakpoints are then written to a file called "programname.vice-mon-list",
|
||||
which is meant to be used by the Vice emulator.
|
||||
It contains a series of commands for Vice's monitor, including source labels and the breakpoint settings.
|
||||
If you use the vice autostart feature of the compiler, it will be processed by Vice automatically and immediately.
|
||||
If you launch Vice manually, you'll have to use a command line option to load this file:
|
||||
Module
|
||||
A file on disk with the ``.ill`` suffix. It contains *directives* and *code blocks*.
|
||||
Whitespace and indentation in the source code are arbitrary and can be tabs or spaces or both.
|
||||
You can also add *comments* to the source code.
|
||||
One moudule file can *import* others, and also import *library modules*.
|
||||
|
||||
``$ x64 -moncommands programname.vice-mon-list``
|
||||
Comments
|
||||
Everything after a semicolon ``;`` is a comment and is ignored by the compiler.
|
||||
If the whole line is just a comment, it will be copied into the resulting assembly source code.
|
||||
This makes it easier to understand and relate the generated code. Examples::
|
||||
|
||||
Vice will then use the label names in memory disassembly, and will activate the breakpoints as well.
|
||||
If your running program hits one of the breakpoints, Vice will halt execution and drop you into the monitor.
|
||||
A = 42 ; set the initial value to 42
|
||||
; next is the code that...
|
||||
|
||||
Directive
|
||||
These are special instructions for the compiler, to change how it processes the code
|
||||
and what kind of program it creates. A directive is on its own line in the file, and
|
||||
starts with ``%``, optionally followed by some arguments.
|
||||
|
||||
Code block
|
||||
A block of actual program code. It defines a *scope* (also known as 'namespace') and
|
||||
can contain IL65 *code*, *variable declarations* and *subroutines*.
|
||||
More details about this below: :ref:`blocks`.
|
||||
|
||||
Variable declarations
|
||||
The data that the code works on is stored in variables ('named values that can change').
|
||||
The compiler allocates the required memory for them.
|
||||
There is *no dynamic memory allocation*. The storage size of all variables
|
||||
is fixed and is determined at compile time.
|
||||
Variable declarations tend to appear at the top of the code block that uses them.
|
||||
They define the name and type of the variable, and its initial value.
|
||||
IL65 supports a small list of data types, including special 'memory mapped' types
|
||||
that don't allocate storage but instead point to a fixed location in the address space.
|
||||
|
||||
Code
|
||||
These are the instructions that make up the program's logic. There are different kinds of instructions
|
||||
('statements' is a better name):
|
||||
|
||||
- value assignment
|
||||
- looping (for, while, repeat, unconditional jumps)
|
||||
- conditional execution (if - then - else, and conditional jumps)
|
||||
- subroutine calls
|
||||
- label definition
|
||||
|
||||
Subroutine
|
||||
Defines a piece of code that can be called by its name from different locations in your code.
|
||||
It accepts parameters and can return result values.
|
||||
It can define its own variables but it's not possible to define subroutines nested in other subroutines.
|
||||
To keep things simple, you can only define subroutines inside code blocks from a module.
|
||||
|
||||
Label
|
||||
This is a named position in your code where you can jump to from another place.
|
||||
You can jump to it with a jump statement elsewhere. It is also possible to use a
|
||||
subroutine call to a label (but without parameters and return value).
|
||||
|
||||
|
||||
Troubleshooting
|
||||
---------------
|
||||
Scope
|
||||
Also known as 'namespace', this is a named box around the symbols defined in it.
|
||||
This prevents name collisions (or 'namespace pollution'), because the name of the scope
|
||||
is needed as prefix to be able to access the symbols in it.
|
||||
Anything *inside* the scope can refer to symbols in the same scope without using a prefix.
|
||||
There are three scopes in IL65:
|
||||
|
||||
Getting an assembler error about undefined symbols such as ``not defined 'c64flt'``?
|
||||
This happens when your program uses floating point values, and you forgot to import the ``c64lib``.
|
||||
If you use floating points, the program will need routines from that library.
|
||||
Fix it by adding an ``%import c64lib``.
|
||||
- global (no prefix)
|
||||
- code block
|
||||
- subroutine
|
||||
|
||||
Modules are *not* a scope! Everything defined in a module is merged into the global scope.
|
||||
|
||||
|
||||
.. _blocks:
|
||||
|
||||
Blocks, Scopes, and accessing Symbols
|
||||
-------------------------------------
|
||||
|
||||
Blocks are the separate pieces of code and data of your program. They are combined
|
||||
into a single output program. No code or data can occur outside a block. Here's an example::
|
||||
|
||||
~ main $c000 {
|
||||
; this is code inside the block...
|
||||
}
|
||||
|
||||
|
||||
The name of a block must be unique in your entire program.
|
||||
Also be careful when importing other modules; blocks in your own code cannot have
|
||||
the same name as a block defined in an imported module or library.
|
||||
|
||||
It's possible to omit this name, but then you can only refer to the contents of the block via its absolute address,
|
||||
which is required in this case. If you omit *both* name and address, the block is *ignored* by the compiler (and a warning is displayed).
|
||||
This is a way to quickly "comment out" a piece of code that is unfinshed or may contain errors that you
|
||||
want to work on later, because the contents of the ignored block are not fully parsed either.
|
||||
|
||||
The address can be used to place a block at a specific location in memory.
|
||||
Usually it is omitted, and the compiler will automatically choose the location (usually immediately after
|
||||
the previous block in memory).
|
||||
The address must be >= ``$0200`` (because ``$00``--``$ff`` is the ZP and ``$100``--``$200`` is the cpu stack).
|
||||
|
||||
A block is also a *scope* in your program so the symbols in the block don't clash with
|
||||
symbols of the same name defined elsewhere in the same file or in another file.
|
||||
You can refer to the symbols in a particular block by using a *dotted name*: ``blockname.symbolname``.
|
||||
Labels inside a subroutine are appended again to that; ``blockname.subroutinename.label``.
|
||||
|
||||
Every symbol is 'public' and can be accessed from elsewhere given its dotted name.
|
||||
|
||||
|
||||
**The special "ZP" ZeroPage block**
|
||||
|
||||
Blocks named "ZP" are treated a bit differently: they refer to the ZeroPage.
|
||||
The contents of every block with that name (this one may occur multiple times) are merged into one.
|
||||
Its start address is always set to ``$04``, because ``$00 - $01`` are used by the hardware
|
||||
and ``$02 - $03`` are reserved as general purpose scratch registers.
|
||||
|
||||
|
||||
Program Start and Entry Point
|
||||
-----------------------------
|
||||
|
||||
Your program must have a single entry point where code execution begins.
|
||||
The compiler expects a ``start`` subroutine in the ``main`` block for this,
|
||||
taking no parameters and having no return value.
|
||||
As any subroutine, it has to end with a ``return`` statement (or a ``goto`` call)::
|
||||
|
||||
~ main {
|
||||
sub start () -> () {
|
||||
; program entrypoint code here
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
The ``main`` module is always relocated to the start of your programs
|
||||
address space, and the ``start`` subroutine (the entrypoint) will be on the
|
||||
first address. This will also be the address that the BASIC loader program (if generated)
|
||||
calls with the SYS statement.
|
||||
|
||||
|
||||
Variables and data
|
||||
------------------
|
||||
|
||||
::
|
||||
|
||||
12345 ; integer number
|
||||
"Hi, I am a string" ; text string
|
||||
-33.456e52 ; floating point number
|
||||
|
||||
byte counter = 42 ; variable of size 8 bits, with initial value 42
|
||||
|
||||
|
||||
Integers
|
||||
^^^^^^^^
|
||||
|
||||
Integers are 8 or 16 bit numbers and can be written in normal decimal notation,
|
||||
in hexadecimal and in binary notation.
|
||||
|
||||
@todo right now only unsinged integers are supported (>=0)
|
||||
|
||||
|
||||
Strings
|
||||
^^^^^^^
|
||||
|
||||
Strings are a sequence of characters enclosed in ``"`` quotes.
|
||||
They're stored and treated much the same as a byte array,
|
||||
but they have some special properties because they are considered to be *text*.
|
||||
Strings in your source code files will be encoded (translated from ASCII/UTF-8) into either CBM PETSCII or C-64 screencodes.
|
||||
PETSCII is the default choice. If you need screencodes (also called 'poke' codes) instead,
|
||||
you have to use the ``str_s`` variants of the string type identifier.
|
||||
If you assign a string literal of length 1 to a non-string variable, it is treated as a *byte* value instead
|
||||
with has the PETSCII value of that single character,
|
||||
|
||||
|
||||
Floating point numbers
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Floats are stored in the 5-byte 'MFLPT' format that is used on CBM machines,
|
||||
and also most float operations are specific to the Commodore-64.
|
||||
This is because routines in the C-64 BASIC and KERNAL ROMs are used for that.
|
||||
So floating point operations will only work if the C-64 BASIC ROM (and KERNAL ROM)
|
||||
are banked in (and your code imports the ``c64lib.ill``)
|
||||
|
||||
The largest 5-byte MFLPT float that can be stored is: **1.7014118345e+38** (negative: **-1.7014118345e+38**)
|
||||
|
||||
|
||||
Initial values across multiple runs of the program
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The initial values of your variables will be restored automatically when the program is (re)started,
|
||||
*except for string variables*. It is assumed these are left unchanged by the program.
|
||||
If you do modify them in-place, you should take care yourself that they work as
|
||||
expected when the program is restarted.
|
||||
|
||||
|
||||
|
||||
Indirect addressing and address-of
|
||||
----------------------------------
|
||||
|
||||
The ``#`` operator is used to take the address of the symbol following it.
|
||||
It can be used for example to work with the *address* of a memory mapped variable rather than
|
||||
the value it holds. You could take the address of a string as well, but that is redundant:
|
||||
the compiler already treats those as a value that you manipulate via its address.
|
||||
For most other types this prefix is not supported and will result in a compilation error.
|
||||
The resulting value is simply a 16 bit word. Example::
|
||||
|
||||
AX = #somevar
|
||||
|
||||
|
||||
|
||||
**Indirect addressing:**
|
||||
@todo
|
||||
|
||||
**Indirect addressing in jumps:**
|
||||
@todo
|
||||
For an indirect ``goto`` statement, the compiler will issue the 6502 CPU's special instruction
|
||||
(``jmp`` indirect). A subroutine call (``jsr`` indirect) is emitted
|
||||
using a couple of instructions.
|
||||
|
||||
|
||||
Loops
|
||||
-----
|
||||
|
||||
The *for*-loop is used to iterate over a range of values. Iteration steps by 1,
|
||||
but you can set it to something else as well.
|
||||
The *while*-loop is used to repeat a piece of code while a certain condition is still true.
|
||||
The *repeat--until* loop is used to repeat a piece of code until a certain condition is true.
|
||||
|
||||
You can also create loops by using the ``goto`` statement, but this should be avoided.
|
||||
|
||||
|
||||
Conditional Execution
|
||||
---------------------
|
||||
|
||||
@todo
|
||||
|
||||
Conditional execution means that the flow of execution changes based on certiain conditions,
|
||||
rather than having fixed gotos or subroutine calls. IL65 has a *conditional goto* statement for this,
|
||||
that is translated into a comparison (if needed) and then a conditional branch instruction::
|
||||
|
||||
if[_XX] [<expression>] goto <label>
|
||||
|
||||
|
||||
The if-status XX is one of: [cc, cs, vc, vs, eq, ne, true, not, zero, pos, neg, lt, gt, le, ge]
|
||||
It defaults to 'true' (=='ne', not-zero) if omitted. ('pos' will translate into 'pl', 'neg' into 'mi')
|
||||
@todo signed: lts==neg?, gts==eq+pos?, les==neg+eq?, ges==pos?
|
||||
|
||||
The <expression> is optional. If it is provided, it will be evaluated first. Only the [true] and [not] and [zero]
|
||||
if-statuses can be used when such a *comparison expression* is used. An example is::
|
||||
|
||||
if_not A > 55 goto more_iterations
|
||||
|
||||
|
||||
Conditional jumps are compiled into 6502's branching instructions (such as ``bne`` and ``bcc``) so
|
||||
the rather strict limit on how *far* it can jump applies. The compiler itself can't figure this
|
||||
out unfortunately, so it is entirely possible to create code that cannot be assembled successfully.
|
||||
You'll have to restructure your gotos in the code (place target labels closer to the branch)
|
||||
if you run into this type of assembler error.
|
||||
|
||||
|
||||
Assignments
|
||||
-----------
|
||||
|
||||
Assignment statements assign a single value to a target variable or memory location.
|
||||
Augmented assignments (such as ``A += X``) are also available, but these are just shorthands
|
||||
for normal assignments (``A = A + X``).
|
||||
|
||||
|
||||
Expressions
|
||||
-----------
|
||||
|
||||
In most places where a number or other value is expected, you can use just the number, or a full constant expression.
|
||||
The expression is parsed and evaluated by Python itself at compile time, and the (constant) resulting value is used in its place.
|
||||
Ofcourse the special il65 syntax for hexadecimal numbers (``$xxxx``), binary numbers (``%bbbbbbbb``),
|
||||
and the address-of (``#xxxx``) is supported. Other than that it must be valid Python syntax.
|
||||
Expressions can contain function calls to the math library (sin, cos, etc) and you can also use
|
||||
all builtin functions (max, avg, min, sum etc). They can also reference idendifiers defined elsewhere in your code,
|
||||
if this makes sense.
|
||||
|
||||
|
||||
Arithmetic
|
||||
^^^^^^^^^^
|
||||
@todo
|
||||
|
||||
|
||||
Logical expressions
|
||||
^^^^^^^^^^^^^^^^^^^
|
||||
@todo
|
||||
|
||||
|
||||
Subroutines
|
||||
-----------
|
||||
|
||||
Defining a subroutine
|
||||
^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Subroutines are parts of the code that can be repeatedly invoked using a subroutine call from elsewhere.
|
||||
Their definition, using the sub statement, includes the specification of the required input- and output parameters.
|
||||
For now, only register based parameters are supported (A, X, Y and paired registers,
|
||||
the carry status bit SC and the interrupt disable bit SI as specials).
|
||||
For subroutine return values, the special SZ register is also available, it means the zero status bit.
|
||||
|
||||
|
||||
Calling a subroutine
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The output variables must occur in the correct sequence of return registers as specified
|
||||
in the subroutine's definiton. It is possible to not specify any of them but the compiler
|
||||
will issue a warning then if the result values of a subroutine call are discarded.
|
||||
If you don't have a variable to store the output register in, it's then required
|
||||
to list the register itself instead as output variable.
|
||||
|
||||
Arguments should match the subroutine definition. You are allowed to omit the parameter names.
|
||||
If no definition is available (because you're directly calling memory or a label or something else),
|
||||
you can freely add arguments (but in this case they all have to be named).
|
||||
|
||||
To jump to a subroutine (without returning), prefix the subroutine call with the word 'goto'.
|
||||
Unlike gotos in other languages, here it take arguments as well, because it
|
||||
essentially is the same as calling a subroutine and only doing something different when it's finished.
|
||||
|
||||
**Register preserving calls:** use the ``!`` followed by a combination of A, X and Y (or followed
|
||||
by nothing, which is the same as AXY) to tell the compiler you want to preserve the origial
|
||||
value of the given registers after the subroutine call. Otherwise, the subroutine may just
|
||||
as well clobber all three registers. Preserving the original values does result in some
|
||||
stack manipulation code to be inserted for every call like this, which can be quite slow.
|
||||
|
Reference in New Issue
Block a user