This commit is contained in:
Irmen de Jong 2018-08-07 01:23:34 +02:00
parent 0e785fcfb3
commit b34ae4c91c
5 changed files with 595 additions and 357 deletions

73
docs/source/building.rst Normal file
View File

@ -0,0 +1,73 @@
==============================
Writing and building a program
==============================
What is a "Program" anyway?
---------------------------
A "complete runnable program" is a compiled, assembled, and linked together single unit.
It contains all of the program's code and data and has a certain file format that
allows it to be loaded directly on the target system. IL65 currently has no built-in
support for programs that exceed 64 Kb of memory, nor for multi-part loaders.
For the Commodore-64, most programs will have a tiny BASIC launcher that does a SYS into the generated machine code.
This way the user can load it as any other program and simply RUN it to start. (This is a regular ".prg" program).
Il65 can create those, but it is also possible to output plain binary programs
that can be loaded into memory anywhere.
Compiling program code
----------------------
Compilation of program code is done by telling the IL65 compiler to compile a main source code module file.
Other modules that this code needs will be loaded and processed via imports from within that file.
The compiler will link everything together into one output program at the end.
The compiler is invoked with the command:
``$ @todo``
Module source code files
------------------------
A module source file is a text file with the ``.ill`` suffix, containing the program's source code.
It consists of compilation options and other directives, imports of other modules,
and source code for one or more code blocks.
IL65 has a couple of *LIBRARY* modules that are defined in special internal files provided by the compiler:
``c64lib``, ``il65lib``, ``mathlib``.
You should not overwrite these or reuse their names.
.. _debugging:
Debugging (with Vice)
---------------------
There's support for using the monitor and debugging capabilities of the rather excellent
`Vice emulator <http://vice-emu.sourceforge.net/>`_.
The ``%breakpoint`` directive (see :ref:`directives`) in the source code instructs the compiler to put
a *breakpoint* at that position. Some systems use a BRK instruction for this, but
this will usually halt the machine altogether instead of just suspending execution.
IL65 issues a NOP instruction instead and creates a 'virtual' breakpoint at this position.
All breakpoints are then written to a file called "programname.vice-mon-list",
which is meant to be used by the Vice emulator.
It contains a series of commands for Vice's monitor, including source labels and the breakpoint settings.
If you use the vice autostart feature of the compiler, it will be processed by Vice automatically and immediately.
If you launch Vice manually, you'll have to use a command line option to load this file:
``$ x64 -moncommands programname.vice-mon-list``
Vice will then use the label names in memory disassembly, and will activate the breakpoints as well.
If your running program hits one of the breakpoints, Vice will halt execution and drop you into the monitor.
Troubleshooting
---------------
Getting an assembler error about undefined symbols such as ``not defined 'c64flt'``?
This happens when your program uses floating point values, and you forgot to import the ``c64lib``.
If you use floating points, the program will need routines from that library.
Fix it by adding an ``%import c64lib``.

View File

@ -40,6 +40,7 @@ Design principles
- The compiler outputs a regular 6502 assembly code file, it doesn't assemble this itself. - The compiler outputs a regular 6502 assembly code file, it doesn't assemble this itself.
A third party cross-assembler tool is used to do this final step. A third party cross-assembler tool is used to do this final step.
- Goto is considered harmful, but not here; arbitrary control flow jumps are allowed. - Goto is considered harmful, but not here; arbitrary control flow jumps are allowed.
- No complicated error handling or overflow checks that would slow things down.
Required tools Required tools
@ -56,8 +57,8 @@ Required tools
:caption: Contents of this manual: :caption: Contents of this manual:
targetsystem.rst targetsystem.rst
building.rst
programming.rst programming.rst
progstructure.rst
syntaxreference.rst syntaxreference.rst
todo.rst todo.rst

View File

@ -1,73 +1,335 @@
============================== .. _programstructure:
Writing and building a program
==============================
What is a "Program" anyway? ===================
--------------------------- Programming in IL65
===================
A "complete runnable program" is a compiled, assembled, and linked together single unit. This chapter describes a high level overview of the elements that make up a program.
It contains all of the program's code and data and has a certain file format that Details about the syntax can be found in the :ref:`syntaxreference` chapter.
allows it to be loaded directly on the target system. IL65 currently has no built-in
support for programs that exceed 64 Kb of memory, nor for multi-part loaders.
For the Commodore-64, most programs will have a tiny BASIC launcher that does a SYS into the generated machine code.
This way the user can load it as any other program and simply RUN it to start. (This is a regular ".prg" program).
Il65 can create those, but it is also possible to output plain binary programs
that can be loaded into memory anywhere.
Compiling program code Elements of a program
----------------------
Compilation of program code is done by telling the IL65 compiler to compile a main source code module file.
Other modules that this code needs will be loaded and processed via imports from within that file.
The compiler will link everything together into one output program at the end.
The compiler is invoked with the command:
``$ @todo``
Module source code files
------------------------
A module source file is a text file with the ``.ill`` suffix, containing the program's source code.
It consists of compilation options and other directives, imports of other modules,
and source code for one or more code blocks.
IL65 has a couple of *LIBRARY* modules that are defined in special internal files provided by the compiler:
``c64lib``, ``il65lib``, ``mathlib``.
You should not overwrite these or reuse their names.
.. _debugging:
Debugging (with Vice)
--------------------- ---------------------
There's support for using the monitor and debugging capabilities of the rather excellent Program
`Vice emulator <http://vice-emu.sourceforge.net/>`_. Consists of one or more *modules*.
The ``%breakpoint`` directive (see :ref:`directives`) in the source code instructs the compiler to put Module
a *breakpoint* at that position. Some systems use a BRK instruction for this, but A file on disk with the ``.ill`` suffix. It contains *directives* and *code blocks*.
this will usually halt the machine altogether instead of just suspending execution. Whitespace and indentation in the source code are arbitrary and can be tabs or spaces or both.
IL65 issues a NOP instruction instead and creates a 'virtual' breakpoint at this position. You can also add *comments* to the source code.
All breakpoints are then written to a file called "programname.vice-mon-list", One moudule file can *import* others, and also import *library modules*.
which is meant to be used by the Vice emulator.
It contains a series of commands for Vice's monitor, including source labels and the breakpoint settings.
If you use the vice autostart feature of the compiler, it will be processed by Vice automatically and immediately.
If you launch Vice manually, you'll have to use a command line option to load this file:
``$ x64 -moncommands programname.vice-mon-list`` Comments
Everything after a semicolon ``;`` is a comment and is ignored by the compiler.
If the whole line is just a comment, it will be copied into the resulting assembly source code.
This makes it easier to understand and relate the generated code. Examples::
Vice will then use the label names in memory disassembly, and will activate the breakpoints as well. A = 42 ; set the initial value to 42
If your running program hits one of the breakpoints, Vice will halt execution and drop you into the monitor. ; next is the code that...
Directive
These are special instructions for the compiler, to change how it processes the code
and what kind of program it creates. A directive is on its own line in the file, and
starts with ``%``, optionally followed by some arguments.
Code block
A block of actual program code. It defines a *scope* (also known as 'namespace') and
can contain IL65 *code*, *variable declarations* and *subroutines*.
More details about this below: :ref:`blocks`.
Variable declarations
The data that the code works on is stored in variables ('named values that can change').
The compiler allocates the required memory for them.
There is *no dynamic memory allocation*. The storage size of all variables
is fixed and is determined at compile time.
Variable declarations tend to appear at the top of the code block that uses them.
They define the name and type of the variable, and its initial value.
IL65 supports a small list of data types, including special 'memory mapped' types
that don't allocate storage but instead point to a fixed location in the address space.
Code
These are the instructions that make up the program's logic. There are different kinds of instructions
('statements' is a better name):
- value assignment
- looping (for, while, repeat, unconditional jumps)
- conditional execution (if - then - else, and conditional jumps)
- subroutine calls
- label definition
Subroutine
Defines a piece of code that can be called by its name from different locations in your code.
It accepts parameters and can return result values.
It can define its own variables but it's not possible to define subroutines nested in other subroutines.
To keep things simple, you can only define subroutines inside code blocks from a module.
Label
This is a named position in your code where you can jump to from another place.
You can jump to it with a jump statement elsewhere. It is also possible to use a
subroutine call to a label (but without parameters and return value).
Troubleshooting Scope
--------------- Also known as 'namespace', this is a named box around the symbols defined in it.
This prevents name collisions (or 'namespace pollution'), because the name of the scope
is needed as prefix to be able to access the symbols in it.
Anything *inside* the scope can refer to symbols in the same scope without using a prefix.
There are three scopes in IL65:
Getting an assembler error about undefined symbols such as ``not defined 'c64flt'``? - global (no prefix)
This happens when your program uses floating point values, and you forgot to import the ``c64lib``. - code block
If you use floating points, the program will need routines from that library. - subroutine
Fix it by adding an ``%import c64lib``.
Modules are *not* a scope! Everything defined in a module is merged into the global scope.
.. _blocks:
Blocks, Scopes, and accessing Symbols
-------------------------------------
Blocks are the separate pieces of code and data of your program. They are combined
into a single output program. No code or data can occur outside a block. Here's an example::
~ main $c000 {
; this is code inside the block...
}
The name of a block must be unique in your entire program.
Also be careful when importing other modules; blocks in your own code cannot have
the same name as a block defined in an imported module or library.
It's possible to omit this name, but then you can only refer to the contents of the block via its absolute address,
which is required in this case. If you omit *both* name and address, the block is *ignored* by the compiler (and a warning is displayed).
This is a way to quickly "comment out" a piece of code that is unfinshed or may contain errors that you
want to work on later, because the contents of the ignored block are not fully parsed either.
The address can be used to place a block at a specific location in memory.
Usually it is omitted, and the compiler will automatically choose the location (usually immediately after
the previous block in memory).
The address must be >= ``$0200`` (because ``$00``--``$ff`` is the ZP and ``$100``--``$200`` is the cpu stack).
A block is also a *scope* in your program so the symbols in the block don't clash with
symbols of the same name defined elsewhere in the same file or in another file.
You can refer to the symbols in a particular block by using a *dotted name*: ``blockname.symbolname``.
Labels inside a subroutine are appended again to that; ``blockname.subroutinename.label``.
Every symbol is 'public' and can be accessed from elsewhere given its dotted name.
**The special "ZP" ZeroPage block**
Blocks named "ZP" are treated a bit differently: they refer to the ZeroPage.
The contents of every block with that name (this one may occur multiple times) are merged into one.
Its start address is always set to ``$04``, because ``$00 - $01`` are used by the hardware
and ``$02 - $03`` are reserved as general purpose scratch registers.
Program Start and Entry Point
-----------------------------
Your program must have a single entry point where code execution begins.
The compiler expects a ``start`` subroutine in the ``main`` block for this,
taking no parameters and having no return value.
As any subroutine, it has to end with a ``return`` statement (or a ``goto`` call)::
~ main {
sub start () -> () {
; program entrypoint code here
return
}
}
The ``main`` module is always relocated to the start of your programs
address space, and the ``start`` subroutine (the entrypoint) will be on the
first address. This will also be the address that the BASIC loader program (if generated)
calls with the SYS statement.
Variables and data
------------------
::
12345 ; integer number
"Hi, I am a string" ; text string
-33.456e52 ; floating point number
byte counter = 42 ; variable of size 8 bits, with initial value 42
Integers
^^^^^^^^
Integers are 8 or 16 bit numbers and can be written in normal decimal notation,
in hexadecimal and in binary notation.
@todo right now only unsinged integers are supported (>=0)
Strings
^^^^^^^
Strings are a sequence of characters enclosed in ``"`` quotes.
They're stored and treated much the same as a byte array,
but they have some special properties because they are considered to be *text*.
Strings in your source code files will be encoded (translated from ASCII/UTF-8) into either CBM PETSCII or C-64 screencodes.
PETSCII is the default choice. If you need screencodes (also called 'poke' codes) instead,
you have to use the ``str_s`` variants of the string type identifier.
If you assign a string literal of length 1 to a non-string variable, it is treated as a *byte* value instead
with has the PETSCII value of that single character,
Floating point numbers
^^^^^^^^^^^^^^^^^^^^^^
Floats are stored in the 5-byte 'MFLPT' format that is used on CBM machines,
and also most float operations are specific to the Commodore-64.
This is because routines in the C-64 BASIC and KERNAL ROMs are used for that.
So floating point operations will only work if the C-64 BASIC ROM (and KERNAL ROM)
are banked in (and your code imports the ``c64lib.ill``)
The largest 5-byte MFLPT float that can be stored is: **1.7014118345e+38** (negative: **-1.7014118345e+38**)
Initial values across multiple runs of the program
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The initial values of your variables will be restored automatically when the program is (re)started,
*except for string variables*. It is assumed these are left unchanged by the program.
If you do modify them in-place, you should take care yourself that they work as
expected when the program is restarted.
Indirect addressing and address-of
----------------------------------
The ``#`` operator is used to take the address of the symbol following it.
It can be used for example to work with the *address* of a memory mapped variable rather than
the value it holds. You could take the address of a string as well, but that is redundant:
the compiler already treats those as a value that you manipulate via its address.
For most other types this prefix is not supported and will result in a compilation error.
The resulting value is simply a 16 bit word. Example::
AX = #somevar
**Indirect addressing:**
@todo
**Indirect addressing in jumps:**
@todo
For an indirect ``goto`` statement, the compiler will issue the 6502 CPU's special instruction
(``jmp`` indirect). A subroutine call (``jsr`` indirect) is emitted
using a couple of instructions.
Loops
-----
The *for*-loop is used to iterate over a range of values. Iteration steps by 1,
but you can set it to something else as well.
The *while*-loop is used to repeat a piece of code while a certain condition is still true.
The *repeat--until* loop is used to repeat a piece of code until a certain condition is true.
You can also create loops by using the ``goto`` statement, but this should be avoided.
Conditional Execution
---------------------
@todo
Conditional execution means that the flow of execution changes based on certiain conditions,
rather than having fixed gotos or subroutine calls. IL65 has a *conditional goto* statement for this,
that is translated into a comparison (if needed) and then a conditional branch instruction::
if[_XX] [<expression>] goto <label>
The if-status XX is one of: [cc, cs, vc, vs, eq, ne, true, not, zero, pos, neg, lt, gt, le, ge]
It defaults to 'true' (=='ne', not-zero) if omitted. ('pos' will translate into 'pl', 'neg' into 'mi')
@todo signed: lts==neg?, gts==eq+pos?, les==neg+eq?, ges==pos?
The <expression> is optional. If it is provided, it will be evaluated first. Only the [true] and [not] and [zero]
if-statuses can be used when such a *comparison expression* is used. An example is::
if_not A > 55 goto more_iterations
Conditional jumps are compiled into 6502's branching instructions (such as ``bne`` and ``bcc``) so
the rather strict limit on how *far* it can jump applies. The compiler itself can't figure this
out unfortunately, so it is entirely possible to create code that cannot be assembled successfully.
You'll have to restructure your gotos in the code (place target labels closer to the branch)
if you run into this type of assembler error.
Assignments
-----------
Assignment statements assign a single value to a target variable or memory location.
Augmented assignments (such as ``A += X``) are also available, but these are just shorthands
for normal assignments (``A = A + X``).
Expressions
-----------
In most places where a number or other value is expected, you can use just the number, or a full constant expression.
The expression is parsed and evaluated by Python itself at compile time, and the (constant) resulting value is used in its place.
Ofcourse the special il65 syntax for hexadecimal numbers (``$xxxx``), binary numbers (``%bbbbbbbb``),
and the address-of (``#xxxx``) is supported. Other than that it must be valid Python syntax.
Expressions can contain function calls to the math library (sin, cos, etc) and you can also use
all builtin functions (max, avg, min, sum etc). They can also reference idendifiers defined elsewhere in your code,
if this makes sense.
Arithmetic
^^^^^^^^^^
@todo
Logical expressions
^^^^^^^^^^^^^^^^^^^
@todo
Subroutines
-----------
Defining a subroutine
^^^^^^^^^^^^^^^^^^^^^
Subroutines are parts of the code that can be repeatedly invoked using a subroutine call from elsewhere.
Their definition, using the sub statement, includes the specification of the required input- and output parameters.
For now, only register based parameters are supported (A, X, Y and paired registers,
the carry status bit SC and the interrupt disable bit SI as specials).
For subroutine return values, the special SZ register is also available, it means the zero status bit.
Calling a subroutine
^^^^^^^^^^^^^^^^^^^^
The output variables must occur in the correct sequence of return registers as specified
in the subroutine's definiton. It is possible to not specify any of them but the compiler
will issue a warning then if the result values of a subroutine call are discarded.
If you don't have a variable to store the output register in, it's then required
to list the register itself instead as output variable.
Arguments should match the subroutine definition. You are allowed to omit the parameter names.
If no definition is available (because you're directly calling memory or a label or something else),
you can freely add arguments (but in this case they all have to be named).
To jump to a subroutine (without returning), prefix the subroutine call with the word 'goto'.
Unlike gotos in other languages, here it take arguments as well, because it
essentially is the same as calling a subroutine and only doing something different when it's finished.
**Register preserving calls:** use the ``!`` followed by a combination of A, X and Y (or followed
by nothing, which is the same as AXY) to tell the compiler you want to preserve the origial
value of the given registers after the subroutine call. Otherwise, the subroutine may just
as well clobber all three registers. Preserving the original values does result in some
stack manipulation code to be inserted for every call like this, which can be quite slow.

View File

@ -1,164 +0,0 @@
.. _programstructure:
=================
Program Structure
=================
This chapter describes a high level overview of the elements that make up a program.
Details about each of them, and the syntax, are discussed in the :ref:`syntaxreference` chapter.
Elements of a program
---------------------
.. data:: Program
Consists of one or more *modules*.
.. data:: Module
A file on disk with the ``.ill`` suffix. It contains *directives* and *code blocks*.
Whitespace and indentation in the source code are arbitrary and can be tabs or spaces or both.
You can also add *comments* to the source code.
One moudule file can *import* others, and also import *library modules*.
.. data:: Comments
Everything after a semicolon ``;`` is a comment and is ignored by the compiler.
If the whole line is just a comment, it will be copied into the resulting assembly source code.
This makes it easier to understand and relate the generated code. Examples::
A = 42 ; set the initial value to 42
; next is the code that...
.. data:: Directive
These are special instructions for the compiler, to change how it processes the code
and what kind of program it creates. A directive is on its own line in the file, and
starts with ``%``, optionally followed by some arguments.
.. data:: Code block
A block of actual program code. It defines a *scope* (also known as 'namespace') and
can contain IL65 *code*, *variable declarations* and *subroutines*.
More details about this below: :ref:`blocks`.
.. data:: Variable declarations
The data that the code works on is stored in variables ('named values that can change').
The compiler allocates the required memory for them.
There is *no dynamic memory allocation*. The storage size of all variables
is fixed and is determined at compile time.
Variable declarations tend to appear at the top of the code block that uses them.
They define the name and type of the variable, and its initial value.
IL65 supports a small list of data types, including special 'memory mapped' types
that don't allocate storage but instead point to a fixed location in the address space.
.. data:: Code
These are the instructions that make up the program's logic. There are different kinds of instructions
('statements' is a better name):
- value assignment
- looping (for, while, repeat, unconditional jumps)
- conditional execution (if - then - else, and conditional jumps)
- subroutine calls
- label definition
.. data:: Subroutine
Defines a piece of code that can be called by its name from different locations in your code.
It accepts parameters and can return result values.
It can define its own variables but it's not possible to define subroutines nested in other subroutines.
To keep things simple, you can only define subroutines inside code blocks from a module.
.. data:: Label
To label a position in your code where you can jump to from another place, you use a label like this::
nice_place:
; code ...
It's an identifier followed by a colon ``:``. It's allowed to put the next statement on
the same line, after the label.
You can jump to it with a jump statement elsewhere. It is also possible to use a
subroutine call to a label (but without parameters and return value).
.. data:: Scope
Also known as 'namespace', this is a named box around the symbols defined in it.
This prevents name collisions (or 'namespace pollution'), because the name of the scope
is needed as prefix to be able to access the symbols in it.
Anything *inside* the scope can refer to symbols in the same scope without using a prefix.
There are three scopes in IL65:
- global (no prefix)
- code block
- subroutine
Modules are *not* a scope! Everything defined in a module is merged into the global scope.
.. _blocks:
Blocks, Scopes, and accessing Symbols
-------------------------------------
Blocks are the separate pieces of code and data of your program. They are combined
into a single output program. No code or data can occur outside a block. Here's an example::
~ main $c000 {
; this is code inside the block...
}
The name of a block must be unique in your entire program.
Also be careful when importing other modules; blocks in your own code cannot have
the same name as a block defined in an imported module or library.
It's possible to omit this name, but then you can only refer to the contents of the block via its absolute address,
which is required in this case. If you omit *both* name and address, the block is *ignored* by the compiler (and a warning is displayed).
This is a way to quickly "comment out" a piece of code that is unfinshed or may contain errors that you
want to work on later, because the contents of the ignored block are not fully parsed either.
The address can be used to place a block at a specific location in memory.
Usually it is omitted, and the compiler will automatically choose the location (usually immediately after
the previous block in memory).
The address must be >= ``$0200`` (because ``$00``--``$ff`` is the ZP and ``$100``--``$200`` is the cpu stack).
A block is also a *scope* in your program so the symbols in the block don't clash with
symbols of the same name defined elsewhere in the same file or in another file.
You can refer to the symbols in a particular block by using a *dotted name*: ``blockname.symbolname``.
Labels inside a subroutine are appended again to that; ``blockname.subroutinename.label``.
Every symbol is 'public' and can be accessed from elsewhere given its dotted name.
**The special "ZP" ZeroPage block**
Blocks named "ZP" are treated a bit differently: they refer to the ZeroPage.
The contents of every block with that name (this one may occur multiple times) are merged into one.
Its start address is always set to ``$04``, because ``$00 - $01`` are used by the hardware
and ``$02 - $03`` are reserved as general purpose scratch registers.
Program Start and Entry Point
-----------------------------
Your program must have a single entry point where code execution begins.
The compiler expects a ``start`` subroutine in the ``main`` block for this,
taking no parameters and having no return value.
As any subroutine, it has to end with a ``return`` statement (or a ``goto`` call)::
~ main {
sub start () -> () {
; program entrypoint code here
return
}
}
The ``main`` module is always relocated to the start of your programs
address space, and the ``start`` subroutine (the entrypoint) will be on the
first address. This will also be the address that the BASIC loader program (if generated)
calls with the SYS statement.

View File

@ -10,20 +10,22 @@ Module file
This is a file with the ``.ill`` suffix, containing *directives* and *code blocks*, described below. This is a file with the ``.ill`` suffix, containing *directives* and *code blocks*, described below.
The file is a text file wich can also contain: The file is a text file wich can also contain:
.. data:: Lines, whitespace, indentation Lines, whitespace, indentation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Line endings are significant because *only one* declaration, statement or other instruction can occur on every line. Line endings are significant because *only one* declaration, statement or other instruction can occur on every line.
Other whitespace and line indentation is arbitrary and ignored by the compiler. Other whitespace and line indentation is arbitrary and ignored by the compiler.
You can use tabs or spaces as you wish. You can use tabs or spaces as you wish.
.. data:: Source code comments Source code comments
^^^^^^^^^^^^^^^^^^^^
Everything after a semicolon ``;`` is a comment and is ignored. Everything after a semicolon ``;`` is a comment and is ignored.
If the whole line is just a comment, it will be copied into the resulting assembly source code. If the whole line is just a comment, it will be copied into the resulting assembly source code.
This makes it easier to understand and relate the generated code. Examples:: This makes it easier to understand and relate the generated code. Examples::
A = 42 ; set the initial value to 42 A = 42 ; set the initial value to 42
; next is the code that... ; next is the code that...
.. _directives: .. _directives:
@ -168,6 +170,19 @@ Also read :ref:`blocks`. Here is an example of a code block, to be loaded at ``
} }
Labels
------
To label a position in your code where you can jump to from another place, you use a label::
nice_place:
; code ...
It's just an identifier followed by a colon ``:``. It's allowed to put the next statement on
the same line, after the label.
Variables and value literals Variables and value literals
---------------------------- ----------------------------
@ -178,7 +193,23 @@ data types below you can see how they should be written.
Variable declarations Variable declarations
^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^
@todo Variables should be declared with their exact type and size so the compiler can allocate storage
for them. You can give them an initial value as well. That value can be a simple literal value,
but you can put a (constant) expression there as well. The syntax is::
<datatype> <variable name> [ = <initial value> ]
Various examples::
word thing
byte counter = 0
byte age = 2018 - 1974
float wallet = 55.25
str name = "my name is Irmen"
word address = #counter
byte[5] values = [11, 22, 33, 44, 55]
byte[5][6] empty_matrix
Data types Data types
@ -211,121 +242,132 @@ type identifier type storage size example var declara
=============== ======================= ================= ========================================= =============== ======================= ================= =========================================
**String encoding:**
Strings in your code will be encoded (translated from ASCII/UTF-8) into either CBM PETSCII or C-64 screencodes.
PETSCII is the default, so if you need screencodes (also called 'poke' codes)
you have to use the ``_s`` variants of the string type identifier.
A string literal of length 1 is considered to be a *byte* instead with
that single character's PETSCII value. If you really need a *string* of length 1 you
can only do so by assigning it to a variable with one of the string types.
**Floating point numbers:**
Floats are stored in the 5-byte 'MFLPT' format that is used on CBM machines,
and also most float operations are specific to the Commodore-64.
This is because routines in the C-64 BASIC and KERNAL ROMs are used for that.
So floating point operations will only work if the C-64 BASIC ROM (and KERNAL ROM)
are banked in (and your code imports the ``c64lib.ill``)
The largest 5-byte MFLPT float that can be stored is: **1.7014118345e+38** (negative: **-1.7014118345e+38**)
**Initial values over multiple runs:**
The initial values of your variables will be restored automatically when the program is (re)started,
*except for string variables*. It is assumed these are left unchanged by the program.
If you do modify them in-place, you should take care yourself that they work as
expected when the program is restarted.
**@todo pointers/addresses? (as opposed to normal WORDs)** **@todo pointers/addresses? (as opposed to normal WORDs)**
**@todo signed integers (byte and word)?** **@todo signed integers (byte and word)?**
Reserved names
^^^^^^^^^^^^^^
Indirect addressing and address-of The following names are reserved, they have a special meaning::
----------------------------------
**Address-of:** A X Y ; 6502 hardware registers
The ``#`` prefix is used to take the address of something. AX AY XY ; 16-bit pseudo register pairs
It can be used for example to work with the *address* of a memory mapped variable rather than SC SI SZ ; bits of the 6502 hardware status register
the value it holds. You could take the address of a string as well, but that is redundant:
the compiler already treats those as a value that you manipulate via its address.
For most other types this prefix is not supported and will result in a compilation error.
The resulting value is simply a 16 bit word.
**Indirect addressing:**
@todo
**Indirect addressing in jumps:**
@todo
For an indirect ``goto`` statement, the compiler will issue the 6502 CPU's special instruction
(``jmp`` indirect). A subroutine call (``jsr`` indirect) is emitted
using a couple of instructions.
Conditional Execution Operators
--------------------- ---------
Conditional execution means that the flow of execution changes based on certiain conditions, .. data:: # (address-of)
rather than having fixed gotos or subroutine calls. IL65 has a *conditional goto* statement for this,
that is translated into a comparison (if needed) and then a conditional branch instruction::
if[_XX] [<expression>] goto <label> Takes the address of the symbol following it: ``word address = #somevar``
The if-status XX is one of: [cc, cs, vc, vs, eq, ne, true, not, zero, pos, neg, lt, gt, le, ge] .. data:: + - * / // ** % (arithmetic)
It defaults to 'true' (=='ne', not-zero) if omitted. ('pos' will translate into 'pl', 'neg' into 'mi')
@todo signed: lts==neg?, gts==eq+pos?, les==neg+eq?, ges==pos?
The <expression> is optional. If it is provided, it will be evaluated first. Only the [true] and [not] and [zero] ``+``, ``-``, ``*``, ``/`` are the familiar arithmetic operations.
if-statuses can be used when such a *comparison expression* is used. An example is::
if_not A > 55 goto more_iterations ``//`` means *integer division* even when the operands are floating point values: ``9.5 // 2.5`` is 3 (and not 3.8)
``**`` is the power operator: ``3 ** 5`` is equal to 3*3*3*3*3 and is 243.
``%`` is the modulo operator: ``25 % 7`` is 4.
Conditional jumps are compiled into 6502's branching instructions (such as ``bne`` and ``bcc``) so .. data:: << >> <<@ >>@ & | ^ ~ (bitwise arithmetic)
the rather strict limit on how *far* it can jump applies. The compiler itself can't figure this
out unfortunately, so it is entirely possible to create code that cannot be assembled successfully. ``<<`` and ``>>`` are bitwise shifts (left and right), ``<<@`` and ``@>>`` are bitwise rotations (left and right)
You'll have to restructure your gotos in the code (place target labels closer to the branch)
if you run into this type of assembler error. ``&`` is bitwise and, ``|`` is bitwise or, ``^`` is bitwise xor, ``~`` is bitwise invert (this one is an unary operator)
Assignments .. data:: = (assignment)
-----------
Assignment statements assign a single value to a target variable or memory location.:: Sets the target on the LHS (left hand side) of the operator to the value of the expression on the RHS (right hand side).
target = value-expression
Augmented Assignments .. data:: += -= *= /= //= **= <<= >>= <<@= >>@= &= |= ^= (augmented assignment)
---------------------
A special assignment is the *augmented assignment* where the value is modified in-place. Syntactic sugar; ``A += X`` is equivalent to ``A = A + X``
Several assignment operators are available: ``+=``, ``-=``, ``&=``, ``|=``, ``^=``, ``<<=``, ``>>=``
Expressions .. data:: ++ -- (postfix increment and decrement)
-----------
In most places where a number or other value is expected, you can use just the number, or a full constant expression. Syntactic sugar; ``A++`` is equivalent to ``A = A + 1``, and ``A--`` is equivalent to ``A = A - 1``.
The expression is parsed and evaluated by Python itself at compile time, and the (constant) resulting value is used in its place. Because these operations are so common, we have these short forms.
Ofcourse the special il65 syntax for hexadecimal numbers (``$xxxx``), binary numbers (``%bbbbbbbb``),
and the address-of (``#xxxx``) is supported. Other than that it must be valid Python syntax.
Expressions can contain function calls to the math library (sin, cos, etc) and you can also use
all builtin functions (max, avg, min, sum etc). They can also reference idendifiers defined elsewhere in your code,
if this makes sense.
Subroutines .. data:: == != < > <= >= (comparison)
-----------
Defining a subroutine Equality, Inequality, Less-than, Greater-than, Less-or-Equal-than, Greater-or-Equal-than comparisons.
^^^^^^^^^^^^^^^^^^^^^ The result is a 'boolean' value 'true' or 'false' (which in reality is just a byte value of 1 or 0).
Subroutines are parts of the code that can be repeatedly invoked using a subroutine call from elsewhere.
Their definition, using the sub statement, includes the specification of the required input- and output parameters. .. data:: not and or xor (logical)
For now, only register based parameters are supported (A, X, Y and paired registers,
the carry status bit SC and the interrupt disable bit SI as specials). These operators are the usual logical operations that are part of a logical expression to reason
For subroutine return values, the special SZ register is also available, it means the zero status bit. about truths (boolean values). The result of such an expression is a 'boolean' value 'true' or 'false'
(which in reality is just a byte value of 1 or 0).
.. data:: .. (range creation)
Creates a range of values from the LHS value to the RHS value, inclusive.
These are mainly used in for loops to set the loop range. Example::
0 .. 7 ; range of values 0, 1, 2, 3, 4, 5, 6, 7 (constant)
A = 5
X = 10
A .. X ; range of 5, 6, 7, 8, 9, 10
byte[4] array = 10 .. 13 ; sets the array to [1, 2, 3, 4]
for i in 0 .. 127 {
; i loops 0, 1, 2, ... 127
}
.. data:: [ ... ] (array indexing)
When put after a sequence type (array, string or matrix) it means to point to the given element in that sequence::
array[2] ; the third byte in the array (index is 0-based)
matrix[4,2] ; the byte at the 5th column and 3rd row in the matrix
.. data:: ( ... ) (precedence grouping in expressions, or subroutine parameter list)
Parentheses are used to chose the evaluation precedence in expressions.
Usually the normal precedence rules apply (``*`` goes before ``+`` etc.) but with
parentheses you can change this: ``4 + 8 * 2`` is 20, but ``(4 + 8) * 2`` is 24.
Parentheses are also used in a subroutine call, they follow the name of the subroutine and contain
the list of arguments to pass to the subroutine: ``big_function(1, 99)``
Subroutine calls
----------------
You call a subroutine like this::
[ result = ] subroutinename_or_address ( [argument...] )
; example:
outputvar1, outputvar2 = subroutine ( arg1, arg2, arg3 )
Arguments are separated by commas. The argument list can also be empty if the subroutine
takes no parameters.
If the subroutine returns one or more result values, you must use an assignment statement
to store those values somewhere. If the subroutine has no result values, you must
omit the assignment.
Subroutine definitions
----------------------
The syntax is:: The syntax is::
@ -333,13 +375,33 @@ The syntax is::
... statements ... ... statements ...
} }
**proc_parameters =** ; example:
sub triple_something (amount: X) -> A {
return X * 3
}
The open curly brace must immediately follow the subroutine result specification on the same line,
and can have nothing following it. The close curly brace must be on its own line as well.
Pre-defined subroutines that are available on specific memory addresses
(in system ROM for instance) can be defined by assigning the routine's memory address to the sub,
and not specifying a code block::
sub <identifier> ([proc_parameters]) -> ([proc_results]) = <address>
; example:
sub CLOSE (logical: A) -> (A?, X?, Y?) = $FFC3
.. data:: proc_parameters
comma separated list of "<parametername>:<register>" pairs specifying the input parameters. comma separated list of "<parametername>:<register>" pairs specifying the input parameters.
You can omit the parameter names as long as the arguments "line up". You can omit the parameter names as long as the arguments "line up".
(actually, the Python parameter passing rules apply, so you can also mix positional (actually, the Python parameter passing rules apply, so you can also mix positional
and keyword arguments, as long as the keyword arguments come last) and keyword arguments, as long as the keyword arguments come last)
**proc_results =** .. data:: proc_results
comma separated list of <register> names specifying in which register(s) the output is returned. comma separated list of <register> names specifying in which register(s) the output is returned.
If the register name ends with a '?', that means the register doesn't contain a real return value but If the register name ends with a '?', that means the register doesn't contain a real return value but
is clobbered in the process so the original value it had before calling the sub is no longer valid. is clobbered in the process so the original value it had before calling the sub is no longer valid.
@ -349,49 +411,53 @@ The syntax is::
what the changed registers are, assume the worst") what the changed registers are, assume the worst")
Pre-defined subroutines that are available on specific memory addresses Loops
(in system ROM for instance) can also be defined using the 'sub' statement. -----
To do this you assign the routine's memory address to the sub::
sub <identifier> ([proc_parameters]) -> ([proc_results]) = <address> for loop
^^^^^^^^
@todo::
example:: for <loopvar> in <range> [ step <amount> ] {
; do something...
sub CLOSE (logical: A) -> (A?, X?, Y?) = $FFC3" }
Calling a subroutine while loop
^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^
@todo::
You call a subroutine like this:: while <condition> {
; do something...
}
subroutinename_or_address ( [arguments...] )
or:: repeat--until loop
^^^^^^^^^^^^^^^^^^
@todo::
subroutinename_or_address ![register(s)] ( [arguments...] ) repeat {
; do something...
} until <condition>
If the subroutine returns one or more values as results, you must use an assignment statement
to store those values somewhere::
outputvar1, outputvar2 = subroutine ( arg1, arg2, arg3 ) Conditional Execution
---------------------
The output variables must occur in the correct sequence of return registers as specified Must align this with the various status bits in the cpu... not only true/false....
in the subroutine's definiton. It is possible to not specify any of them but the compiler
will issue a warning then if the result values of a subroutine call are discarded.
If you don't have a variable to store the output register in, it's then required
to list the register itself instead as output variable.
Arguments should match the subroutine definition. You are allowed to omit the parameter names. @todo::
If no definition is available (because you're directly calling memory or a label or something else),
you can freely add arguments (but in this case they all have to be named).
To jump to a subroutine (without returning), prefix the subroutine call with the word 'goto'. if <condition> {
Unlike gotos in other languages, here it take arguments as well, because it ; do something....
essentially is the same as calling a subroutine and only doing something different when it's finished. }
[ else {
; do something else...
} ]
@todo::
if <condition> goto <location>
**Register preserving calls:** use the ``!`` followed by a combination of A, X and Y (or followed
by nothing, which is the same as AXY) to tell the compiler you want to preserve the origial
value of the given registers after the subroutine call. Otherwise, the subroutine may just
as well clobber all three registers. Preserving the original values does result in some
stack manipulation code to be inserted for every call like this, which can be quite slow.