diff --git a/docs/source/building.rst b/docs/source/building.rst
new file mode 100644
index 000000000..058734881
--- /dev/null
+++ b/docs/source/building.rst
@@ -0,0 +1,73 @@
+==============================
+Writing and building a program
+==============================
+
+What is a "Program" anyway?
+---------------------------
+
+A "complete runnable program" is a compiled, assembled, and linked together single unit.
+It contains all of the program's code and data and has a certain file format that
+allows it to be loaded directly on the target system. IL65 currently has no built-in
+support for programs that exceed 64 Kb of memory, nor for multi-part loaders.
+
+For the Commodore-64, most programs will have a tiny BASIC launcher that does a SYS into the generated machine code.
+This way the user can load it as any other program and simply RUN it to start. (This is a regular ".prg" program).
+Il65 can create those, but it is also possible to output plain binary programs
+that can be loaded into memory anywhere.
+
+
+Compiling program code
+----------------------
+
+Compilation of program code is done by telling the IL65 compiler to compile a main source code module file.
+Other modules that this code needs will be loaded and processed via imports from within that file.
+The compiler will link everything together into one output program at the end.
+
+The compiler is invoked with the command:
+
+ ``$ @todo``
+
+
+Module source code files
+------------------------
+
+A module source file is a text file with the ``.ill`` suffix, containing the program's source code.
+It consists of compilation options and other directives, imports of other modules,
+and source code for one or more code blocks.
+
+IL65 has a couple of *LIBRARY* modules that are defined in special internal files provided by the compiler:
+``c64lib``, ``il65lib``, ``mathlib``.
+You should not overwrite these or reuse their names.
+
+
+.. _debugging:
+
+Debugging (with Vice)
+---------------------
+
+There's support for using the monitor and debugging capabilities of the rather excellent
+`Vice emulator `_.
+
+The ``%breakpoint`` directive (see :ref:`directives`) in the source code instructs the compiler to put
+a *breakpoint* at that position. Some systems use a BRK instruction for this, but
+this will usually halt the machine altogether instead of just suspending execution.
+IL65 issues a NOP instruction instead and creates a 'virtual' breakpoint at this position.
+All breakpoints are then written to a file called "programname.vice-mon-list",
+which is meant to be used by the Vice emulator.
+It contains a series of commands for Vice's monitor, including source labels and the breakpoint settings.
+If you use the vice autostart feature of the compiler, it will be processed by Vice automatically and immediately.
+If you launch Vice manually, you'll have to use a command line option to load this file:
+
+ ``$ x64 -moncommands programname.vice-mon-list``
+
+Vice will then use the label names in memory disassembly, and will activate the breakpoints as well.
+If your running program hits one of the breakpoints, Vice will halt execution and drop you into the monitor.
+
+
+Troubleshooting
+---------------
+
+Getting an assembler error about undefined symbols such as ``not defined 'c64flt'``?
+This happens when your program uses floating point values, and you forgot to import the ``c64lib``.
+If you use floating points, the program will need routines from that library.
+Fix it by adding an ``%import c64lib``.
diff --git a/docs/source/index.rst b/docs/source/index.rst
index 69f358ee4..643bbf1a7 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -40,6 +40,7 @@ Design principles
- The compiler outputs a regular 6502 assembly code file, it doesn't assemble this itself.
A third party cross-assembler tool is used to do this final step.
- Goto is considered harmful, but not here; arbitrary control flow jumps are allowed.
+- No complicated error handling or overflow checks that would slow things down.
Required tools
@@ -56,8 +57,8 @@ Required tools
:caption: Contents of this manual:
targetsystem.rst
+ building.rst
programming.rst
- progstructure.rst
syntaxreference.rst
todo.rst
diff --git a/docs/source/programming.rst b/docs/source/programming.rst
index 058734881..f1cfc81ed 100644
--- a/docs/source/programming.rst
+++ b/docs/source/programming.rst
@@ -1,73 +1,335 @@
-==============================
-Writing and building a program
-==============================
+.. _programstructure:
-What is a "Program" anyway?
----------------------------
+===================
+Programming in IL65
+===================
-A "complete runnable program" is a compiled, assembled, and linked together single unit.
-It contains all of the program's code and data and has a certain file format that
-allows it to be loaded directly on the target system. IL65 currently has no built-in
-support for programs that exceed 64 Kb of memory, nor for multi-part loaders.
-
-For the Commodore-64, most programs will have a tiny BASIC launcher that does a SYS into the generated machine code.
-This way the user can load it as any other program and simply RUN it to start. (This is a regular ".prg" program).
-Il65 can create those, but it is also possible to output plain binary programs
-that can be loaded into memory anywhere.
+This chapter describes a high level overview of the elements that make up a program.
+Details about the syntax can be found in the :ref:`syntaxreference` chapter.
-Compiling program code
-----------------------
-
-Compilation of program code is done by telling the IL65 compiler to compile a main source code module file.
-Other modules that this code needs will be loaded and processed via imports from within that file.
-The compiler will link everything together into one output program at the end.
-
-The compiler is invoked with the command:
-
- ``$ @todo``
-
-
-Module source code files
-------------------------
-
-A module source file is a text file with the ``.ill`` suffix, containing the program's source code.
-It consists of compilation options and other directives, imports of other modules,
-and source code for one or more code blocks.
-
-IL65 has a couple of *LIBRARY* modules that are defined in special internal files provided by the compiler:
-``c64lib``, ``il65lib``, ``mathlib``.
-You should not overwrite these or reuse their names.
-
-
-.. _debugging:
-
-Debugging (with Vice)
+Elements of a program
---------------------
-There's support for using the monitor and debugging capabilities of the rather excellent
-`Vice emulator `_.
+Program
+ Consists of one or more *modules*.
-The ``%breakpoint`` directive (see :ref:`directives`) in the source code instructs the compiler to put
-a *breakpoint* at that position. Some systems use a BRK instruction for this, but
-this will usually halt the machine altogether instead of just suspending execution.
-IL65 issues a NOP instruction instead and creates a 'virtual' breakpoint at this position.
-All breakpoints are then written to a file called "programname.vice-mon-list",
-which is meant to be used by the Vice emulator.
-It contains a series of commands for Vice's monitor, including source labels and the breakpoint settings.
-If you use the vice autostart feature of the compiler, it will be processed by Vice automatically and immediately.
-If you launch Vice manually, you'll have to use a command line option to load this file:
+Module
+ A file on disk with the ``.ill`` suffix. It contains *directives* and *code blocks*.
+ Whitespace and indentation in the source code are arbitrary and can be tabs or spaces or both.
+ You can also add *comments* to the source code.
+ One moudule file can *import* others, and also import *library modules*.
- ``$ x64 -moncommands programname.vice-mon-list``
+Comments
+ Everything after a semicolon ``;`` is a comment and is ignored by the compiler.
+ If the whole line is just a comment, it will be copied into the resulting assembly source code.
+ This makes it easier to understand and relate the generated code. Examples::
-Vice will then use the label names in memory disassembly, and will activate the breakpoints as well.
-If your running program hits one of the breakpoints, Vice will halt execution and drop you into the monitor.
+ A = 42 ; set the initial value to 42
+ ; next is the code that...
+
+Directive
+ These are special instructions for the compiler, to change how it processes the code
+ and what kind of program it creates. A directive is on its own line in the file, and
+ starts with ``%``, optionally followed by some arguments.
+
+Code block
+ A block of actual program code. It defines a *scope* (also known as 'namespace') and
+ can contain IL65 *code*, *variable declarations* and *subroutines*.
+ More details about this below: :ref:`blocks`.
+
+Variable declarations
+ The data that the code works on is stored in variables ('named values that can change').
+ The compiler allocates the required memory for them.
+ There is *no dynamic memory allocation*. The storage size of all variables
+ is fixed and is determined at compile time.
+ Variable declarations tend to appear at the top of the code block that uses them.
+ They define the name and type of the variable, and its initial value.
+ IL65 supports a small list of data types, including special 'memory mapped' types
+ that don't allocate storage but instead point to a fixed location in the address space.
+
+Code
+ These are the instructions that make up the program's logic. There are different kinds of instructions
+ ('statements' is a better name):
+
+ - value assignment
+ - looping (for, while, repeat, unconditional jumps)
+ - conditional execution (if - then - else, and conditional jumps)
+ - subroutine calls
+ - label definition
+
+Subroutine
+ Defines a piece of code that can be called by its name from different locations in your code.
+ It accepts parameters and can return result values.
+ It can define its own variables but it's not possible to define subroutines nested in other subroutines.
+ To keep things simple, you can only define subroutines inside code blocks from a module.
+
+Label
+ This is a named position in your code where you can jump to from another place.
+ You can jump to it with a jump statement elsewhere. It is also possible to use a
+ subroutine call to a label (but without parameters and return value).
-Troubleshooting
----------------
+Scope
+ Also known as 'namespace', this is a named box around the symbols defined in it.
+ This prevents name collisions (or 'namespace pollution'), because the name of the scope
+ is needed as prefix to be able to access the symbols in it.
+ Anything *inside* the scope can refer to symbols in the same scope without using a prefix.
+ There are three scopes in IL65:
-Getting an assembler error about undefined symbols such as ``not defined 'c64flt'``?
-This happens when your program uses floating point values, and you forgot to import the ``c64lib``.
-If you use floating points, the program will need routines from that library.
-Fix it by adding an ``%import c64lib``.
+ - global (no prefix)
+ - code block
+ - subroutine
+
+ Modules are *not* a scope! Everything defined in a module is merged into the global scope.
+
+
+.. _blocks:
+
+Blocks, Scopes, and accessing Symbols
+-------------------------------------
+
+Blocks are the separate pieces of code and data of your program. They are combined
+into a single output program. No code or data can occur outside a block. Here's an example::
+
+ ~ main $c000 {
+ ; this is code inside the block...
+ }
+
+
+The name of a block must be unique in your entire program.
+Also be careful when importing other modules; blocks in your own code cannot have
+the same name as a block defined in an imported module or library.
+
+It's possible to omit this name, but then you can only refer to the contents of the block via its absolute address,
+which is required in this case. If you omit *both* name and address, the block is *ignored* by the compiler (and a warning is displayed).
+This is a way to quickly "comment out" a piece of code that is unfinshed or may contain errors that you
+want to work on later, because the contents of the ignored block are not fully parsed either.
+
+The address can be used to place a block at a specific location in memory.
+Usually it is omitted, and the compiler will automatically choose the location (usually immediately after
+the previous block in memory).
+The address must be >= ``$0200`` (because ``$00``--``$ff`` is the ZP and ``$100``--``$200`` is the cpu stack).
+
+A block is also a *scope* in your program so the symbols in the block don't clash with
+symbols of the same name defined elsewhere in the same file or in another file.
+You can refer to the symbols in a particular block by using a *dotted name*: ``blockname.symbolname``.
+Labels inside a subroutine are appended again to that; ``blockname.subroutinename.label``.
+
+Every symbol is 'public' and can be accessed from elsewhere given its dotted name.
+
+
+**The special "ZP" ZeroPage block**
+
+Blocks named "ZP" are treated a bit differently: they refer to the ZeroPage.
+The contents of every block with that name (this one may occur multiple times) are merged into one.
+Its start address is always set to ``$04``, because ``$00 - $01`` are used by the hardware
+and ``$02 - $03`` are reserved as general purpose scratch registers.
+
+
+Program Start and Entry Point
+-----------------------------
+
+Your program must have a single entry point where code execution begins.
+The compiler expects a ``start`` subroutine in the ``main`` block for this,
+taking no parameters and having no return value.
+As any subroutine, it has to end with a ``return`` statement (or a ``goto`` call)::
+
+ ~ main {
+ sub start () -> () {
+ ; program entrypoint code here
+ return
+ }
+ }
+
+The ``main`` module is always relocated to the start of your programs
+address space, and the ``start`` subroutine (the entrypoint) will be on the
+first address. This will also be the address that the BASIC loader program (if generated)
+calls with the SYS statement.
+
+
+Variables and data
+------------------
+
+::
+
+ 12345 ; integer number
+ "Hi, I am a string" ; text string
+ -33.456e52 ; floating point number
+
+ byte counter = 42 ; variable of size 8 bits, with initial value 42
+
+
+Integers
+^^^^^^^^
+
+Integers are 8 or 16 bit numbers and can be written in normal decimal notation,
+in hexadecimal and in binary notation.
+
+@todo right now only unsinged integers are supported (>=0)
+
+
+Strings
+^^^^^^^
+
+Strings are a sequence of characters enclosed in ``"`` quotes.
+They're stored and treated much the same as a byte array,
+but they have some special properties because they are considered to be *text*.
+Strings in your source code files will be encoded (translated from ASCII/UTF-8) into either CBM PETSCII or C-64 screencodes.
+PETSCII is the default choice. If you need screencodes (also called 'poke' codes) instead,
+you have to use the ``str_s`` variants of the string type identifier.
+If you assign a string literal of length 1 to a non-string variable, it is treated as a *byte* value instead
+with has the PETSCII value of that single character,
+
+
+Floating point numbers
+^^^^^^^^^^^^^^^^^^^^^^
+
+Floats are stored in the 5-byte 'MFLPT' format that is used on CBM machines,
+and also most float operations are specific to the Commodore-64.
+This is because routines in the C-64 BASIC and KERNAL ROMs are used for that.
+So floating point operations will only work if the C-64 BASIC ROM (and KERNAL ROM)
+are banked in (and your code imports the ``c64lib.ill``)
+
+The largest 5-byte MFLPT float that can be stored is: **1.7014118345e+38** (negative: **-1.7014118345e+38**)
+
+
+Initial values across multiple runs of the program
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The initial values of your variables will be restored automatically when the program is (re)started,
+*except for string variables*. It is assumed these are left unchanged by the program.
+If you do modify them in-place, you should take care yourself that they work as
+expected when the program is restarted.
+
+
+
+Indirect addressing and address-of
+----------------------------------
+
+The ``#`` operator is used to take the address of the symbol following it.
+It can be used for example to work with the *address* of a memory mapped variable rather than
+the value it holds. You could take the address of a string as well, but that is redundant:
+the compiler already treats those as a value that you manipulate via its address.
+For most other types this prefix is not supported and will result in a compilation error.
+The resulting value is simply a 16 bit word. Example::
+
+ AX = #somevar
+
+
+
+**Indirect addressing:**
+@todo
+
+**Indirect addressing in jumps:**
+@todo
+For an indirect ``goto`` statement, the compiler will issue the 6502 CPU's special instruction
+(``jmp`` indirect). A subroutine call (``jsr`` indirect) is emitted
+using a couple of instructions.
+
+
+Loops
+-----
+
+The *for*-loop is used to iterate over a range of values. Iteration steps by 1,
+but you can set it to something else as well.
+The *while*-loop is used to repeat a piece of code while a certain condition is still true.
+The *repeat--until* loop is used to repeat a piece of code until a certain condition is true.
+
+You can also create loops by using the ``goto`` statement, but this should be avoided.
+
+
+Conditional Execution
+---------------------
+
+@todo
+
+Conditional execution means that the flow of execution changes based on certiain conditions,
+rather than having fixed gotos or subroutine calls. IL65 has a *conditional goto* statement for this,
+that is translated into a comparison (if needed) and then a conditional branch instruction::
+
+ if[_XX] [] goto