==================== Programming in Prog8 ==================== This chapter describes a high level overview of the elements that make up a program. Elements of a program --------------------- Program Consists of one or more *modules*. Module A file on disk with the ``.p8`` suffix. It can contain *directives* and *code blocks*. Whitespace and indentation in the source code are arbitrary and can be mixed tabs or spaces. A module file can *import* other modules, including *library modules*. It should be saved in UTF-8 encoding. Line endings are significant because *only one* declaration, statement or other instruction can occur on every line. Other whitespace and line indentation is arbitrary and ignored by the compiler. You can use tabs or spaces as you wish. Comments Everything on the line after a semicolon ``;`` is a comment and is ignored by the compiler. If the whole line is just a comment, this line will be copied into the resulting assembly source code for reference. There's also a block-comment: everything surrounded with ``/*`` and ``*/`` is ignored and this can span multiple lines. This block comment is experimental for now: it may change or even be removed again in a future compiler version. The recommended way to comment out a bunch of lines remains to just bulk comment them individually with ``;``. Directive These are special instructions for the compiler, to change how it processes the code and what kind of program it creates. A directive is on its own line in the file, and starts with ``%``, optionally followed by some arguments. See the syntax reference for all directives. The list of directives is given below at :ref:`directives`. Code block A block of actual program code. It has a starting address in memory, and defines a *scope* (also known as 'namespace'). It contains variables and subroutines. More details about this below: :ref:`blocks`. Variable declarations The data that the code works on is stored in variables ('named values that can change'). They are described in the chapter :ref:`variables`. Code These are the instructions that make up the program's logic. Code can only occur inside a subroutine. There are different kinds of instructions ('statements' is a better name) such as: - value assignment - looping (for, while, do-until, repeat, unconditional jumps) - conditional execution (if - then - else, when, and conditional jumps) - subroutine calls - label definition Subroutine Defines a piece of code that can be called by its name from different locations in your code. It accepts parameters and can return a value (optional). It can define its own variables, and it is also possible to define subroutines within other subroutines. Nested subroutines can access the variables from outer scopes easily, which removes the need and overhead to pass everything via parameters all the time. Subroutines do not have to be declared in the source code before they can be called. Label This is a named position in your code where you can jump to from another place. You can jump to it with a jump statement elsewhere. It is also possible to use a subroutine call to a label (but without parameters and return value). A label is an identifier followed by a colon ``:``. It's ok to put the next statement on the same line, immediately after the label. Scope Also known as 'namespace', this is a named box around the symbols defined in it. This prevents name collisions (or 'namespace pollution'), because the name of the scope is needed as prefix to be able to access the symbols in it. Anything *inside* the scope can refer to symbols in the same scope without using a prefix. There are three scope levels in Prog8: - global (no prefix), everything in a module file goes in here; - block; - subroutine, can be nested in another subroutine. Even though modules are separate files, they are *not* separate scopes! Everything defined in a module is merged into the global scope. This is different from most other languages that have modules. The global scope can only contain blocks and some directives, while the others can contain variables and subroutines too. Some more details about how to deal with scopes and names is discussed below. Identifiers ----------- Naming things in Prog8 is done via valid *identifiers*. They start with a letter, and after that, a combination of letters, numbers, or underscores. Note that any Unicode Letter symbol is accepted as a letter! Examples of valid identifiers:: a A monkey COUNTER Better_Name_2 something_strange__ knäckebröd приблизительно π **Scoped names** Sometimes called "qualified names" or "dotted names", a scoped name is a sequence of identifiers separated by a dot. They are used to reference symbols in other scopes. Note that unlike many other programming languages, scoped names always need to be fully scoped (because they always start in the global scope). Also see :ref:`blocks`:: main.start ; the entrypoint subroutine main.start.variable ; a variable in the entrypoint subroutine **Aliases** The ``alias`` statement makes it easier to refer to symbols from other places, and they can save you from having to type the fully scoped name everytime you need to access that symbol. Aliases can be created in any scope except at the module level. An alias is created with ``alias = `` and then you can use ```` as if it were ````. It is possible to alias variables, labels and subroutines, but not whole blocks. The name has to be an unscoped identifier name, the target can be any symbol. .. _blocks: Blocks, Scopes, and accessing Symbols ------------------------------------- **Blocks** are the top level separate pieces of code and data of your program. They have a starting address in memory and will be combined together into a single output program. They can only contain *directives*, *variable declarations*, *subroutines* and *inline assembly code*:: [
] { } The must be a valid identifier, and must be unique in the entire program (there's a directive to merge multiple occurences). The
is optional. If specified it must be a valid memory address such as ``$c000``. It's used to tell the compiler to put the block at a certain position in memory. .. sidebar:: Using qualified names ("dotted names") to reference symbols defined elsewhere Every symbol is 'public' and can be accessed from anywhere else, when given its *full* "dotted name". So, accessing a variable ``counter`` defined in subroutine ``worker`` in block ``main``, can be done from anywhere by using ``main.worker.counter``. Unlike most other programming langues, as soon as a name is scoped, Prog8 treats it as a name starting in the *global* namespace. Relative name lookup is only performed for *non-scoped* names. The address can be used to place a block at a specific location in memory. Usually it is omitted, and the compiler will automatically choose the location (usually immediately after the previous block in memory). It must be >= ``$0200`` (because ``$00``--``$ff`` is the ZP and ``$100``--``$1ff`` is the cpu stack). *Symbols* are names defined in a certain *scope*. Inside the same scope, you can refer to them by their 'short' name directly. If the symbol is not found in the same scope, the enclosing scope is searched for it, and so on, up to the top level block, until the symbol is found. If the symbol was not found the compiler will issue an error message. **Subroutines** create a new scope. All variables inside a subroutine are hoisted up to the scope of the subroutine they are declared in. Note that you can define **nested subroutines** in Prog8, and such a nested subroutine has its own scope! This also means that you have to use a fully qualified name to access a variable from a nested subroutine:: main { sub start() { sub nested() { ubyte counter ... } ... txt.print_ub(counter) ; Error: undefined symbol txt.print_ub(main.start.nested.counter) ; OK } } **Aliases** make it easier to refer to symbols from other places. They save you from having to type the fully scoped name everytime you need to access that symbol. Aliases can be created in any scope except at the module level. You can create and use an alias with the ``alias`` statement like so:: alias score = cx16.r7L ; 'name' the virtual register alias prn = txt.print_ub ; shorter name for a subroutine elsewhere ... prn(score) .. important:: Emphasizing this once more: unlike most other programming languages, a new scope is *not* created inside for, while, repeat, and do-until statements, the if statement, and the branching conditionals. These all share the same scope from the subroutine they're defined in. You can define variables in these blocks, but these will be treated as if they were defined in the subroutine instead. Program Start and Entry Point ----------------------------- Your program must have a single entry point where code execution begins. The compiler expects a ``start`` subroutine in the ``main`` block for this, taking no parameters and having no return value. As any subroutine, it has to end with a ``return`` statement (or a ``goto`` call):: main { sub start () { ; program entrypoint code here return } } The ``main`` module is always relocated to the start of your programs address space, and the ``start`` subroutine (the entrypoint) will be on the first address. This will also be the address that the BASIC loader program (if generated) calls with the SYS statement. .. _directives: Directives ----------- .. data:: %address
Level: module. Global setting, set the program's start memory address. It's usually fixed at ``$0801`` because the default launcher type is a CBM-BASIC program. But you have to specify this address yourself when you don't use a CBM-BASIC launcher. .. data:: %align Level: not at module scope. Tells the assembler to continue assembling on the given alignment interval. For example, ``%align $100`` will insert an assembler command to align on the next page boundary. Note that this has no impact on variables following this directive! Prog8 reallocates all variables using different rules. If you want to align a specific variable (array or string), you should use one of the alignment tags for variable declarations instead. Valid intervals are from 2 to 65536. **Warning:** if you use this directive in between normal statements, it will disrupt the output of the machine code instructions by making gaps between them, this will probably crash the program! .. data:: %asm {{ ... }} Level: not at module scope. Declares that a piece of *assembly code* is inside the curly braces. This code will be copied as-is into the generated output assembly source file. Note that the start and end markers are both *double curly braces* to minimize the chance that the assembly code itself contains either of those. If it does contain a ``}}``, it will confuse the parser. If you use the correct scoping rules you can access symbols from the prog8 program from inside the assembly code. Sometimes you'll have to declare a variable in prog8 with `@shared` if it is only used in such assembly code. .. note:: 64tass syntax is required for the assembly code. As such, mnemonics need to be written in lowercase. .. caution:: Avoid using single-letter symbols in included assembly code, as they could be confused with CPU registers. Also, note that all prog8 symbols are prefixed in assembly code, see :ref:`symbol-prefixing`. .. data:: %asmbinary "" [, [, ]] Level: not at module scope. This directive can only be used inside a block. The assembler itself will include the file as binary bytes at this point, prog8 will not process this at all. This means that the filename must be spelled exactly as it appears on your computer's file system. Note that this filename may differ in case compared to when you chose to load the file from disk from within the program code itself (for example on the C64 and X16 there's the PETSCII encoding difference). The file is located relative to the current working directory! The optional offset and length can be used to select a particular piece of the file. To reference the contents of the included binary data, you can put a label in your prog8 code just before the %asmbinary. To find out where the included binary data ends, add another label directly after it. An example program for this can be found below at the description of %asminclude. .. data:: %asminclude "" Level: not at module scope. This directive can only be used inside a block. The assembler will include the file as raw assembly source text at this point, prog8 will not process this at all. Symbols defined in the included assembly can not be referenced from prog8 code. However they can be referenced from other assembly code if properly prefixed. You can of course use a label in your prog8 code just before the %asminclude directive, and reference that particular label to get to (the start of) the included assembly. Be careful: you risk symbol redefinitions or duplications if you include a piece of assembly into a prog8 block that already defines symbols itself. The compiler first looks for the file relative to the same directory as the module containing this statement is in, if the file can't be found there it is searched relative to the current directory. .. caution:: Avoid using single-letter symbols in included assembly code, as they could be confused with CPU registers. Also, note that all prog8 symbols are prefixed in assembly code, see :ref:`symbol-prefixing`. Here is a small example program to show how to use labels to reference the included contents from prog8 code:: %import textio %zeropage basicsafe main { sub start() { txt.print("first three bytes of included asm:\n") uword included_addr = &included_asm txt.print_ub(@(included_addr)) txt.spc() txt.print_ub(@(included_addr+1)) txt.spc() txt.print_ub(@(included_addr+2)) txt.print("\nfirst three bytes of included binary:\n") included_addr = &included_bin txt.print_ub(@(included_addr)) txt.spc() txt.print_ub(@(included_addr+1)) txt.spc() txt.print_ub(@(included_addr+2)) txt.nl() return included_asm: %asminclude "inc.asm" included_bin: %asmbinary "inc.bin" end_of_included_bin: } } .. data:: %breakpoint Level: not at module scope. Defines a debugging breakpoint at this location. See :ref:`debugging` .. data:: %encoding Overrides, in the module file it occurs in, the default text encoding to use for strings and characters that have no explicit encoding prefix. You can use one of the recognised encoding names, see :ref:`encodings`. .. data:: %import Level: module. This reads and compiles the named module source file as part of your current program. Symbols from the imported module become available in your code, without a module or filename prefix. You can import modules one at a time, and importing a module more than once has no effect. .. data:: %launcher Level: module. Global setting, selects the program launcher stub to use. Only relevant when using the ``prg`` output type. Defaults to ``basic``. - type ``basic`` : add a tiny C64 BASIC program, with a SYS statement calling into the machine code - type ``none`` : no launcher logic is added at all .. data:: %memtop
Level: module. Global setting, changes the program's top memory address. This is usually specified internally by the compiler target, but with this you can change it to another value. This can be useful for example to 'reserve' a piece of memory at the end of program space where other data such as external library files can be loaded into. This memtop value is used for a check instruction for the assembler to see if the resulting program size exceeds the given memtop address. This value is exclusive, so $a000 means that $a000 is the first address that program can no longer use. Everything up to and including $9fff is still usable. .. data:: %option