diff --git a/docs/source/index.rst b/docs/source/index.rst index eff8dc433..0775e2914 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -217,7 +217,7 @@ Look in the `syntax-files = `` and then you can use ```` as if it were ````. +It is possible to alias variables, labels and subroutines, but not whole blocks. +The name has to be an unscoped identifier name, the target can be any symbol. + + .. _blocks: Blocks, Scopes, and accessing Symbols @@ -94,19 +130,19 @@ Blocks, Scopes, and accessing Symbols **Blocks** are the top level separate pieces of code and data of your program. They have a starting address in memory and will be combined together into a single output program. -They can only contain *directives*, *variable declarations*, *subroutines* and *inline assembly code*. -Your actual program code can only exist inside these subroutines. -(except the occasional inline assembly) +They can only contain *directives*, *variable declarations*, *subroutines* and *inline assembly code*:: -Here's an example:: - - main $c000 { - ; this is code inside the block... + [
] { + + + + } -The name of a block must be unique in your entire program. -Be careful when importing other modules; blocks in your own code cannot have -the same name as a block defined in an imported module or library. +The must be a valid identifier, and must be unique in the entire program (there's +a directive to merge multiple occurences). +The
is optional. If specified it must be a valid memory address such as ``$c000``. +It's used to tell the compiler to put the block at a certain position in memory. .. sidebar:: Using qualified names ("dotted names") to reference symbols defined elsewhere @@ -187,370 +223,253 @@ first address. This will also be the address that the BASIC loader program (if g calls with the SYS statement. +.. _directives: + +Directives +----------- + +.. data:: %address
+ + Level: module. + Global setting, set the program's start memory address. It's usually fixed at ``$0801`` because the + default launcher type is a CBM-BASIC program. But you have to specify this address yourself when + you don't use a CBM-BASIC launcher. -Variables and values --------------------- +.. data:: %align -Variables are named values that can change during the execution of the program. -They can be defined inside any scope (blocks, subroutines etc.) See :ref:`blocks`. -When declaring a numeric variable it is possible to specify the initial value, if you don't want it to be zero. -For other data types it is required to specify that initial value it should get. -Values will usually be part of an expression or assignment statement:: - - 12345 ; integer number - $aa43 ; hex integer number - %100101 ; binary integer number (% is also remainder operator so be careful) - false ; boolean false - -33.456e52 ; floating point number - "Hi, I am a string" ; text string, encoded with default encoding - 'a' ; byte value (ubyte) for the letter a - sc:"Alternate" ; text string, encoded with c64 screencode encoding - sc:'a' ; byte value of the letter a in c64 screencode encoding - - byte counter = 42 ; variable of size 8 bits, with initial value 42 + Level: not at module scope. + Tells the assembler to continue assembling on the given alignment interval. For example, ``%align $100`` + will insert an assembler command to align on the next page boundary. + Note that this has no impact on variables following this directive! Prog8 reallocates all variables + using different rules. If you want to align a specific variable (array or string), you should use + one of the alignment tags for variable declarations instead. + Valid intervals are from 2 to 65536. + **Warning:** if you use this directive in between normal statements, it will disrupt the output + of the machine code instructions by making gaps between them, this will probably crash the program! -**putting a variable in zeropage:** -If you add the ``@zp`` tag to the variable declaration, the compiler will prioritize this variable -when selecting variables to put into zeropage (but no guarantees). If there are enough free locations in the zeropage, -it will try to fill it with as much other variables as possible (before they will be put in regular memory pages). -Use ``@requirezp`` tag to *force* the variable into zeropage, but if there is no more free space the compilation will fail. -It's possible to put strings, arrays and floats into zeropage too, however because Zp space is really scarce -this is not advised as they will eat up the available space very quickly. It's best to only put byte or word -variables in zeropage. By the way, there is also ``@nozp`` to keep a variable *out of the zeropage* at all times. +.. data:: %asm {{ ... }} -Example:: + Level: not at module scope. + Declares that a piece of *assembly code* is inside the curly braces. + This code will be copied as-is into the generated output assembly source file. + Note that the start and end markers are both *double curly braces* to minimize the chance + that the assembly code itself contains either of those. If it does contain a ``}}``, + it will confuse the parser. - byte @zp smallcounter = 42 - uword @requirezp zppointer = $4000 + If you use the correct scoping rules you can access symbols from the prog8 program from inside + the assembly code. Sometimes you'll have to declare a variable in prog8 with `@shared` if it + is only used in such assembly code. + + .. note:: + 64tass syntax is required for the assembly code. As such, mnemonics need to be written in lowercase. + + .. caution:: + Avoid using single-letter symbols in included assembly code, as they could be confused with CPU registers. + Also, note that all prog8 symbols are prefixed in assembly code, see :ref:`symbol-prefixing`. -**shared variables:** -If you add the ``@shared`` tag to the variable declaration, the compiler will know that this variable -is a prog8 variable shared with some assembly code elsewhere. This means that the assembly code can -refer to the variable even if it's otherwise not used in prog8 code itself. -(usually, these kinds of 'unused' variables are optimized away by the compiler, resulting in an error -when assembling the rest of the code). Example:: +.. data:: %asmbinary "" [, [, ]] - byte @shared assemblyVariable = 42 + Level: not at module scope. + This directive can only be used inside a block. + The assembler itself will include the file as binary bytes at this point, prog8 will not process this at all. + This means that the filename must be spelled exactly as it appears on your computer's file system. + Note that this filename may differ in case compared to when you chose to load the file from disk from within the + program code itself (for example on the C64 and X16 there's the PETSCII encoding difference). + The file is located relative to the current working directory! + The optional offset and length can be used to select a particular piece of the file. + To reference the contents of the included binary data, you can put a label in your prog8 code + just before the %asmbinary. To find out where the included binary data ends, add another label directly after it. + An example program for this can be found below at the description of %asminclude. -**uninitialized variables:** -All variables will be initialized by prog8 at startup, they'll get their assigned initialization value, or be cleared to zero. -This (re)initialization is also done on each subroutine entry for the variables declared in the subroutine. +.. data:: %asminclude "" -There may be certain scenarios where this initialization is redundant and/or where you want to avoid the overhead of it. -In some cases, Prog8 itself can detect that a variable doesn't need a separate automatic initialization to zero, if -it's trivial that it is not being read between the variable's declaration and the first assignment. For instance, when -you declare a variable immediately before a for loop where it is the loop variable. However Prog8 is not yet very smart -at detecting these redundant initializations. If you want to be sure, check the generated assembly output. + Level: not at module scope. + This directive can only be used inside a block. + The assembler will include the file as raw assembly source text at this point, + prog8 will not process this at all. Symbols defined in the included assembly can not be referenced + from prog8 code. However they can be referenced from other assembly code if properly prefixed. + You can of course use a label in your prog8 code just before the %asminclude directive, and reference + that particular label to get to (the start of) the included assembly. + Be careful: you risk symbol redefinitions or duplications if you include a piece of + assembly into a prog8 block that already defines symbols itself. + The compiler first looks for the file relative to the same directory as the module containing this statement is in, + if the file can't be found there it is searched relative to the current directory. -In any case, you can use the ``@dirty`` tag on the variable declaration to make the variable *not* being (re)initialized by Prog8. -This means its value will be undefined (it can be anything) until you assign a value yourself! Don't use such -a variable before you have done so. 🦶🔫 Footgun warning. + .. caution:: + Avoid using single-letter symbols in included assembly code, as they could be confused with CPU registers. + Also, note that all prog8 symbols are prefixed in assembly code, see :ref:`symbol-prefixing`. + + Here is a small example program to show how to use labels to reference the included contents from prog8 code:: + + %import textio + %zeropage basicsafe + + main { + + sub start() { + txt.print("first three bytes of included asm:\n") + uword included_addr = &included_asm + txt.print_ub(@(included_addr)) + txt.spc() + txt.print_ub(@(included_addr+1)) + txt.spc() + txt.print_ub(@(included_addr+2)) + + txt.print("\nfirst three bytes of included binary:\n") + included_addr = &included_bin + txt.print_ub(@(included_addr)) + txt.spc() + txt.print_ub(@(included_addr+1)) + txt.spc() + txt.print_ub(@(included_addr+2)) + txt.nl() + return + + included_asm: + %asminclude "inc.asm" + + included_bin: + %asmbinary "inc.bin" + end_of_included_bin: + + } + } -**memory alignment:** -A string or array variable can be aligned to a couple of possible interval sizes in memory. -The use for this is very situational, but two examples are: sprite data for the C64 that needs -to be on a 64 byte aligned memory address, or an array aligned on a full page boundary to avoid -any possible extra page boundary clock cycles on certain instructions when accessing the array. -You can align on word, 64 bytes, and page boundaries:: +.. data:: %breakpoint - ubyte[] @alignword array = [1, 2, 3, 4, ...] - ubyte[] @align64 spritedata = [ %00000000, %11111111, ...] - ubyte[] @alignpage lookup = [11, 22, 33, 44, ...] + Level: not at module scope. + Defines a debugging breakpoint at this location. See :ref:`debugging` -Integers -^^^^^^^^ +.. data:: %encoding -Integers are 8 or 16 bit numbers and can be written in normal decimal notation, -in hexadecimal and in binary notation. There is no octal notation. -You can use underscores to group digits to make long numbers more readable. -A single character in single quotes such as ``'a'`` is translated into a byte integer, -which is the PETSCII value for that character. - -Unsigned integers are in the range 0-255 for unsigned byte types, and 0-65535 for unsigned word types. -The signed integers integers are in the range -128..127 for bytes, -and -32768..32767 for words. - -Only for ``const`` numbers, you can use larger values (32 bits signed integers). The compiler can handle those -internally in expressions. As soon as you have to actually store it into a variable, -you have to make sure the resulting value fits into the byte or word size of the variable. - -.. attention:: - Doing math on signed integers can result in code that is a lot larger and slower than - when using unsigned integers. Make sure you really need the signed numbers, otherwise - stick to unsigned integers for efficiency. + Overrides, in the module file it occurs in, + the default text encoding to use for strings and characters that have no explicit encoding prefix. + You can use one of the recognised encoding names, see :ref:`encodings`. -Booleans -^^^^^^^^ +.. data:: %import -Booleans are a distinct type in Prog8 and can have only the values ``true`` or ``false``. -It can be casted to and from other integer types though -where a nonzero integer is considered to be true, and zero is false. -Logical expressions, comparisons and some other code tends to compile more efficiently if -you explicitly use ``bool`` types instead of 0/1 integers. -The in-memory representation of a boolean value is just a byte containing 0 or 1. - -If you find that you need a whole bunch of boolean variables or perhaps even an array of them, -consider using integer bit mask variable + bitwise operators instead. -This saves a lot of memory and may be faster as well. + Level: module. + This reads and compiles the named module source file as part of your current program. + Symbols from the imported module become available in your code, + without a module or filename prefix. + You can import modules one at a time, and importing a module more than once has no effect. -Floating point numbers -^^^^^^^^^^^^^^^^^^^^^^ +.. data:: %launcher -You can use underscores to group digits to make long numbers more readable. + Level: module. + Global setting, selects the program launcher stub to use. + Only relevant when using the ``prg`` output type. Defaults to ``basic``. -Floats are stored in the 5-byte 'MFLPT' format that is used on CBM machines. -Floating point support is available on the c64 and cx16 (and virtual) compiler targets. -On the c64 and cx16, the rom routines are used for floating point operations, -so on both systems the correct rom banks have to be banked in to make this work. -Although the C128 shares the same floating point format, Prog8 currently doesn't support -using floating point on that system (because the c128 fp routines require the fp variables -to be in another ram bank than the program, something Prog8 doesn't do). - -Also your code needs to import the ``floats`` library to enable floating point support -in the compiler, and to gain access to the floating point routines. -(this library contains the directive to enable floating points, you don't have -to worry about this yourself) - -The largest 5-byte MFLPT float that can be stored is: **1.7014118345e+38** (negative: **-1.7014118345e+38**) + - type ``basic`` : add a tiny C64 BASIC program, with a SYS statement calling into the machine code + - type ``none`` : no launcher logic is added at all -Arrays -^^^^^^ -Array types are also supported. They can be formed from a list of booleans, bytes, words, floats, or addresses of other variables -(such as explicit address-of expressions, strings, or other array variables) - values in an array literal -always have to be constants. Here are some examples of arrays:: +.. data:: %memtop
- byte[10] array ; array of 10 bytes, initially set to 0 - byte[] array = [1, 2, 3, 4] ; initialize the array, size taken from value - ubyte[99] array = [255]*99 ; initialize array with 99 times 255 [255, 255, 255, 255, ...] - byte[] array = 100 to 199 ; initialize array with [100, 101, ..., 198, 199] - str[] names = ["ally", "pete"] ; array of string pointers/addresses (equivalent to array of uwords) - uword[] others = [names, array] ; array of pointers/addresses to other arrays - bool[2] flags = [true, false] ; array of two boolean values (take up 1 byte each, like a byte array) + Level: module. + Global setting, changes the program's top memory address. This is usually specified internally by the compiler target, + but with this you can change it to another value. This can be useful for example to 'reserve' a piece + of memory at the end of program space where other data such as external library files can be loaded into. + This memtop value is used for a check instruction for the assembler to see if the resulting program size + exceeds the given memtop address. This value is exclusive, so $a000 means that $a000 is the first address + that program can no longer use. Everything up to and including $9fff is still usable. - value = array[3] ; the fourth value in the array (index is 0-based) - char = string[4] ; the fifth character (=byte) in the string - char = string[-2] ; the second-to-last character in the string (Python-style indexing from the end) + +.. data:: %option