@alignword aligns string or array variable on an even memory address
@align64 aligns string or array variable on a 64 byte address interval (example: for C64 sprite data)
@alignpage aligns string or array variable on a 256 byte address interval (example: to avoid page boundaries)
@dirty the variable won't be initialized by Prog8 which means that its value is undefined. You'll have to set it yourself before using the variable. Used to reduce overhead in certain scenarios. 🦶🔫 Footgun warning.
========== ======
Variables can be defined inside any scope (blocks, subroutines etc.) See :ref:`blocks`.
When declaring a numeric variable it is possible to specify the initial value, if you don't want it to be zero.
For other data types it is required to specify that initial value it should get.
Values will usually be part of an expression or assignment statement::
12345 ; integer number
$aa43 ; hex integer number
%100101 ; binary integer number (% is also remainder operator so be careful)
false ; boolean false
-33.456e52 ; floating point number
"Hi, I am a string" ; text string, encoded with default encoding
'a' ; byte value (ubyte) for the letter a
sc:"Alternate" ; text string, encoded with c64 screencode encoding
sc:'a' ; byte value of the letter a in c64 screencode encoding
byte counter = 42 ; variable of size 8 bits, with initial value 42
**putting a variable in zeropage:**
If you add the ``@zp`` tag to the variable declaration, the compiler will prioritize this variable
when selecting variables to put into zeropage (but no guarantees). If there are enough free locations in the zeropage,
it will try to fill it with as much other variables as possible (before they will be put in regular memory pages).
Use ``@requirezp`` tag to *force* the variable into zeropage, but if there is no more free space the compilation will fail.
It's possible to put strings, arrays and floats into zeropage too, however because Zp space is really scarce
this is not advised as they will eat up the available space very quickly. It's best to only put byte or word
variables in zeropage. By the way, there is also ``@nozp`` to keep a variable *out of the zeropage* at all times.
Example::
byte @zp smallcounter = 42
uword @requirezp zppointer = $4000
**shared variables:**
If you add the ``@shared`` tag to the variable declaration, the compiler will know that this variable
is a prog8 variable shared with some assembly code elsewhere. This means that the assembly code can
refer to the variable even if it's otherwise not used in prog8 code itself.
(usually, these kinds of 'unused' variables are optimized away by the compiler, resulting in an error
when assembling the rest of the code). Example::
byte @shared assemblyVariable = 42
**uninitialized variables:**
All variables will be initialized by prog8 at startup, they'll get their assigned initialization value, or be cleared to zero.
This (re)initialization is also done on each subroutine entry for the variables declared in the subroutine.
There may be certain scenarios where this initialization is redundant and/or where you want to avoid the overhead of it.
In some cases, Prog8 itself can detect that a variable doesn't need a separate automatic initialization to zero, if
it's trivial that it is not being read between the variable's declaration and the first assignment. For instance, when
you declare a variable immediately before a for loop where it is the loop variable. However Prog8 is not yet very smart
at detecting these redundant initializations. If you want to be sure, check the generated assembly output.
In any case, you can use the ``@dirty`` tag on the variable declaration to make the variable *not* being (re)initialized by Prog8.
This means its value will be undefined (it can be anything) until you assign a value yourself! Don't use such
a variable before you have done so. 🦶🔫 Footgun warning.
**memory alignment:**
A string or array variable can be aligned to a couple of possible interval sizes in memory.
The use for this is very situational, but two examples are: sprite data for the C64 that needs
to be on a 64 byte aligned memory address, or an array aligned on a full page boundary to avoid
any possible extra page boundary clock cycles on certain instructions when accessing the array.
You can align on word, 64 bytes, and page boundaries::
``byte[x]`` signed byte array x bytes ``byte[4] myvar``
``ubyte[x]`` unsigned byte array x bytes ``ubyte[4] myvar``
``word[x]`` signed word array 2*x bytes ``word[4] myvar``
``uword[x]`` unsigned word array 2*x bytes ``uword[4] myvar``
``float[x]`` floating-point array 5*x bytes ``float[4] myvar``. The 5 bytes per float is on CBM targets.
``bool[x]`` boolean array x bytes ``bool[4] myvar`` note: consider using bit flags in a byte or word instead to save space
``byte[]`` signed byte array depends on value ``byte[] myvar = [1, 2, 3, 4]``
``ubyte[]`` unsigned byte array depends on value ``ubyte[] myvar = [1, 2, 3, 4]``
``word[]`` signed word array depends on value ``word[] myvar = [1, 2, 3, 4]``
``uword[]`` unsigned word array depends on value ``uword[] myvar = [1, 2, 3, 4]``
``float[]`` floating-point array depends on value ``float[] myvar = [1.1, 2.2, 3.3, 4.4]``
``bool[]`` boolean array depends on value ``bool[] myvar = [true, false, true]`` note: consider using bit flags in a byte or word instead to save space
An uword variable can be used in limited scenarios as a 'pointer' to a byte in memory at a specific,
dynamic, location. You can use array indexing on a pointer variable to use it as a byte array at
a dynamic location in memory: currently this is equivalent to directly referencing the bytes in
memory at the given index. In contrast to a real array variable, the index value can be the size of a word.
Unlike array variables, negative indexing for pointer variables does *not* mean it will be counting from the end, because the size of the buffer is unknown.
Instead, it simply addresses memory that lies *before* the pointer variable.
You can concatenate two string literals using '+', which can be useful to
split long strings over separate lines. But remember that the length
of the total string still cannot exceed 255 characters.
A string literal can also be repeated a given number of times using '*', where the repeat number must be a constant value.
And a new string value can be assigned to another string, but no bounds check is done!
So be sure the destination string is large enough to contain the new value (it is overwritten in memory)::
str string1 = "first part" + "second part"
str string2 = "hello!" * 10
string1 = string2
string1 = "new value"
There are several escape sequences available to put special characters into your string value:
-``\\`` - the backslash itself, has to be escaped because it is the escape symbol by itself
-``\n`` - newline character (move cursor down and to beginning of next line)
-``\r`` - carriage return character (more or less the same as newline if printing to the screen)
-``\"`` - quote character (otherwise it would terminate the string)
-``\'`` - apostrophe character (has to be escaped in character literals, is okay inside a string)
-``\uHHHH`` - a unicode codepoint \u0000 - \uffff (16-bit hexadecimal)
-``\xHH`` - 8-bit hex value that will be copied verbatim *without encoding*
- String literals can contain many symbols directly if they have a PETSCII equivalent, such as "♠♥♣♦π▚●○╳".
Characters like ^, _, \\, {, } and | (that have no direct PETSCII counterpart) are still accepted and converted to the closest PETSCII equivalents. (Make sure you save the source file in UTF-8 encoding if you use this.)
Using the ``in`` operator you can easily check if a character is present in a string,
example: ``if '@' in email_address {....}`` (however this gives no clue about the location
in the string where the character is present, if you need that, use the ``strings.find()``
library function instead)
**Caution:**
This checks *all* elements in the string with the length as it was initially declared.
Even when a string was changed and is terminated early with a 0-byte early,
the containment check with ``in`` will still look at all character positions in the initial string.
Consider using ``strings.find`` followed by ``if_cs`` (for instance) to do a "safer" search
for a character in such strings (one that stops at the first 0 byte)
..hint::
Strings/arrays and uwords (=memory address) can often be interchanged.
An array of strings is actually an array of uwords where every element is the memory
address of the string. You can pass a memory address to assembly functions
that require a string as an argument.
For regular assignments you still need to use an explicit ``&`` (address-of) to take
the address of the string or array.
..hint::
You can declare parameters and return values of subroutines as ``str``,
but in this case that is equivalent to declaring them as ``uword`` (because
in this case, the address of the string is passed as argument or returned as value).
..note:: Strings and their (im)mutability
*String literals outside of a string variable's initialization value*,
are considered to be "constant", i.e. the string isn't going to change
during the execution of the program. The compiler takes advantage of this in certain
ways. For instance, multiple identical occurrences of a string literal are folded into
just one string allocation in memory. Examples of such strings are the string literals
passed to a subroutine as arguments.
*Strings that aren't such string literals are considered to be unique*, even if they
are the same as a string defined elsewhere. This includes the strings assigned to
a string variable in its declaration! These kind of strings are not deduplicated and
are just copied into the program in their own unique part of memory. This means that
it is okay to treat those strings as mutable; you can safely change the contents
of such a string without destroying other occurrences (as long as you stay within
the size of the allocated string!)
.._range-expression:
Ranges
^^^^^^
A special value is the *range expression* which represents a range of integer numbers or characters,
from the starting value to (and including) the ending value::
<start> to <end> [ step <step> ]
<start> downto <end> [ step <step> ]
You an provide a step value if you need something else than the default increment which is one (or,
in case of downto, a decrement of one). Unlike the start and end values, the step value must be a constant.
Because a step of minus one is so common you can just use
the downto variant to avoid having to specify the step as well::
0 to 7 ; range of values 0, 1, 2, 3, 4, 5, 6, 7
20 downto 10 step -3 ; range of values 20, 17, 14, 11
aa = 5
xx = 10
aa to xx ; range of 5, 6, 7, 8, 9, 10
for i in 0 to 127 {
; i loops 0, 1, 2, ... 127
}
Range expressions are most often used in for loops, but can be also be used to create array initialization values::
byte[] array = 100 to 199 ; initialize array with [100, 101, ..., 198, 199]