prog8/docs/source/variables.rst

.. _variables:

====================
Variables and Values
====================

Because this is such a big subject, variables and values have their own chapter.


Variables
---------

Variables are named values that can be modified during the execution of the program.
The compiler allocates the required memory for them.
There is *no dynamic memory allocation*. The storage size of all variables
is fixed and is determined at compile time.
Variable declarations tend to appear at the top of the code block that uses them, but this is not mandatory.
They define the name and type of the variable, and its initial value.
Prog8 supports a small list of data types, including special memory-mapped types
that don't allocate storage but instead point to a fixed location in the address space.


Declaring a variable
^^^^^^^^^^^^^^^^^^^^

Variables should be declared with their exact type and size so the compiler can allocate storage
for them. You can give them an initial value as well. That value can be a simple literal value,
or an expression. If you don't provide an initial value yourself, zero will be used.
The syntax for variable declarations is::

	<datatype>  [ @tag ]  <variable name>   [ = <initial value> ]

For boolean and numeric variables, you can actually declare them in one go by listing the names in a comma separated list.
Type tags, and the optional initialization value, are applied equally to all variables in such a list.
Various examples::

    word        thing   = 0
    byte        counter = len([1, 2, 3]) * 20
    byte        age     = 2018 - 1974
    float       wallet  = 55.25
    ubyte       x,y,z                   ; declare three ubyte variables x y and z
    str         name    = "my name is Alice"
    uword       address = &counter
    bool        flag    = true
    byte[]      values  = [11, 22, 33, 44, 55]
    byte[5]     values                  ; array of 5 bytes, initially set to zero
    byte[5]     values  = [255]*5       ; initialize with five 255 bytes

    word  @zp         zpword = 9999     ; prioritize this when selecting vars for zeropage storage
    uword @requirezp  zpaddr = $3000    ; we require this variable in zeropage
    word  @shared asmvar                ; variable is used in assembly code but not elsewhere
    byte  @nozp memvar                  ; variable that is never in zeropage


Here are the tags you can add to a variable:

==========  ======
Tag         Effect
==========  ======
@zp         prioritize the variable for putting it into Zero page. No guarantees; if ZP is full the variable will be placed in another memory location.
@requirezp  force the variable into Zero page. If ZP is full, compilation will fail.
@nozp       force the variable to normal system ram, never place it into zeropage.
@shared     means the variable is shared with some assembly code and that it cannot be optimized away if not used elsewhere.
@nosplit    (only valid on (u)word arrays) Store the array as a single inear array instead of a separate array for lsb and msb values
@alignword  aligns string or array variable on an even memory address
@align64    aligns string or array variable on a 64 byte address interval (example: for C64 sprite data)
@alignpage  aligns string or array variable on a 256 byte address interval (example: to avoid page boundaries)
@dirty      the variable won't be set to zero when entering the subroutine (note: it will still be set to zero once on program startup, like all other uninitialized variables). You'll usually have to make sure to assign a value yourself before using the variable! This is used to reduce overhead in certain scenarios. 🦶🔫 Footgun warning.
==========  ======


Variables can be defined inside any scope (blocks, subroutines etc.) See :ref:`blocks`.
When declaring a numeric variable it is possible to specify the initial value, if you don't want it to be zero.
For other data types it is required to specify that initial value it should get.
Values will usually be part of an expression or assignment statement::

    12345                 ; integer number
    $aa43                 ; hex integer number
    %100101               ; binary integer number (% is also remainder operator so be careful)
    false                 ; boolean false
    -33.456e52            ; floating point number
    "Hi, I am a string"   ; text string, encoded with default encoding
    'a'                   ; byte value (ubyte) for the letter a
    sc:"Alternate"        ; text string, encoded with c64 screencode encoding
    sc:'a'                ; byte value of the letter a in c64 screencode encoding

    byte  counter  = 42   ; variable of size 8 bits, with initial value 42


**putting a variable in zeropage:**
If you add the ``@zp`` tag to the variable declaration, the compiler will prioritize this variable
when selecting variables to put into zeropage (but no guarantees). If there are enough free locations in the zeropage,
it will try to fill it with as much other variables as possible (before they will be put in regular memory pages).
Use ``@requirezp`` tag to *force* the variable into zeropage, but if there is no more free space the compilation will fail.
It's possible to put strings, arrays and floats into zeropage too, however because Zp space is really scarce
this is not advised as they will eat up the available space very quickly. It's best to only put byte or word
variables in zeropage.  By the way, there is also ``@nozp`` to keep a variable *out of the zeropage* at all times.

Example::

    byte   @zp  smallcounter = 42
    uword  @requirezp  zppointer = $4000


**shared variables:**
If you add the ``@shared`` tag to the variable declaration, the compiler will know that this variable
is a prog8 variable shared with some assembly code elsewhere. This means that the assembly code can
refer to the variable even if it's otherwise not used in prog8 code itself.
(usually, these kinds of 'unused' variables are optimized away by the compiler, resulting in an error
when assembling the rest of the code). Example::

    byte  @shared  assemblyVariable = 42


**uninitialized variables:**
All variables will be initialized by prog8 at startup, they'll get their assigned initialization value, or be cleared to zero.
This (re)initialization is also done on each subroutine entry for the variables declared in the subroutine.

There may be certain scenarios where this initialization is redundant and/or where you want to avoid the overhead of it.
In some cases, Prog8 itself can detect that a variable doesn't need a separate automatic initialization to zero, if
it's trivial that it is not being read between the variable's declaration and the first assignment. For instance, when
you declare a variable immediately before a for loop where it is the loop variable. However Prog8 is not yet very smart
at detecting these redundant initializations. If you want to be sure, check the generated assembly output.

In any case, you can use the ``@dirty`` tag on the variable declaration to make the variable *not* being reinitialized
when entering the subroutine (it will still be set to 0 once at program startup).
This means you usually have to make sure to assign a value yourself, before using the variable. 🦶🔫 Footgun warning.


**memory alignment:**
A string or array variable can be aligned to a couple of possible interval sizes in memory.
The use for this is very situational, but two examples are: sprite data for the C64 that needs
to be on a 64 byte aligned memory address, or an array aligned on a full page boundary to avoid
any possible extra page boundary clock cycles on certain instructions when accessing the array.
You can align on word, 64 bytes, and page boundaries::

    ubyte[] @alignword array = [1, 2, 3, 4, ...]
    ubyte[] @align64 spritedata = [ %00000000, %11111111, ...]
    ubyte[] @alignpage lookup = [11, 22, 33, 44, ...]


Initializing a variable
^^^^^^^^^^^^^^^^^^^^^^^

You can specify an initialization value in the variable declaration.
This will then be used to initialize the variable with at the start of the subroutine, instead of the default value 0.
The provided value doesn't have to be a constant; it can be any expression.
It is a shorter notation for declaring the variables and then assigning the values to them in separate assignment statment(s).

There are a few special situations:

initializing an array: ``ubyte[3] array = [11,22,33]``
    The initiazation value has to be a range value or an array literal (remember you can use '[4] * 3' and such).
    Ofcourse the size of the range or the number of values in the array has to match the declared array size.

initializing a multi variable declaration: ``ubyte a,b,c = multi()``
    The initialization value can be a single constant value which will then be assigned to each of the variables.
    It can also be a subroutine call to a subroutine returning multiple result values, which will then be put
    into the declared variables in order.  Ofcourse the number of values has to match the number of variables.


Data Types
----------

Prog8 supports the following data types:

===============  =======================  =================  =========================================
type identifier  type                     storage size       example var declaration and literal value
===============  =======================  =================  =========================================
``byte``         signed byte              1 byte = 8 bits    ``byte myvar = -22``
``ubyte``        unsigned byte            1 byte = 8 bits    ``ubyte myvar = $8f``,   ``ubyte c = 'a'``
``bool``         boolean                  1 byte = 8 bits    ``bool myvar = true`` or ``bool myvar == false``
``word``         signed word              2 bytes = 16 bits  ``word myvar = -12345``
``uword``        unsigned word            2 bytes = 16 bits  ``uword myvar = $8fee``
``long``         signed 32 bits integer   n/a                ``const long LARGE = $12345678``
                                          (only for consts)
``float``        floating-point           5 bytes = 40 bits  ``float myvar = 1.2345``
                                                             stored in 5-byte cbm MFLPT format
``byte[x]``      signed byte array        x bytes            ``byte[4] myvar``
``ubyte[x]``     unsigned byte array      x bytes            ``ubyte[4] myvar``
``word[x]``      signed word array        2*x bytes          ``word[4] myvar``
``uword[x]``     unsigned word array      2*x bytes          ``uword[4] myvar``
``float[x]``     floating-point array     5*x bytes          ``float[4] myvar``.   The 5 bytes per float is on CBM targets.
``bool[x]``      boolean array            x bytes            ``bool[4] myvar``  note: consider using bit flags in a byte or word instead to save space
``byte[]``       signed byte array        depends on value   ``byte[] myvar = [1, 2, 3, 4]``
``ubyte[]``      unsigned byte array      depends on value   ``ubyte[] myvar = [1, 2, 3, 4]``
``word[]``       signed word array        depends on value   ``word[] myvar = [1, 2, 3, 4]``
``uword[]``      unsigned word array      depends on value   ``uword[] myvar = [1, 2, 3, 4]``
``float[]``      floating-point array     depends on value   ``float[] myvar = [1.1, 2.2, 3.3, 4.4]``
``bool[]``       boolean array            depends on value   ``bool[] myvar = [true, false, true]``  note: consider using bit flags in a byte or word instead to save space
``str[]``        array with string ptrs   2*x bytes + strs   ``str[] names = ["ally", "pete"]``  note: equivalent to a uword array.
``str``          string (PETSCII)         varies             ``str myvar = "hello."``
                                                             implicitly terminated by a 0-byte
===============  =======================  =================  =========================================

Integers (bytes, words)
^^^^^^^^^^^^^^^^^^^^^^^

Integers are 8 or 16 bit numbers and can be written in normal decimal notation,
in hexadecimal and in binary notation. There is no octal notation. Hexadecimal has the '$' prefix,
binary has the '%' prefix. Note that ``%`` is also the remainder operator so be careful: if you want to take the remainder
of something with an operand starting with 1 or 0, you'll have to add a space in between, otherwise
the parser thinks you've typed an invalid binary number.

You can use underscores to group digits to make long numbers more readable: any underscores in the number are ignored by the compiler.
For instance ``3_000_000`` is a valid decimal number and so is ``%1001_0001`` a valid binary number.

A single character in single quotes such as ``'a'`` is translated into a byte integer,
which is the PETSCII value for that character. You can prefix it with the desired encoding, like with strings, see :ref:`encodings`.

**bytes versus words:**

Prog8 tries to determine the data type of integer values according to the table below,
and sometimes the context in which they are used.

========================= =================
value                     datatype
========================= =================
-128 .. 127               byte
0 .. 255                  ubyte
-32768 .. 32767           word
0 .. 65535                uword
-2147483647 .. 2147483647 long (only for const)
========================= =================

If the number fits in a byte but you really require it as a word value, you'll have to explicitly cast it: ``60 as uword``
or you can use the full word hexadecimal notation ``$003c``.  This is useful in expressions where you want a calcuation
to be done on word values, and don't want to explicitly have to cast everything all the time. For instance::

    ubyte  column
    uword  offset = column * 64       ; does (column * 64) as uword, wrong result?
    uword  offset = column * $0040    ; does (column as uword) * 64 , a word calculation

Only for ``const`` numbers, you can use larger values (32 bits signed integers). The compiler can handle those
internally in expressions. As soon as you have to actually store it into a variable,
you have to make sure the resulting value fits into the byte or word size of the variable.

.. attention::
    Doing math on signed integers can result in code that is a lot larger and slower than
    when using unsigned integers. Make sure you really need the signed numbers, otherwise
    stick to unsigned integers for efficiency.


Booleans
^^^^^^^^

Booleans are a distinct type called ``bool`` in Prog8 and can have only the values ``true`` or ``false``.
In memory, they are stored as a byte containing 0 or 1.
You can cast any numeric to a bool, in which case 0 will become ``false`` and any nonzero value will become ``true``.


Floating point numbers
^^^^^^^^^^^^^^^^^^^^^^

Floats are stored in the 5-byte 'MFLPT' format that is used on CBM machines.
Floating point support is available on the c64 and cx16 (and virtual) compiler targets.
On the c64 and cx16, the rom routines are used for floating point operations,
so on both systems the correct rom banks have to be banked in to make this work.
Although the C128 shares the same floating point format, Prog8 currently doesn't support
using floating point on that system (because the c128 fp routines require the fp variables
to be in another ram bank than the program, something Prog8 doesn't support yet).

Also your code needs to import the ``floats`` library to enable floating point support
in the compiler, and to gain access to the floating point routines.
(this library contains the directive to enable floating points, you don't have
to worry about this yourself)

The largest 5-byte MFLPT float that can be stored is: **1.7014118345e+38**   (negative: **-1.7014118345e+38**)

You can use underscores to group digits in floating point literals to make long numbers more readable:
any underscores in the number are ignored by the compiler.
For instance ``30_000.999_999`` is a valid floating point number 30000.999999.

.. attention::
    On the X16, make sure rom bank 4 is still active before doing floationg point operations (it's the bank that contains the fp routines).
    On the C64, you have to make sure the Basic ROM is still banked in (same reason).


.. _arrayvars:

Arrays
^^^^^^
Arrays can be created from a list of booleans, bytes, words, floats, or addresses of other variables
(such as explicit address-of expressions, strings, or other array variables) - values in an array literal
always have to be constants. A trailing comma is allowed, sometimes this is easier when copying values
or when adding more stuff to the array later. Here are some examples of arrays::

    byte[10]  array                   ; array of 10 bytes, initially set to 0
    byte[]  array = [1, 2, 3, 4]      ; initialize the array, size taken from value
    ubyte[99] array = [255]*99        ; initialize array with 99 times 255 [255, 255, 255, 255, ...]
    byte[] array = 100 to 199         ; initialize array with [100, 101, ..., 198, 199]
    str[] names = ["ally", "pete"]    ; array of string pointers/addresses (equivalent to array of uwords)
    uword[] others = [names, array]   ; array of pointers/addresses to other arrays
    bool[2] flags = [true, false]     ; array of two boolean values  (take up 1 byte each, like a byte array)

    value = array[3]            ; the fourth value in the array (index is 0-based)
    char = string[4]            ; the fifth character (=byte) in the string
    char = string[-2]           ; the second-to-last character in the string (Python-style indexing from the end)

.. note::
    To allow the 6502 CPU to efficiently access values in an array, the array should be small enough to be
    indexable by a single byte index.
    This means byte arrays should be <= 256 elements, word arrays <= 256 elements as well (if split, which
    is the default. When not split, the maximum length is 128. See below for details about this disctinction).t
    Float arrays should be <= 51 elements.

Arrays can be initialized with a range expression or an array literal value.
You can write out such an initializer value over several lines if you want to improve readability.
When an initialization value is given, you are allowed to omit the array size in the declaration,
because it can be inferred from the initialization value.
You can use '*' to repeat array fragments to build up a larger array.

You can assign a new value to an element in the array, but you can't assign a whole
new array to another array at once. This is usually a costly operation. If you really
need this you have to write it out depending on the use case: you can copy the memory using
``sys.memcopy(sourcearray, targetarray, sizeof(targetarray))``. Or perhaps use ``sys.memset`` instead to
set it all to the same value, or maybe even simply assign the individual elements.

Note that the various keywords for the data type and variable type (``byte``, ``word``, ``const``, etc.)
can't be used as *identifiers* elsewhere. You can't make a variable, block or subroutine with the name ``byte``
for instance.

Using the ``in`` operator you can easily check if a value is present in an array,
example: ``if choice in [1,2,3,4] {....}``

**Arrays at a specific memory location:**

Using the memory-mapped syntax it is possible to define an array to be located at a specific memory location.
For instance to reference the first 5 rows of the Commodore 64's screen matrix as an array, you can define::

    &ubyte[5*40]  top5screenrows = $0400

This way you can set the second character on the second row from the top like this::

    top5screenrows[41] = '!'

**Array indexing on a pointer variable:**

An uword variable can be used in limited scenarios as a 'pointer' to a byte in memory at a specific,
dynamic, location. You can use array indexing on a pointer variable to use it as a byte array at
a dynamic location in memory: currently this is equivalent to directly referencing the bytes in
memory at the given index. In contrast to a real array variable, the index value can be the size of a word.
Unlike array variables, negative indexing for pointer variables does *not* mean it will be counting from the end, because the size of the buffer is unknown.
Instead, it simply addresses memory that lies *before* the pointer variable.
See also :ref:`pointervars`

**LSB/MSB split word and str arrays:**

As an optimization, (u)word arrays and str arrays are split by the compiler in memory as two separate arrays,
one with the LSBs and one with the MSBs of the word values. This is more efficient to access by the 6502 cpu.
It also allows a maximum length of 256 for word arrays, where normally it would have been 128.

For normal prog8 array indexing, the compiler takes care of the distiction for you under water.
*But for assembly code, or code that otherwise accesses the array elements directly, you have to be aware of the distinction from 'normal' arrays.*
In the assembly code, the array is generated as two byte arrays namely ``name_lsb`` and ``name_msb``, immediately following eachother in memory.

The ``@split`` tag can be added to the variable declaration to *always* split the array even when the command line option -dontsplitarrays is set
The ``@nosplit`` tag can be added to the variable declaration to *never* split the array. This is useful for compatibility with
code that expects the words to be sequentially in memory (such as the cx16.FB_set_palette routine).

There is a command line option ``-dontsplitarrays`` that avoids splitting word arrays by default,
so every word array is layed out sequentially in memory (this is what older versions of Prog8 used to do).immediately
It reduces the maximum word array length to 128. You can still override this by adding ``@split`` explicitly.

.. note::
    Most but not all array operations are supported yet on "split word arrays".
    If you get a compiler error message, simply revert to a regular sequential word array using ``@nosplit``,
    and please report the issue.

.. note::
    Array literals are stored as split arrays if they're initializing a split word array, otherwise,
    they are stored as sequential words!  So if you pass one directly to a subroutine (like ``func([1111,2222,3333])``),
    the array values are sequential in memory.  If this is undesiarable (i.e. the subroutine expects a split word array),
    you have to create a normal array variable first and then pass that to the subroutine.

.. caution::
    Be aware that the default is to split word arrays. Normal array access is taken care of by Prog8, so you won't
    notice this optimization. However if you are accessing the array's values using other ways (for example via a pointer,
    and then using ``peekw`` to get the value) you have to be aware of this. In that ``peekw`` example you have
    to make sure to use ``@nosplit`` on the word array so that the words stay sequentially in memory which is what ``peekw`` needs.
    Also be careful when passing arrays to library routines (this is via a pointer!): you have to make sure
    the library routine can deal with the split array otherwise you have to use ``@nosplit`` as well.


.. _encodings:

Strings
^^^^^^^

Strings are a sequence of characters enclosed in double quotes. The length is limited to 255 characters.
They're stored and treated much the same as a byte array,
but they have some special properties because they are considered to be *text*.
Strings (without encoding prefix) will be encoded (translated from ASCII/UTF-8) into bytes via the
*default encoding* for the target platform. On the CBM machines, this is CBM PETSCII.

Strings without an encoding prefix are stored in the machine's default character encoding (which is PETSCII on the CBM machines,
but can be something else on other targets).
There are ways to change the encoding: prefix the string with an encoding name, or use the ``%encoding`` directive to
change it for the whole file at once. Here are examples of the possible encodings:

    - ``"hello"``   a string translated into the default character encoding (PETSCII on the CBM machines)
    - ``petscii:"hello"``               string in CBM PETSCII encoding
    - ``sc:"my name is Alice"``         string in CBM screencode encoding
    - ``iso:"Ich heiße François"``      string in iso-8859-15 encoding (Latin)
    - ``iso5:"Хозяин и Работник"``      string in iso-8859-5 encoding (Cyrillic)
    - ``iso16:"zażółć gęślą jaźń"``     string in iso-8859-16 encoding (Eastern Europe)
    - ``atascii:"I am Atari!"``         string in "atascii" encoding (Atari 8-bit)
    - ``cp437:"≈ IBM Pc ≈ ♂♀♪☺¶"``     string in "cp437" encoding (IBM PC codepage 437) See note below!
    - ``kata:"ｱﾉ ﾆﾎﾝｼﾞﾝ ﾜ ｶﾞｲｺｸｼﾞﾝ｡ # が # ガ"``  string in "kata" encoding (Katakana)
    - ``c64os:"^Hello_World! \\ ~{_}~"`` string in "c64os" encoding (C64 OS)

So what follows below is a string literal that will be encoded into memory bytes using the iso encoding.
It can be correctly displayed on the screen only if a iso-8859-15 charset has been activated first
(the Commander X16 has this capability)::

    iso:"Käse, Straße"

You can concatenate two string literals using '+', which can be useful to
split long strings over separate lines. But remember that the length
of the total string still cannot exceed 255 characters.
A string literal can also be repeated a given number of times using '*', where the repeat number must be a constant value.
And a new string value can be assigned to another string, but no bounds check is done!
So be sure the destination string is large enough to contain the new value (it is overwritten in memory)::

    str string1 = "first part" + "second part"
    str string2 = "hello!" * 10

    string1 = string2
    string1 = "new value"

There are several escape sequences available to put special characters into your string value:

- ``\\`` - the backslash itself, has to be escaped because it is the escape symbol by itself
- ``\n`` - newline character (move cursor down and to beginning of next line)
- ``\r`` - carriage return character (more or less the same as newline if printing to the screen)
- ``\"`` - quote character (otherwise it would terminate the string)
- ``\'`` - apostrophe character (has to be escaped in character literals, is okay inside a string)
- ``\uHHHH`` - a unicode codepoint \u0000 - \uffff (16-bit hexadecimal)
- ``\xHH`` - 8-bit hex value that will be copied verbatim *without encoding*

- String literals can contain many symbols directly if they have a PETSCII equivalent, such as "♠♥♣♦π▚●○╳".
  Characters like ^, _, \\, {, } and | (that have no direct PETSCII counterpart) are still accepted and converted to the closest PETSCII equivalents. (Make sure you save the source file in UTF-8 encoding if you use this.)

Using the ``in`` operator you can easily check if a character is present in a string,
example: ``if '@' in email_address {....}`` (however this gives no clue about the location
in the string where the character is present, if you need that, use the ``strings.find()``
library function instead)
**Caution:**
This checks *all* elements in the string with the length as it was initially declared.
Even when a string was changed and is terminated early with a 0-byte early,
the containment check with ``in`` will still look at all character positions in the initial string.
Consider using ``strings.find`` followed by ``if_cs`` (for instance) to do a "safer" search
for a character in such strings (one that stops at the first 0 byte)


.. hint::
    Strings/arrays and uwords (=memory address) can often be interchanged.
    An array of strings is actually an array of uwords where every element is the memory
    address of the string. You can pass a memory address to assembly functions
    that require a string as an argument.
    For regular assignments you still need to use an explicit ``&`` (address-of) to take
    the address of the string or array.

.. hint::
    You can declare parameters and return values of subroutines as ``str``,
    but in this case that is equivalent to declaring them as ``uword`` (because
    in this case, the address of the string is passed as argument or returned as value).

.. note:: Strings and their (im)mutability

    *String literals outside of a string variable's initialization value*,
    are considered to be "constant", i.e. the string isn't going to change
    during the execution of the program. The compiler takes advantage of this in certain
    ways. For instance, multiple identical occurrences of a string literal are folded into
    just one string allocation in memory. Examples of such strings are the string literals
    passed to a subroutine as arguments.

    *Strings that aren't such string literals are considered to be unique*, even if they
    are the same as a string defined elsewhere. This includes the strings assigned to
    a string variable in its declaration! These kind of strings are not deduplicated and
    are just copied into the program in their own unique part of memory. This means that
    it is okay to treat those strings as mutable; you can safely change the contents
    of such a string without destroying other occurrences (as long as you stay within
    the size of the allocated string!)

.. note:: printing **cp437** encoded strings

    To print strings in the **cp437** encoding, you will probably need ``txt.print_lit(message)`` to properly print
    them to the screen. This is because this encoding has symbols in place of where normally ASCII
    control characters such as Line feed would be. A regular ``txt.print(message)`` will likely get confused
    by such symbols and print them as control characters, messing up the output.


.. _range-expression:

Ranges
^^^^^^

A special value is the *range expression* which represents a range of integer numbers or characters,
from the starting value to (and including) the ending value::

    <start>  to  <end>   [ step  <step> ]
    <start>  downto  <end>   [ step  <step> ]

You an provide a step value if you need something else than the default increment which is one (or,
in case of downto, a decrement of one).  Unlike the start and end values, the step value must be a constant.
Because a step of minus one is so common you can just use
the downto variant to avoid having to specify the step as well::

    0 to 7                   ; range of values 0, 1, 2, 3, 4, 5, 6, 7
    20 downto 10 step -3     ; range of values 20, 17, 14, 11

    aa = 5
    xx = 10
    aa to xx                 ; range of 5, 6, 7, 8, 9, 10

    for  i  in  0 to 127  {
        ; i loops 0, 1, 2, ... 127
    }


Range expressions are most often used in for loops, but can be also be used to create array initialization values::

	byte[] array = 100 to 199     ; initialize array with [100, 101, ..., 198, 199]


Constants
^^^^^^^^^

When using ``const``, the value of the 'variable' cannot be changed; it has become a compile-time constant value instead.
You'll have to specify the initial value expression. This value is then used
by the compiler everywhere you refer to the constant (and no memory is allocated
for the constant itself). Onlythe simple numeric types (byte, word, float) can be defined as a constant.
If something is defined as a constant, very efficient code can usually be generated from it.
Variables on the other hand can't be optimized as much, need memory, and more code to manipulate them.
Note that a subset of the library routines in the ``math``, ``strings`` and ``floats`` modules are recognised in
compile time expressions. For example, the compiler knows what ``math.sin8u(12)`` is and replaces it with the computed result.


Memory-mapped
^^^^^^^^^^^^^
When using ``&`` (the address-of operator but now applied to the datatype in the variable's declaration),
the variable will be placed at a designated position in memory rather than being newly allocated somewhere.
The initial value in the declaration should be the valid memory address where the variable should be placed.
Reading the variable will then read its value from that address, and setting the variable will directly modify those memory location(s)::

	const  byte  max_age = 2000 - 1974      ; max_age will be the constant value 26
	&word  SCREENCOLORS = $d020             ; a 16-bit word at the address $d020-$d021

If you need to use the variable's memory address instead of the value placed there, you can still use `&variable` as usual.
You can memory map all datatypes except strings.


.. _pointervars:

Direct access to memory locations ('peek' and 'poke')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Usually specific memory locations are accessed through a memory-mapped variable, such as ``cbm.BGCOL0`` that is defined
as the background color register at the memory address $d021 (on the c64 target).

If you want to access any memory location directly (by using the address itself or via an uword pointer variable),
without defining a memory-mapped location, you can do so by enclosing the address in ``@(...)``::

    color = @($d020)  ; set the variable 'color' to the current c64 screen border color ("peek(53280)")
    @($d020) = 0      ; set the c64 screen border to black ("poke 53280,0")
    @(vic+$20) = 6    ; you can also use expressions to 'calculate' the address

This is the official syntax to 'dereference a pointer' as it is often named in other languages.
You can actually also use the array indexing notation for this. It will be silently converted into
the direct memory access expression as explained above. Note that unlike regular arrays,
the index is not limited to an ubyte value. You can use a full uword to index a pointer variable like this::

    pointervar[999] = 0     ; set memory byte to zero at location pointervar + 999.


Converting/Casting types into other types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sometimes you need an unsigned word where you have an unsigned byte, or you need some other type conversion.
Many type conversions are possible by just writing ``as <type>`` at the end of an expression::

    uword  uw = $ea31
    ubyte  ub = uw as ubyte     ; ub will be $31, identical to lsb(uw)
    float  f = uw as float      ; f will be 59953, but this conversion can be omitted in this case
    word   w = uw as word       ; w will be -5583 (simply reinterpret $ea31 as 2-complement negative number)
    f = 56.777
    ub = f as ubyte             ; ub will be 56

Sometimes it is a straight reinterpretation of the given value as being of the other type,
sometimes an actual value conversion is done to convert it into the other type.
Try to avoid those type conversions as much as possible.


Initial values across multiple runs of the program
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

When declaring values with an initial value, this value will be set into the variable each time
the program reaches the declaration again. This can be in loops, multiple subroutine calls,
or even multiple invocations of the entire program.
If you omit the initial value, zero will be used instead.

This only works for simple types, *and not for string variables and arrays*.
It is assumed these are left unchanged by the program; they are not re-initialized on
a second run.
If you do modify them in-place, you should take care yourself that they work as
expected when the program is restarted.
(This is an optimization choice to avoid having to store two copies of every string and array)