millfork/docs/lang/literals.md

[< back to index](../index.md)

# Literals and initializers

## Numeric literals

Decimal: `1`, `10`

Binary: `%0101`, `0b101001`

Quaternary: `0q2131`

Octal: `0o172`

Hexadecimal: `$D323`, `0x2a2`

When using Intel syntax for inline assembly, another hexadecimal syntax is available: `0D323H`, `2a2h`.
It is not allowed in any other places.

## String literals

String literals can be used as either array initializers or expressions of type `pointer`.

String literals are surrounded with double quotes and optionally followed by the name of the encoding:

    "this is a string" ascii
    "this is also a string"

If there is no encoding name specified, then the `default` encoding is used. 
Two encoding names are special and refer to platform-specific encodings:
`default` and `scr`.

You can also append `z` to the name of the encoding to make the string zero-terminated.
This means that the string will have one extra byte appended, equal to 0.

    "this is a zero-terminated string" asciiz
    "this is also a zero-terminated string"z

Most characters between the quotes are interpreted literally.
To allow characters that cannot be inserted normally,
each encoding may define escape sequences.
Every encoding is guaranteed to support at least 
`{q}` for double quote 
and `{apos}` for single quote/apostrophe.

For the list of all text encodings and escape sequences, see [this page](./text.md).

In some encodings, multiple characters are mapped to the same byte value,
for compatibility with multiple variants.

If the characters in the literal cannot be encoded in particular encoding, an error is raised.
However, if the command-line option `-flenient-encoding` is used,
then literals using `default` and `scr` encodings replace unsupported characters with supported ones, 
skip unsupported escape sequences, and a warning is issued.
For example, if `-flenient-encoding` is enabled, then a literal `"£¥↑ž©ß"` is equivalent to:

* `"£Y↑z(C)ss"` if the default encoding is `pet`

* `"£Y↑z©ss"` if the default encoding is `bbc`

* `"?Y^z(C)ss"` if the default encoding is `ascii`

* `"?Y^ž(C)ss"` if the default encoding is `iso_yu`

* `"?Y^z(C)ß"` if the default encoding is `iso_de`

* `"?¥^z(C)ss"` if the default encoding is `jisx`

Note that the final length of the string may vary.

## Character literals

Character literals are surrounded by single quotes and optionally followed by the name of the encoding: 

    'x' ascii
    'W'

From the type system point of view, they are constants of type byte.

For the list of all text encodings and escape sequences, see [this page](./text.md).

If the characters in the literal cannot be encoded in particular encoding, an error is raised.
However, if the command-line option `-flenient-encoding` is used,
then literals using `default` and `scr` encodings replace unsupported characters with supported ones.
If the replacement is one character long, only a warning is issued, otherwise an error is raised.

## Array initialisers 

An array is initialized with either:

* a string literal

* a `file` expression

* a `for`-style expression

* a format, followed by an array initializer:

   *   `@word` (=`@word_le`): for every term of the array initializer, emit two bytes, first being the low byte of the value, second being the high byte:      
       `@word [$1122]` is equivalent to `[$22, $11]`
   
   *   `@word_be` – like the above, but opposite:  
       `@word_be [$1122]` is equivalent to `[$11, $22]`
   

* a list of byte literals and/or other array initializers, surrounded by brackets:


    array a = [1, 2]
    array b = "----" scr
    array c = ["hello world!" ascii, 13]
    array d = file("d.bin")
    array e = file("d.bin", 128, 256)
    array f = for x,0,until,8 [x * 3 + 5]  // equivalent to [5, 8, 11, 14, 17, 20, 23, 26]

Trailing commas (`[1, 2,]`) are not allowed.

The parameters for `file` are: file path, optional start offset, optional length
(start offset and length have to be either both present or both absent).

The `for`-style expression has a variable, a starting index, a direction, a final index, 
and a parameterizable array initializer.
The initializer is repeated for every value of the variable in the given range.

What might be useful is the fact that the compiler allows for built-in trigonometric functions
in constant expressions only:

* `sin(x, n)` – returns _n_·**sin**(*x*π/128)

* `cos(x, n)` – returns _n_·**cos**(*x*π/128)

* `tan(x, n)` – returns _n_·**tan**(*x*π/128)
-												Documentation improvements

											
										
										
											2018-04-03 00:21:26 +02:00
+								[< back to index](../index.md)
-												Some more documentation

											
										
										
											2018-01-04 01:15:04 +01:00
+								# Literals and initializers
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
 								## Numeric literals
 								Decimal: `1`, `10`
 								Binary: `%0101`, `0b101001`
-												Preliminary support for 65816, 65CE02 and HuC6280

											
										
										
											2018-03-03 01:21:57 +01:00
+								Quaternary: `0q2131`
 								Octal: `0o172`
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
+								Hexadecimal: `$D323`, `0x2a2`
-												Z80: Intel syntax support

											
										
										
											2018-08-03 13:23:37 +02:00
+								When using Intel syntax for inline assembly, another hexadecimal syntax is available: `0D323H`, `2a2h`.
 								It is not allowed in any other places.
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
+								## String literals
-												Text literals in expressions, escape sequences, and more

											
										
										
											2018-07-28 00:58:20 +02:00
+								String literals can be used as either array initializers or expressions of type `pointer`.
-												Text encoding improvements

											
										
										
											2018-07-07 00:58:44 +02:00
+								String literals are surrounded with double quotes and optionally followed by the name of the encoding:
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
 								    "this is a string" ascii
-												Text encoding improvements

											
										
										
											2018-07-07 00:58:44 +02:00
+								    "this is also a string"
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
-												Text literals in expressions, escape sequences, and more

											
										
										
											2018-07-28 00:58:20 +02:00
+								If there is no encoding name specified, then the `default` encoding is used.
 								Two encoding names are special and refer to platform-specific encodings:
 								`default` and `scr`.
-												Text encoding improvements

											
										
										
											2018-07-07 00:58:44 +02:00
-												Text literals in expressions, escape sequences, and more

											
										
										
											2018-07-28 00:58:20 +02:00
+								You can also append `z` to the name of the encoding to make the string zero-terminated.
 								This means that the string will have one extra byte appended, equal to 0.
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
-												Text literals in expressions, escape sequences, and more

											
										
										
											2018-07-28 00:58:20 +02:00
+								    "this is a zero-terminated string" asciiz
 								    "this is also a zero-terminated string"z
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
-												Text literals in expressions, escape sequences, and more

											
										
										
											2018-07-28 00:58:20 +02:00
+								Most characters between the quotes are interpreted literally.
 								To allow characters that cannot be inserted normally,
 								each encoding may define escape sequences.
 								Every encoding is guaranteed to support at least
 								`{q}` for double quote
 								and `{apos}` for single quote/apostrophe.
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
-												Text literals in expressions, escape sequences, and more

											
										
										
											2018-07-28 00:58:20 +02:00
+								For the list of all text encodings and escape sequences, see [this page](./text.md).
-												More text codecs

											
										
										
											2018-04-02 19:47:11 +02:00
-												Text literals in expressions, escape sequences, and more

											
										
										
											2018-07-28 00:58:20 +02:00
+								In some encodings, multiple characters are mapped to the same byte value,
 								for compatibility with multiple variants.
-												Text encoding improvements

											
										
										
											2018-07-07 00:58:44 +02:00
 								If the characters in the literal cannot be encoded in particular encoding, an error is raised.
 								However, if the command-line option `-flenient-encoding` is used,
-												Text literals in expressions, escape sequences, and more

											
										
										
											2018-07-28 00:58:20 +02:00
+								then literals using `default` and `scr` encodings replace unsupported characters with supported ones,
 								skip unsupported escape sequences, and a warning is issued.
-												Fix/improve documentation

											
										
										
											2018-12-24 01:32:17 +01:00
+								For example, if `-flenient-encoding` is enabled, then a literal `"£¥↑ž©ß"` is equivalent to:
-												Text encoding improvements

											
										
										
											2018-07-07 00:58:44 +02:00
 								* `"£Y↑z(C)ss"` if the default encoding is `pet`
 								* `"£Y↑z©ss"` if the default encoding is `bbc`
 								* `"?Y^z(C)ss"` if the default encoding is `ascii`
 								* `"?Y^ž(C)ss"` if the default encoding is `iso_yu`
 								* `"?Y^z(C)ß"` if the default encoding is `iso_de`
 								* `"?¥^z(C)ss"` if the default encoding is `jisx`
 								Note that the final length of the string may vary.
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
-												Character literals

											
										
										
											2018-04-02 21:06:18 +02:00
+								## Character literals
-												Text encoding improvements

											
										
										
											2018-07-07 00:58:44 +02:00
+								Character literals are surrounded by single quotes and optionally followed by the name of the encoding:
-												Character literals

											
										
										
											2018-04-02 21:06:18 +02:00
 								    'x' ascii
-												Text encoding improvements

											
										
										
											2018-07-07 00:58:44 +02:00
+								    'W'
-												Character literals

											
										
										
											2018-04-02 21:06:18 +02:00
 								From the type system point of view, they are constants of type byte.
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
-												Text literals in expressions, escape sequences, and more

											
										
										
											2018-07-28 00:58:20 +02:00
+								For the list of all text encodings and escape sequences, see [this page](./text.md).
-												Text encoding improvements

											
										
										
											2018-07-07 00:58:44 +02:00
+								If the characters in the literal cannot be encoded in particular encoding, an error is raised.
 								However, if the command-line option `-flenient-encoding` is used,
 								then literals using `default` and `scr` encodings replace unsupported characters with supported ones.
-												Text literals in expressions, escape sequences, and more

											
										
										
											2018-07-28 00:58:20 +02:00
+								If the replacement is one character long, only a warning is issued, otherwise an error is raised.
-												Text encoding improvements

											
										
										
											2018-07-07 00:58:44 +02:00
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
+								## Array initialisers
-												Documentation improvements

											
										
										
											2018-04-03 00:21:26 +02:00
+								An array is initialized with either:
 								* a string literal
 								* a `file` expression
 								* a `for`-style expression
-												Array filters (@word, @word_be)

											
										
										
											2018-06-18 02:52:14 +02:00
+								* a format, followed by an array initializer:
 								   *   `@word` (=`@word_le`): for every term of the array initializer, emit two bytes, first being the low byte of the value, second being the high byte:
 								       `@word [$1122]` is equivalent to `[$22, $11]`
 								   *   `@word_be` – like the above, but opposite:
 								       `@word_be [$1122]` is equivalent to `[$11, $22]`
-												Documentation improvements

											
										
										
											2018-04-03 00:21:26 +02:00
+								* a list of byte literals and/or other array initializers, surrounded by brackets:
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
 								    array a = [1, 2]
 								    array b = "----" scr
 								    array c = ["hello world!" ascii, 13]
-												Documentation improvements

											
										
										
											2018-04-03 00:21:26 +02:00
+								    array d = file("d.bin")
 								    array e = file("d.bin", 128, 256)
 								    array f = for x,0,until,8 [x * 3 + 5]  // equivalent to [5, 8, 11, 14, 17, 20, 23, 26]
 								Trailing commas (`[1, 2,]`) are not allowed.
 								The parameters for `file` are: file path, optional start offset, optional length
 								(start offset and length have to be either both present or both absent).
-												Documentation improvements

											
										
										
											2018-02-27 13:26:56 +01:00
-												Documentation improvements

											
										
										
											2018-04-03 00:21:26 +02:00
+								The `for`-style expression has a variable, a starting index, a direction, a final index,
-												Documentation update

											
										
										
											2018-07-03 23:28:05 +02:00
+								and a parameterizable array initializer.
-												Documentation improvements

											
										
										
											2018-04-03 00:21:26 +02:00
+								The initializer is repeated for every value of the variable in the given range.
-												Improvements to trigonometric functions

											
										
										
											2018-08-08 23:52:47 +02:00
 								What might be useful is the fact that the compiler allows for built-in trigonometric functions
 								in constant expressions only:
 								* `sin(x, n)` – returns _n_·**sin**(*x*π/128)
 								* `cos(x, n)` – returns _n_·**cos**(*x*π/128)
 								* `tan(x, n)` – returns _n_·**tan**(*x*π/128)