4.0 KiB
Literals and initializers
Numeric literals
Decimal: 1
, 10
Binary: %0101
, 0b101001
Quaternary: 0q2131
Octal: 0o172
Hexadecimal: $D323
, 0x2a2
String literals
String literals can be used as either array initializers or expressions of type pointer
.
String literals are surrounded with double quotes and optionally followed by the name of the encoding:
"this is a string" ascii
"this is also a string"
If there is no encoding name specified, then the default
encoding is used.
Two encoding names are special and refer to platform-specific encodings:
default
and scr
.
You can also append z
to the name of the encoding to make the string zero-terminated.
This means that the string will have one extra byte appended, equal to 0.
"this is a zero-terminated string" asciiz
"this is also a zero-terminated string"z
Most characters between the quotes are interpreted literally.
To allow characters that cannot be inserted normally,
each encoding may define escape sequences.
Every encoding is guaranteed to support at least
{n}
for new line,
{q}
for double quote
and {apos}
for single quote/apostrophe.
For the list of all text encodings and escape sequences, see this page.
In some encodings, multiple characters are mapped to the same byte value, for compatibility with multiple variants.
If the characters in the literal cannot be encoded in particular encoding, an error is raised.
However, if the command-line option -flenient-encoding
is used,
then literals using default
and scr
encodings replace unsupported characters with supported ones,
skip unsupported escape sequences, and a warning is issued.
For example, if -flenient-encoding
is enabled, then a literal "£¥↑ž©ß{lbrace}"
is equivalent to:
-
"£Y↑z(C)ss"
if the default encoding ispet
-
"£Y↑z©ss"
if the default encoding isbbc
-
"?Y^z(C)ss"
if the default encoding isascii
-
"?Y^ž(C)ss"
if the default encoding isiso_yu
-
"?Y^z(C)ß"
if the default encoding isiso_de
-
"?¥^z(C)ss"
if the default encoding isjisx
Note that the final length of the string may vary.
Character literals
Character literals are surrounded by single quotes and optionally followed by the name of the encoding:
'x' ascii
'W'
From the type system point of view, they are constants of type byte.
For the list of all text encodings and escape sequences, see this page.
If the characters in the literal cannot be encoded in particular encoding, an error is raised.
However, if the command-line option -flenient-encoding
is used,
then literals using default
and scr
encodings replace unsupported characters with supported ones.
If the replacement is one character long, only a warning is issued, otherwise an error is raised.
Array initialisers
An array is initialized with either:
-
a string literal
-
a
file
expression -
a
for
-style expression -
a format, followed by an array initializer:
-
@word
(=@word_le
): for every term of the array initializer, emit two bytes, first being the low byte of the value, second being the high byte:
@word [$1122]
is equivalent to[$22, $11]
-
@word_be
– like the above, but opposite:
@word_be [$1122]
is equivalent to[$11, $22]
-
-
a list of byte literals and/or other array initializers, surrounded by brackets:
array a = [1, 2] array b = "----" scr array c = ["hello world!" ascii, 13] array d = file("d.bin") array e = file("d.bin", 128, 256) array f = for x,0,until,8 [x * 3 + 5] // equivalent to [5, 8, 11, 14, 17, 20, 23, 26]
Trailing commas ([1, 2,]
) are not allowed.
The parameters for file
are: file path, optional start offset, optional length
(start offset and length have to be either both present or both absent).
The for
-style expression has a variable, a starting index, a direction, a final index,
and a parameterizable array initializer.
The initializer is repeated for every value of the variable in the given range.