mirror of
https://github.com/KarolS/millfork.git
synced 2026-04-19 10:42:10 +00:00
Text encoding improvements
This commit is contained in:
@@ -81,14 +81,18 @@ Default: no if targeting Ricoh, yes otherwise.
|
||||
* `-fvariable-overlap`, `-fno-variable-overlap` – Whether variables should overlap if their scopes do not intersect.
|
||||
Default: yes.
|
||||
|
||||
* `-fbounds-checking`, `-fnobounds-checking` – Whether should insert bounds checking on array access.
|
||||
* `-fbounds-checking`, `-fno-bounds-checking` – Whether should insert bounds checking on array access.
|
||||
Default: no.
|
||||
|
||||
* `-fcompact-dispatch-params`, `-fnocompact-dispatch-params` –
|
||||
* `-fcompact-dispatch-params`, `-fno-compact-dispatch-params` –
|
||||
Whether parameter values in return dispatch statements may overlap other objects.
|
||||
This may cause problems if the parameter table is stored next to a hardware register that has side effects when reading.
|
||||
`.ini` equivalent: `compact_dispatch_params`. Default: yes.
|
||||
|
||||
* `-flenient-encoding`, `-fno-lenient-encoding` –
|
||||
Whether the compiler should allow for invalid characters in string/character literals that use the default encodings and replace them with alternatives.
|
||||
.ini` equivalent: `lenient_encoding`. Default: no.
|
||||
|
||||
## Optimization options
|
||||
|
||||
* `-O0` – Disable all optimizations.
|
||||
|
||||
@@ -26,6 +26,13 @@ Every platform is defined in an `.ini` file with an appropriate name.
|
||||
|
||||
* `z80` (Zilog Z80; experimental and very incomplete)
|
||||
|
||||
* `encoding` – default encoding for console I/O, one of
|
||||
`ascii`, `pet`/`petscii`, `petscr`/`cbmscr`, `atascii`, `bbc`, `jis`/`jisx`, `apple2`,
|
||||
`iso_de`, `iso_no`/`iso_dk`, `iso_se`/`iso_fi`, `iso_yu`. Default: `ascii`
|
||||
|
||||
* `screen_encoding` – default encoding for screencodes (literals with encoding specified as `scr`).
|
||||
Default: the same as `encoding`.
|
||||
|
||||
* `modules` – comma-separated list of modules that will be automatically imported
|
||||
|
||||
* other compilation options (they can be overridden using commandline options):
|
||||
@@ -54,6 +61,8 @@ Every platform is defined in an `.ini` file with an appropriate name.
|
||||
* `inline` - inline functions automatically by default, default is `false`.
|
||||
|
||||
* `ipo` - enable interprocedural optimization, default is `false`.
|
||||
|
||||
* `lenient_encoding` - allow for automatic substitution of invalid characters in string literals using the default encodings, default is `false`.
|
||||
|
||||
|
||||
#### `[allocation]` section
|
||||
|
||||
+36
-4
@@ -16,9 +16,10 @@ Hexadecimal: `$D323`, `0x2a2`
|
||||
|
||||
## String literals
|
||||
|
||||
String literals are surrounded with double quotes and followed by the name of the encoding:
|
||||
String literals are surrounded with double quotes and optionally followed by the name of the encoding:
|
||||
|
||||
"this is a string" ascii
|
||||
"this is also a string"
|
||||
|
||||
Characters between the quotes are interpreted literally,
|
||||
there are no ways to escape special characters or quotes.
|
||||
@@ -28,11 +29,16 @@ for compatibility with multiple variants.
|
||||
|
||||
Currently available encodings:
|
||||
|
||||
* `default` – default console encoding (can be omitted)
|
||||
|
||||
* `scr` – default screencodes
|
||||
(usually the same as `default`, a notable exception are the Commodore computers)
|
||||
|
||||
* `ascii` – standard ASCII
|
||||
|
||||
* `pet` or `petscii` – PETSCII (ASCII-like character set used by Commodore machines)
|
||||
|
||||
* `scr` – Commodore screencodes
|
||||
* `cbmscr` or `petscr` – Commodore screencodes
|
||||
|
||||
* `apple2` – Apple II charset ($A0–$FE)
|
||||
|
||||
@@ -46,16 +52,42 @@ Currently available encodings:
|
||||
|
||||
When programming for Commodore,
|
||||
use `pet` for strings you're printing using standard I/O routines
|
||||
and `scr` for strings you're copying to screen memory directly.
|
||||
and `petscr` for strings you're copying to screen memory directly.
|
||||
|
||||
If the characters in the literal cannot be encoded in particular encoding, an error is raised.
|
||||
However, if the command-line option `-flenient-encoding` is used,
|
||||
then literals using `default` and `scr` encodings replace unsupported characters with supported ones
|
||||
and a warning is issued.
|
||||
For example, if `-flenient-encoding` is enabled, then a literal `"£¥↑ž©ß"` is equivalent to:
|
||||
|
||||
* `"£Y↑z(C)ss"` if the default encoding is `pet`
|
||||
|
||||
* `"£Y↑z©ss"` if the default encoding is `bbc`
|
||||
|
||||
* `"?Y^z(C)ss"` if the default encoding is `ascii`
|
||||
|
||||
* `"?Y^ž(C)ss"` if the default encoding is `iso_yu`
|
||||
|
||||
* `"?Y^z(C)ß"` if the default encoding is `iso_de`
|
||||
|
||||
* `"?¥^z(C)ss"` if the default encoding is `jisx`
|
||||
|
||||
Note that the final length of the string may vary.
|
||||
|
||||
## Character literals
|
||||
|
||||
Character literals are surrounded by single quotes and followed by the name of the encoding:
|
||||
Character literals are surrounded by single quotes and optionally followed by the name of the encoding:
|
||||
|
||||
'x' ascii
|
||||
'W'
|
||||
|
||||
From the type system point of view, they are constants of type byte.
|
||||
|
||||
If the characters in the literal cannot be encoded in particular encoding, an error is raised.
|
||||
However, if the command-line option `-flenient-encoding` is used,
|
||||
then literals using `default` and `scr` encodings replace unsupported characters with supported ones.
|
||||
If the replacement is one characacter long, only a warning is issued, otherwise an error is raised.
|
||||
|
||||
## Array initialisers
|
||||
|
||||
An array is initialized with either:
|
||||
|
||||
Reference in New Issue
Block a user