[< back to index](../doc_index.md) ### Defining custom encodings Every encoding is defined in an `.tbl` file with an appropriate name. The file is looked up in the directories on the include path, first directly, then in the `encoding` subdirectory. The file is a UTF-8 text file, with each line having a specific meaning. In the specifications below, `<>` are not to be meant literally: * lines starting with `#`, `;` or `//` are comments. * `ALIAS=` defines this encoding to be an alias for another encoding. No other lines are allowed in the file. * `NAME=` defines the name for this encoding. Required. * `BUILTIN=` defines this encoding to be a UTF-based encoding. `` may be one of `UTF-8`, `UTF-16LE`, `UTF-16BE`, `UTF-32LE`, `UTF-32BE`. If this directive is present, the only other allowed directive in the file is the `NAME` directive. * `EOT=` where `` are two hex digits, defines the string terminator byte. Required, unless `BUILTIN` is present. There have to be two digits, `EOT=0` is invalid. * lines like `=` where `` are two hex digits and `` is either a **non-whitespace** character or a **BMP** Unicode codepoint written as `U+xxxx`, define the byte `` to correspond to character ``. There have to be two digits, `0=@` is invalid. * lines like `-=` where `` is repeated an appropriate number of times define characters for multiple byte values. In this kind of lines, characters cannot be represented as Unicode codepoints. * lines like `=`, `=` etc. define secondary or alternate characters that are going to be represented as one or more bytes. There have to be two digits, `@=0` is invalid. Problematic characters (space, `=`, `#`, `;`) can be written as Unicode codepoints `U+xxxx`. * a line like `a-z=` is equivalent to lines `a=`, `b=` all the way to `z=`. * a line like `KATAKANA=>DECOMPOSE` means that katakana characters with dakuten or handakuten should be split into the base character and the standalone dakuten/handakuten. * similarly with `HIRAGANA=>DECOMPOSE`. * lines like `{}=`, `{}=` etc. define escape codes. It's a good practice to define these when possible: `{q}`, `{apos}`, `{n}`, `{lbrace}`, `{rbrace}`, `{yen}`, `{pound}`, `{cent}`, `{euro}`, `{copy}`, `{pi}`, `{nbsp}`, `{shy}`.