mirror of
https://github.com/depp/syncfiles.git
synced 2024-10-31 12:04:44 +00:00
f94a65d245
This creates conversion tables for 11 of the 21 character sets.
657 B
657 B
Table Formats
Each character map table starts with a single nonzero byte, indicating the table's format.
Extended ASCII
Format 1 is for "extended ASCII". Encoded values 0-127 are identical to ASCII, and encoded values 128-255 are mapped to single Unicode characters.
The table contains 128 entries, for encoded values 128-255, with the following format:
u8 Length of Unicode character
u8[] Unicode character, UTF-8
u8 Length of normalized Unicode character, may be zero
u8[] Unicode character in NFD normal form, UTF-8
The second copy of the character is only present if the character decomposes into multiple characters.