mirror of
https://github.com/depp/syncfiles.git
synced 2024-11-22 03:30:57 +00:00
db4187b65b
Extract table generation to its own file, table.go, and refactor the interface. This exposed an inconsistency in the way that line breaks were handled: both CR and LF on the Mac side were mapped to LF on the UTF-8 side, but when the conversion table was inverted, the reverse mappings would conflict. Previously, there was no explicit handling for it, and whichever Mac charecter had a higher byte value would take precedence. Conflicts are now detected and return an error, so line breaks must be mapped explicitly. The new code maps CR, LF, and CRLF to CR when converting UTF-8 to Mac. |
||
---|---|---|
.. | ||
.gitignore | ||
go.mod | ||
go.sum | ||
macroman.go | ||
README.md | ||
table.go |
Character Conversion Tables
Used by SyncFiles.
This program generates the tables necessary to convert from UTF-8 to Mac OS Roman.
The conversion process is entirely table-driven. The table maps a (state, input) pair to a (state, output) pair. The initial state is 0. A transition to state 0 is considered invalid.
A transition may have both a state and output. This means that the input may be translated in different ways depending on the bytes that follow. The translation code prefers the longest path through the state table that results in an output.
The table is compressed with PackBits to reduce its size by a factor of 22x.