hfsutils/doc/charset.txt

250 lines
12 KiB
Plaintext

hfsutils - tools for reading and writing Macintosh HFS volumes
Copyright (C) 1996-1998 Robert Leslie
$Id: charset.txt,v 1.4 1998/11/02 22:08:46 rob Exp $
===============================================================================
HFS uses the MacOS Standard Roman character set for volume, file, and path
names, as well as potentially for the data contained within some files.
Although MacOS Standard Roman shares the same characters as US ASCII for
values 0x00 through 0x7F, the remaining characters have values which are
primarily unique to MacOS Standard Roman.
The following is a translation table from MacOS Standard Roman to Unicode.
Note that ISO 8859-1 is a subset of Unicode, directly mapped from U+0000
through U+00FF. Since the Unicode values for MacOS Standard Roman characters
do not all fall within this range, it is not possible to map MacOS Standard
Roman completely and directly to ISO 8859-1 (and, incidentally, neither is it
possible to map ISO 8859-1 completely and directly to MacOS Standard Roman.)
(All numeric values in this table are hexadecimal.)
Part I: Control Characters (00 - 1F)
+----+--------+-----------------------+----+--------+--------------------------
|ASR |Unicode | Character |ASR |Unicode | Character
+----+--------+-----------------------+----+--------+--------------------------
| 00 | U+0000 | NULL | 10 | U+0010 | DATA LINK ESCAPE
| 01 | U+0001 | START OF HEADING | 11 | U+0011 | DEVICE CONTROL ONE
| 02 | U+0002 | START OF TEXT | 12 | U+0012 | DEVICE CONTROL TWO
| 03 | U+0003 | END OF TEXT | 13 | U+0013 | DEVICE CONTROL THREE
| 04 | U+0004 | END OF TRANSMISSION | 14 | U+0014 | DEVICE CONTROL FOUR
| 05 | U+0005 | ENQUIRY | 15 | U+0015 | NEGATIVE ACKNOWLEDGE
| 06 | U+0006 | ACKNOWLEDGE | 16 | U+0016 | SYNCHRONOUS IDLE
| 07 | U+0007 | BELL | 17 | U+0017 | END OF TRANSMISSION BLOCK
| 08 | U+0008 | BACKSPACE | 18 | U+0018 | CANCEL
| 09 | U+0009 | HORIZONTAL TABULATION | 19 | U+0019 | END OF MEDIUM
| 0A | U+000A | LINE FEED | 1A | U+001A | SUBSTITUTE
| 0B | U+000B | VERTICAL TABULATION | 1B | U+001B | ESCAPE
| 0C | U+000C | FORM FEED | 1C | U+001C | FILE SEPARATOR
| 0D | U+000D | CARRIAGE RETURN | 1D | U+001D | GROUP SEPARATOR
| 0E | U+000E | SHIFT OUT | 1E | U+001E | RECORD SEPARATOR
| 0F | U+000F | SHIFT IN | 1F | U+001F | UNIT SEPARATOR
+----+--------+-----------------------+----+--------+--------------------------
Part II: Printable ASCII Characters and Delete (20 - 7E, and 7F)
+----+--------+-------------------------+----+--------+------------------------
|ASR |Unicode | Character |ASR |Unicode | Character
+----+--------+-------------------------+----+--------+------------------------
| 20 | U+0020 | SPACE | 30 | U+0030 | DIGIT ZERO
| 21 | U+0021 | EXCLAMATION MARK | 31 | U+0031 | DIGIT ONE
| 22 | U+0022 | QUOTATION MARK | 32 | U+0032 | DIGIT TWO
| 23 | U+0023 | NUMBER SIGN | 33 | U+0033 | DIGIT THREE
| 24 | U+0024 | DOLLAR SIGN | 34 | U+0034 | DIGIT FOUR
| 25 | U+0025 | PERCENT SIGN | 35 | U+0035 | DIGIT FIVE
| 26 | U+0026 | AMPERSAND | 36 | U+0036 | DIGIT SIX
| 27 | U+0027 | APOSTROPHE | 37 | U+0037 | DIGIT SEVEN
| 28 | U+0028 | LEFT PARENTHESIS | 38 | U+0038 | DIGIT EIGHT
| 29 | U+0029 | RIGHT PARENTHESIS | 39 | U+0039 | DIGIT NINE
| 2A | U+002A | ASTERISK | 3A | U+003A | COLON
| 2B | U+002B | PLUS SIGN | 3B | U+003B | SEMICOLON
| 2C | U+002C | COMMA | 3C | U+003C | LESS-THAN SIGN
| 2D | U+002D | HYPHEN-MINUS | 3D | U+003D | EQUALS SIGN
| 2E | U+002E | FULL STOP | 3E | U+003E | GREATER-THAN SIGN
| 2F | U+002F | SOLIDUS | 3F | U+003F | QUESTION MARK
+----+--------+-------------------------+----+--------+------------------------
| 40 | U+0040 | COMMERCIAL AT | 50 | U+0050 | LATIN CAPITAL LETTER P
| 41 | U+0041 | LATIN CAPITAL LETTER A | 51 | U+0051 | LATIN CAPITAL LETTER Q
| 42 | U+0042 | LATIN CAPITAL LETTER B | 52 | U+0052 | LATIN CAPITAL LETTER R
| 43 | U+0043 | LATIN CAPITAL LETTER C | 53 | U+0053 | LATIN CAPITAL LETTER S
| 44 | U+0044 | LATIN CAPITAL LETTER D | 54 | U+0054 | LATIN CAPITAL LETTER T
| 45 | U+0045 | LATIN CAPITAL LETTER E | 55 | U+0055 | LATIN CAPITAL LETTER U
| 46 | U+0046 | LATIN CAPITAL LETTER F | 56 | U+0056 | LATIN CAPITAL LETTER V
| 47 | U+0047 | LATIN CAPITAL LETTER G | 57 | U+0057 | LATIN CAPITAL LETTER W
| 48 | U+0048 | LATIN CAPITAL LETTER H | 58 | U+0058 | LATIN CAPITAL LETTER X
| 49 | U+0049 | LATIN CAPITAL LETTER I | 59 | U+0059 | LATIN CAPITAL LETTER Y
| 4A | U+004A | LATIN CAPITAL LETTER J | 5A | U+005A | LATIN CAPITAL LETTER Z
| 4B | U+004B | LATIN CAPITAL LETTER K | 5B | U+005B | LEFT SQUARE BRACKET
| 4C | U+004C | LATIN CAPITAL LETTER L | 5C | U+005C | REVERSE SOLIDUS
| 4D | U+004D | LATIN CAPITAL LETTER M | 5D | U+005D | RIGHT SQUARE BRACKET
| 4E | U+004E | LATIN CAPITAL LETTER N | 5E | U+005E | CIRCUMFLEX ACCENT
| 4F | U+004F | LATIN CAPITAL LETTER O | 5F | U+005F | LOW LINE
+----+--------+-------------------------+----+--------+------------------------
| 60 | U+0060 | GRAVE ACCENT | 70 | U+0070 | LATIN SMALL LETTER P
| 61 | U+0061 | LATIN SMALL LETTER A | 71 | U+0071 | LATIN SMALL LETTER Q
| 62 | U+0062 | LATIN SMALL LETTER B | 72 | U+0072 | LATIN SMALL LETTER R
| 63 | U+0063 | LATIN SMALL LETTER C | 73 | U+0073 | LATIN SMALL LETTER S
| 64 | U+0064 | LATIN SMALL LETTER D | 74 | U+0074 | LATIN SMALL LETTER T
| 65 | U+0065 | LATIN SMALL LETTER E | 75 | U+0075 | LATIN SMALL LETTER U
| 66 | U+0066 | LATIN SMALL LETTER F | 76 | U+0076 | LATIN SMALL LETTER V
| 67 | U+0067 | LATIN SMALL LETTER G | 77 | U+0077 | LATIN SMALL LETTER W
| 68 | U+0068 | LATIN SMALL LETTER H | 78 | U+0078 | LATIN SMALL LETTER X
| 69 | U+0069 | LATIN SMALL LETTER I | 79 | U+0079 | LATIN SMALL LETTER Y
| 6A | U+006A | LATIN SMALL LETTER J | 7A | U+007A | LATIN SMALL LETTER Z
| 6B | U+006B | LATIN SMALL LETTER K | 7B | U+007B | LEFT CURLY BRACKET
| 6C | U+006C | LATIN SMALL LETTER L | 7C | U+007C | VERTICAL LINE
| 6D | U+006D | LATIN SMALL LETTER M | 7D | U+007D | RIGHT CURLY BRACKET
| 6E | U+006E | LATIN SMALL LETTER N | 7E | U+007E | TILDE
| 6F | U+006F | LATIN SMALL LETTER O | 7F | U+007F | DELETE
+----+--------+-------------------------+----+--------+------------------------
Part III: Extended Characters (80 - FF)
+----+--------+----------------------------------------------------------------
|ASR |Unicode | Character
+----+--------+----------------------------------------------------------------
| 80 | U+00C4 | LATIN CAPITAL LETTER A WITH DIAERESIS
| 81 | U+00C5 | LATIN CAPITAL LETTER A WITH RING ABOVE
| 82 | U+00C7 | LATIN CAPITAL LETTER C WITH CEDILLA
| 83 | U+00C9 | LATIN CAPITAL LETTER E WITH ACUTE
| 84 | U+00D1 | LATIN CAPITAL LETTER N WITH TILDE
| 85 | U+00D6 | LATIN CAPITAL LETTER O WITH DIAERESIS
| 86 | U+00DC | LATIN CAPITAL LETTER U WITH DIAERESIS
| 87 | U+00E1 | LATIN SMALL LETTER A WITH ACUTE
| 88 | U+00E0 | LATIN SMALL LETTER A WITH GRAVE
| 89 | U+00E2 | LATIN SMALL LETTER A WITH CIRCUMFLEX
| 8A | U+00E4 | LATIN SMALL LETTER A WITH DIAERESIS
| 8B | U+00E3 | LATIN SMALL LETTER A WITH TILDE
| 8C | U+00E5 | LATIN SMALL LETTER A WITH RING ABOVE
| 8D | U+00E7 | LATIN SMALL LETTER C WITH CEDILLA
| 8E | U+00E9 | LATIN SMALL LETTER E WITH ACUTE
| 8F | U+00E8 | LATIN SMALL LETTER E WITH GRAVE
+----+--------+----------------------------------------------------------------
| 90 | U+00EA | LATIN SMALL LETTER E WITH CIRCUMFLEX
| 91 | U+00EB | LATIN SMALL LETTER E WITH DIAERESIS
| 92 | U+00ED | LATIN SMALL LETTER I WITH ACUTE
| 93 | U+00EC | LATIN SMALL LETTER I WITH GRAVE
| 94 | U+00EE | LATIN SMALL LETTER I WITH CIRCUMFLEX
| 95 | U+00EF | LATIN SMALL LETTER I WITH DIAERESIS
| 96 | U+00F1 | LATIN SMALL LETTER N WITH TILDE
| 97 | U+00F3 | LATIN SMALL LETTER O WITH ACUTE
| 98 | U+00F2 | LATIN SMALL LETTER O WITH GRAVE
| 99 | U+00F4 | LATIN SMALL LETTER O WITH CIRCUMFLEX
| 9A | U+00F6 | LATIN SMALL LETTER O WITH DIAERESIS
| 9B | U+00F5 | LATIN SMALL LETTER O WITH TILDE
| 9C | U+00FA | LATIN SMALL LETTER U WITH ACUTE
| 9D | U+00F9 | LATIN SMALL LETTER U WITH GRAVE
| 9E | U+00FB | LATIN SMALL LETTER U WITH CIRCUMFLEX
| 9F | U+00FC | LATIN SMALL LETTER U WITH DIAERESIS
+----+--------+----------------------------------------------------------------
| A0 | U+2020 | DAGGER
| A1 | U+00B0 | DEGREE SIGN
| A2 | U+00A2 | CENT SIGN
| A3 | U+00A3 | POUND SIGN
| A4 | U+00A7 | SECTION SIGN
| A5 | U+2022 | BULLET
| A6 | U+00B6 | PILCROW SIGN
| A7 | U+00DF | LATIN SMALL LETTER SHARP S
| A8 | U+00AE | REGISTERED SIGN
| A9 | U+00A9 | COPYRIGHT SIGN
| AA | U+2122 | TRADE MARK SIGN
| AB | U+00B4 | ACUTE ACCENT
| AC | U+00A8 | DIAERESIS
| AD | U+2260 | NOT EQUAL TO
| AE | U+00C6 | LATIN CAPITAL LIGATURE AE
| AF | U+00D8 | LATIN CAPITAL LETTER O WITH STROKE
+----+--------+----------------------------------------------------------------
| B0 | U+221E | INFINITY
| B1 | U+00B1 | PLUS-MINUS SIGN
| B2 | U+2264 | LESS-THAN OR EQUAL TO
| B3 | U+2265 | GREATER-THAN OR EQUAL TO
| B4 | U+00A5 | YEN SIGN
| B5 | U+00B5 | MICRO SIGN
| B6 | U+2202 | PARTIAL DIFFERENTIAL
| B7 | U+2211 | N-ARY SUMMATION
| B8 | U+220F | N-ARY PRODUCT
| B9 | U+03C0 | GREEK SMALL LETTER PI
| BA | U+222B | INTEGRAL
| BB | U+00AA | FEMININE ORDINAL INDICATOR
| BC | U+00BA | MASCULINE ORDINAL INDICATOR
| BD | U+2126 | OHM SIGN
| BE | U+00E6 | LATIN SMALL LIGATURE AE
| BF | U+00F8 | LATIN SMALL LETTER O WITH STROKE
+----+--------+----------------------------------------------------------------
| C0 | U+00BF | INVERTED QUESTION MARK
| C1 | U+00A1 | INVERTED EXCLAMATION MARK
| C2 | U+00AC | NOT SIGN
| C3 | U+221A | SQUARE ROOT
| C4 | U+0192 | LATIN SMALL LETTER F WITH HOOK
| C5 | U+2248 | ALMOST EQUAL TO
| C6 | U+2206 | INCREMENT
| C7 | U+00AB | LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
| C8 | U+00BB | RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
| C9 | U+2026 | HORIZONTAL ELLIPSIS
| CA | U+00A0 | NO-BREAK SPACE
| CB | U+00C0 | LATIN CAPITAL LETTER A WITH GRAVE
| CC | U+00C3 | LATIN CAPITAL LETTER A WITH TILDE
| CD | U+00D5 | LATIN CAPITAL LETTER O WITH TILDE
| CE | U+0152 | LATIN CAPITAL LIGATURE OE
| CF | U+0153 | LATIN SMALL LIGATURE OE
+----+--------+----------------------------------------------------------------
| D0 | U+2013 | EN DASH
| D1 | U+2014 | EM DASH
| D2 | U+201C | LEFT DOUBLE QUOTATION MARK
| D3 | U+201D | RIGHT DOUBLE QUOTATION MARK
| D4 | U+2018 | LEFT SINGLE QUOTATION MARK
| D5 | U+2019 | RIGHT SINGLE QUOTATION MARK
| D6 | U+00F7 | DIVISION SIGN
| D7 | U+25CA | LOZENGE
| D8 | U+00FF | LATIN SMALL LETTER Y WITH DIAERESIS
| D9 | U+0178 | LATIN CAPITAL LETTER Y WITH DIAERESIS
| DA | U+2044 | FRACTION SLASH
| DB | U+00A4 | CURRENCY SIGN
| DC | U+2039 | SINGLE LEFT-POINTING ANGLE QUOTATION MARK
| DD | U+203A | SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
| DE | U+FB01 | LATIN SMALL LIGATURE FI
| DF | U+FB02 | LATIN SMALL LIGATURE FL
+----+--------+----------------------------------------------------------------
| E0 | U+2021 | DOUBLE DAGGER
| E1 | U+00B7 | MIDDLE DOT
| E2 | U+201A | SINGLE LOW-9 QUOTATION MARK
| E3 | U+201E | DOUBLE LOW-9 QUOTATION MARK
| E4 | U+2030 | PER MILLE SIGN
| E5 | U+00C2 | LATIN CAPITAL LETTER A WITH CIRCUMFLEX
| E6 | U+00CA | LATIN CAPITAL LETTER E WITH CIRCUMFLEX
| E7 | U+00C1 | LATIN CAPITAL LETTER A WITH ACUTE
| E8 | U+00CB | LATIN CAPITAL LETTER E WITH DIAERESIS
| E9 | U+00C8 | LATIN CAPITAL LETTER E WITH GRAVE
| EA | U+00CD | LATIN CAPITAL LETTER I WITH ACUTE
| EB | U+00CE | LATIN CAPITAL LETTER I WITH CIRCUMFLEX
| EC | U+00CF | LATIN CAPITAL LETTER I WITH DIAERESIS
| ED | U+00CC | LATIN CAPITAL LETTER I WITH GRAVE
| EE | U+00D3 | LATIN CAPITAL LETTER O WITH ACUTE
| EF | U+00D4 | LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+----+--------+----------------------------------------------------------------
| F0 | U+F8FF | Apple logo
| F1 | U+00D2 | LATIN CAPITAL LETTER O WITH GRAVE
| F2 | U+00DA | LATIN CAPITAL LETTER U WITH ACUTE
| F3 | U+00DB | LATIN CAPITAL LETTER U WITH CIRCUMFLEX
| F4 | U+00D9 | LATIN CAPITAL LETTER U WITH GRAVE
| F5 | U+0131 | LATIN SMALL LETTER DOTLESS I
| F6 | U+02C6 | MODIFIER LETTER CIRCUMFLEX ACCENT
| F7 | U+02DC | SMALL TILDE
| F8 | U+00AF | MACRON
| F9 | U+02D8 | BREVE
| FA | U+02D9 | DOT ABOVE
| FB | U+02DA | RING ABOVE
| FC | U+00B8 | CEDILLA
| FD | U+02DD | DOUBLE ACUTE ACCENT
| FE | U+02DB | OGONEK
| FF | U+02C7 | CARON
+----+--------+----------------------------------------------------------------
===============================================================================