mirror of
https://github.com/fadden/nulib2.git
synced 2024-05-28 23:41:29 +00:00
Update NuFX Addendum
This commit is contained in:
parent
fe4080a2d7
commit
eaa0ecc9c8
|
@ -199,8 +199,10 @@ disk images used for emulators is considered. At first glance, it seems
|
||||||
useful to be able to store a hierarchy of disk images. In practice, such
|
useful to be able to store a hierarchy of disk images. In practice, such
|
||||||
images would either be archived as a hierarchy of .PO files, or as an archive of
|
images would either be archived as a hierarchy of .PO files, or as an archive of
|
||||||
.SDK archives.</p>
|
.SDK archives.</p>
|
||||||
|
<p>Ultimately, the disk volume name is embedded in the disk image itself. The
|
||||||
|
name stored in the archive is purely decorative.</p>
|
||||||
<p align="left"><b>Adding/renaming</b> Applications must
|
<p align="left"><b>Adding/renaming</b> Applications must
|
||||||
strip any leading path components from disk image "storage names".
|
strip any leading path components from disk image "storage names"
|
||||||
(The NuFX specification does explicitly forbid the use of a filesystem separator
|
(The NuFX specification does explicitly forbid the use of a filesystem separator
|
||||||
character in a disk volume name.)</p>
|
character in a disk volume name.)</p>
|
||||||
<p align="left"><b>Extracting:</b>
|
<p align="left"><b>Extracting:</b>
|
||||||
|
@ -272,30 +274,30 @@ For GSHK compatibility, the filename thread compThreadEOF must be the greater of
|
||||||
free space remaining after a file is renamed. However, if the filename
|
free space remaining after a file is renamed. However, if the filename
|
||||||
itself exceeds the buffer size and the thread must be rebuilt, the 8-byte
|
itself exceeds the buffer size and the thread must be rebuilt, the 8-byte
|
||||||
padding should be added.</p>
|
padding should be added.</p>
|
||||||
|
|
||||||
<p align="left"> </p>
|
<p align="left"> </p>
|
||||||
<h3 align="left">Thread ordering</h3>
|
<h3 align="left">Thread ordering</h3>
|
||||||
<p align="left">The NuFX specification does not require that threads appear in
|
<p align="left">The NuFX specification specifies a general ordering for
|
||||||
any particular order. However, writing them in a certain order can make
|
threads ("blocks must occur in the following fashion"), but doesn't indicate
|
||||||
some operations significantly easier.</p>
|
what should be done if they appear out of order. Handling out-of-order
|
||||||
|
threads isn't impossible, but it can be inconvenient.</p>
|
||||||
<p align="left">For example, if an archive is being unpacked as it is received,
|
<p align="left">For example, if an archive is being unpacked as it is received,
|
||||||
it is important to know the filename before receiving the data. If the
|
it is important to know the filename before receiving the data. If the
|
||||||
filename thread comes after the data threads, the application has to write the
|
filename thread comes after the data threads, the application has to write the
|
||||||
incoming data into a temp file, and then rename it later when the filename
|
incoming data into a temp file, and then rename it later when the filename
|
||||||
thread finally shows up. It would also be nice to be able to display file
|
thread finally shows up. It would also be nice to be able to display file
|
||||||
comments as the file is being downloaded.</p>
|
comments as the file is being downloaded.</p>
|
||||||
<p align="left"><b>Creating:</b> The filename thread must precede all other
|
<p align="left"><b>Creating:</b> The filename thread must precede all other
|
||||||
threads. The recommended (but not required) ordering for common thread
|
threads. The recommended ordering for common thread types is:</p>
|
||||||
types is:</p>
|
|
||||||
<ul>
|
<ul>
|
||||||
<li>Filename</li>
|
<li>Filename</li>
|
||||||
<li>Message(s) (i.e. comments)</li>
|
<li>Message(s) (i.e. comments)</li>
|
||||||
<li>Data fork</li>
|
<li>Data threads (data fork, resource fork, disk image)</li>
|
||||||
<li>Disk image</li>
|
|
||||||
<li>Resource fork</li>
|
|
||||||
<li>all other threads</li>
|
<li>all other threads</li>
|
||||||
</ul>
|
</ul>
|
||||||
<p align="left"><b>Extracting:</b> If the filename thread does not appear before
|
<p align="left"><b>Extracting:</b> If the filename thread does not appear before
|
||||||
the first data-class thread, the record may be ignored.</p>
|
the first data-class thread, the record may be ignored.</p>
|
||||||
|
|
||||||
<p align="left"> </p>
|
<p align="left"> </p>
|
||||||
<h3 align="left">Incompatible thread types</h3>
|
<h3 align="left">Incompatible thread types</h3>
|
||||||
<p align="left">There are some combinations of threads that must never appear in
|
<p align="left">There are some combinations of threads that must never appear in
|
||||||
|
@ -329,8 +331,8 @@ systems. However, certain values are significant.</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li>
|
<li>
|
||||||
<p align="left">For records with <b>only a data fork</b>, the storage type
|
<p align="left">For records with <b>only a data fork</b>, the storage type
|
||||||
must be one of 0, 1, 2, or 3. The value "2" is recommended
|
must be one of 0, 1, 2, or 3. The specific choice is not useful to
|
||||||
for applications that don't wish to mimic ProDOS behavior exactly.</li>
|
anyone, but a nonzero value (say, 1) should be used.</li>
|
||||||
<li>
|
<li>
|
||||||
<p align="left">For records with <b>a resource fork</b>, the storage type
|
<p align="left">For records with <b>a resource fork</b>, the storage type
|
||||||
must be "5" (ProDOS extended file).</li>
|
must be "5" (ProDOS extended file).</li>
|
||||||
|
@ -345,6 +347,102 @@ systems. However, certain values are significant.</p>
|
||||||
not be used.</p>
|
not be used.</p>
|
||||||
<p align="left">It is important to update the storage type as threads are added
|
<p align="left">It is important to update the storage type as threads are added
|
||||||
and deleted, so that it always accurately reflects the contents of the record.</p>
|
and deleted, so that it always accurately reflects the contents of the record.</p>
|
||||||
|
<p>The spec seems to claim that HFS volumes have 524 bytes per block (though
|
||||||
|
the assertion was weakened from "would" to "might" in the final version).
|
||||||
|
This refers to the 12 "tag" bytes available on 3.5" floppies, which are
|
||||||
|
accessible from Mac OS but not actually required by HFS.</p>
|
||||||
|
|
||||||
|
<p> </p>
|
||||||
|
<h3>GS/OS option lists and HFS file types</h3>
|
||||||
|
<p>GS/OS was designed to work with a variety of different filesystems.
|
||||||
|
Instead of trying to handle all conceivable file attributes explicitly,
|
||||||
|
GS/OS returns filesystem-specific values in "option lists". These can
|
||||||
|
be provided to the get/set file info calls when copying files around.</p>
|
||||||
|
<p>Files on HFS volumes have two four-byte values, called file type and
|
||||||
|
creator, that identify the file contents. These are part of the Macintosh
|
||||||
|
Finder info structures, called FInfo and FXInfo. Files copied from HFS
|
||||||
|
to ProDOS may have this data stored in the extended key block of a forked
|
||||||
|
file (see ProDOS technical note #25). This appears as two 18-byte chunks,
|
||||||
|
consisting of a size byte followed by a type byte, and then 16 bytes of
|
||||||
|
FInfo or FXInfo data (which are defined in <i>Inside Macintosh: Macintosh
|
||||||
|
Toolbox Essentials</i>, page 7-47). To expose the data to applications,
|
||||||
|
certain GS/OS calls pass an "option list" with the contents. Most of
|
||||||
|
the fields are uninteresting to anything but the Mac Finder on the system
|
||||||
|
where the files were stored, so for our purposes the option list may be
|
||||||
|
viewed simply as a way to preserve the file type and creator.</p>
|
||||||
|
|
||||||
|
<p>Experiments with the GS/OS Exerciser reveal that the option list returned
|
||||||
|
doesn't include the size/type bytes. For an HFS file copied to ProDOS
|
||||||
|
with GS/OS, the GetFileInfo call returns a 32-byte buffer that begins
|
||||||
|
with FInfo. When called on an HFS volume, the option list is 36 bytes,
|
||||||
|
with the last four bytes set to 02 00 00 00. GSHK appears to record these
|
||||||
|
exactly as it receives them, which means the first four bytes hold the
|
||||||
|
HFS file type, and the second four bytes hold the HFS creator, in
|
||||||
|
big-endian byte order. Because most of the fields only have meaning to the
|
||||||
|
Macintosh finder, the rest of the data is zeroes. Files archived from an
|
||||||
|
HFS volume created by a Macintosh would presumably have nonzero data in
|
||||||
|
more places.</p>
|
||||||
|
<p>When archiving files from an HFS volume under GS/OS, GSHK records the
|
||||||
|
ProDOS type/auxtype rather than the full HFS file type and creator,
|
||||||
|
because that's what the GS/OS file info query returns. The only way to
|
||||||
|
recover the original Mac Finder types is through the option list.</p>
|
||||||
|
<p>Sometimes the option list found in a NuFX archive is a little messed up,
|
||||||
|
e.g. the size field says 36 bytes, but there's only space for 18 bytes in
|
||||||
|
the record header.</p>
|
||||||
|
<p>Side note: the NuFX specification reversed the values of MFS and HFS
|
||||||
|
in the file_sys_id enumeration. In practice, GS/ShrinkIt
|
||||||
|
correctly uses the GS/OS FST definitions: MFS=5, HFS=6.</p>
|
||||||
|
<p><b>Opening:</b> Assume the option_size field is correct
|
||||||
|
unless it exceeds attrib_count-2. If it's too large, clip it down to size.
|
||||||
|
If the filesystem type is ProDOS or HFS, the option list is at least 16 bytes
|
||||||
|
long, and the second 4 bytes are nonzero,
|
||||||
|
use the first 4 bytes of the option list data as the file type and
|
||||||
|
the second 4 bytes as the creator. If a secondary test is desired to
|
||||||
|
avoid garbage, the creator value is usually ASCII.</p>
|
||||||
|
<p><b>Creating:</b> If a record has HFS type values, generate a
|
||||||
|
filesystem-specific option list (32 bytes for ProDOS or 36 bytes for HFS)
|
||||||
|
and store them there.</p>
|
||||||
|
<p><b>Updating:</b> Always output the actual record size. Do not propagate
|
||||||
|
incorrect size values. Retaining option lists for ProDOS and HFS entries
|
||||||
|
is required, since they may have the only copy of the original file type
|
||||||
|
and creator, but only if at least one of the first 8 bytes of the option
|
||||||
|
list are nonzero. Updates to the archive attributes that alter the file/aux
|
||||||
|
type should usually retain the option list, since the purpose may be to
|
||||||
|
improve ProDOS usability without losing the original type information.</p>
|
||||||
|
|
||||||
|
<p> </p>
|
||||||
|
<h3>ProDOS vs. HFS file types</h3>
|
||||||
|
<p>The initial release of the specification stated that the HFS file type and
|
||||||
|
creator should be stored in the record header. The final version of the
|
||||||
|
specification abdicates responsibility for defining the field, stating simply,
|
||||||
|
"For ProDOS 8 or GS/OS, this field should always be what the operating system
|
||||||
|
returns when asked".</p>
|
||||||
|
<p>For reference, when an application asks GS/OS to get the information for
|
||||||
|
a file on an HFS volume, it returns a ProDOS file type and aux type (usually
|
||||||
|
BIN), and puts the HFS type and creator into an option list. If this
|
||||||
|
behavior defines the field, then this is how the types should be stored.</p>
|
||||||
|
<p>However, the vague wording of the specification raises the possibility that
|
||||||
|
a Mac OS-based archiver should store the file type and creator directly in
|
||||||
|
the record header, because that's what "the operating system" returned. The
|
||||||
|
record header does not provide a way to define the source of the type values,
|
||||||
|
so an extraction program attempting to set the file info would need to draw
|
||||||
|
conclusions based on whether the types are small enough values to be valid
|
||||||
|
for ProDOS.</p>
|
||||||
|
<p>It's worth noting that files on an AppleShare volume have independent
|
||||||
|
ProDOS and HFS file types. When a ProDOS file is written to the AppleShare
|
||||||
|
FST, Mac OS type and creator values are generated according to a scheme
|
||||||
|
documented in the AppleShare FST public ERS document. It's possible that a
|
||||||
|
Mac archiver could store ProDOS file types as HFS file types that are
|
||||||
|
actually ProDOS file types that must be decoded based on a collection of
|
||||||
|
rules.</p>
|
||||||
|
<p>To avoid ambiguity, we want to follow the GS/OS behavior, regardless of
|
||||||
|
what the host operating system does.</p>
|
||||||
|
<p><b>Creating:</b> store the ProDOS file type and aux type in the record
|
||||||
|
header. For files on HFS volumes, put a simple ProDOS type (BIN or TXT)
|
||||||
|
in the record header, and put the file type and creator in an option list.</p>
|
||||||
|
<p><b>Extracting:</b> if the file type and aux type do not fit in 8 and 16
|
||||||
|
bits, respectively, treat them as values from HFS.</p>
|
||||||
|
|
||||||
<p align="left"> </p>
|
<p align="left"> </p>
|
||||||
<h3 align="left">Disk block size and block count</h3>
|
<h3 align="left">Disk block size and block count</h3>
|
||||||
<p align="left">For a compressed disk image, the "storage_type" and
|
<p align="left">For a compressed disk image, the "storage_type" and
|
||||||
|
@ -449,58 +547,6 @@ proportional font, so there is no need to worry about formatting to preserve &qu
|
||||||
comments.</p>
|
comments.</p>
|
||||||
<p align="left"> </p>
|
<p align="left"> </p>
|
||||||
|
|
||||||
<h3>GS/OS option lists and HFS file types</h3>
|
|
||||||
<p>Files on HFS volumes have two four-byte values, called file type and
|
|
||||||
creator, that identify the file contents. These are part of the Macintosh
|
|
||||||
Finder info structures, called FInfo and FXInfo. Files copied from HFS
|
|
||||||
to ProDOS may have this data stored in the extended key block of a forked
|
|
||||||
file (see ProDOS technical note #25). This appears as two 18-byte chunks,
|
|
||||||
consisting of a size byte followed by a type byte, and then 16 bytes of
|
|
||||||
FInfo or FXInfo data (which are defined in <i>Inside Macintosh: Macintosh
|
|
||||||
Toolbox Essentials</i>, page 7-47). To expose the data to applications,
|
|
||||||
certain GS/OS calls pass an "option list" with the contents. Most of
|
|
||||||
the fields are uninteresting to anything but the Mac Finder, so for our
|
|
||||||
purposes the option list may be viewed simply as a way to preserve the
|
|
||||||
file type and creator.</p>
|
|
||||||
|
|
||||||
<p>Experiments with the GS/OS exerciser reveal that the option list doesn't
|
|
||||||
include the size/type bytes. For an HFS file copied to ProDOS with GS/OS,
|
|
||||||
the GetFileInfo call returns a 32-byte buffer that begins with FInfo.
|
|
||||||
When called on an HFS volume, the option list is 36 bytes, with the last
|
|
||||||
four bytes set to 02 00 00 00. GSHK appears to record these exactly as
|
|
||||||
it receives them, which means the first four bytes hold the HFS file type,
|
|
||||||
and the second four bytes hold the HFS creator. Because most of the
|
|
||||||
fields only have meaning to the Macintosh finder, the rest of the data is
|
|
||||||
zeroes. Files archived from an HFS volume created by a Macintosh would
|
|
||||||
presumably have nonzero data in more places.</p>
|
|
||||||
|
|
||||||
<p>Sometimes the option list is a little messed up, e.g. the size field
|
|
||||||
says 36 bytes, but there's only space for 18 bytes in the record header.</p>
|
|
||||||
<p>When archiving files from an HFS volume under GS/OS, GSHK records the
|
|
||||||
ProDOS type/auxtype rather than the full HFS file type and creator,
|
|
||||||
because that's what GS/OS provides. The only way to
|
|
||||||
recover the original Mac Finder types is through the option list.</p>
|
|
||||||
<p>Side note: the NuFX specification reversed the values of MFS and HFS
|
|
||||||
in the file_sys_id enumeration. In practice, GS/ShrinkIt
|
|
||||||
correctly uses the GS/OS FST definitions: MFS=5, HFS=6.</p>
|
|
||||||
<p><b>Opening:</b> Assume the option_size field is correct
|
|
||||||
unless it exceeds attrib_count-2. If it's too large, clip it down to size.
|
|
||||||
If the filesystem type is ProDOS or HFS, and the second 4 bytes are nonzero,
|
|
||||||
use the first 4 bytes of the option list data as the file type and
|
|
||||||
the second 4 bytes as the creator. If a secondary test is desired to
|
|
||||||
avoid garbage, the creator value is usually ASCII.</p>
|
|
||||||
<p><b>Creating:</b> The specification says that applications should store
|
|
||||||
whatever the OS gives them, which means putting a ProDOS type/auxtype in the
|
|
||||||
record header, and generating a filesystem-specific option list with the
|
|
||||||
HFS Finder types (i.e. 32 bytes for ProDOS or 36 bytes for HFS). Simply
|
|
||||||
storing the four-byte HFS types in the record header is not allowed.</p>
|
|
||||||
<p><b>Updating:</b> Always output the actual record size. Do not propagate
|
|
||||||
incorrect size values. Retaining option lists for ProDOS and HFS entries
|
|
||||||
is required, since they may have the only copy of the original file type
|
|
||||||
and creator. Updates to the archive attributes that alter the file/aux
|
|
||||||
type should usually retain the option list, since the purpose may be to
|
|
||||||
improve ProDOS usability without losing the original type information.</p>
|
|
||||||
|
|
||||||
<p align="left"> </p>
|
<p align="left"> </p>
|
||||||
<h3 align="left">Master EOF</h3>
|
<h3 align="left">Master EOF</h3>
|
||||||
<p align="left">For the most part, ShrinkIt correctly sets the MasterEOF field
|
<p align="left">For the most part, ShrinkIt correctly sets the MasterEOF field
|
||||||
|
@ -554,27 +600,34 @@ got left behind. If a record has two filenames, they'd better have the
|
||||||
same fssep char, or interpreting one of them will be impossible. (This is
|
same fssep char, or interpreting one of them will be impossible. (This is
|
||||||
one of the reasons why it's important to clearly define which filename takes
|
one of the reasons why it's important to clearly define which filename takes
|
||||||
precedence in all circumstances.)</p>
|
precedence in all circumstances.)</p>
|
||||||
|
|
||||||
<h3 align="left">Files with zero or two CRCs</h3>
|
<h3 align="left">Files with zero or two CRCs</h3>
|
||||||
<p align="left">The "threadCRC" field in the thread header block can
|
<p align="left">The "threadCRC" field in the thread header block can
|
||||||
have one of three meanings: nothing (v0, v1), the CRC of the compressed data
|
have one of three meanings: nothing (v0, v1), the CRC of the compressed data
|
||||||
(v2), or the CRC of the uncompressed data (v3). The version 2 meaning
|
(v2), or the CRC of the uncompressed data (v3). Version 2 records weren't
|
||||||
wasn't used in anything significant, and can be ignored.</p>
|
generated by anything significant, and can be ignored. (If you actually find
|
||||||
|
an archive with v2 records, it's reasonable to just treat them as v1.)</p>
|
||||||
<p align="left">Version 1 records generally have threads compressed with LZW/1
|
<p align="left">Version 1 records generally have threads compressed with LZW/1
|
||||||
data. The LZW/1 compression format includes a 16-bit CRC at the start of
|
data. The LZW/1 compression format includes the 16-bit CRC of the uncompressed
|
||||||
the thread. Version 3 records generally have threads compressed with LZW/2
|
data at the start of the thread. Version 3 records generally have threads compressed
|
||||||
data, which does not include a CRC.</p>
|
with LZW/2 data, which does not include a CRC.</p>
|
||||||
<p align="left">Applications like P8 ShrinkIt and NuLib creation v1 records and
|
<p align="left">Applications like P8 ShrinkIt and NuLib create v1 records and
|
||||||
compress with LZW/1, while GS/ShrinkIt and NuLib2 create v3 records and compress
|
compress with LZW/1, while GS/ShrinkIt and NuLib2 create v3 records and compress
|
||||||
with LZW/2. This means that each compressed thread has exactly one CRC.
|
with LZW/2. This means that each compressed thread has exactly one CRC.
|
||||||
|
(Uncompressed data stored by P8 ShrinkIt has no CRC at all.)
|
||||||
So what happens if you tell NuLib2 to create a new record with
|
So what happens if you tell NuLib2 to create a new record with
|
||||||
LZW/1, or tell it to add a new LZW/2 thread to an existing v1 record?</p>
|
LZW/1, or tell it to add a new LZW/2 thread to an existing v1 record?</p>
|
||||||
<p align="left">In one case, you end up with two CRCS; in the other, you end up
|
<p align="left">In one case, you end up with two CRCs; in the other, you end up
|
||||||
with no CRC on your data at all. For some bizarre reason, the v3 thread
|
with no CRC on your data at all. Unfortunately, the v3 thread
|
||||||
CRC is computed with a different initial value, so it is necessary to compute
|
CRC is computed with a different initial value, so it is necessary to compute
|
||||||
the CRC twice, not merely store the same value twice.</p>
|
the CRC twice for LZW/1 data, not merely store the same value twice.</p>
|
||||||
<p align="left">Please select your compression methods appropriately.
|
<p>When replacing a data thread in an existing record, it's tempting to
|
||||||
Also, bear in mind that uncompressed data stored with P8 ShrinkIt has no CRC
|
update the record to the latest (v3), but this may come at a cost. For
|
||||||
whatsoever.</p>
|
example, if the record has both resource and data forks, and only the data fork
|
||||||
|
is being replaced, it would be necessary to uncompress the resource fork to
|
||||||
|
calculate its uncompressed CRC. Programs that rewrite records should be
|
||||||
|
prepared to output v1 or v3.</p>
|
||||||
|
|
||||||
<h3 align="left">Extra data in compressed threads</h3>
|
<h3 align="left">Extra data in compressed threads</h3>
|
||||||
<p align="left">ShrinkIt adds an extra byte at the end of all LZW compressed
|
<p align="left">ShrinkIt adds an extra byte at the end of all LZW compressed
|
||||||
data, probably due to an off-by-one bug in the compression code. It turns
|
data, probably due to an off-by-one bug in the compression code. It turns
|
||||||
|
|
Loading…
Reference in New Issue
Block a user