From eaa0ecc9c8e97eb7b69ccce833aeeaf8cfeab7db Mon Sep 17 00:00:00 2001
From: Andy McFadden
Ultimately, the disk volume name is embedded in the disk image itself. The +name stored in the archive is purely decorative.
Adding/renaming Applications must -strip any leading path components from disk image "storage names". +strip any leading path components from disk image "storage names" (The NuFX specification does explicitly forbid the use of a filesystem separator character in a disk volume name.)
Extracting: @@ -272,30 +274,30 @@ For GSHK compatibility, the filename thread compThreadEOF must be the greater of free space remaining after a file is renamed. However, if the filename itself exceeds the buffer size and the thread must be rebuilt, the 8-byte padding should be added.
+
The NuFX specification does not require that threads appear in -any particular order. However, writing them in a certain order can make -some operations significantly easier.
+The NuFX specification specifies a general ordering for +threads ("blocks must occur in the following fashion"), but doesn't indicate +what should be done if they appear out of order. Handling out-of-order +threads isn't impossible, but it can be inconvenient.
For example, if an archive is being unpacked as it is received, -it is important to know the filename before receiving the data. If the +it is important to know the filename before receiving the data. If the filename thread comes after the data threads, the application has to write the incoming data into a temp file, and then rename it later when the filename -thread finally shows up. It would also be nice to be able to display file +thread finally shows up. It would also be nice to be able to display file comments as the file is being downloaded.
Creating: The filename thread must precede all other -threads. The recommended (but not required) ordering for common thread -types is:
+threads. The recommended ordering for common thread types is:Extracting: If the filename thread does not appear before the first data-class thread, the record may be ignored.
+
There are some combinations of threads that must never appear in @@ -329,8 +331,8 @@ systems. However, certain values are significant.
For records with only a data fork, the storage type - must be one of 0, 1, 2, or 3. The value "2" is recommended - for applications that don't wish to mimic ProDOS behavior exactly.
For records with a resource fork, the storage type must be "5" (ProDOS extended file).
It is important to update the storage type as threads are added and deleted, so that it always accurately reflects the contents of the record.
+The spec seems to claim that HFS volumes have 524 bytes per block (though +the assertion was weakened from "would" to "might" in the final version). +This refers to the 12 "tag" bytes available on 3.5" floppies, which are +accessible from Mac OS but not actually required by HFS.
+ ++
GS/OS was designed to work with a variety of different filesystems. +Instead of trying to handle all conceivable file attributes explicitly, +GS/OS returns filesystem-specific values in "option lists". These can +be provided to the get/set file info calls when copying files around.
+Files on HFS volumes have two four-byte values, called file type and +creator, that identify the file contents. These are part of the Macintosh +Finder info structures, called FInfo and FXInfo. Files copied from HFS +to ProDOS may have this data stored in the extended key block of a forked +file (see ProDOS technical note #25). This appears as two 18-byte chunks, +consisting of a size byte followed by a type byte, and then 16 bytes of +FInfo or FXInfo data (which are defined in Inside Macintosh: Macintosh +Toolbox Essentials, page 7-47). To expose the data to applications, +certain GS/OS calls pass an "option list" with the contents. Most of +the fields are uninteresting to anything but the Mac Finder on the system +where the files were stored, so for our purposes the option list may be +viewed simply as a way to preserve the file type and creator.
+ +Experiments with the GS/OS Exerciser reveal that the option list returned +doesn't include the size/type bytes. For an HFS file copied to ProDOS +with GS/OS, the GetFileInfo call returns a 32-byte buffer that begins +with FInfo. When called on an HFS volume, the option list is 36 bytes, +with the last four bytes set to 02 00 00 00. GSHK appears to record these +exactly as it receives them, which means the first four bytes hold the +HFS file type, and the second four bytes hold the HFS creator, in +big-endian byte order. Because most of the fields only have meaning to the +Macintosh finder, the rest of the data is zeroes. Files archived from an +HFS volume created by a Macintosh would presumably have nonzero data in +more places.
+When archiving files from an HFS volume under GS/OS, GSHK records the +ProDOS type/auxtype rather than the full HFS file type and creator, +because that's what the GS/OS file info query returns. The only way to +recover the original Mac Finder types is through the option list.
+Sometimes the option list found in a NuFX archive is a little messed up, +e.g. the size field says 36 bytes, but there's only space for 18 bytes in +the record header.
+Side note: the NuFX specification reversed the values of MFS and HFS +in the file_sys_id enumeration. In practice, GS/ShrinkIt +correctly uses the GS/OS FST definitions: MFS=5, HFS=6.
+Opening: Assume the option_size field is correct +unless it exceeds attrib_count-2. If it's too large, clip it down to size. +If the filesystem type is ProDOS or HFS, the option list is at least 16 bytes +long, and the second 4 bytes are nonzero, +use the first 4 bytes of the option list data as the file type and +the second 4 bytes as the creator. If a secondary test is desired to +avoid garbage, the creator value is usually ASCII.
+Creating: If a record has HFS type values, generate a +filesystem-specific option list (32 bytes for ProDOS or 36 bytes for HFS) +and store them there.
+Updating: Always output the actual record size. Do not propagate +incorrect size values. Retaining option lists for ProDOS and HFS entries +is required, since they may have the only copy of the original file type +and creator, but only if at least one of the first 8 bytes of the option +list are nonzero. Updates to the archive attributes that alter the file/aux +type should usually retain the option list, since the purpose may be to +improve ProDOS usability without losing the original type information.
+ ++
The initial release of the specification stated that the HFS file type and +creator should be stored in the record header. The final version of the +specification abdicates responsibility for defining the field, stating simply, +"For ProDOS 8 or GS/OS, this field should always be what the operating system +returns when asked".
+For reference, when an application asks GS/OS to get the information for +a file on an HFS volume, it returns a ProDOS file type and aux type (usually +BIN), and puts the HFS type and creator into an option list. If this +behavior defines the field, then this is how the types should be stored.
+However, the vague wording of the specification raises the possibility that +a Mac OS-based archiver should store the file type and creator directly in +the record header, because that's what "the operating system" returned. The +record header does not provide a way to define the source of the type values, +so an extraction program attempting to set the file info would need to draw +conclusions based on whether the types are small enough values to be valid +for ProDOS.
+It's worth noting that files on an AppleShare volume have independent +ProDOS and HFS file types. When a ProDOS file is written to the AppleShare +FST, Mac OS type and creator values are generated according to a scheme +documented in the AppleShare FST public ERS document. It's possible that a +Mac archiver could store ProDOS file types as HFS file types that are +actually ProDOS file types that must be decoded based on a collection of +rules.
+To avoid ambiguity, we want to follow the GS/OS behavior, regardless of +what the host operating system does.
+Creating: store the ProDOS file type and aux type in the record +header. For files on HFS volumes, put a simple ProDOS type (BIN or TXT) +in the record header, and put the file type and creator in an option list.
+Extracting: if the file type and aux type do not fit in 8 and 16 +bits, respectively, treat them as values from HFS.
+
For a compressed disk image, the "storage_type" and @@ -449,58 +547,6 @@ proportional font, so there is no need to worry about formatting to preserve &qu comments.
-
Files on HFS volumes have two four-byte values, called file type and -creator, that identify the file contents. These are part of the Macintosh -Finder info structures, called FInfo and FXInfo. Files copied from HFS -to ProDOS may have this data stored in the extended key block of a forked -file (see ProDOS technical note #25). This appears as two 18-byte chunks, -consisting of a size byte followed by a type byte, and then 16 bytes of -FInfo or FXInfo data (which are defined in Inside Macintosh: Macintosh -Toolbox Essentials, page 7-47). To expose the data to applications, -certain GS/OS calls pass an "option list" with the contents. Most of -the fields are uninteresting to anything but the Mac Finder, so for our -purposes the option list may be viewed simply as a way to preserve the -file type and creator.
- -Experiments with the GS/OS exerciser reveal that the option list doesn't -include the size/type bytes. For an HFS file copied to ProDOS with GS/OS, -the GetFileInfo call returns a 32-byte buffer that begins with FInfo. -When called on an HFS volume, the option list is 36 bytes, with the last -four bytes set to 02 00 00 00. GSHK appears to record these exactly as -it receives them, which means the first four bytes hold the HFS file type, -and the second four bytes hold the HFS creator. Because most of the -fields only have meaning to the Macintosh finder, the rest of the data is -zeroes. Files archived from an HFS volume created by a Macintosh would -presumably have nonzero data in more places.
- -Sometimes the option list is a little messed up, e.g. the size field -says 36 bytes, but there's only space for 18 bytes in the record header.
-When archiving files from an HFS volume under GS/OS, GSHK records the -ProDOS type/auxtype rather than the full HFS file type and creator, -because that's what GS/OS provides. The only way to -recover the original Mac Finder types is through the option list.
-Side note: the NuFX specification reversed the values of MFS and HFS -in the file_sys_id enumeration. In practice, GS/ShrinkIt -correctly uses the GS/OS FST definitions: MFS=5, HFS=6.
-Opening: Assume the option_size field is correct -unless it exceeds attrib_count-2. If it's too large, clip it down to size. -If the filesystem type is ProDOS or HFS, and the second 4 bytes are nonzero, -use the first 4 bytes of the option list data as the file type and -the second 4 bytes as the creator. If a secondary test is desired to -avoid garbage, the creator value is usually ASCII.
-Creating: The specification says that applications should store -whatever the OS gives them, which means putting a ProDOS type/auxtype in the -record header, and generating a filesystem-specific option list with the -HFS Finder types (i.e. 32 bytes for ProDOS or 36 bytes for HFS). Simply -storing the four-byte HFS types in the record header is not allowed.
-Updating: Always output the actual record size. Do not propagate -incorrect size values. Retaining option lists for ProDOS and HFS entries -is required, since they may have the only copy of the original file type -and creator. Updates to the archive attributes that alter the file/aux -type should usually retain the option list, since the purpose may be to -improve ProDOS usability without losing the original type information.
-
For the most part, ShrinkIt correctly sets the MasterEOF field @@ -554,27 +600,34 @@ got left behind. If a record has two filenames, they'd better have the same fssep char, or interpreting one of them will be impossible. (This is one of the reasons why it's important to clearly define which filename takes precedence in all circumstances.)
+The "threadCRC" field in the thread header block can have one of three meanings: nothing (v0, v1), the CRC of the compressed data -(v2), or the CRC of the uncompressed data (v3). The version 2 meaning -wasn't used in anything significant, and can be ignored.
+(v2), or the CRC of the uncompressed data (v3). Version 2 records weren't +generated by anything significant, and can be ignored. (If you actually find +an archive with v2 records, it's reasonable to just treat them as v1.)Version 1 records generally have threads compressed with LZW/1 -data. The LZW/1 compression format includes a 16-bit CRC at the start of -the thread. Version 3 records generally have threads compressed with LZW/2 -data, which does not include a CRC.
-Applications like P8 ShrinkIt and NuLib creation v1 records and +data. The LZW/1 compression format includes the 16-bit CRC of the uncompressed +data at the start of the thread. Version 3 records generally have threads compressed +with LZW/2 data, which does not include a CRC.
+Applications like P8 ShrinkIt and NuLib create v1 records and compress with LZW/1, while GS/ShrinkIt and NuLib2 create v3 records and compress -with LZW/2. This means that each compressed thread has exactly one CRC. +with LZW/2. This means that each compressed thread has exactly one CRC. +(Uncompressed data stored by P8 ShrinkIt has no CRC at all.) So what happens if you tell NuLib2 to create a new record with LZW/1, or tell it to add a new LZW/2 thread to an existing v1 record?
-In one case, you end up with two CRCS; in the other, you end up -with no CRC on your data at all. For some bizarre reason, the v3 thread +
In one case, you end up with two CRCs; in the other, you end up +with no CRC on your data at all. Unfortunately, the v3 thread CRC is computed with a different initial value, so it is necessary to compute -the CRC twice, not merely store the same value twice.
-Please select your compression methods appropriately. -Also, bear in mind that uncompressed data stored with P8 ShrinkIt has no CRC -whatsoever.
+the CRC twice for LZW/1 data, not merely store the same value twice. +When replacing a data thread in an existing record, it's tempting to +update the record to the latest (v3), but this may come at a cost. For +example, if the record has both resource and data forks, and only the data fork +is being replaced, it would be necessary to uncompress the resource fork to +calculate its uncompressed CRC. Programs that rewrite records should be +prepared to output v1 or v3.
+ShrinkIt adds an extra byte at the end of all LZW compressed data, probably due to an off-by-one bug in the compression code. It turns