diff --git a/library/nufx-addendum.htm b/library/nufx-addendum.htm index 433beb8..a61247f 100644 --- a/library/nufx-addendum.htm +++ b/library/nufx-addendum.htm @@ -24,7 +24,7 @@ -
NuFX Addendum - By Andy McFadden - Last revised 2022/10/26
+
NuFX Addendum - By Andy McFadden - Last revised 2022/11/06

This addendum clarifies and extends certain aspects of the NuFX specification. This is not an "official" modification @@ -140,7 +140,7 @@ The thread filename takes precedence over the record header filename.  

Filename character set

Filenames in NuFX archives use the Mac OS Roman character set, which is ASCII plus some symbols and the usual set of latin language characters -(see +(see Unicode definition).  The NuFX filename definition was intended to accommodate files from HFS volumes, which may contain any character except ':'.  Control characters, including NUL ('\0'), were allowed but discouraged. @@ -454,35 +454,45 @@ in the record header, and put the file type and creator in an option list.

bits, respectively, treat them as values from HFS.

 

-

Disk block size and block count

-

For a compressed disk image, the "storage_type" and -"extra_type" fields take on a different meaning, notably the block -size (typically 512) and block count (e.g. 280 for a 140K disk) of the disk.

-

These fields are more important than you might expect, because -some older versions of ShrinkIt would set the thread EOF to a strange value like -68096 (which, curiously enough, is 133 * 512).  These same versions of -ShrinkIt tended to leave the "storage_type" set to 2.  -Apparently, ShrinkIt just used extra_type * 512 as the uncompressed size when -trying to figure out what sort of disk it had.  An early version of +

Disk image size values

+

For a compressed disk image, the "storage_type" and +"extra_type" fields take on different meanings: the extra_type +field holds the block size (usually 512), and the extra_type field holds +the block count (e.g. 280 for a 140KB disk).

+

These fields are more important than you might expect, because +ShrinkIt doesn't appear to set the thread EOF value for disk images. (A quick +test with ShrinkIt v3.4 on a 5.25" DOS disk yielded a thread EOF of zero, +while GS/ShrinkIt v1.1 on a 3.5" ProDOS disk generated a mysterious +thread EOF of $4a00.) +Worse, some older versions of ShrinkIt tended to leave the +"storage_type" set to 2. +Apparently, ShrinkIt just uses extra_type * 512 as the uncompressed size when +trying to figure out what sort of disk it has. An early version of GS/ShrinkIt went one step further: it used a block count of 280 with a block size of 256, resulting in archives that apparently held 70K disk images.

-

It is simple enough to disregard the thread EOF value, and +

It is simple enough to disregard the thread EOF value, and replace the storage_type when it is absurdly small, but there is a deeper -problem.  If you delete a 140K disk image thread and replace it with an -800K disk image thread, the block count stored in the extra_type no longer -accurately reflects the contents of the record.  (This linkage between the +problem. If you delete a 140KB disk image thread and replace it with an +800KB disk image thread, the block count stored in the extra_type no longer +accurately reflects the contents of the record. (This linkage between the record header and the thread contents is the reason why this document forbids mixing of disk image threads with any other data-class thread, including other disk images.)

-

Creating: Applications must update the extra_type -whenever a disk image thread is added.  The value (storage_type * -extra_type) must be equal to the uncompressed size.  The application may -wish to reject threads that are not a multiple of 512 bytes.

-

Extracting: The application must normalize storage_type -to 512 if it is less than 16 (0x0f is the largest possible ProDOS storage -type).  The value storage_type * extra_type must then be used as the -uncompressed size.  If the uncompressed size is zero, the thread may be -ignored.

+

Because the length of the disk image thread can only be determined from +the extra_type field, it is important for applications that support changing +the file and aux types to prevent such changes in records with disk images.

+

Creating: Applications must update the record's storage_type and +extra_type fields whenever a disk image thread is added. The value +(storage_type * extra_type) must be equal to the uncompressed size. The +application should reject disk image files that are not a multiple of +512 bytes. For consistency with other applications, the thread EOF field +should be zeroed.

+

Extracting: The application must ignore the thread EOF, and +normalize storage_type to 512 if it is less than 16 (0x0f is the largest +valid ProDOS storage type). The value (512 * extra_type) should be +used as the uncompressed size. If the uncompressed size is zero, the +thread may be ignored.

+

 

Access permissions

NuFX supports four boolean access permission flags (read, write, @@ -542,34 +552,60 @@ for which no files are added to the archive.

thread is present.  As noted in the NuFX specification, the application must also create any directories listed in the record's pathname that don't yet exist.

+

 

Message thread format

The specification says that message threads are ASCII text, but -doesn't specify an EOL character.  For the benefit of Apple II utilities, -it's best to use a carriage return (ctrl-M).  The comments are expected to +doesn't specify an EOL character. For the benefit of Apple II utilities, +it's best to use a carriage return (Ctrl+M). The comments are expected to be readable on 8-bit Apple IIs, so plain ASCII rather than Mac OS Roman should be used.

Creating: Convert any EOL markers to CR, and any non-ASCII characters (i.e. bytes with the high bit set) to ASCII.

Extracting: Assume that the comment may be using CR, LF, -or CRLF, and convert as needed for display.  GS/ShrinkIt used a -proportional font, so there is no need to worry about formatting to preserve "ASCII art" in -comments.

-

 

+or CRLF, and convert as needed for display. GS/ShrinkIt used a +proportional font, so there is no need to worry about formatting to preserve +"ASCII art" in comments.

-

 

-

Master EOF

-

For the most part, ShrinkIt correctly sets the MasterEOF field -in the Master Header block.  A very old version of ShrinkIt left it set to +

 

+

Message thread maximum length

+

Comments are rarely used, and when they are they tend to be fairly short. +The contents are never compressed, aren't covered by a CRC, and aren't +extracted to files, making them a bad way to convey vital information. +Adding and editing the comment field was introduced with GS/ShrinkIt, which +creates a pre-sized comment on the first entry in each batch. The editor +does not expand or reduce the length of the field, which is limited to +1,000 bytes. It does support longer comments created by other programs.

+

It's convenient to assign a maximum possible length to comments, so that +they can be manipulated by code that doesn't need to handle their maximum +possible length of 4GB. A cap of 64KB (same as ZIP) seems reasonable as an +absolute maximum, considering likely content and what Apple II software can +support.

+

Creating: Limit comments to 64KB. Applications may establish a +lower limit, but should allow them to be at least 1000 bytes.

+

Updating: Truncation of comments longer than 64KB is +discouraged but allowed.

+ +

 

+

Master EOF

+

For the most part, ShrinkIt correctly sets the MasterEOF field +in the Master Header block. The field was introduced with version 1 of the +header definition. A very old version of ShrinkIt left it set to zero (this is the same version that completely omitted the filename for DOS 3.3 -disk images).  GS/ShrinkIt appears to initialize it to 48 (the size of the +disk images). GS/ShrinkIt appears to initialize it to 48 (the size of the MH block), and if the creation process is interrupted you can end up with a partial archive with a nonzero EOF.

-

Opening: Accept a MasterEOF of zero, but reject a -MasterEOF of 48.  Don't assume the MasterEOF is accurate.

-

Updating: Applications must write the correct MasterEOF +

The master EOF is useful as a quick file truncation test, but provides +no other value. The record count in the header is more important.

+

Opening: Don't assume the master EOF is accurate. Walk through the +list of records to determine the actual end-of-file before appending new +records.

+

Updating: Applications must write the correct MasterEOF value if an archive is modified.

+ +

 


+

Extensions

Unofficial extensions to the NuFX specification.  Anyone working with NuFX archives should take heed.

@@ -578,14 +614,14 @@ working with NuFX archives should take heed.

following thread format values have been added:

  • -

    0x0006 - deflate.  The thread contains data conforming - to RFC 1951 (deflate1.3 specification).  A more practical way of - putting it is it contains exactly the data that zlib v1.1.4 outputs.  Visit http://www.zlib.org/ - for more details.

  • +

    0x0006 - deflate. The thread contains data conforming + to RFC 1951 (deflate 1.3 specification), which is the compression format + used by ZIP and gzip. The canonical implementation is "zlib". + Visit zlib.net for more details.

  • 0x0007 - bzip2.  The thread contains BWT+Huffman - compressed data as output by Julian Seward's "libbz2" v1.0.2.  Visit - http://sources.redhat.com/bzip2/ + compressed data as output by "libbz2". Visit + sourceware.org/bzip2 for more information.

Support for these formats is nonexistent on the Apple II, so @@ -656,20 +692,22 @@ problems, but can make you wonder where all the extra bits came from.

The SQ compression algorithm, as implemented by Don Elton's SQ3, appears to add an extra 0xff to the end of the compressed data.  It can safely be ignored.

+

Preserving BXY and SEA wrappers

Preserving BXY wrappers is pretty easy, since the Binary II format is well documented.  Updating block counts and file lengths is all that is required.

-

Preserving SEA wrappers is a little harder, since (as far as I -can tell) there is no documentation on the format.  A little -experimentation shows that the SEA header is always 12005 bytes long, and the -only part that changes from file to file is a short piece right before the NuFX -archive begins.

-

It is necessary to update the file length in three different -places, all right next to each other, one of which is offset by 64 bytes.  -I would guess the header allows for more than one archive to be present, but -since such things have never actually been created, the possibility can be -ignored.

+

Preserving SEA wrappers is a little more obscure, since there +is no documentation on the format. A bit of reverse engineering reveals that +SEA files are OMF executables with two segments. The first segment holds the +extraction code, and is the same for all archives. The second holds the NuFX +data, and requires that a few length values in the segment header be adjusted. +Also, to be correct, the file must have a $00 byte appended after the NuFX +data (it's an OMF "END" opcode).

+

The archives have a minor bug: an offset field in the header is off by one, +so actually loading the segment in GS/OS would likely fail. The segment +header has the "skip" flag set, though, so this isn't a problem in practice.

+

Y2K

The NuFX standard says that the Date/Time format is the same as that returned by the IIgs ReadTimeHex toolbox call.  That call returns the @@ -682,10 +720,10 @@ also accept the year 0.  However, if you find a Date/Time with zero in all useful fields (second, minute, hour, day, month, year), treat it as an unspecified date rather than midnight of January 1, 2000.


-

This document is Copyright © 2000-2004 by Andy +

This document is Copyright © 2000-2004 by Andy McFadden.  All Rights Reserved.

The latest version can be found on the NuLib web site at -http://www.nulib.com/.

+https://www.nulib.com/.