macutils/README

484 lines
23 KiB
Plaintext
Raw Normal View History

2018-03-22 19:30:02 -05:00
This is version 2.0b3 of macutil (22-OCT-1992).
This package contains the following utilities:
macunpack
hexbin
macsave
macstream
binhex
tomac
frommac
Requirements:
a. Of course a C compiler.
b. A 32-bit machine with large memory (or at least the ability to 'malloc'
large chunks of memory). For reasons of efficiency and simplicity the
programs work 'in-core', also many files are first read in core.
If somebody can take the trouble to do it differently, go ahead!
There are also probably in a number of places implicit assumptions that
an int is 32 bits. If you encounter such occurrences feel free to
notify me.
c. A Unix (tm) machine, or something very close. There are probably quite
a lot of Unix dependencies. Also here, if you have replacements, feel
free to send comments.
d. This version normally uses the 'mkdir' system call available on BSD Unix
and some versions of SysV Unix. You can change that, see the makefile for
details.
File name translation:
The programs use a table driven program to do Mac filename -> Unix filename
translation. When compiled without further changes the translation is as
follows:
Printable ASCII characters except space and slash are not changed.
Slash and space are changed to underscore, as are all characters that
do not fall in the following group.
Accented letters are translated to their unaccented counterparts.
If your system supports the Latin-1 character set, you can change this
translation scheme by specifying '-DLATIN1' for the 'CF' macro in the
makefile. This will translate all accented letters (and some symbols)
to their Latin-1 counterpart. This feature is untested (I do not have
access to systems that cater for Latin-1), so use with care.
Future revisions of the program will have user settable conversions.
Another feature of filename translation is that when the -DNODOT flag is
specified in the CF macro an initial period will be translated to underscore.
MacBinary stream:
Most programs allow MacBinary streams as either input or output. A
MacBinary stream is a series of files in MacBinary format pasted
together. Embedded within a MacBinary stream can be information about
folders. So a MacBinary stream can contain all information about a
folder and its constituents.
Appleshare support:
Optionally the package can be compiled for systems that support the sharing
of Unix and Mac filesystems. The package supports AUFS (AppleTalk Unix File
Server) from CAP (Columbia AppleTalk Package) and AppleDouble (from Apple).
It will not support both at the same time. Moreover this support requires
the existence of the 'mkdir' system call. And finally, as implemented it
probably will work on big-endian BSD compatible systems. If you have a SysV
system with restricted filename lengths you can get problems. I do not know
also whether the structures are stored native or Apple-wise on little-endian
systems. And also, I did not test it fully; having no access to either AUFS
or AppleDouble systems.
Acknowledgements:
a. Macunpack is for a large part based on the utilities 'unpit' and 'unsit'
written by:
Allan G. Weber
weber%brand.usc.edu@oberon.usc.edu
(wondering whether that is still valid!). I combined the two into a
single program and did a lot of modification. For information on the
originals, see the files README.unpit and README.unsit.
b. The crc-calculating routines are based on a routine originally written by:
Mark G. Mendel
UUCP: ihnp4!umn-cs!hyper!mark
(this will not work anymore for sure!). Also here I modified the stuff
and expanded it, see the files README.crc and README.crc.orig.
c. LZW-decompression is taken from the sources of compress that are floating
around. Probably I did not use the most efficient version, but this
program was written to get it done. The version I based it on (4.0) is
authored by:
Steve Davies (decvax!vax135!petsd!peora!srd)
Jim McKie (decvax!mcvax!jim) (Hi Jim!)
Joe Orost (decvax!vax135!petsd!joe)
Spencer W. Thomas (decvax!harpo!utah-cs!utah-gr!thomas)
Ken Turkowski (decvax!decwrl!turtlevax!ken)
James A. Woods (decvax!ihnp4!ames!jaw)
I am sure those e-mail addresses also will not work!
d. Optional AUFS support comes from information supplied by:
Casper H.S. Dik
University of Amsterdam
Kruislaan 409
1098 SJ Amsterdam
Netherlands
phone: +31205922022
email: casper@fwi.uva.nl
This is an e-mail address that will workm but the address and phone
number ar no longer valid.
See the makefile.
Some caveats are applicable:
1. I did not fully test it (we do not use it). But the unpacking
appears to be correct. Anyhow, as the people who initially compile
it and use it will be system administrators I am confident they are
able to locate bugs! (What if an archive contains a Macfile with
the name .finderinfo or .resource? I have had two inputs for AUFS
support [I took Caspers; his came first], but both do not deal with
that. Does CAP deal with it?) Also I have no idea whether this
version supports it under SysV, so beware.
2. From one of the README's supplied by Casper:
Files will not appear in an active folder, because Aufs doesn't like
people working behind it's back.
Simply opening and closing the folder will suffice.
Appears to be the same problem as when you are unpacking or in some
other way creating files in a folder open to multifinder. I have seen
bundle bits disappear this way. So if after unpacking you see the
generic icon; check whether a different icon should appear and check
the bundle bit.
The desktop isn't updated, but that doesn't seem to matter.
I dunno, not using it.
e. Man pages are now supplied. The base was provided by:
Douglas Siebert
ISCA
dsiebert@icaen.uiowa.edu
f. Because of some problems the Uncompactor has been rewritten, it is now
based on sources from the dearchiver unzip (of PC fame). Apparently the
base code is by:
Samuel H. Smith
I have no further address available, but as soon as I find a better
attribution, I will include it.
g. UnstuffIt's LZAH code comes from lharc (also of PC fame) by:
Haruhiko Okumura,
Haruyasu Yoshizaki,
Yooichi Tagawa.
h. Zoom's code comes from information supplied by Jon W{tte
(d88-jwa@nada.kth.se). The Zoo decompressor is based on the routine
written by Rahul Dhesi (dhesi@cirrus.COM). This again is based on
code by Haruhiko Okumura. See also the file README.zoom.
i. MacLHa's decompressors are identical to the ones mentioned in g and h.
j. Most of hexbin's code is based on code written/modified by:
Dave Johnson, Brown University Computer Science
Darin Adler, TMQ Software
Jim Budler, amdcad!jimb
Dan LaLiberte, liberte@uiucdcs
ahm (?)
Jeff Meyer, John Fluke Company
Guido van Rossum, guido@cwi.nl (Hi!)
(most of the e-mail addresses will not work, the affiliation may also
be incorrect by now.) See also the file README.hexbin.
k. The dl code in hexbin comes is based on the original distribution of
SUMacC.
l. The mu code in hexbin is a slight modification of the hcx code (the
compressions are identical).
m. The MW code for StuffIt is loosely based on code by Daniel H. Bernstein
(brnstnd@acf10.nyu.edu).
n. Tomac and frommac are loosely based on the original macput and macget
by (the e-mail address will not work anymore):
Dave Johnson
ddj%brown@csnet-relay.arpa
Brown University Computer Science
-------------------------------------------------------------------------------
Macunpack will unpack PackIt, StuffIt, Diamond, Compactor/Compact Pro, most
StuffItClassic/StuffItDeluxe, and all Zoom and LHarc/MacLHa archives, and
archives created by later versions of DiskDoubler.
Also it will decode files created by BinHex5.0, MacBinary, UMCP,
Compress It, ShrinkToFit, MacCompress, DiskDoubler and AutoDoubler.
(PackIt, StuffIt, Diamond, Compactor, Compact/Pro, Zoom and LHarc/MacLHa are
archivers written by respectively: Harry R. Chesley, Raymond Lau, Denis Sersa,
Bill Goodman, Jon W{tte* and Kazuaki Ishizaki. BinHex 5.0, MacBinary and
UMCP are by respectively: Yves Lempereur, Gregory J. Smith, Information
Electronics. ShrinkToFit is by Roy T. Hashimoto, Compress It by Jerry
Whitnell, and MacCompress, DiskDoubler and AutoDoubler are all by
Lloyd Chambers.)
* from his signature:
Jon W{tte - Yes, that's a brace - Damn Swede.
Actually it is an a with two dots above; some (German inclined) people
refer to it (incorrectly) as a-umlaut.
It does not deal with:
a. Password protected archives.
b. Multi-segment archives.
c. Plugin methods for Zoom.
d. MacLHa archives not packed in MacBinary mode (the program deals very
poorly with that!).
Background:
There are millions of ways to pack files, and unfortunately, all have been
implemented one way or the other. Below I will give some background
information about the packing schemes used by the different programs
mentioned above. But first some background about compression (I am no
expert, more comprehensive information can be found in for instance:
Tomothy Bell, Ian H. Witten and John G. Cleary, Modelling for Text
Compression, ACM Computing Surveys, Vol 21, No 4, Dec 1989, pp 557-591).
Huffman encoding (also called Shannon-Fano coding or some other variation
of the name). An encoding where the length of the code for the symbols
depends on the frequency of the symbols. Frequent symbols have shorter
codes than infrequent symbols. The normal method is to first scan the
file to be compressed, and to assign codes when this is done (see for
instance: D. E. Knuth, the Art of Computer Programming). Later methods
have been designed to create the codes adaptively; for a survey see:
Jeremy S. Vetter, Design and Analysis of Dynamic Huffman Codes, JACM,
Vol 34, No 4, Oct 1987, pp 825-845.
LZ77: The first of two Ziv-Lempel methods. Using a window of past encoded
text, output consists of triples for each sequence of newly encoded
symbols: a back pointer and length of past text to be repeated and the
first symbol that is not part of that sequence. Later versions allowed
deviation from the strict alternation of pointers and uncoded symbols
(LZSS by Bell). Later Brent included Huffman coding of the pointers
(LZH).
LZ78: While LZ77 uses a window of already encoded text as a dictionary,
LZ78 dynamically builds the dictionary. Here again pointers are strictly
alternated with unencoded new symbols. Later Welch (LZW) managed to
eliminate the output of unencoded symbols. This algorithm is about
the same as the one independently invented by Miller and Wegman (MW).
A problem with these two schemes is that they are patented. Thomas
modified LZW to LZC (as used in the Unix compress command). While LZ78
and LZW become static once the dictionary is full, LZC has possibilities
to reset the dictionary. Many LZC variants are in use, depending on the
size of memory available. They are distinguished by the maximum number
of bits that are used in a code.
A number of other schemes are proposed and occasionally used. The main
advantage of the LZ type schemes is that (especially) decoding is fairly fast.
Programs background:
Plain programs:
BinHex 5.0:
Unlike what its name suggest this is not a true successor of BinHex 4.0.
BinHex 5.0 takes the MacBinary form of a file and stores it in the data
fork of the newly created file.
Although BinHex 5.0 does not create BinHex 4.0 compatible files, StuffIt
will give the creator type of BinHex 5.0 (BnHq) to its binhexed files,
rather than the creator type of BinHex 4.0 (BNHQ). The program knows
about that.
MacBinary:
As its name suggests, it does the same as BinHex 5.0.
UMCP:
Looks similar, but the file as stored by UMCP is not true MacBinary.
Size fields are modified, the result is not padded to a multiple of 128,
etc. Macunpack deals with all that, but until now is not able to
correctly restore the finder flags of the original file. Also, UMCP
created files have type "TEXT" and creator "ttxt", which can create a
bit of confusion. Macunpack will recognize these files only if the
creator has been modified to "UMcp".
Compressors:
ShrinkToFit:
This program uses a Huffman code to compress. It has an option (default
checked for some reason), COMP, for which I do not yet know the
meaning. Compressing more than a single file in a single run results
in a failure for the second and subsequent files.
Compress It:
Also uses a Huffman code to compress.
MacCompress:
MacCompress has two modes of operation, the first mode is (confusingly)
MacCompress, the second mode is (again confusingly) UnixCompress. In
MacCompress mode both forks are compressed using the LZC algorithm.
In UnixCompress mode only the data fork is compressed, and some shuffling
of resources is performed. Upto now macunpack only deals with MacCompress
mode. The LZC variant MacCompress uses depends on memory availability.
12 bit to 16 bit LZC can be used.
Archivers:
ArcMac:
Nearly PC-Arc compatible. Arc knows 8 compression methods, I have seen
all of them used by ArcMac, except the LZW techniques. Here they are:
1: No compression, shorter header
2: No compression
3: (packing) Run length encoding
4: (squeezing) RLE followed by Huffman encoding
5: (crunching) LZW
6: (crunching) RLE followed by LZW
7: (crunching) as the previous but with a different hash function
8: (crunching) RLE followed by 12-bit LZC
9: (squashing) 13-bit LZC
PackIt:
When archiving a file PackIt either stores the file uncompressed or
stores the file Huffman encoded. In the latter case both forks are
encoded using the same Huffman tree.
StuffIt and StuffIt Classic/Deluxe:
These have the ability to use different methods for the two forks of a
file. The following standard methods I do know about (the last three
are only used by the Classic/Deluxe version 2.0 of StuffIt):
0: No compression
1: Run length encoding
2: 14-bit LZC compression
3: Huffman encoding
5: LZAH: like LZH, but the Huffman coding used is adaptive
6: A Huffman encoding using a fixed (built-in) Huffman tree
8: A MW encoding
Diamond:
Uses a LZ77 like frontend plus a Fraenkel-Klein like backend (see
Apostolico & Galil, Combinatorial Algorithms on Words, pages 169-183).
Compactor/Compact Pro:
Like StuffIt, different encodings are possible for data and resource fork.
Only two possible methods are used:
0: Run length encoding
1: RLE followed by some form of LZH
Zoom:
Data and resource fork are compressed with the same method. The standard
uses either no compression or some form of LZH
MacLHa:
Has two basic modes of operation, Mac mode and Normal mode. In Mac mode
the file is archived in MacBinary form. In normal mode only the forks
are archived. Normal mode should not be used (and can not be unpacked
by macunpack) as all information about data fork size/resource fork size,
type, creator etc. is lost. It knows quite a few methods, some are
probably only used in older versions, the only methods I have seen used
are -lh0-, -lh1- and -lh5-. Methods known by MacLHa:
-lz4-: No compression
-lz5-: LZSS
-lzs-: LZSS, another variant
-lh0-: No compression
-lh1-: LZAH (see StuffIt)
-lh2-: Another form of LZAH
-lh3-: A form of LZH, different from the next two
-lh4-: LZH with a 4096 byte buffer (as far as I can see the coding in
MacLHa is wrong)
-lh5-: LZH with a 8192 byte buffer
DiskDoubler:
The older version of DiskDoubler is compatible with MacCompress. It does
not create archives, it only compresses files. The newer version (since
3.0) does both archiving and compression. The older version uses LZC as
its compression algorithm, the newer version knows a number of different
compression algorithms. Many (all?) are algorithms used in other
archivers. Probably this is done to simplify conversion from other formats
to DiskDoubler format archives. I have seen actual DiskDoubler archives
that used methods 0, 1 and 8:
0: No compression
1: LZC
2: unknown
3: RLE
4: Huffman (or no compression)
5: unknown
6: unknown
7: An improved form of LZSS
8: Compactor/Compact Pro compatible RLE/LZH or RLE only
9: unknown
The DiskDoubler archive format contains many subtle twists that make it
difficult to properly read the archive (or perhaps this is on purpose?).
Naming:
Some people have complained about the name conflict with the unpack utility
that is already available on Sys V boxes. I had forgotten it, so there
really was a problem. The best way to solve it was to trash pack/unpack/pcat
and replace it by compress/uncompress/zcat. Sure, man uses it; but man uses
pcat, so you can retain pcat. If that was not an option you were able to feel
free to rename the program. But finally I relented. It is now macunpack.
When you have problems unpacking an archive feel free to ask for information.
I am especially keen when the program detects an unknown method. If you
encounter such an archive, please, mail a 'binhexed' copy of the archive
to me so that I can deal with it. Password protected archives are (as
already stated) not implemented. I do not have much inclination to do that.
Also I feel no inclination to do multi-segment archives.
-------------------------------------------------------------------------------
Hexbin will de-hexify files created in BinHex 4.0 compatible format (hqx)
but also the older format (dl, hex and hcx). Moreover it will uudecode
files uuencoded by UUTool (the only program I know that does UU hexification
of all Mac file information).
There are currently many programs that are able to create files in BinHex 4.0
compatible format. There are however some slight differences, and most
de-hexifiers are not able to deal with all the variations. This program is
very simple minded. First it will intuit (based on the input) whether the
file is in dl, hex, hcx or hqx format. Next it will de-hexify the file.
When the format is hqx, it will check whether more files follow, and continue
processing. So you can catenate multiple (hqx) hexified files together and
feed them as a single file to hexbin. Also hexbin does not mind whether lines
are separated by CR's, LF's or combinations of the two. Moreover, it will
strip all leading, trailing and intermediate garbage introduced by mailers
etc. Next, it does not mind if a file is not terminated by a CR or an LF
(as StuffIt 1.5.1 and earlier did), but in that case a second file is not
allowed to follow it. Last, while most hexifiers output lines of equal length,
some do not. Hexbin will deal with that, but there are some caveats; see the
-c option in the man page.
Background:
dl format:
This was the first hexified format used. Programs to deal with it came
from SUMacC. This format only coded resource forks, 4 bits in a byte.
hex format:
I think this is the first format from Yves Lempereur. Like dl format,
it codes 4 bits in a byte, but is able to code both resource and
data fork. Is it BinHex 2.0?
hcx format:
A compressing variant of hex format. Codes 6 bits in a byte.
Is it BinHex 3.0?
hqx format:
Like hcx, but using a different coding (possibly to allow for ASCII->EBCDIC
and EBCDIC->ASCII translation, which not always results in an identical
file). Moreover this format also encodes the original Mac filename.
mu format:
The conversion can be done by the UUTool program from Octavian Micro
Development. It encodes both forks and also some finder info. You will
in general not use this with uudecode on non Mac systems, with uudecode
only the data fork will be uudecoded. UU hexification is well known (and
fairly old) in Unix environments. Moreover it has been ported to lots of
other systems.
-------------------------------------------------------------------------------
Macsave reads a MacBinary stream from standard input and writes the
files according to the options.
-------------------------------------------------------------------------------
Macstream reads files from the Unix host and will output a MacBinary stream
containing all those files together with information about the directory
structure.
-------------------------------------------------------------------------------
Binhex will read a MacBinary stream, or will read files/directories as
indicated on the command line, and will output all files in binhexed (.hqx)
format. Information about the directory structure is lost.
-------------------------------------------------------------------------------
Tomac will transmit a MacBinary stream, or named files to the Mac using
the XMODEM protocol.
-------------------------------------------------------------------------------
Frommac will receive one or more files from the Mac using the XMODEM protocol.
-------------------------------------------------------------------------------
This is an ongoing project, more stuff will appear.
All comments are still welcome. Thanks for the comments I already received.
dik t. winter, amsterdam, nederland
email: dik@cwi.nl
--
Note:
In these programs all algorithms are implemented based on publicly available
software to prevent any claim that would prevent redistribution due to
Copyright. Although parts of the code would indeed fall under the Copyright
by the original author, use and redistribution of all such code is explicitly
allowed. For some parts of it the GNU software license does apply.
--
Appendix.
BinHex 4.0 compatible file creators:
Type Creator Created by
"TEXT" "BthX" BinHqx
"TEXT" "BNHQ" BinHex
"TEXT" "BnHq" StuffIt and StuffIt Classic
"TEXT" "ttxt" Compactor
Files recognized by macunpack:
Type Creator Recognized as
"APPL" "DSEA" "DiskDoubler" Self extracting
"APPL" "EXTR" "Compactor" Self extracting
"APPL" "Mooz" "Zoom" Self extracting
"APPL" "Pack" "Diamond" Self extracting
"APPL" "arc@" "ArcMac" Self extracting (not yet)
"APPL" "aust" "StuffIt" Self extracting
"ArCv" "TrAS" "AutoSqueeze" (not yet)
"COMP" "STF " "ShrinkToFit"
"DD01" "DDAP" "DiskDoubler"
"DDAR" "DDAP" "DiskDoubler"
"DDF." "DDAP" "DiskDoubler" (any fourth character)
"DDf." "DDAP" "DiskDoubler" (any fourth character)
"LARC" "LARC" "MacLHa (LHARC)"
"LHA " "LARC" "MacLHa (LHA)"
"PACT" "CPCT" "Compactor"
"PIT " "PIT " "PackIt"
"Pack" "Pack" "Diamond"
"SIT!" "SIT!" "StuffIt"
"SITD" "SIT!" "StuffIt Deluxe"
"Smal" "Jdw " "Compress It"
"TEXT" "BnHq" "BinHex 5.0"
"TEXT" "GJBU" "MacBinary 1.0"
"TEXT" "UMcp" "UMCP"
"ZIVM" "LZIV" "MacCompress(M)"
"ZIVU" "LZIV" "MacCompress(U)" (not yet)
"mArc" "arc*" "ArcMac" (not yet)
"zooM" "zooM" "Zoom"