convert-binhex/docs/Convert/BinHex.pm.html
2013-08-20 23:04:15 -07:00

1060 lines
28 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<!-- Generated by pod2coolhtml 1.101
-- Using Pod::CoolHTML 1.104 , (C) 1997 by Eryq (eryq@enteract.com).
--
-- DO NOT EDIT THIS HTML FILE! All your changes will be lost.
-- Edit the POD or Perl file that was used to create it.
-->
<HTML>
<HEAD>
<TITLE>Convert::BinHex</TITLE>
</HEAD>
<BODY LINK=#C00000 ALINK=#FF2020 VLINK=#800030
BGCOLOR=#FFFFFF>
<A NAME="__top"> </A><TABLE WIDTH="100%">
<TR VALIGN="TOP"><TD ALIGN="LEFT"><CENTER>
<H1><FONT SIZE=7 COLOR=#800030><B>Convert::<BR>BinHex</B></FONT></H1><IMG SRC="BinHex/hqxred.gif" ALT="HQX"> </CENTER>
<TD>
<UL>
<LI><A HREF="#name">NAME</A>
</LI><LI><A HREF="#synopsis">SYNOPSIS</A>
</LI><LI><A HREF="#description">DESCRIPTION</A>
</LI><LI><A HREF="#format">FORMAT</A>
</LI><LI><A HREF="#functions">FUNCTIONS</A>
</LI><UL>
<LI><A HREF="#crc_computation">CRC computation</A>
</LI></UL>
<LI><A HREF="#oo_interface">OO INTERFACE</A>
</LI><UL>
<LI><A HREF="#conversion">Conversion</A>
</LI><LI><A HREF="#construction">Construction</A>
</LI><LI><A HREF="#getset_header_information">Get/set header information</A>
</LI><LI><A HREF="#decode_highlevel">Decode, high-level</A>
</LI><LI><A HREF="#encode_highlevel">Encode, high-level</A>
</LI></UL>
<LI><A HREF="#submodules">SUBMODULES</A>
</LI><UL>
<LI><A HREF="#convertbinhexbin2hex">Convert::BinHex::Bin2Hex</A>
</LI><LI><A HREF="#convertbinhexhex2bin">Convert::BinHex::Hex2Bin</A>
</LI><LI><A HREF="#convertbinhexfork">Convert::BinHex::Fork</A>
</LI></UL>
<LI><A HREF="#under_the_hood">UNDER THE HOOD</A>
</LI><UL>
<LI><A HREF="#design_issues">Design issues</A>
</LI><LI><A HREF="#how_it_works">How it works</A>
</LI></UL>
<LI><A HREF="#warnings">WARNINGS</A>
</LI><LI><A HREF="#change_log">CHANGE LOG</A>
</LI><LI><A HREF="#author_and_credits">AUTHOR AND CREDITS</A>
</LI><LI><A HREF="#terms_and_conditions">TERMS AND CONDITIONS</A>
</LI></UL>
</TABLE>
<P><HR>
<A NAME="name">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
NAME</FONT></H1>
</A>
<P>
Convert::BinHex - extract data from Macintosh BinHex files
<P>
<I>ALPHA WARNING: this code is currently in its Alpha release.
Things may change drastically until the interface is hammered out:
if you have suggestions or objections, please speak up now!</I>
<P><HR>
<A NAME="synopsis">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
SYNOPSIS</FONT></H1>
</A>
<P>
<B>Simple functions:</B>
<P>
<PRE> use Convert::BinHex qw(binhex_crc macbinary_crc);</PRE>
<P>
<PRE> # Compute HQX7-style CRC for data, pumping in old CRC if desired:
&#36;crc = binhex_crc(&#36;data, &#36;crc);</PRE>
<P>
<PRE> # Compute the MacBinary-II-style CRC for the data:
&#36;crc = macbinary_crc(&#36;data, &#36;crc);</PRE>
<P>
<B>Hex to bin, low-level interface.</B>
Conversion is actually done via an object (<A HREF="#convertbinhexhex2bin">&quot;Convert::BinHex::Hex2Bin&quot;</A>)
which keeps internal conversion state:
<P>
<PRE> # Create and use a &quot;translator&quot; object:
my &#36;H2B = Convert::BinHex-&gt;hex2bin; # get a converter object
while (&lt;STDIN&gt;) {
print &#36;STDOUT &#36;H2B-&gt;next(&#36;_); # convert some more input
}
print &#36;STDOUT &#36;H2B-&gt;done; # no more input: finish up</PRE>
<P>
<B>Hex to bin, OO interface.</B>
The following operations <I>must</I> be done in the order shown!
<P>
<PRE> # Read data in piecemeal:
&#36;HQX = Convert::BinHex-&gt;open(FH=&gt;\*STDIN) || die &quot;open: &#36;!&quot;;
&#36;HQX-&gt;read_header; # read header info
@data = &#36;HQX-&gt;read_data; # read in all the data
@rsrc = &#36;HQX-&gt;read_resource; # read in all the resource</PRE>
<P>
<B>Bin to hex, low-level interface.</B>
Conversion is actually done via an object (<A HREF="#convertbinhexbin2hex">&quot;Convert::BinHex::Bin2Hex&quot;</A>)
which keeps internal conversion state:
<P>
<PRE> # Create and use a &quot;translator&quot; object:
my &#36;B2H = Convert::BinHex-&gt;bin2hex; # get a converter object
while (&lt;STDIN&gt;) {
print &#36;STDOUT &#36;B2H-&gt;next(&#36;_); # convert some more input
}
print &#36;STDOUT &#36;B2H-&gt;done; # no more input: finish up</PRE>
<P>
<B>Bin to hex, file interface.</B> Yes, you can convert <I>to</I> BinHex
as well as from it!
<P>
<PRE> # Create new, empty object:
my &#36;HQX = Convert::BinHex-&gt;new;</PRE>
<P>
<PRE> # Set header attributes:
&#36;HQX-&gt;filename(&quot;logo.gif&quot;);
&#36;HQX-&gt;type(&quot;GIFA&quot;);
&#36;HQX-&gt;creator(&quot;CNVS&quot;);</PRE>
<P>
<PRE> # Give it the data and resource forks (either can be absent):
&#36;HQX-&gt;data(Path =&gt; &quot;/path/to/data&quot;); # here, data is on disk
&#36;HQX-&gt;resource(Data =&gt; &#36;resourcefork); # here, resource is in core</PRE>
<P>
<PRE> # Output as a BinHex stream, complete with leading comment:
&#36;HQX-&gt;encode(\*STDOUT);</PRE>
<P>
<B>PLANNED!!!! Bin to hex, &quot;CAP&quot; interface.</B>
<I>Thanks to Ken Lunde for suggesting this</I>.
<P>
<PRE> # Create new, empty object from CAP tree:
my &#36;HQX = Convert::BinHex-&gt;from_cap(&quot;/path/to/root/file&quot;);
&#36;HQX-&gt;encode(\*STDOUT);</PRE>
<P><HR>
<A NAME="description">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
DESCRIPTION</FONT></H1>
</A>
<P>
<B>BinHex</B> is a format used by Macintosh for transporting Mac files
safely through electronic mail, as short-lined, 7-bit, semi-compressed
data streams. Ths module provides a means of converting those
data streams back into into binary data.
<P><HR>
<A NAME="format">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
FORMAT</FONT></H1>
</A>
<P>
<I>(Some text taken from RFC-1741.)</I>
Files on the Macintosh consist of two parts, called <I>forks</I>:
<DL>
<P><DT><B><A NAME="data">Data fork</A></B><DD>
The actual data included in the file. The Data fork is typically the
only meaningful part of a Macintosh file on a non-Macintosh computer system.
For example, if a Macintosh user wants to send a file of data to a
user on an IBM-PC, she would only send the Data fork.
<P><DT><B><A NAME="resource">Resource fork</A></B><DD>
Contains a collection of arbitrary attribute/value pairs, including
program segments, icon bitmaps, and parametric values.
</DL>
<P>
Additional information regarding Macintosh files is stored by the
Finder in a hidden file, called the &quot;Desktop Database&quot;.
<P>
Because of the complications in storing different parts of a
Macintosh file in a non-Macintosh filesystem that only handles
consecutive data in one part, it is common to convert the Macintosh
file into some other format before transferring it over the network.
The BinHex format squashes that data into transmittable ASCII as follows:
<UL>
<P><LI><B>1.</B>
The file is output as a <B>byte stream</B> consisting of some basic header
information (filename, type, creator), then the data fork, then the
resource fork.
<P><LI><B>2.</B>
The byte stream is <B>compressed</B> by looking for series of duplicated
bytes and representing them using a special binary escape sequence
(of course, any occurences of the escape character must also be escaped).
<P><LI><B>3.</B>
The compressed stream is <B>encoded</B> via the &quot;6/8 hemiola&quot; common
to <I>base64</I> and <I>uuencode</I>: each group of three 8-bit bytes (24 bits)
is chopped into four 6-bit numbers, which are used as indexes into
an ASCII &quot;alphabet&quot;.
(I assume that leftover bytes are zero-padded; documentation is thin).
</UL>
<P><HR>
<A NAME="functions">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
FUNCTIONS</FONT></H1>
</A>
<P><HR>
<A NAME="crc_computation">
<H2><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-tiny.gif" ALT="" BORDER="0"></A>
CRC computation</FONT></H2>
</A>
<DL>
<P><DT><B><A NAME="macbinary_crc">macbinary_crc DATA, SEED</A></B><DD>
Compute the MacBinary-II-style CRC for the given DATA, with the CRC
seeded to SEED. Normally, you start with a SEED of 0, and you pump in
the previous CRC as the SEED if you're handling a lot of data one chunk
at a time. That is:
<P>
<PRE> &#36;crc = 0;
while (&lt;STDIN&gt;) {
&#36;crc = macbinary_crc(&#36;_, &#36;crc);
}</PRE>
<P>
<I>Note:</I> Extracted from the <I>mcvert</I> utility (Doug Moore, April '87),
using a &quot;magic array&quot; algorithm by Jim Van Verth for efficiency.
Converted to Perl5 by Eryq. <B>Untested.</B>
<P><DT><B><A NAME="binhex_crc">binhex_crc DATA, SEED</A></B><DD>
Compute the HQX-style CRC for the given DATA, with the CRC seeded to SEED.
Normally, you start with a SEED of 0, and you pump in the previous CRC as
the SEED if you're handling a lot of data one chunk at a time. That is:
<P>
<PRE> &#36;crc = 0;
while (&lt;STDIN&gt;) {
&#36;crc = binhex_crc(&#36;_, &#36;crc);
}</PRE>
<P>
<I>Note:</I> Extracted from the <I>mcvert</I> utility (Doug Moore, April '87),
using a &quot;magic array&quot; algorithm by Jim Van Verth for efficiency.
Converted to Perl5 by Eryq.
</DL>
<P><HR>
<A NAME="oo_interface">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
OO INTERFACE</FONT></H1>
</A>
<P><HR>
<A NAME="conversion">
<H2><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-tiny.gif" ALT="" BORDER="0"></A>
Conversion</FONT></H2>
</A>
<DL>
<P><DT><B><A NAME="bin2hex">bin2hex</A></B><DD>
<I>Class method, constructor.</I>
Return a converter object. Just creates a new instance of
<A HREF="#convertbinhexbin2hex">&quot;Convert::BinHex::Bin2Hex&quot;</A>; see that class for details.
<P><DT><B><A NAME="hex2bin">hex2bin</A></B><DD>
<I>Class method, constructor.</I>
Return a converter object. Just creates a new instance of
<A HREF="#convertbinhexhex2bin">&quot;Convert::BinHex::Hex2Bin&quot;</A>; see that class for details.
</DL>
<P><HR>
<A NAME="construction">
<H2><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-tiny.gif" ALT="" BORDER="0"></A>
Construction</FONT></H2>
</A>
<DL>
<P><DT><B><A NAME="new">new PARAMHASH</A></B><DD>
<I>Class method, constructor.</I>
Return a handle on a BinHex'able entity. In general, the data and resource
forks for such an entity are stored in native format (binary) format.
<P>
Parameters in the PARAMHASH are the same as header-oriented method names,
and may be used to set attributes:
<P>
<PRE> &#36;HQX = new Convert::BinHex filename =&gt; &quot;icon.gif&quot;,
type =&gt; &quot;GIFB&quot;,
creator =&gt; &quot;CNVS&quot;;</PRE>
<P><DT><B><A NAME="open">open PARAMHASH</A></B><DD>
<I>Class method, constructor.</I>
Return a handle on a new BinHex'ed stream, for parsing.
Params are:
<DL>
<P><DT><B><A NAME="data">Data</A></B><DD>
Input a HEX stream from the given data. This can be a scalar, or a
reference to an array of scalars.
<P><DT><B><A NAME="expr">Expr</A></B><DD>
Input a HEX stream from any open()able expression. It will be opened and
binmode'd, and the filehandle will be closed either on a <CODE>close()</CODE>
or when the object is destructed.
<P><DT><B><A NAME="fh">FH</A></B><DD>
Input a HEX stream from the given filehandle.
<P><DT><B><A NAME="nocomment">NoComment</A></B><DD>
If true, the parser should not attempt to skip a leading &quot;(This file...)&quot;
comment. That means that the first nonwhite characters encountered
must be the binhex'ed data.
</DL>
</DL>
<P><HR>
<A NAME="getset_header_information">
<H2><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-tiny.gif" ALT="" BORDER="0"></A>
Get/set header information</FONT></H2>
</A>
<DL>
<P><DT><B><A NAME="creator">creator [VALUE]</A></B><DD>
<I>Instance method.</I>
Get/set the creator of the file. This is a four-character
string (though I don't know if it's guaranteed to be printable ASCII!)
that serves as part of the Macintosh's version of a MIME &quot;content-type&quot;.
<P>
For example, a document created by &quot;Canvas&quot; might have
creator <CODE>&quot;CNVS&quot;</CODE>.
<P><DT><B><A NAME="data">data [PARAMHASH]</A></B><DD>
<I>Instance method.</I>
Get/set the data fork. Any arguments are passed into the
new() method of <A HREF="#convertbinhexfork">&quot;Convert::BinHex::Fork&quot;</A>.
<P><DT><B><A NAME="filename">filename [VALUE]</A></B><DD>
<I>Instance method.</I>
Get/set the name of the file.
<P><DT><B><A NAME="flags">flags [VALUE]</A></B><DD>
<I>Instance method.</I>
Return the flags, as an integer. Use bitmasking to get as the values
you need.
<P><DT><B><A NAME="header_as_string">header_as_string</A></B><DD>
Return a stringified version of the header that you might
use for logging/debugging purposes. It looks like this:
<P>
<PRE> X-HQX-Software: BinHex 4.0 (Convert::BinHex 1.102)
X-HQX-Filename: Something_new.eps
X-HQX-Version: 0
X-HQX-Type: EPSF
X-HQX-Creator: ART5
X-HQX-Data-Length: 49731
X-HQX-Rsrc-Length: 23096</PRE>
<P>
As some of you might have guessed, this is RFC-822-style, and
may be easily plunked down into the middle of a mail header, or
split into lines, etc.
<P><DT><B><A NAME="requires">requires [VALUE]</A></B><DD>
<I>Instance method.</I>
Get/set the software version required to convert this file, as
extracted from the comment that preceded the actual binhex'ed
data; e.g.:
<P>
<PRE> (This file must be converted with BinHex 4.0)</PRE>
<P>
In this case, after parsing in the comment, the code:
<P>
<PRE> &#36;HQX-&gt;requires;</PRE>
<P>
would get back &quot;4.0&quot;.
<P><DT><B><A NAME="resource">resource [PARAMHASH]</A></B><DD>
<I>Instance method.</I>
Get/set the resource fork. Any arguments are passed into the
new() method of <A HREF="#convertbinhexfork">&quot;Convert::BinHex::Fork&quot;</A>.
<P><DT><B><A NAME="type">type [VALUE]</A></B><DD>
<I>Instance method.</I>
Get/set the type of the file. This is a four-character
string (though I don't know if it's guaranteed to be printable ASCII!)
that serves as part of the Macintosh's version of a MIME &quot;content-type&quot;.
<P>
For example, a GIF89a file might have type <CODE>&quot;GF89&quot;</CODE>.
<P><DT><B><A NAME="version">version [VALUE]</A></B><DD>
<I>Instance method.</I>
Get/set the version, as an integer.
</DL>
<P><HR>
<A NAME="decode_highlevel">
<H2><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-tiny.gif" ALT="" BORDER="0"></A>
Decode, high-level</FONT></H2>
</A>
<DL>
<P><DT><B><A NAME="read_comment">read_comment</A></B><DD>
<I>Instance method.</I>
Skip past the opening comment in the file, which is of the form:
<P>
<PRE> (This file must be converted with BinHex 4.0)</PRE>
<P>
As per RFC-1741, <I>this comment must immediately precede the BinHex data,</I>
and any text before it will be ignored.
<P>
<I>You don't need to invoke this method yourself;</I> <CODE>read_header()</CODE> will
do it for you. After the call, the version number in the comment is
accessible via the <CODE>requires()</CODE> method.
<P><DT><B><A NAME="read_header">read_header</A></B><DD>
<I>Instance method.</I>
Read in the BinHex file header. You must do this first!
<P><DT><B><A NAME="read_data">read_data [NBYTES]</A></B><DD>
<I>Instance method.</I>
Read information from the data fork. Use it in an array context to
slurp all the data into an array of scalars:
<P>
<PRE> @data = &#36;HQX-&gt;read_data;</PRE>
<P>
Or use it in a scalar context to get the data piecemeal:
<P>
<PRE> while (defined(&#36;data = &#36;HQX-&gt;read_data)) {
# do stuff with &#36;data
}</PRE>
<P>
The NBYTES to read defaults to 2048.
<P><DT><B><A NAME="read_resource">read_resource [NBYTES]</A></B><DD>
<I>Instance method.</I>
Read in all/some of the resource fork.
See <CODE>read_data()</CODE> for usage.
</DL>
<P><HR>
<A NAME="encode_highlevel">
<H2><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-tiny.gif" ALT="" BORDER="0"></A>
Encode, high-level</FONT></H2>
</A>
<DL>
<P><DT><B><A NAME="encode">encode OUT</A></B><DD>
Encode the object as a BinHex stream to the given output handle OUT.
OUT can be a filehandle, or any blessed object that responds to a
<CODE>print()</CODE> message.
<P>
The leading comment is output, using the <CODE>requires()</CODE> attribute.
</DL>
<P><HR>
<A NAME="submodules">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
SUBMODULES</FONT></H1>
</A>
<P><HR>
<A NAME="convertbinhexbin2hex">
<H2><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-tiny.gif" ALT="" BORDER="0"></A>
Convert::BinHex::Bin2Hex</FONT></H2>
</A>
<P>
A BINary-to-HEX converter. This kind of conversion requires
a certain amount of state information; it cannot be done by
just calling a simple function repeatedly. Use it like this:
<P>
<PRE> # Create and use a &quot;translator&quot; object:
my &#36;B2H = Convert::BinHex-&gt;bin2hex; # get a converter object
while (&lt;STDIN&gt;) {
print STDOUT &#36;B2H-&gt;next(&#36;_); # convert some more input
}
print STDOUT &#36;B2H-&gt;done; # no more input: finish up</PRE>
<P>
<PRE> # Re-use the object:
&#36;B2H-&gt;rewind; # ready for more action!
while (&lt;MOREIN&gt;) { ...</PRE>
<P>
On each iteration, <CODE>next()</CODE> (and <CODE>done()</CODE>) may return either
a decent-sized non-empty string (indicating that more converted data
is ready for you) or an empty string (indicating that the converter
is waiting to amass more input in its private buffers before handing
you more stuff to output.
<P>
Note that <CODE>done()</CODE> <I>always</I> converts and hands you whatever is left.
<P>
This may have been a good approach. It may not. Someday, the converter
may also allow you give it an object that responds to read(), or
a FileHandle, and it will do all the nasty buffer-filling on its own,
serving you stuff line by line:
<P>
<PRE> # Someday, maybe...
my &#36;B2H = Convert::BinHex-&gt;bin2hex(\*STDIN);
while (defined(&#36;_ = &#36;B2H-&gt;getline)) {
print STDOUT &#36;_;
}</PRE>
<P>
Someday, maybe. Feel free to voice your opinions.
<P><HR>
<A NAME="convertbinhexhex2bin">
<H2><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-tiny.gif" ALT="" BORDER="0"></A>
Convert::BinHex::Hex2Bin</FONT></H2>
</A>
<P>
A HEX-to-BINary converter. This kind of conversion requires
a certain amount of state information; it cannot be done by
just calling a simple function repeatedly. Use it like this:
<P>
<PRE> # Create and use a &quot;translator&quot; object:
my &#36;H2B = Convert::BinHex-&gt;hex2bin; # get a converter object
while (&lt;STDIN&gt;) {
print STDOUT &#36;H2B-&gt;next(&#36;_); # convert some more input
}
print STDOUT &#36;H2B-&gt;done; # no more input: finish up</PRE>
<P>
<PRE> # Re-use the object:
&#36;H2B-&gt;rewind; # ready for more action!
while (&lt;MOREIN&gt;) { ...</PRE>
<P>
On each iteration, <CODE>next()</CODE> (and <CODE>done()</CODE>) may return either
a decent-sized non-empty string (indicating that more converted data
is ready for you) or an empty string (indicating that the converter
is waiting to amass more input in its private buffers before handing
you more stuff to output.
<P>
Note that <CODE>done()</CODE> <I>always</I> converts and hands you whatever is left.
<P>
Note that this converter does <I>not</I> find the initial
&quot;BinHex version&quot; comment. You have to skip that yourself. It
only handles data between the opening and closing <CODE>&quot;:&quot;</CODE>.
<P><HR>
<A NAME="convertbinhexfork">
<H2><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-tiny.gif" ALT="" BORDER="0"></A>
Convert::BinHex::Fork</FONT></H2>
</A>
<P>
A fork in a Macintosh file.
<P>
<PRE> # How to get them...
&#36;data_fork = &#36;HQX-&gt;data; # get the data fork
&#36;rsrc_fork = &#36;HQX-&gt;resource; # get the resource fork</PRE>
<P>
<PRE> # Make a new fork:
&#36;FORK = Convert::BinHex::Fork-&gt;new(Path =&gt; &quot;/tmp/file.data&quot;);
&#36;FORK = Convert::BinHex::Fork-&gt;new(Data =&gt; &#36;scalar);
&#36;FORK = Convert::BinHex::Fork-&gt;new(Data =&gt; \@array_of_scalars);</PRE>
<P>
<PRE> # Get/set the length of the data fork:
&#36;len = &#36;FORK-&gt;length;
&#36;FORK-&gt;length(170); # this overrides the REAL value: be careful!</PRE>
<P>
<PRE> # Get/set the path to the underlying data (if in a disk file):
&#36;path = &#36;FORK-&gt;path;
&#36;FORK-&gt;path(&quot;/tmp/file.data&quot;);</PRE>
<P>
<PRE> # Get/set the in-core data itself, which may be a scalar or an arrayref:
&#36;data = &#36;FORK-&gt;data;
&#36;FORK-&gt;data(&#36;scalar);
&#36;FORK-&gt;data(\@array_of_scalars);</PRE>
<P>
<PRE> # Get/set the CRC:
&#36;crc = &#36;FORK-&gt;crc;
&#36;FORK-&gt;crc(&#36;crc);</PRE>
<P><HR>
<A NAME="under_the_hood">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
UNDER THE HOOD</FONT></H1>
</A>
<P><HR>
<A NAME="design_issues">
<H2><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-tiny.gif" ALT="" BORDER="0"></A>
Design issues</FONT></H2>
</A>
<DL>
<P><DT><B><A NAME="binhex">BinHex needs a stateful parser</A></B><DD>
Unlike its cousins <I>base64</I> and <I>uuencode</I>, BinHex format is not
amenable to being parsed line-by-line. There appears to be no
guarantee that lines contain 4n encoded characters... and even if there
is one, the BinHex compression algorithm interferes: even when you
can <I>decode</I> one line at a time, you can't necessarily
<I>decompress</I> a line at a time.
<P>
For example: a decoded line ending with the byte <CODE>\x90</CODE> (the escape
or &quot;mark&quot; character) is ambiguous: depending on the next decoded byte,
it could mean a literal <CODE>\x90</CODE> (if the next byte is a <CODE>\x00</CODE>), or
it could mean n-1 more repetitions of the previous character (if
the next byte is some nonzero <CODE>n</CODE>).
<P>
For this reason, a BinHex parser has to be somewhat stateful: you
cannot have code like this:
<P>
<PRE> #### NO! #### NO! #### NO! #### NO! #### NO! ####
while (&lt;STDIN&gt;) { # read HEX
print hexbin(&#36;_); # convert and write BIN
}</PRE>
<P>
unless something is happening &quot;behind the scenes&quot; to keep track of
what was last done. <I>The dangerous thing, however, is that this
approach will <B>seem</B> to work, if you only test it on BinHex files
which do not use compression and which have 4n HEX characters
on each line.</I>
<P>
Since we have to be stateful anyway, we use the parser object to
keep our state.
<P><DT><B><A NAME="we">We need to be handle large input files</A></B><DD>
Solutions that demand reading everything into core don't cut
it in my book. The first MPEG file that comes along can louse
up your whole day. So, there are no size limitations in this
module: the data is read on-demand, and filehandles are always
an option.
<P><DT><B><A NAME="boy">Boy, is this slow!</A></B><DD>
A lot of the byte-level manipulation that has to go on, particularly
the CRC computing (which involves intensive bit-shifting and masking)
slows this module down significantly. What is needed perhaps is an
<I>optional</I> extension library where the slow pieces can be done more
quickly... a Convert::BinHex::CRC, if you will. Volunteers, anyone?
<P>
Even considering that, however, it's slower than I'd like. I'm
sure many improvements can be made in the HEX-to-BIN end of things.
No doubt I'll attempt some as time goes on...
</DL>
<P><HR>
<A NAME="how_it_works">
<H2><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-tiny.gif" ALT="" BORDER="0"></A>
How it works</FONT></H2>
</A>
<P>
Since BinHex is a layered format, consisting of...
<P>
<PRE> A Macintosh file [the &quot;BIN&quot;]...
Encoded as a structured 8-bit bytestream, then...
Compressed to reduce duplicate bytes, then...
Encoded as 7-bit ASCII [the &quot;HEX&quot;]</PRE>
<P>
...there is a layered parsing algorithm to reverse the process.
Basically, it works in a similar fashion to stdio's fread():
<P>
<PRE> 0. There is an internal buffer of decompressed (BIN) data,
initially empty.
1. Application asks to read() n bytes of data from object
2. If the buffer is not full enough to accomodate the request:
2a. The read() method grabs the next available chunk of input
data (the HEX).
2b. HEX data is converted and decompressed into as many BIN
bytes as possible.
2c. BIN bytes are added to the read() buffer.
2d. Go back to step 2a. until the buffer is full enough
or we hit end-of-input.</PRE>
<P>
The conversion-and-decompression algorithms need their own internal
buffers and state (since the next input chunk may not contain all the
data needed for a complete conversion/decompression operation).
These are maintained in the object, so parsing two different
input streams simultaneously is possible.
<P><HR>
<A NAME="warnings">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
WARNINGS</FONT></H1>
</A>
<P>
Only handles <CODE>Hqx7</CODE> files, as per RFC-1741.
<P>
Remember that Macintosh text files use <CODE>&quot;\r&quot;</CODE> as end-of-line:
this means that if you want a textual file to look normal on
a non-Mac system, you probably want to do this to the data:
<P>
<PRE> # Get the data, and output it according to normal conventions:
foreach (&#36;HQX-&gt;read_data) { s/\r/\n/g; print }</PRE>
<P><HR>
<A NAME="change_log">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
CHANGE LOG</FONT></H1>
</A>
<P>
Current version: &#36;Id: BinHex.pm,v 1.119 1997/06/28 05:12:42 eryq Exp &#36;
<DL>
<P><DT><B><A NAME="version">Version 1.118</A></B><DD>
Ready to go public (with Paul's version, patched for native Mac support)!
Warnings have been suppressed in a few places where undefined values
appear.
<P><DT><B><A NAME="version">Version 1.115</A></B><DD>
Fixed another bug in comp2bin, related to the MARK falling on a
boundary between inputs. Added testing code.
<P><DT><B><A NAME="version">Version 1.114</A></B><DD>
Added BIN-to-HEX conversion. Eh. It's a start.
Also, a lot of documentation additions and cleanups.
Some methods were also renamed.
<P><DT><B><A NAME="version">Version 1.103</A></B><DD>
Fixed bug in decompression (wasn't saving last character).
Fixed &quot;NoComment&quot; bug.
<P><DT><B><A NAME="version">Version 1.102</A></B><DD>
Initial release.
</DL>
<P><HR>
<A NAME="author_and_credits">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
AUTHOR AND CREDITS</FONT></H1>
</A>
<P>
Written by Eryq, <I><A HREF="http://www.enteract.com/~eryq">http://www.enteract.com/~eryq</A></I> / <I><A HREF="mailto:eryq@enteract.com">eryq@enteract.com</A></I>
<P>
Support for native-Mac conversion, <I>plus</I> invaluable contributions in
Alpha Testing, <I>plus</I> a few patches, <I>plus</I> the baseline binhex/debinhex
programs, were provided by Paul J. Schinder (NASA/GSFC).
<P>
Ken Lunde (Adobe) suggested incorporating the CAP file representation.
<P><HR>
<A NAME="terms_and_conditions">
<H1><FONT COLOR=#800030>
<A HREF="#__top"><IMG SRC="BinHex/redapple-sm.gif" ALT="" BORDER="0"></A>
TERMS AND CONDITIONS</FONT></H1>
</A>
<P>
Copyright (c) 1997 by Eryq. All rights reserved. This program is free
software; you can redistribute it and/or modify it under the same terms as
Perl itself.
<P>
This software comes with <B>NO WARRANTY</B> of any kind.
See the COPYING file in the distribution for details.
<P><HR>
<SMALL>
Apple Computer Corporation
neither endorses nor is in any way connected with
the development of this software.
<P>
Last updated: Sat Jun 28 00:17:41 1997 <BR>
Generated by pod2coolhtml 1.101. Want a copy? Just email
<A HREF="mailto:eryq@enteract.com">eryq@enteract.com</A>.
(Yes, it's free.)
</SMALL></BODY>
</HTML>