.sp2
DATA FIELD ENCODING

Due to Apple's hardware, it is not
possible to read all 256 possible
byte values from a diskette.
This is not a great problem, but it
does require that the data written to
the disk be encoded.  Three different
techniques have been used.  The first
one, which is currently used in
Address Fields, involves
writing a data byte as two disk
bytes, one containing the odd bits,
and the other containing the even
bits.  It would thus require 512
"disk" bytes for each 256 byte sector
of data.  Had this technique been used
for sector data, no more than 10
sectors would have fit on a track.
This amounts to about 88K of data per
diskette, or roughly 72K of space
available to the user; typical for 5
1/4 single density drives.

Fortunately, a second technique for
writing data to diskette was devised
that allows 13
sectors per track.  This new method
involved a "5 and 3" split of the
data bits, versus the "4 and 4"
mentioned earlier.  Each byte written
to the disk contains five valid bits
rather than four.  This requires 410
"disk" bytes to store a 256 byte
sector.  This latter density allows
the now well known 13 sectors per
track format used by DOS 3 through
DOS 3.2.1.  The "5 and 3" scheme
represented a hefty 33% increase over
comparable drives of the day.

Currently, of course, DOS 3.3
features 16 sectors per track and
provides a 23% increase in disk
storage over the 13 sector format.
This was made possible by a hardware
modification (the P6 PROM on the disk
controller card) which allowed a "6
and 2" split of the data.  The change
was to the logic of the "state
machine" in the P6 PROM, now allowing
two consecutive zero bits in data
bytes.

These three different encoding
techniques
will now be covered in some detail.
The hardware for DOS 3.2.1 (and
earlier versions of DOS) imposed a
number of restrictions upon how data
could be stored and retrieved.  It
required that a disk byte have the
high bit set and, in addition, no two
consecutive bits could be zero.
The odd-even "4 and 4" technique meets
these requirements.  Each data byte
is represented as two bytes, one
containing the even data bits and the
other the odd data bits.  Figure 3.15
illustrates this transformation.  It
should be noted that the unused bits
are all set to one to guarantee
meeting the two requirements.
.sp1
*** INSERT FIGURE 3.15 HERE ***

No matter what value the original
data data byte has, this technique
insures that the high bit is set and
that there can not be two consecutive
zero bits.  The "4 and 4" technique is
used to store the information
(volume, track, sector, checksum)
contained in the Address Field.  It
is quite easy to decode the data, since the
byte with the odd bits is simply
shifted left and logically ANDed with
the byte containing the even bits.
This is illustrated in Figure 3.16.
.sp1
*** INSERT FIGURE 3.16 HERE ***

It is important that the least
significant bit contain a 1 when the
odd-bits byte is left shifted.  The
entire
operation is carried out in the
RDADR subroutine at $B944 in DOS
(48K).

The major difficulty with the above
technique is that it takes up a lot of
room on the track.  To overcome this
deficiency the "5 and 3" encoding
technique was developed.  It is so named
because, instead of splitting the
bytes in half, as in the odd-even
technique, they are split five and three.  A
byte would have the form 000XXXXX,
where X is a valid data bit.  The
above byte could range in value from
$00 to $1F, a total of 32 different
values.  It so happens that there are
34 valid "disk" bytes, ranging
from $AA up to $FF, which meet the
two requirements (high bit set, no
consecutive zero bits).  Two bytes,
$D5 and $AA, were chosen as reserved
bytes, thus leaving an exact mapping
between five bit data bytes and eight bit
"disk" bytes.  The process of
converting eight bit data bytes to eight bit
"disk" bytes, then, is twofold.
An overview is diagrammed in Figure
3.17.
.sp1
*** INSERT FIGURE 3.17 HERE ***

First, the 256 bytes that will make
up a sector must be translated to
five bit bytes.  This is done by the
"prenibble" routine at $B800.  It is
a fairly involved process, involving
a good deal of bit rearrangement.
Figure 3.18 shows the before and
after of prenibbilizing.  On the left
is a buffer of eight bit data bytes, as
passed to the RWTS subroutine
package by DOS.  Each byte in this
buffer is represented by a letter (A,
B, C, etc.) and each bit by a number
(7 through 0).  On the right side are
the results of the transformation.
The primary buffer contains five
distinct areas of five bit bytes (the
top three bits of the eight bit bytes
zero-filled) and the secondary buffer
contains three areas, graphically
illustrating the name "5 and 3".
.bp
*** INSERT FIGURE 3.18 HERE ***

A total of 410 bytes are needed to
store the original 256.  This can be
calculated by finding the total bits
of data (256 x 8 = 2048) and dividing
that by the number of bits per byte
(2048 / 5 = 409.6). (two bits are
not used)  Once this process is
completed, the data is further
transformed to make it valid "disk"
bytes, meeting the disk's
requirements.  This is much easier,
involving a one to one look-up in the
table given in Figure 3.19.
.sp1
*** INSERT FIGURE 3.19 HERE ***

The Data Field has a checksum much
like the one in the Address Field,
used to verify the integrity of the
data.  It also involves
exclusive-ORing the information, but,
due to time constraints during
reading bytes, it is implemented
differently.  The data is
exclusive-ORed in pairs before being
transformed by the look-up table in
Figure 3.19.  This can best be
illustrated by Figure 3.20 on the following page.
.sp1
*** INSERT FIGURE 3.20 HERE ***

The reason for this transformation
can be better understood by examining
how the information is retrieved from
the disk.  The read routine must read
a byte, transform it, and store it --
all in under 32 cycles (the time
taken to write a byte) or the
information will be lost.  By using
the checksum computation to decode
data, the
transformation shown in Figure 3.20
greatly facilitates the time
constraint.  As the data is being
read from a sector the accumulator
contains the cumulative result of
all previous bytes, exclusive-ORed
together.  The value of the
accumulator after any exclusive-OR
operation is the actual data byte
for that point in the series.
This process is diagrammed in Figure 3.21.
.bp
*** INSERT FIGURE 3.21 HERE ***

The third encoding technique, currently
used by DOS 3.3, is similar to the "5
and 3".  It was made possible by a
change in the hardware which eased
the requirements for valid data
somewhat.  The high bit must still be
set, but now the byte may contain one
(and only one) pair of consecutive
zero bits.  This allows a greater
number of valid bytes and permits the
use of a "6 and 2" encoding technique.
A six bit byte would have the form
00XXXXXX and has values from $00 to
$3F for a total of 64 different
values.  With the new, relaxed
requirements for valid "disk" bytes
there are 69 different bytes ranging
in value from $96 up to $FF.  After
removing the two reserved bytes, $AA
and $D5, there are still 67 "disk" bytes
with only 64 needed.  An additional
requirement was introduced to force
the mapping to be one to one, namely,
that there must be at least two
adjacent bits set, excluding bit 7.
This produces exactly 64 valid "disk"
values.  The initial transformation
is done by the prenibble routine
(still located at $B800) and its
results are shown in Figure 3.22.
.sp1
*** INSERT FIGURE 3.22 (20) HERE ***

A total of 342 bytes are needed,
shown by finding the total number of
bits (256 x 8 = 2048) and dividing by
the number of bits per byte (2048 / 6
= 341.33).  The transformation from
the six bit bytes to valid data bytes
is again performed by a one to one
mapping shown in Figure 3.23.
Once again, the stream of data bytes
written to the diskette are a product
of exclusive-ORs, exactly as with the
"5 and 3" technique discussed earlier.
.sp1
*** INSERT FIGURE 3.23 (21) HERE ***
.sp1
SECTOR INTERLEAVING

Sector interleaving is the staggering
of sectors on a track to maximize
access speed.  There is usually a
delay between the time DOS reads or
writes a sector and the time it is
ready to read or write another.  This
delay depends upon the application
program using the disk and can vary
greatly.  If sectors were stored
on the track
sequentially, in ascending numerical
order, unless the
application was very quick indeed,
it would
usually be necessary to wait an
entire revolution of the diskette
before the next sector could be
accessed.  Rearranging the sectors
into a different order or
"interleaving" them can provide
different access speeds.
.bp
On DOS 3.2.1 and earlier versions,
the 13 sectors are physically
interleaved on the disk.  Since DOS
resides on the diskette in ascending
sequential order and files generally
are stored in descending sequential
order, no single interleaving scheme
works well for both booting and
sequentially accessing a file.

A different approach has been used in
DOS 3.3 in an attempt to maximize
performance.  The interleaving is now
done in software.  The 16 sectors are
stored in numerically ascending order
on the diskette (0, 1, 2, ... , 15)
and are not physically interleaved at
all.  A look-up table is used to
translate the physical sector number
into a "pseudo" or soft sector number
used by DOS.  For example, if the
sector number found on the disk were a
2, this is used as an offset into a
table where the number $0B is found.
Thus, DOS treats the physical sector
2 as sector 11 ($0B) for all intents
and purposes.  This presents no
problem if RWTS is used for disk
access, but would become a
consideration if access were made
without RWTS.

To eliminate the access differences
between booting and reading files,
another change has been made.  During
the boot process, DOS is loaded
backwards in descending sequential
order into memory, just as files are
accessed.  This means one
interleaving scheme can minimize disk
access time.

It is interesting to point out that
Pascal, Fortran, and CP/M diskettes
all use software interleaving also.
However, each uses a different
sector order.  A comparison of these
differences is presented in Figure
3.24.
.sp1
*** INSERT FIGURE 3.24 (22) HERE ***
.br
.nx ch4