.SP2 DATA FIELD ENCODING Due to Apple's hardware, it is not possible to read all 256 possible byte values from a diskette. This is not a great problem, but it does require that the data written to the disk be encoded. Three different techniques have been used. The first one, which is currently used in Address Fields, involves writing a data byte as two disk bytes, one containing the odd bits, and the other containing the even bits. It would thus require 512 "disk" bytes for each 256 byte sector of data. Had this technique been used for sector data, no more than 10 sectors would have fit on a track. This amounts to about 88K of data per diskette, or roughly 72K of space available to the user; typical for 5 1/4 single density drives. Fortunately, a second technique for writing data to diskette was devised that allows 13 sectors per track. This new method involved a "5 and 3" split of the data bits, versus the "4 and 4" mentioned earlier. Each byte written to the disk contains five valid bits rather than four. This requires 410 "disk" bytes to store a 256 byte sector. This latter density allows the now well known 13 sectors per track format used by DOS 3 through DOS 3.2.1. The "5 and 3" scheme represented a hefty 33% increase over comparable drives of the day. Currently, of course, DOS 3.3 features 16 sectors per track and provides a 23% increase in disk storage over the 13 sector format. This was made possible by a hardware modification (the P6 PROM on the disk controller card) which allowed a "6 and 2" split of the data. The change was to the logic of the "state machine" in the P6 PROM, now allowing two consecutive zero bits in data bytes. These three different encoding techniques will now be covered in some detail. The hardware for DOS 3.2.1 (and earlier versions of DOS) imposed a number of restrictions upon how data could be stored and retrieved. It required that a disk byte have the high bit set and, in addition, no two consecutive bits could be zero. The odd-even "4 and 4" technique meets these requirements. Each data byte is represented as two bytes, one containing the even data bits and the other the odd data bits. Figure 3.15 illustrates this transformation. It should be noted that the unused bits are all set to one to guarantee meeting the two requirements. .sp1 *** INSERT FIGURE 3.15 HERE *** No matter what value the original data data byte has, this technique insures that the high bit is set and that there can not be two consecutive zero bits. The "4 and 4" technique is used to store the information (volume, track, sector, checksum) contained in the Address Field. It is quite easy to decode the data, since the byte with the odd bits is simply shifted left and logically ANDed with the byte containing the even bits. This is illustrated in Figure 3.16. .sp1 *** INSERT FIGURE 3.16 HERE *** It is important that the least significant bit contain a 1 when the odd-bits byte is left shifted. The entire operation is carried out in the RDADR subroutine at $B944 in DOS (48K). The major difficulty with the above technique is that it takes up a lot of room on the track. To overcome this deficiency the "5 and 3" encoding technique was developed. It is so named because, instead of splitting the bytes in half, as in the odd-even technique, they are split five and three. A byte would have the form 000XXXXX, where X is a valid data bit. The above byte could range in value from $00 to $1F, a total of 32 different values. It so happens that there are 34 valid "disk" bytes, ranging from $AA up to $FF, which meet the two requirements (high bit set, no consecutive zero bits). Two bytes, $D5 and $AA, were chosen as reserved bytes, thus leaving an exact mapping between five bit data bytes and eight bit "disk" bytes. The process of converting eight bit data bytes to eight bit "disk" bytes, then, is twofold. An overview is diagrammed in Figure 3.17. .sp1 *** INSERT FIGURE 3.17 HERE *** First, the 256 bytes that will make up a sector must be translated to five bit bytes. This is done by the "prenibble" routine at $B800. It is a fairly involved process, involving a good deal of bit rearrangement. Figure 3.18 shows the before and after of prenibbilizing. On the left is a buffer of eight bit data bytes, as passed to the RWTS subroutine package by DOS. Each byte in this buffer is represented by a letter (A, B, C, etc.) and each bit by a number (7 through 0). On the right side are the results of the transformation. The primary buffer contains five distinct areas of five bit bytes (the top three bits of the eight bit bytes zero-filled) and the secondary buffer contains three areas, graphically illustrating the name "5 and 3". .bp *** INSERT FIGURE 3.18 HERE *** A total of 410 bytes are needed to store the original 256. This can be calculated by finding the total bits of data (256 x 8 = 2048) and dividing that by the number of bits per byte (2048 / 5 = 409.6). (two bits are not used) Once this process is completed, the data is further transformed to make it valid "disk" bytes, meeting the disk's requirements. This is much easier, involving a one to one look-up in the table given in Figure 3.19. .sp1 *** INSERT FIGURE 3.19 HERE *** The Data Field has a checksum much like the one in the Address Field, used to verify the integrity of the data. It also involves exclusive-ORing the information, but, due to time constraints during reading bytes, it is implemented differently. The data is exclusive-ORed in pairs before being transformed by the look-up table in Figure 3.19. This can best be illustrated by Figure 3.20 on the following page. .sp1 *** INSERT FIGURE 3.20 HERE *** The reason for this transformation can be better understood by examining how the information is retrieved from the disk. The read routine must read a byte, transform it, and store it -- all in under 32 cycles (the time taken to write a byte) or the information will be lost. By using the checksum computation to decode data, the transformation shown in Figure 3.20 greatly facilitates the time constraint. As the data is being read from a sector the accumulator contains the cumulative result of all previous bytes, exclusive-ORed together. The value of the accumulator after any exclusive-OR operation is the actual data byte for that point in the series. This process is diagrammed in Figure 3.21. .bp *** INSERT FIGURE 3.21 HERE *** The third encoding technique, currently used by DOS 3.3, is similar to the "5 and 3". It was made possible by a change in the hardware which eased the requirements for valid data somewhat. The high bit must still be set, but now the byte may contain one (and only one) pair of consecutive zero bits. This allows a greater number of valid bytes and permits the use of a "6 and 2" encoding technique. A six bit byte would have the form 00XXXXXX and has values from $00 to $3F for a total of 64 different values. With the new, relaxed requirements for valid "disk" bytes there are 69 different bytes ranging in value from $96 up to $FF. After removing the two reserved bytes, $AA and $D5, there are still 67 "disk" bytes with only 64 needed. An additional requirement was introduced to force the mapping to be one to one, namely, that there must be at least two adjacent bits set, excluding bit 7. This produces exactly 64 valid "disk" values. The initial transformation is done by the prenibble routine (still located at $B800) and its results are shown in Figure 3.22. .sp1 *** INSERT FIGURE 3.22 (20) HERE *** A total of 342 bytes are needed, shown by finding the total number of bits (256 x 8 = 2048) and dividing by the number of bits per byte (2048 / 6 = 341.33). The transformation from the six bit bytes to valid data bytes is again performed by a one to one mapping shown in Figure 3.23. Once again, the stream of data bytes written to the diskette are a product of exclusive-ORs, exactly as with the "5 and 3" technique discussed earlier. .sp1 *** INSERT FIGURE 3.23 (21) HERE *** .sp1 SECTOR INTERLEAVING Sector interleaving is the staggering of sectors on a track to maximize access speed. There is usually a delay between the time DOS reads or writes a sector and the time it is ready to read or write another. This delay depends upon the application program using the disk and can vary greatly. If sectors were stored on the track sequentially, in ascending numerical order, unless the application was very quick indeed, it would usually be necessary to wait an entire revolution of the diskette before the next sector could be accessed. Rearranging the sectors into a different order or "interleaving" them can provide different access speeds. .bp On DOS 3.2.1 and earlier versions, the 13 sectors are physically interleaved on the disk. Since DOS resides on the diskette in ascending sequential order and files generally are stored in descending sequential order, no single interleaving scheme works well for both booting and sequentially accessing a file. A different approach has been used in DOS 3.3 in an attempt to maximize performance. The interleaving is now done in software. The 16 sectors are stored in numerically ascending order on the diskette (0, 1, 2, ... , 15) and are not physically interleaved at all. A look-up table is used to translate the physical sector number into a "pseudo" or soft sector number used by DOS. For example, if the sector number found on the disk were a 2, this is used as an offset into a table where the number $0B is found. Thus, DOS treats the physical sector 2 as sector 11 ($0B) for all intents and purposes. This presents no problem if RWTS is used for disk access, but would become a consideration if access were made without RWTS. To eliminate the access differences between booting and reading files, another change has been made. During the boot process, DOS is loaded backwards in descending sequential order into memory, just as files are accessed. This means one interleaving scheme can minimize disk access time. It is interesting to point out that Pascal, Fortran, and CP/M diskettes all use software interleaving also. However, each uses a different sector order. A comparison of these differences is presented in Figure 3.24. .sp1 *** INSERT FIGURE 3.24 (22) HERE *** .br .nxch4