diff --git a/D1S1/CH3.1#064000.txt b/D1S1/CH3.1#064000.txt deleted file mode 100644 index 5fa8728..0000000 --- a/D1S1/CH3.1#064000.txt +++ /dev/null @@ -1,577 +0,0 @@ -.bp -.np -.ce -CHAPTER 3 - DISK II HARDWARE AND TRACK FORMATTING -.sp2 - -Apple Computer's excellent manual on -the Disk Operating System (DOS) -provides only very basic information -about how diskettes are formatted. -This chapter will explain in detail -how information is structured on a -diskette. The first section will -contain a brief introduction to the -hardware, and may be skipped by those -already familiar with the DOS manual. - -For system housekeeping, DOS divides -diskettes into tracks and sectors. -This is done during the -initialization process. A track is a -physically defined circular path -which is concentric with the hole in -the center of the diskette. Each -track is identified by its distance -from the center of the disk. Similar -to a phonograph stylus, the -read/write head of the disk drive may -be positioned over any given track. -The tracks are similar to the grooves -in a record, but they are not -connected in a spiral. Much like -playing a record, the diskette is -spun at a constant speed while the -data is read from or written to its -surface with the read/write head. -Apple formats its diskettes into 35 -tracks. They are numbered from 0 to -34, track 0 being the outermost track -and track 34 the innermost. Figure -3.1 illustrates the concept of -tracks, although they are invisible -to the eye on a real diskette. -.sp1 -*** INSERT FIGURE 3.1 HERE *** - -It should be pointed out, for the -sake of accuracy, that the disk arm -can position itself over 70 "phases". -To move the arm past one track to the -next, two phases of the stepper -motor, which moves the arm, must be -cycled. This implies that data might -be stored on 70 tracks, rather than -35. Unfortunately, the resolution of -the read/write head and the accuracy -of the stepper motor are such, that -attempts to use these phantom -"half" tracks create so much -cross-talk that data is lost or -overwritten. Although the standard -DOS uses only even phases, some -protected disks use odd phases or -combinations of the two, provided -that no two tracks are closer than -two phases from one another. See -APPENDIX B for more information on -protection schemes. -.bp - -A sector is a subdivision of a track. -It is the smallest unit of -"updatable" data on the diskette. -DOS generally reads or writes data a -sector at a time. This is to avoid -using a large chunk of memory as a -buffer to read or write an entire -track. -Apple has used two different track -formats -to date. One divides the track into -13 sectors, the other, 16 sectors. The -sectoring does not use the index -hole, provided on most diskettes, to -locate the first sector of the track. -The implication is that -the software must be able -to locate any given track and sector -with no help from the hardware. -This scheme, known as "soft sectoring", -takes a little more space -for storage but allows flexibility, -as evidenced by the recent change -from 13 sectors to 16 sectors per -track. The following table -catagorizes the amount of data stored -on a diskette under both 13 and 16 -sector formats. -.sp1 -.ne10 -.nf - DISK ORGANIZATION -.sp1 -TRACKS - All DOS versions................35 -.sp1 -SECTORS PER TRACK - DOS 3.2.1 and earlier...........13 - DOS 3.3.........................16 -.sp1 -SECTORS PER DISKETTE - DOS 3.2.1 and earlier..........455 - DOS 3.3........................560 -.sp1 -BYTES PER SECTOR - All DOS versions...............256 -.sp1 -BYTES PER DISKETTE - DOS 3.2.1 and earlier.......116480 - DOS 3.3.....................143360 -.sp1 -USABLE* SECTORS FOR DATA STORAGE - DOS 3.2.1 and earlier..........403 - DOS 3.3........................496 -.sp1 -USABLE* BYTES PER DISKETTE - DOS 3.2.1 and earlier.......103168 - DOS 3.3.....................126976 -.sp2 -* Excludes DOS, VTOC, and CATALOG -.bp -TRACK FORMATTING - -Up to this point we have broken down -the structure of data to the track -and sector level. To better -understand how -data is stored and retrieved, we will -start at the bottom and work up. - -As this manual is primarily concerned -with software, no attempt will be -made to deal with the specifics of -the hardware. For example, while in -fact data is stored as a continuous -stream of analog signals, we will -deal with discrete digital data, i.e. -a 0 or a 1. We recognize that the -hardware converts analog data to -digital data but how this is -accomplished is beyond the scope of -this manual. - -Data is recorded on the diskette -using frequency modulation as the -recording mode. Each data bit -recorded on the diskette has an -associated clock bit recorded with -it. Data written on and read back -from the diskette takes the form -shown in Figure 3.2. The data -pattern shown represents a binary -value of 101. -.sp1 -*** INSERT FIGURE 3.2 HERE *** - -As can be seen in Figure 3.3, the -clock bits and data bits (if present) -are interleaved. The presence of a -data bit between two clock bits -represents a binary 1, the absence of -a data bit between two clock bits -represents a binary 0. We will -define a "bit cell" as the period -between the leading edge of one clock -bit and the leading edge of the next -clock bit. -.sp1 -*** INSERT FIGURE 3.3 HERE *** - -A byte would consist of eight (8) -consecutive bit cells. The most -significant bit cell is usually -referred to as bit cell 7 and the -least significant bit cell would be -bit cell 0. When reference is made -to a specific data bit (i.e. data bit -5), it is with respect to the -corresponding bit cell (bit cell 5). -Data is written and read serially, -one bit at a time. Thus, during a -write operation, bit cell 7 of each -byte would be written first, with bit -cell 0 being written last. -Correspondingly, when data is being -read back from the diskette, bit cell -7 is read first and bit cell 0 is -read last. The diagram below -illustrates the relationship of the -bits within a byte. -.bp -*** INSERT FIGURE 3.4 HERE *** - -To graphically show how bits are -stored and retrieved, we must take -certain liberties. The diagrams are -a representation of what functionally -occurs within the disk drive. For -the purposes of our presentation, the -hardware interface -to the diskette will be -represented as an eight bit "data latch". -While the hardware involves -considerably more complication, from a software -standpoint it is reasonable to use -the data latch, as it accurately -embodies the function of data flow to -and from the diskette. -.sp1 -*** INSERT FIGURE 3.5 HERE *** - -Figure 3.5 shows the three bits, 101, -being read from the diskette data -stream into the data latch. Of -course another five bits would be -read to fill the latch. As can be -seen, the data is separated from the -clock bits. This task is done by the -hardware and is shown more for -accuracy than for its importance to -our discussion. - -Writing data can be depicted in much -the same way (see Figure 3.6). -The clock bits which -were separated from the data must now -be interleaved with the data as it is -written. It should be noted that, -while in write mode, zeros are being -brought into the data latch to -replace the data being written. It -is the task of the software to make -sure that the latch is loaded and -instructed to write in 32 -cycle intervals. If not, zero bits -will continue to be written every four -cycles, which is, in fact, exactly -how self-sync bytes are created. -Self-sync bytes will be covered in -detail shortly. -.sp1 -*** INSERT FIGURE 3.6 HERE *** - -A "field" is made up of a group of -consecutive -bytes. The number of bytes varies, -depending upon the nature of the -field. The two types of fields -present on a diskette are the Address -Field and the Data Field. They are -similar in that they both contain a -prologue, a data area, a checksum, and -an epilogue. Each field on a track is -separated from adjacent fields by a -number of bytes. These areas of -separation are called "gaps" and are -provided for two reasons. One, they -allow the updating of one field -without affecting adjacent fields (on -the Apple, only data fields are -updated). Secondly, they allow the -computer time -to decode the address field before -the corresponding data field can pass -beneath the read/write head. - -All gaps are primarily alike in -content, consisting of self-sync -hexadecimal FF's, and vary only in -the number of bytes they contain. -Figure 3.7 is a diagram of a portion -of a typical track, broken into its -major components. -.bp -*** INSERT FIGURE 3.7 HERE *** - -Self-sync or auto-sync bytes are -special bytes that make up the three -different types of gaps on a track. -They are so named because of their -ability to automatically bring the -hardware into synchronization with -data bytes on the disk. The -difficulty in doing this lies in the -fact that the hardware reads bits and -the data must be stored as eight bit -bytes. It has been mentioned that a -track is literally a continuous -stream of data bits. In fact, at the -bit level, there is no way to -determine where a byte starts or -ends, because each bit cell is -exactly the same, written in precise -intervals with its neighbors. When -the drive is instructed to read data, -it will start wherever it happens to -be on a particular track. That could -be anywhere among the 50,000 or so -bits on a track. Distinguishing -clock bits from data bits, -the hardware finds -the first bit cell with data in it -and proceeds to read the following -seven data -bits into the eight bit latch. In -effect, it assumes that it had -started at the beginning of a data -byte. Of course, in reality, the -odds of its having started at the -beginning of a byte are only one in -eight. -Pictured in Figure 3.8 is a small -portion of a track. The clock bits -have been stripped out and 0's and -1's have been used for clarity. -.sp1 -*** INSERT FIGURE 3.8 HERE *** - -There is no way from looking at the -data to tell what bytes are -represented, because we don't know -where to start. This is exactly the -problem that self-sync bytes -overcome. - -A self-sync byte is defined to be a -hexadecimal FF with a special -difference. It is, in fact, a 10 bit -byte rather than an eight bit byte. Its -two extra bits are zeros. Figure 3.9 -shows the difference between a normal -data hex FF that might be found -elsewhere on the disk and a self-sync -hex FF byte. -.bp -*** INSERT FIGURE 3.9 HERE *** - -A self-sync is generated by using a -40 cycle (micro-second) loop while -writing an FF. A bit is written -every four cycles, so two of the zero bits -brought into the data latch while the -FF was being written are also -written to the disk, making the 10 -bit byte. (DOS 3.2.1 and earlier versions use -a nine bit byte due to the hardware's -inability to always detect two -consecutive zero bits.) It can be -shown, using Figure 3.10, that five -self-sync bytes are sufficient to -guarantee that the hardware is -reading valid data. The reason for -this is that the hardware requires -the first bit of a byte to be a 1. -Pictured at the top of the figure is -a stream of five auto-sync bytes. Each -row below that demonstates what the -hardware will read should it start -reading at any given bit in the -first byte. In each case, by the -time the five sync bytes have passed -beneath the read/write head, the -hardware will be "synched" to read the -data bytes that follow. As long as -the disk is left in read mode, it -will continue to correctly interpret -the data unless there is an error on -the track. -.sp1 -*** INSERT FIGURE 3.10 *** - -We can now discuss the particular -portions of a track in detail. The -three gaps will be covered first. -Unlike some other disk formats, the -size of the three gap types will vary -from drive to drive and even from -track to track. During the -initialization process, DOS will -start with large gaps and keep making -them smaller until an entire track -can be written without overlapping -itself. A minimum of five self-sync -bytes must be maintained for -each gap type (as discussed earlier). -The result is fairly uniform gap -sizes within each particular track. - -Gap 1 is the first data written to a -track during initialization. Its -purpose is twofold. The gap -originally consists of 128 bytes of -self-sync, a large enough area to -insure that all portions of a track -will contain data. Since the speed -of a particular drive may vary, the -total length of the track in bytes is -uncertain, and the percentage -occupied by data is unknown. The -initialization process is set up, -however, so that even on drives of -differing speeds, the last data field -written will overlap Gap 1, providing -continuity over the entire physical -track. Care is taken to make sure -the remaining portion of Gap 1 is at -as long as a typical Gap 3 -(in practice its length -is usually more than 40), -enabling it to serve -as a Gap 3 type for Address Field -number 0 (See Figure 3.7 for -clarity). -.bp - -Gap 2 appears after each Address -Field and before each Data Field. -Its length varies from five to ten bytes -on a normal drive. The primary -purpose of Gap 2 is to provide time -for the information in an Address -Field to be decoded by the computer -before a read or write takes place. -If the gap were too short, the -beginning of the Data Field might -spin past while DOS was still -determining if this was the -sector to be read. The 240 odd -cycles that six self-sync bytes provide -seems ample time to decode an address -field. When a Data Field is written -there is no guarantee that the write -will occur in exactly the same spot -each time. This is due to the fact -that the drive which is rewriting -the Data Field may not be the one -which originally INITed or wrote it. -Since the speed of the drives can -vary, it is possible that the write -could start in mid-byte. (See Figure -3.11) This is not -a problem as long as the difference -in positioning is not great. To -insure the integrity of Gap 2, when -writing a data field, five self-sync -bytes are written prior to writing -the Data Field itself. This serves -two purposes. Since relatively -little time is spent decoding an -address field, the five bytes help place -the Data Field near its original -position. Secondly, and more -importantly, the five self-sync bytes -are the minimum number required to -guarantee read-synchronization. It -is probable that, in writing a Data -Field, at least one sync byte will be -destroyed. This is because, just as -in reading bits on the track, the -write may not begin on a byte -boundary, thus altering an existing -byte. Figure 3.12 illustrates this. -.sp1 -*** INSERT FIGURE 3.11 HERE *** -.sp1 -*** INSERT FIGURE 3.12 HERE *** - -Gap 3 appears after each -Data Field and before each Address -Field. It is longer than Gap 2 and -generally ranges from 14 to 24 bytes -in length. It is quite similar in -purpose to Gap 2. Gap 3 allows the -additional time needed to manipulate -the data that has been read before -the next sector is to be read. The -length of Gap 3 is not as critical as -that of Gap 2. If the -following Address -Field is missed, DOS can always wait -for the next time it spins around -under the read/write head, at most -one revolution of the disk. Since -Address Fields are never rewritten, -there is no problem with this gap -providing synchronization, since only -the first part of the gap can be -overwritten or damaged. (See Figure -3.11 for clarity) -.bp - -An examination of the contents of the -two types of fields is in order. -The Address Field contains -the "address" or identifying -information about the Data Field -which follows it. The volume, track, -and sector number of any given sector -can be thought of as its "address", -much like a country, city, and street -number might identify a house. As -shown previously in Figure 3.7, there -are a number of components which make -up the Address Field. A more -detailed illustration is given in -Figure 3.13. -.sp1 -*** INSERT FIGURE 3.13 HERE *** - -The prologue consists of three bytes -which form a unique sequence, found -in no other component of the track. -This fact enables DOS to locate an -Address Field with almost no -possibility of error. The three -bytes are $D5, $AA, and $96. The $D5 -and $AA are reserved (never written -as data) thus insuring the uniqueness -of the prologue. The $96, following -this unique string, indicates that -the data following constitutes an -Address Field (as opposed to a Data -Field). The address information -follows next, consisting of the -volume, track, and sector number and -a checksum. This information is -absolutely essential for DOS to know -where it is positioned on a -particular diskette. The checksum is -computed by exclusive-ORing the first -three pieces of information, and is -used to verify its integrity. -Lastly follows the epilogue, which -contains the three bytes $DE, $AA and -$EB. Oddly, the $EB is always written -during initialization but is never -verified when an Address Field is -read. The epilogue bytes are -sometimes referred to as "bit-slip -marks", which provide added assurance -that the drive is still in sync with -the bytes on the disk. These bytes -are probably unnecessary, but do -provide a means of double checking. - -The other field type is the Data -Field. Much like the Address Field, -it consists of a prologue, data, -checksum, and an epilogue. (Refer to -Figure 3.14) The prologue is -different only in the third byte. -The bytes are $D5, $AA, and $AD, -which again form a unique sequence, -enabling DOS to locate the beginning -of the sector data. The data -consists of 342 bytes of encoded -data. The encoding scheme used will -be discussed in the next section. -The data is followed by a checksum -byte, used to verify the integrity of -the data just read. The epilogue -portion of the Data Field is -absolutely identical to the epilogue -in the Address Field and it serves -the same function. -.bp -.nx ch3.2 diff --git a/D1S1/CH3.2#064000.txt b/D1S1/CH3.2#064000.txt deleted file mode 100644 index 7adff08..0000000 --- a/D1S1/CH3.2#064000.txt +++ /dev/null @@ -1,333 +0,0 @@ -.sp2 -DATA FIELD ENCODING - -Due to Apple's hardware, it is not -possible to read all 256 possible -byte values from a diskette. -This is not a great problem, but it -does require that the data written to -the disk be encoded. Three different -techniques have been used. The first -one, which is currently used in -Address Fields, involves -writing a data byte as two disk -bytes, one containing the odd bits, -and the other containing the even -bits. It would thus require 512 -"disk" bytes for each 256 byte sector -of data. Had this technique been used -for sector data, no more than 10 -sectors would have fit on a track. -This amounts to about 88K of data per -diskette, or roughly 72K of space -available to the user; typical for 5 -1/4 single density drives. - -Fortunately, a second technique for -writing data to diskette was devised -that allows 13 -sectors per track. This new method -involved a "5 and 3" split of the -data bits, versus the "4 and 4" -mentioned earlier. Each byte written -to the disk contains five valid bits -rather than four. This requires 410 -"disk" bytes to store a 256 byte -sector. This latter density allows -the now well known 13 sectors per -track format used by DOS 3 through -DOS 3.2.1. The "5 and 3" scheme -represented a hefty 33% increase over -comparable drives of the day. - -Currently, of course, DOS 3.3 -features 16 sectors per track and -provides a 23% increase in disk -storage over the 13 sector format. -This was made possible by a hardware -modification (the P6 PROM on the disk -controller card) which allowed a "6 -and 2" split of the data. The change -was to the logic of the "state -machine" in the P6 PROM, now allowing -two consecutive zero bits in data -bytes. - -These three different encoding -techniques -will now be covered in some detail. -The hardware for DOS 3.2.1 (and -earlier versions of DOS) imposed a -number of restrictions upon how data -could be stored and retrieved. It -required that a disk byte have the -high bit set and, in addition, no two -consecutive bits could be zero. -The odd-even "4 and 4" technique meets -these requirements. Each data byte -is represented as two bytes, one -containing the even data bits and the -other the odd data bits. Figure 3.15 -illustrates this transformation. It -should be noted that the unused bits -are all set to one to guarantee -meeting the two requirements. -.sp1 -*** INSERT FIGURE 3.15 HERE *** - -No matter what value the original -data data byte has, this technique -insures that the high bit is set and -that there can not be two consecutive -zero bits. The "4 and 4" technique is -used to store the information -(volume, track, sector, checksum) -contained in the Address Field. It -is quite easy to decode the data, since the -byte with the odd bits is simply -shifted left and logically ANDed with -the byte containing the even bits. -This is illustrated in Figure 3.16. -.sp1 -*** INSERT FIGURE 3.16 HERE *** - -It is important that the least -significant bit contain a 1 when the -odd-bits byte is left shifted. The -entire -operation is carried out in the -RDADR subroutine at $B944 in DOS -(48K). - -The major difficulty with the above -technique is that it takes up a lot of -room on the track. To overcome this -deficiency the "5 and 3" encoding -technique was developed. It is so named -because, instead of splitting the -bytes in half, as in the odd-even -technique, they are split five and three. A -byte would have the form 000XXXXX, -where X is a valid data bit. The -above byte could range in value from -$00 to $1F, a total of 32 different -values. It so happens that there are -34 valid "disk" bytes, ranging -from $AA up to $FF, which meet the -two requirements (high bit set, no -consecutive zero bits). Two bytes, -$D5 and $AA, were chosen as reserved -bytes, thus leaving an exact mapping -between five bit data bytes and eight bit -"disk" bytes. The process of -converting eight bit data bytes to eight bit -"disk" bytes, then, is twofold. -An overview is diagrammed in Figure -3.17. -.sp1 -*** INSERT FIGURE 3.17 HERE *** - -First, the 256 bytes that will make -up a sector must be translated to -five bit bytes. This is done by the -"prenibble" routine at $B800. It is -a fairly involved process, involving -a good deal of bit rearrangement. -Figure 3.18 shows the before and -after of prenibbilizing. On the left -is a buffer of eight bit data bytes, as -passed to the RWTS subroutine -package by DOS. Each byte in this -buffer is represented by a letter (A, -B, C, etc.) and each bit by a number -(7 through 0). On the right side are -the results of the transformation. -The primary buffer contains five -distinct areas of five bit bytes (the -top three bits of the eight bit bytes -zero-filled) and the secondary buffer -contains three areas, graphically -illustrating the name "5 and 3". -.bp -*** INSERT FIGURE 3.18 HERE *** - -A total of 410 bytes are needed to -store the original 256. This can be -calculated by finding the total bits -of data (256 x 8 = 2048) and dividing -that by the number of bits per byte -(2048 / 5 = 409.6). (two bits are -not used) Once this process is -completed, the data is further -transformed to make it valid "disk" -bytes, meeting the disk's -requirements. This is much easier, -involving a one to one look-up in the -table given in Figure 3.19. -.sp1 -*** INSERT FIGURE 3.19 HERE *** - -The Data Field has a checksum much -like the one in the Address Field, -used to verify the integrity of the -data. It also involves -exclusive-ORing the information, but, -due to time constraints during -reading bytes, it is implemented -differently. The data is -exclusive-ORed in pairs before being -transformed by the look-up table in -Figure 3.19. This can best be -illustrated by Figure 3.20 on the following page. -.sp1 -*** INSERT FIGURE 3.20 HERE *** - -The reason for this transformation -can be better understood by examining -how the information is retrieved from -the disk. The read routine must read -a byte, transform it, and store it -- -all in under 32 cycles (the time -taken to write a byte) or the -information will be lost. By using -the checksum computation to decode -data, the -transformation shown in Figure 3.20 -greatly facilitates the time -constraint. As the data is being -read from a sector the accumulator -contains the cumulative result of -all previous bytes, exclusive-ORed -together. The value of the -accumulator after any exclusive-OR -operation is the actual data byte -for that point in the series. -This process is diagrammed in Figure 3.21. -.bp -*** INSERT FIGURE 3.21 HERE *** - -The third encoding technique, currently -used by DOS 3.3, is similar to the "5 -and 3". It was made possible by a -change in the hardware which eased -the requirements for valid data -somewhat. The high bit must still be -set, but now the byte may contain one -(and only one) pair of consecutive -zero bits. This allows a greater -number of valid bytes and permits the -use of a "6 and 2" encoding technique. -A six bit byte would have the form -00XXXXXX and has values from $00 to -$3F for a total of 64 different -values. With the new, relaxed -requirements for valid "disk" bytes -there are 69 different bytes ranging -in value from $96 up to $FF. After -removing the two reserved bytes, $AA -and $D5, there are still 67 "disk" bytes -with only 64 needed. An additional -requirement was introduced to force -the mapping to be one to one, namely, -that there must be at least two -adjacent bits set, excluding bit 7. -This produces exactly 64 valid "disk" -values. The initial transformation -is done by the prenibble routine -(still located at $B800) and its -results are shown in Figure 3.22. -.sp1 -*** INSERT FIGURE 3.22 (20) HERE *** - -A total of 342 bytes are needed, -shown by finding the total number of -bits (256 x 8 = 2048) and dividing by -the number of bits per byte (2048 / 6 -= 341.33). The transformation from -the six bit bytes to valid data bytes -is again performed by a one to one -mapping shown in Figure 3.23. -Once again, the stream of data bytes -written to the diskette are a product -of exclusive-ORs, exactly as with the -"5 and 3" technique discussed earlier. -.sp1 -*** INSERT FIGURE 3.23 (21) HERE *** -.sp1 -SECTOR INTERLEAVING - -Sector interleaving is the staggering -of sectors on a track to maximize -access speed. There is usually a -delay between the time DOS reads or -writes a sector and the time it is -ready to read or write another. This -delay depends upon the application -program using the disk and can vary -greatly. If sectors were stored -on the track -sequentially, in ascending numerical -order, unless the -application was very quick indeed, -it would -usually be necessary to wait an -entire revolution of the diskette -before the next sector could be -accessed. Rearranging the sectors -into a different order or -"interleaving" them can provide -different access speeds. -.bp -On DOS 3.2.1 and earlier versions, -the 13 sectors are physically -interleaved on the disk. Since DOS -resides on the diskette in ascending -sequential order and files generally -are stored in descending sequential -order, no single interleaving scheme -works well for both booting and -sequentially accessing a file. - -A different approach has been used in -DOS 3.3 in an attempt to maximize -performance. The interleaving is now -done in software. The 16 sectors are -stored in numerically ascending order -on the diskette (0, 1, 2, ... , 15) -and are not physically interleaved at -all. A look-up table is used to -translate the physical sector number -into a "pseudo" or soft sector number -used by DOS. For example, if the -sector number found on the disk were a -2, this is used as an offset into a -table where the number $0B is found. -Thus, DOS treats the physical sector -2 as sector 11 ($0B) for all intents -and purposes. This presents no -problem if RWTS is used for disk -access, but would become a -consideration if access were made -without RWTS. - -To eliminate the access differences -between booting and reading files, -another change has been made. During -the boot process, DOS is loaded -backwards in descending sequential -order into memory, just as files are -accessed. This means one -interleaving scheme can minimize disk -access time. - -It is interesting to point out that -Pascal, Fortran, and CP/M diskettes -all use software interleaving also. -However, each uses a different -sector order. A comparison of these -differences is presented in Figure -3.24. -.sp1 -*** INSERT FIGURE 3.24 (22) HERE *** -.br -.nx ch4 diff --git a/ch03.txt b/ch03.txt index 2f61c41..6bc3311 100644 --- a/ch03.txt +++ b/ch03.txt @@ -567,4 +567,335 @@ absolutely identical to the epilogue in the Address Field and it serves the same function. -.nx ch3.2 +DATA FIELD ENCODING + +Due to Apple's hardware, it is not +possible to read all 256 possible +byte values from a diskette. +This is not a great problem, but it +does require that the data written to +the disk be encoded. Three different +techniques have been used. The first +one, which is currently used in +Address Fields, involves +writing a data byte as two disk +bytes, one containing the odd bits, +and the other containing the even +bits. It would thus require 512 +"disk" bytes for each 256 byte sector +of data. Had this technique been used +for sector data, no more than 10 +sectors would have fit on a track. +This amounts to about 88K of data per +diskette, or roughly 72K of space +available to the user; typical for 5 +1/4 single density drives. + +Fortunately, a second technique for +writing data to diskette was devised +that allows 13 +sectors per track. This new method +involved a "5 and 3" split of the +data bits, versus the "4 and 4" +mentioned earlier. Each byte written +to the disk contains five valid bits +rather than four. This requires 410 +"disk" bytes to store a 256 byte +sector. This latter density allows +the now well known 13 sectors per +track format used by DOS 3 through +DOS 3.2.1. The "5 and 3" scheme +represented a hefty 33% increase over +comparable drives of the day. + +Currently, of course, DOS 3.3 +features 16 sectors per track and +provides a 23% increase in disk +storage over the 13 sector format. +This was made possible by a hardware +modification (the P6 PROM on the disk +controller card) which allowed a "6 +and 2" split of the data. The change +was to the logic of the "state +machine" in the P6 PROM, now allowing +two consecutive zero bits in data +bytes. + +These three different encoding +techniques +will now be covered in some detail. +The hardware for DOS 3.2.1 (and +earlier versions of DOS) imposed a +number of restrictions upon how data +could be stored and retrieved. It +required that a disk byte have the +high bit set and, in addition, no two +consecutive bits could be zero. +The odd-even "4 and 4" technique meets +these requirements. Each data byte +is represented as two bytes, one +containing the even data bits and the +other the odd data bits. Figure 3.15 +illustrates this transformation. It +should be noted that the unused bits +are all set to one to guarantee +meeting the two requirements. + +*** INSERT FIGURE 3.15 HERE *** + +No matter what value the original +data data byte has, this technique +insures that the high bit is set and +that there can not be two consecutive +zero bits. The "4 and 4" technique is +used to store the information +(volume, track, sector, checksum) +contained in the Address Field. It +is quite easy to decode the data, since the +byte with the odd bits is simply +shifted left and logically ANDed with +the byte containing the even bits. +This is illustrated in Figure 3.16. + +*** INSERT FIGURE 3.16 HERE *** + +It is important that the least +significant bit contain a 1 when the +odd-bits byte is left shifted. The +entire +operation is carried out in the +RDADR subroutine at $B944 in DOS +(48K). + +The major difficulty with the above +technique is that it takes up a lot of +room on the track. To overcome this +deficiency the "5 and 3" encoding +technique was developed. It is so named +because, instead of splitting the +bytes in half, as in the odd-even +technique, they are split five and three. A +byte would have the form 000XXXXX, +where X is a valid data bit. The +above byte could range in value from +$00 to $1F, a total of 32 different +values. It so happens that there are +34 valid "disk" bytes, ranging +from $AA up to $FF, which meet the +two requirements (high bit set, no +consecutive zero bits). Two bytes, +$D5 and $AA, were chosen as reserved +bytes, thus leaving an exact mapping +between five bit data bytes and eight bit +"disk" bytes. The process of +converting eight bit data bytes to eight bit +"disk" bytes, then, is twofold. +An overview is diagrammed in Figure +3.17. + +*** INSERT FIGURE 3.17 HERE *** + +First, the 256 bytes that will make +up a sector must be translated to +five bit bytes. This is done by the +"prenibble" routine at $B800. It is +a fairly involved process, involving +a good deal of bit rearrangement. +Figure 3.18 shows the before and +after of prenibbilizing. On the left +is a buffer of eight bit data bytes, as +passed to the RWTS subroutine +package by DOS. Each byte in this +buffer is represented by a letter (A, +B, C, etc.) and each bit by a number +(7 through 0). On the right side are +the results of the transformation. +The primary buffer contains five +distinct areas of five bit bytes (the +top three bits of the eight bit bytes +zero-filled) and the secondary buffer +contains three areas, graphically +illustrating the name "5 and 3". + +*** INSERT FIGURE 3.18 HERE *** + +A total of 410 bytes are needed to +store the original 256. This can be +calculated by finding the total bits +of data (256 x 8 = 2048) and dividing +that by the number of bits per byte +(2048 / 5 = 409.6). (two bits are +not used) Once this process is +completed, the data is further +transformed to make it valid "disk" +bytes, meeting the disk's +requirements. This is much easier, +involving a one to one look-up in the +table given in Figure 3.19. + +*** INSERT FIGURE 3.19 HERE *** + +The Data Field has a checksum much +like the one in the Address Field, +used to verify the integrity of the +data. It also involves +exclusive-ORing the information, but, +due to time constraints during +reading bytes, it is implemented +differently. The data is +exclusive-ORed in pairs before being +transformed by the look-up table in +Figure 3.19. This can best be +illustrated by Figure 3.20 on the following page. + +*** INSERT FIGURE 3.20 HERE *** + +The reason for this transformation +can be better understood by examining +how the information is retrieved from +the disk. The read routine must read +a byte, transform it, and store it -- +all in under 32 cycles (the time +taken to write a byte) or the +information will be lost. By using +the checksum computation to decode +data, the +transformation shown in Figure 3.20 +greatly facilitates the time +constraint. As the data is being +read from a sector the accumulator +contains the cumulative result of +all previous bytes, exclusive-ORed +together. The value of the +accumulator after any exclusive-OR +operation is the actual data byte +for that point in the series. +This process is diagrammed in Figure 3.21. + +*** INSERT FIGURE 3.21 HERE *** + +The third encoding technique, currently +used by DOS 3.3, is similar to the "5 +and 3". It was made possible by a +change in the hardware which eased +the requirements for valid data +somewhat. The high bit must still be +set, but now the byte may contain one +(and only one) pair of consecutive +zero bits. This allows a greater +number of valid bytes and permits the +use of a "6 and 2" encoding technique. +A six bit byte would have the form +00XXXXXX and has values from $00 to +$3F for a total of 64 different +values. With the new, relaxed +requirements for valid "disk" bytes +there are 69 different bytes ranging +in value from $96 up to $FF. After +removing the two reserved bytes, $AA +and $D5, there are still 67 "disk" bytes +with only 64 needed. An additional +requirement was introduced to force +the mapping to be one to one, namely, +that there must be at least two +adjacent bits set, excluding bit 7. +This produces exactly 64 valid "disk" +values. The initial transformation +is done by the prenibble routine +(still located at $B800) and its +results are shown in Figure 3.22. + +*** INSERT FIGURE 3.22 (20) HERE *** + +A total of 342 bytes are needed, +shown by finding the total number of +bits (256 x 8 = 2048) and dividing by +the number of bits per byte (2048 / 6 += 341.33). The transformation from +the six bit bytes to valid data bytes +is again performed by a one to one +mapping shown in Figure 3.23. +Once again, the stream of data bytes +written to the diskette are a product +of exclusive-ORs, exactly as with the +"5 and 3" technique discussed earlier. + +*** INSERT FIGURE 3.23 (21) HERE *** + +SECTOR INTERLEAVING + +Sector interleaving is the staggering +of sectors on a track to maximize +access speed. There is usually a +delay between the time DOS reads or +writes a sector and the time it is +ready to read or write another. This +delay depends upon the application +program using the disk and can vary +greatly. If sectors were stored +on the track +sequentially, in ascending numerical +order, unless the +application was very quick indeed, +it would +usually be necessary to wait an +entire revolution of the diskette +before the next sector could be +accessed. Rearranging the sectors +into a different order or +"interleaving" them can provide +different access speeds. + +On DOS 3.2.1 and earlier versions, +the 13 sectors are physically +interleaved on the disk. Since DOS +resides on the diskette in ascending +sequential order and files generally +are stored in descending sequential +order, no single interleaving scheme +works well for both booting and +sequentially accessing a file. + +A different approach has been used in +DOS 3.3 in an attempt to maximize +performance. The interleaving is now +done in software. The 16 sectors are +stored in numerically ascending order +on the diskette (0, 1, 2, ... , 15) +and are not physically interleaved at +all. A look-up table is used to +translate the physical sector number +into a "pseudo" or soft sector number +used by DOS. For example, if the +sector number found on the disk were a +2, this is used as an offset into a +table where the number $0B is found. +Thus, DOS treats the physical sector +2 as sector 11 ($0B) for all intents +and purposes. This presents no +problem if RWTS is used for disk +access, but would become a +consideration if access were made +without RWTS. + +To eliminate the access differences +between booting and reading files, +another change has been made. During +the boot process, DOS is loaded +backwards in descending sequential +order into memory, just as files are +accessed. This means one +interleaving scheme can minimize disk +access time. + +It is interesting to point out that +Pascal, Fortran, and CP/M diskettes +all use software interleaving also. +However, each uses a different +sector order. A comparison of these +differences is presented in Figure +3.24. + +*** INSERT FIGURE 3.24 (22) HERE *** + +.nx ch4