diff --git a/LICENSE b/LICENSE.txt similarity index 99% rename from LICENSE rename to LICENSE.txt index 8f71f43..57bc88a 100644 --- a/LICENSE +++ b/LICENSE.txt @@ -178,7 +178,7 @@ APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following - boilerplate notice, with the fields enclosed by brackets "{}" + boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a @@ -186,7 +186,7 @@ same "printed page" as the copyright notice for easier identification within third-party archives. - Copyright {yyyy} {name of copyright owner} + Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. diff --git a/README.md b/README.md index 9ff23a3..68a63c1 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,268 @@ -# fhpack -Testing... +fhpack - compression for Apple II hi-res images +=============================================== + +fhpack is a compression tool with a singular purpose: to compress +Apple II hi-res graphics images. + + +## Origins ## + +I've had an idea for a project involving hi-res graphics compression +for several years, but didn't do much about it. After learning +about [LZ4](http://lz4.org/), and seeing uncompressors written +for the [6502](http://pferrie.host22.com/misc/appleii.htm) and +[65816](http://www.brutaldeluxe.fr/products/crossdevtools/lz4/index.html), +I decided to see if I could apply LZ4 to hi-res images. + +A few hi-res compressors were written back in The Day, usually employing +run-length encoding, which is easy to write and fast to encode and decode. +In the spirit of LZ4, I decided to put together an asymmetric codec, +meaning compression is very very slow, but uncompression is very very fast. + +The result is a modified form of LZ4 that consistently beats LZ4-HC, +and generally comes close to (and occasionally beats) ShrinkIt's LZW/II. +The decoder is tiny and extremely fast, especially on the 65816 where +the bulk data copy instructions can be used. + + +## About the fhpack Tool ## + +The compressor has two modes, similar to LZ4's "fast" and "high". The +fast mode uses greedy parsing and is not particularly fast, while the +high-compression mode uses optimal parsing and takes 12 times as long. +Both employ simple brute-force algorithms, which we can get away with +because we're only compressing 8KB of data. The high-compression mode +does about 4% better on average -- not huge, but not negligible. + +Other compression programs, such as gzip, produce significantly smaller +output, but uncompression is much slower and requires more memory. + +The comments in [fhpack.cpp](fhpack.cpp) describe the data format. + +There is no implementation of the compression side for the 6502. +An implementation that uses greedy parsing is feasible, as the bulk of the +time is spent comparing 8-bit strings that are less than 256 bytes long, +and the 6502 series is pretty good at that. The optimal parser could +theoretically be done on a machine with 128KB of RAM, but would take a +very long time to run. + + +#### Screen Holes #### + +The hi-res screen has a curious interleaved structure that leaves "holes" +in memory -- parts of the frame buffer that don't affect what appears +on screen. The screen layout is divided into 128-byte sections, with +120 bytes of visible data followed by an eight byte "hole". The holes +tend to be filled with zeroes, though sometimes they may contain +garbage or program state. + +fhpack can do one of three things with the screen holes: + + 1. Preserve them. This mode is enabled with the "-h" flag. If you + want the uncompressed data to exactly match the original, you + must specify this flag. + 2. Fill them with zeroes. + 3. Fill them with a pattern that matches the data immediately before + or after the hole. + +In some cases #2 provides the best results, in others #3 wins. +The difference is usual minimal, with outliers in the 70-90 byte range. +On modern hardware fhpack runs very quickly, so when not in hole-preserve +mode the tool compresses everything twice, and keeps whichever approach +yielded the smallest output. + + +## Apple II Code and Demos ## + +The 6502/65816 versions of the uncompressor (source and binaries), as +well as two slideshow applications written in Applesoft and a number +of sample files, are provided on the attached disk images. + +There are six disk images. The first three hold the slide show demo: + + * LZ4FHDemo.do (/LZ4FH, 140KB) - Source and object code for the + uncompression routines, plus a few test images and the Applesoft + "SLIDESHOW" program. + * UncompressedSlides.do (/SLIDESHOW, 140KB) - A set of 16 uncompressed + hi-res images. + * CompressedSlides.do (/SLIDESHOW, 140KB) - A set of 42 compressed + hi-res images. + +To view the demo, put the LZ4FHDemo image in slot 6 drive 1, and one +of the slide disks in slot 6 drive 2. Boot the disk and "-SLIDESHOW". +Just hit return at the prompt to accept the default prefix. + +The slideshow program will scan the specified directory and identify files +that appear to be compressed or uncompressed hi-res images. It will +then start a slide show, moving through them as quickly as possible. +By swapping the compressed and uncompressed disks and restarting the +program, you can compare the performance with and without compression. +(For a 5.25" disk, it's generally faster to load a compressed image and +uncompress it than it is to load an uncompressed image.) + + +There is a second demo, called "HYPERSLIDE", which shows off the raw +performance by eliminating the disk accesses. A set of 15 images is loaded +into memory -- overwriting BASIC.System -- and presented as a slide show +as quickly as possible. The demo and selected images are on this disk: + + * HyperSlide.po (/HYPERSLIDE, 140KB) + +To run the demo, put the disk image in slot 6 drive 1, boot the disk, +and "-HYPERSLIDE". If you are running on a IIgs, you may want to try it +with the 65816 uncompressor, which is much faster than the 6502 version. +If you want to compute frame timings, you can set an iteration count, +and the slide show will beep at the start and end. + +A larger set of images is available on a pair of 800KB disks. One disk +has the compressed form, the other the uncompressed form: + + * UncompressedImages.po (/IMAGES, 800KB) + * CompressedImages.po (/IMAGES.LZ4H, 800KB) + +It's worth noting that the images on CompressedSlides.do take up about +135KB of disk space, but are about 104KB combined. The rest of the space +is used up by filesystem overhead. Storing them in a ShrinkIt archive +would be more efficient, but would also make them far more difficult +to unpack. + + +#### Decoder Performance #### + +Running under AppleWin with "authentic" disk access speed enabled, +a slide show of uncompressed images runs at about 1.7 seconds per +image (about 0.6fps). With compressed images the time varies, because +the size of the compressed image affects the amount of disk activity, +but it averages about 1.4 seconds per image (about 0.7 fps). + +Removing disk activity from the equation, HyperSlide improves that to +about 3.7 fps, with very little variation between files. The decode +time is dominated by byte copies, and we're always copying 8KB, so the +consistency is expected. + +HyperSlide still has some overhead from Applesoft BASIC. The "blitz +test", included on the LZ4FH demo disk, generates machine language +calls that uncompress the same image 100x, eliminating all overhead +(and simulating what HyperSlide could do if it weren't written in +BASIC). The speed improves to 5.6 fps. + +The most significant boost in speed comes from using the 65816 data +move instructions. With a 65816 implementation, still running at 1MHz, +HyperSlide hits 6 fps, and BLITZTEST tops 12 fps. + + +#### Code Notes #### + +The uncompressor takes as arguments the addresses of the compressed data +and the buffer to uncompress to. These are poked into memory locations +$02FC and $02FE. In the current implementation, the output buffer must +be $2000 or $4000 (the two hi-res pages). + +Packed images use the FOT ($08) file type, with an auxtype of $8066 +(0x66 is ASCII 'f'). + + +## Experimental Results ## + +I grabbed a set of about 70 images, most from games, a few from early +"contributed program" disks. The latter include what look like digitized +scans that don't compress especially well. + +All images were compressed with LZ4 r131 in high-compression mode (`lz4 +-9`), NuLib2 with LZW/II, and LZ4FH (`fhpack -9`). fhpack output has a +one-byte magic number, while LZ4-HC has 15 bytes of headers and footers, +so for a fair "raw data" comparison the numbers should be adjusted +appropriately. + +Most source images are 8192 bytes long, some are a few bytes shorter. + +Image File | LZ4-HC | fhpack | LZW/II | +----------------------- | -----: | -----: | -----: | +contrib/BABY.JANE | 3664 | 3617 | 2851 | +contrib/CHARACTERS | 1874 | 1836 | 1614 | +contrib/CHURCHILL | 4759 | 4723 | 3749 | +contrib/DIP.CHIPS | 3840 | 3785 | 3048 | +contrib/DOLLAR | 3838 | 3790 | 3483 | +contrib/DOUBLE.BESSEL | 3066 | 3010 | 2566 | +contrib/GIRLS.BEST.FRND | 5967 | 5933 | 4659 | +contrib/HOPALONG | 3394 | 3331 | 2713 | +contrib/JOE.SENT.ME | 7569 | 7546 | 7702 | +contrib/LADY.BE.GOOD | 4119 | 4060 | 3292 | +contrib/MACROMETER | 4405 | 4354 | 3613 | +contrib/MUSIC | 1077 | 1030 | 929 | +contrib/RANDOM.LADY | 5754 | 5720 | 5426 | +contrib/ROCKY.RACCOON | 5864 | 5754 | 5598 | +contrib/SHAKESPEARE | 4933 | 4884 | 4286 | +contrib/SPIRALLELLO | 5722 | 5704 | 5345 | +contrib/SQUEEZE | 3209 | 3163 | 2533 | +contrib/TEQUILA | 5881 | 5828 | 5484 | +contrib/TEX | 4463 | 4413 | 3677 | +contrib/TIME.MACHINE | 3628 | 3578 | 3023 | +contrib/UNCLE.SAM | 3057 | 3012 | 2784 | +contrib/WORLD.MAP | 2792 | 2723 | 2429 | +games/ABM.TITLE | 1387 | 1353 | 1581 | +games/ARCHON.TITLE | 5607 | 5589 | 4991 | +games/AZTEC.TITLE | 6378 | 6362 | 6414 | +games/BAM.TITLE | 5380 | 5377 | 4902 | +games/BANDITS.TITLE | 1442 | 1388 | 1376 | +games/BARDS.TALE.1 | 3853 | 3815 | 3523 | +games/BILESTOAD | 2140 | 1559 | 2552 | +games/BORG.TITLE | 2327 | 2559 | 2009 | +games/CAPT.GOODNIGHT | 2839 | 2820 | 2726 | +games/CHOPLIFTER | 1088 | 1007 | 1070 | +games/CRISIS.MT.GAME | 4144 | 4085 | 4037 | +games/CRISIS.MT.TITLE | 2290 | 2246 | 2384 | +games/DAVID.MIDNIGHT | 3362 | 3322 | 3225 | +games/DEFENDER | 728 | 696 | 678 | +games/EAMON.TITLE | 3744 | 3665 | 3252 | +games/GALACTIC.EMPIRE | 2161 | 2087 | 1956 | +games/GERMANY.1985 | 2566 | 2460 | 2390 | +games/HARD.HAT.MAC | 1524 | 1457 | 1678 | +games/KADASH.DEMO | 1796 | 1736 | 2120 | +games/KADASH.TITLE | 5317 | 5294 | 5393 | +games/KARATE.TITLE | 4040 | 3845 | 3952 | +games/KARATEKA.FORT | 4050 | 3955 | 3452 | +games/KARATEKA.GAME | 948 | 904 | 1108 | +games/LODE.RUNNER | 1133 | 1102 | 1428 | +games/MARIO.BROS | 1472 | 1406 | 1372 | +games/MAZE.CRAZE | 2703 | 2659 | 2485 | +games/MICROWAVE.TITLE | 2812 | 2737 | 2434 | +games/NIGHT.FLIGHT | 1109 | 1024 | 1183 | +games/ODYSSEY.TITLE | 3994 | 3953 | 3752 | +games/OUTWORLD | 2222 | 2157 | 2296 | +games/PCS | 1897 | 1837 | 1861 | +games/PCS.TITLE | 1881 | 1841 | 1882 | +games/QUESTRON.DEMO | 2569 | 2518 | 2253 | +games/QUESTRON.TITLE | 1536 | 1499 | 1837 | +games/RASTER.BLASTER | 2687 | 2636 | 2553 | +games/RESCUE.RAIDERS | 5377 | 4961 | 4883 | +games/ROADWAR2K.TITLE | 2063 | 1983 | 2068 | +games/SPARE.CHANGE | 2058 | 2009 | 2268 | +games/STAR.MAZE | 1253 | 1208 | 1600 | +games/STARSHIP.CMDR | 1453 | 1427 | 1845 | +games/STELLAR.7 | 1629 | 1412 | 1845 | +games/SUNDOG.TITLE | 3250 | 3188 | 3270 | +games/SWASHBUCK.GAME | 4690 | 4608 | 4286 | +games/SWASHBUCK.TITLE | 5077 | 5035 | 5085 | +games/TRANQUILITY | 1409 | 1363 | 1273 | +games/ULT2.LORD.BRIT | 1529 | 1514 | 1592 | +games/ULTIMA2.TITLE | 2220 | 2176 | 2201 | +games/WASTELAND.TITLE | 3540 | 3078 | 3510 | +games/WAYOUT | 1691 | 1669 | 1864 | +games/WOLFEN.TITLE | 2638 | 2588 | 2610 | +games/ZAXXON | 1884 | 1862 | 1769 | +misc/CRAPS.TABLE | 2286 | 2266 | 2548 | +misc/GHOSTBUST.LOGO | 1829 | 1724 | 1594 | +misc/LINE.CHART | 1753 | 1655 | 1578 | +misc/MICKEY | 3369 | 3316 | 2945 | +misc/WHO.LOGO | 1138 | 1084 | 1218 | +test/allgreen | 63 | 137 | 215 | +test/allzero | 62 | 136 | 38 | +test/nomatch | 8211 | 7928 | 7414 | +TOTAL | 248473 | 242771 | 232201 | + | 37.4% | 36.5% | 34.9% | + +Note: test/nomatch is not compressible by LZ4 encoding. fhpack was able +to compress it because it zeroed out the "screen holes". When processed +in hole-preservation mode, test/nomatch expands to 8292 bytes. + diff --git a/fhpack.cpp b/fhpack.cpp new file mode 100644 index 0000000..41cdb30 --- /dev/null +++ b/fhpack.cpp @@ -0,0 +1,1028 @@ +/* + * fhpack, an Apple II hi-res picture compressor. + * By Andy McFadden + * Version 1.0, August 2015 + * + * Copyright 2015 by faddenSoft. All Rights Reserved. + * See the LICENSE.txt file for distribution terms (Apache 2.0). + * + * Under Linux, you can build it with just: + * g++ -O2 fhpack.cpp -o fhpack + */ +// TODO: prompt before overwriting output file (add "-f" to force) + +/* +Format summary: + +LZ4FH (FH is "fadden's hi-res") is similar to LZ4 (http://lz4.org) in +that the output is byte-oriented and has two kinds of chunks: "string of +literals" and "match". The format has been modified to make it easier +(and faster) to decode on a 6502. + +As with LZ4, the goal is to get reasonable compression ratios with an +extremely fast decoder. On a CPU with 8-bit index registers, there is +a distinct advantage to keeping copy lengths under 256 bytes. Since the +goal is to compress hi-res graphics, runs of identical bytes tend to be +fairly short anyway -- the interleaved nature means that solid blocks of +color aren't necessarily contiguous in memory -- so the ability to encode +runs of arbitrary length adds baggage with little benefit. + +Files should use file type $08 (FOT) with auxtype $8066 (vendor-specific, +0x66 is 'f'). + +The format is very similar to LZ4, with a few key differences. It +retains the idea of encoding the lengths of the next literal string +and next match in a single byte (4 bits each), so it is most efficient +when matches and literals alternate. + + file: + 1 byte : 0x66 - format magic number for version 1 + Not strictly necessary, but gives a hint if the images end up on + a DOS 3.3 disk where there's no dedicated file type. + [ ...one or more chunks follow... ] + + chunk: + 1 byte : length of literal string (hi 4 bits) and match (lo 4 bits) + A literal-string len of zero indicates no literals (match follows + match). A literal-string len of 15 indicates that the match is + at least 15, and the next byte must be added to it. The match + len is stored as (length - 4), allowing us to represent a match + of length 4 to 18 with 4 bits. A match len of 15 indicates that an + additional byte is needed. + 1 byte : (optional) continuation of literal len, 0 - 240 + Add 15 to get a match length of 15 - 255. + N bytes: 0 - 255 literal values + + 1 byte : (optional) continuation of match len, 0 - 236 -or- 253/254 + Add 15 to get 15-251. Factoring in the minimum match length of 4 + yields 19 - 255. A value of 253 indicates no match (literals + follow literals). This is generally very rare, and is actually + impossible if we overwrite the screen holes as that will guarantee + a match every 120 literals. A value of 254 indicates end-of-data. + 2 bytes: (if match) offset to match + The offset is from the start of the output buffer, *not* back + from the current position. That way, if we're writing the output + to $2000, instead of doing a 16-bit subtraction we can just + ORA #$20 into the high byte. + +We could save a byte by limiting the match distance to 8 bits (and probably +making it relative to the current position), but the interleaved layout of +the hi-res screen tends to spread things apart. It won't really improve +our speed, which is what we're mostly concerned with. + +The use of an explicit end indicator means we don't have to constantly +check to see if we've consumed enough input or produced enough output. +Unlike LZ4, we need to support adjacent runs of literals, so we already +need a special-case check on the match length. It also means we can +choose to trim the file to $1ff8, losing the final "hole", or retain the +original file length. + +Note that, in LZ4, the match offset comes before any optional match +length extensions, while in LZ4FH it comes after. This allows the match +offset to be omitted when there's no match. (This was not useful in LZ4 +because literals-follow-literals doesn't occur.) + +Expansion of uncompressible data is possible, but minimal. The worst +case is a file with no matches. We add three bytes of overhead for +every 255 literals (4/4 byte, 1 for literal len extension, 1 for match +len extension that holds the "no match" symbol). Globally we add +1 for +the magic number. The "end-of-data" symbol replaces the "no match" +symbol, so overall it's int(ceil(8192/255)) * 33 + 1 = 100 bytes. + +*/ +/* +Implementation notes: + +The compression code uses an exhaustive brute-force search for matches. +The "greedy" approach is very slow, the "optimal" approach is extremely +slow. It executes quickly on a modern machine, but would take a long +time to run on an Apple II. On the bright side, with "greedy" parsing +it uses very little memory, and an optimized 6502/65816 implementation +might run in a reasonable amount of time. + +Unrelated to the compression is the handling of the "screen holes". +Of the hi-res screens 8192 bytes, 512 are invisible. We can teach the +compression code to skip over them, but that will require additional +code and will interrupt our literal/match strings every 120 bytes, so +it's better to alter the contents of the holes so that they blend into +the surrounding data and handle them as a match string. + +Sometimes filling holes with a nearby pattern is not a win. This is +particularly noticeable for the old digitized images in the "contrib" +folder, which have widely varying pixel values near the edges. It turns +out we do slightly better by zeroing the holes out, which allows them +to match previous holes. Also, sometimes there are patterns in the +file that happen to match eight zeroes followed by a splash of color. + +Generally speaking the difference in output size is a few dozen bytes, +though in rare cases it can noticeably improve (-200) or cost (+50). +We resolve this conundrum by compressing the file twice and using whichever +works best. + +*/ + +#include +#include +#include +#include +#include +#include + +enum ProgramMode { + MODE_UNKNOWN, MODE_COMPRESS, MODE_UNCOMPRESS, MODE_TEST +}; + +#define MAX_SIZE 8192 +#define MIN_SIZE (MAX_SIZE - 8) // without final screen hole +#define MAX_EXPANSION 100 // ((MAX_SIZE/255)+1) * 3 + 1 + +#define MIN_MATCH_LEN 4 +#define MAX_MATCH_LEN 255 +#define MAX_LITERAL_LEN 255 +#define INITIAL_LEN 15 + +#define EMPTY_MATCH_TOKEN 253 +#define EOD_MATCH_TOKEN 254 + +#define LZ4FH_MAGIC 0x66 + +//#define DEBUG_MSGS +#ifdef DEBUG_MSGS +# define DBUG(x) printf x +#else +# define DBUG(x) +#endif + +/* + * Print usage info. + */ +static void usage(const char* argv0) +{ + fprintf(stderr, "fhpack v1.0 by Andy McFadden -*- "); + fprintf(stderr, "Copyright 2015 by faddenSoft\n"); + fprintf(stderr, + "Source code available from https://github.com/fadden/fhpack\n\n"); + fprintf(stderr, "Usage:\n"); + fprintf(stderr, " fhpack {-c|-d} [-h] [-1|-9] infile outfile\n\n"); + fprintf(stderr, " fhpack {-t} [-h] [-1|-9] infile1 [infile2...] \n\n"); + fprintf(stderr, "Use -c to compress, -d to decompress, -t to test\n"); + fprintf(stderr, " -h: don't fill or remove hi-res screen holes\n"); + fprintf(stderr, " -9: high compression (default)\n"); + fprintf(stderr, " -1: fast compression\n"); + fprintf(stderr, "\n"); + fprintf(stderr, "Example: fhpack -c foo.pic foo.lz4fh\n"); +} + + +/* + * Zero out the "screen holes". + */ +void zeroHoles(uint8_t* inBuf) +{ + uint8_t* inPtr = inBuf + 120; + + while (inPtr < inBuf + MAX_SIZE) { + memset(inPtr, 0, 8); + inPtr += 128; + } +} + +/* + * Fill in the "screen holes" in the image. The hi-res page has + * three 40-byte chunks of visible data, followed by 8 bytes of unseen + * data (padding it to 128). + * + * Instead of simply zeroing them out, we want to examine the data that + * comes before and after, copying whichever seems best into the hole. + * If there's a repeating color pattern (2a 55 2a 55), the hole just + * becomes part of the string, and will be handled as part of a long match. + * + * We can match the bytes that appear before or after the hole. + * Ideally we'd use whichever yields the longest run. + * + * "inBuf" holds MAX_SIZE bytes. + */ +void fillHoles(uint8_t* inBuf) +{ + uint8_t* inPtr = inBuf + 120; + while (inPtr < inBuf + MAX_SIZE) { + // check to see if the bytes that follow are a better match + // ("greedy" parsing can be suboptimal) + uint8_t* checkp = inPtr + 8; + bool useAfter = false; + if (checkp < inBuf + MAX_SIZE) { + if (checkp[0] == checkp[2] && checkp[1] == checkp[3]) { + DBUG((" bytes-after looks good at +0x%04lx\n", + checkp - inBuf)); + useAfter = true; + } else { + DBUG((" bytes-before used at +0x%04lx\n", checkp - inBuf)); + } + } else { + DBUG((" bytes-before used at end +0x%04lx\n", checkp - inBuf)); + } + + // Do an 8-byte overlapping copy. We can overlap by 2 bytes + // or 4 bytes depending on whether we want a 16-bit or 32-bit + // repeating pattern. + if (useAfter) { + for (int i = 7; i >= 0; i--) { + inPtr[i] = inPtr[i + 2]; + } + } else { + for (int i = 0; i < 8; i++) { + inPtr[i] = inPtr[i - 2]; + } + } + + inPtr += 128; + } +} + +/* + * Computes the number of characters that match. Stops when it finds + * a mismatching byte, or "count" is reached. + */ +size_t getMatchLen(const uint8_t* str1, const uint8_t* str2, size_t count) +{ + size_t matchLen = 0; + while (count-- && *str1++ == *str2++) { + matchLen++; + } + return matchLen; +} + +/* + * Finds a match for the string at "matchPtr", in the buffer pointed + * to by "inBuf" with length "inLen". "matchPtr" must be inside "inBuf". + * + * We explicitly allow data to copy over itself, so a run of 200 0x00 + * bytes could be represented by a literal 0x00 followed immediately + * by a match of length 199. We do need to ensure that the initial + * literal(s) go out first, though, so we use "maxStartOffset" to + * restrict where matches may be found. + * + * Returns the length of the longest match found, with the match + * offset in "*pMatchOffset". + */ +size_t findLongestMatch(const uint8_t* matchPtr, const uint8_t* inBuf, + size_t inLen, size_t* pMatchOffset) +{ + size_t maxStartOffset = matchPtr - inBuf; + size_t longest = 0; + size_t longestOffset = 0; + DBUG((" findLongestMatch: maxSt=%zd\n", maxStartOffset)); + + // Brute-force scan through the buffer. Start from the beginning, + // and continue up to the point we've generated until now. (We + // can't search the *entire* buffer right away because the decoder + // can only copy matches from previously-decoded data.) + for (size_t ii = 0; ii < maxStartOffset; ii++) { + // Limit the length of the match by the length of the buffer. + // We don't want the match code to go wandering off the end. + // The match source is always earlier than matchPtr, so we + // want to cap the length based on the distance from matchPtr + // to the end of the buffer. + size_t maxMatchLen = inLen - (matchPtr - inBuf); + if (maxMatchLen > MAX_MATCH_LEN) { + maxMatchLen = MAX_MATCH_LEN; + } + if (maxMatchLen < MIN_MATCH_LEN) { + // too close to end of buffer, no point continuing + break; + } + + //DBUG((" maxMatchLen is %zd\n", maxMatchLen)); + + size_t matchLen = getMatchLen(matchPtr, inBuf + ii, maxMatchLen); + if (matchLen > longest) { + longest = matchLen; + longestOffset = ii; + } + if (matchLen == maxMatchLen) { + // Not going to find a longer one -- any future matches + // will be the same length or shorter. + break; + } + } + + + *pMatchOffset = longestOffset; + return longest; +} + +/* + * Compress a buffer, from "inBuf" to "outBuf". + * + * The input buffer holds between MIN_SIZE and MAX_SIZE bytes (inclusive), + * depending on the length of the source material and whether or not + * we're attempting to preserve the screen holes. + * + * Returns the amount of data in "outBuf" on success, or 0 on failure. + */ +size_t compressBufferOptimally(uint8_t* outBuf, const uint8_t* inBuf, + size_t inLen) +{ + // Optimal parsing for data compression is a lot like computing the + // shortest distance between two points in a directed graph. For + // each location, there are two possible "paths": a literal at this + // point, which advances us one byte forward, or a match at this + // point, which takes us several bytes forward. + // + // We walk through the file backward. At each position, we compute + // whether or not a match exists, and then determine the length from + // the current position to the end depending on whether we handle + // the value as a literal or the start of a match. When we reach the + // start of the file, we generate output by walking forward, selecting + // the path based on whether a literal or match results in the best + // outcome. + struct OptNode { + size_t totalCost; // running total "best" length + size_t matchLength; // zero if no match or literal is best + size_t matchOffset; + + size_t literalLength; // running total of literal run length + }; + OptNode* optList = (OptNode*) calloc(1, (inLen+1) * sizeof(OptNode)); + + // + // Pass 1: determine optimal path + // + + for (unsigned int i = inLen - 1; i < inLen; i--) { + size_t costForMatch, costForLiteral; + + // First consider the "match" path. It doesn't matter what + // follows the match, as that has no local effect on the output + // length. + size_t matchOffset; + size_t longestMatch = findLongestMatch(inBuf + i, inBuf, inLen, + &matchOffset); + if (longestMatch < MIN_MATCH_LEN) { + // no match to consider; leave optList[] values at zero + costForMatch = MAX_SIZE * 2; // arbitrary large value + } else { + // 4-14 bytes, fits in mixed-len byte + optList[i].matchLength = longestMatch; + optList[i].matchOffset = matchOffset; + + // total is previous total + 3 for match + costForMatch = optList[i + longestMatch].totalCost + 3; + if (longestMatch >= INITIAL_LEN) { + costForMatch++; + } + } + + // Now consider the "literal" path. If the next node is a + // literal, we add on to the existing run. If it's a match, + // we're a length-1 literal. + if (i == inLen - 1) { + // special-case start (essentially a 1-byte file) + optList[i].literalLength = 1; + optList[i].totalCost = 2; + costForLiteral = 2; // mixed-len byte + literal + } else { + if (optList[i+1].matchLength != 0) { + // next is match + optList[i].literalLength = 1; + costForLiteral = 1; // literal; mixed-len byte in match + } else if (optList[i+1].literalLength == MAX_LITERAL_LEN) { + // next is max-length literal, start a new one + optList[i].literalLength = 1; + costForLiteral = 3; // mixed-len byte + literal + nomatch + } else { + // next is sub-max-length literal, join it + size_t newLiteralLen = optList[i+1].literalLength + 1; + optList[i].literalLength = newLiteralLen; + costForLiteral = 1; + + if (newLiteralLen == INITIAL_LEN) { + // just hit 15, now need the extension byte + costForLiteral++; + } + } + costForLiteral += optList[i + 1].totalCost; + } + + if (costForLiteral > costForMatch) { + // use the match + assert(longestMatch != 0); + optList[i].totalCost = costForMatch; + DBUG(("0x%04x use-mat [l=%zd m=%zd] (len=%zd off=0x%04zx) --> 0x%04zx\n", + i, costForLiteral, costForMatch, longestMatch, + matchOffset, optList[i].totalCost)); + } else { + // use the literal -- zero the matchLength as a flag + optList[i].matchLength = 0; + optList[i].totalCost = costForLiteral; + DBUG(("0x%04x use-lit [l=%zd m=%zd] (len=%zd) --> 0x%04zx\n", + i, costForLiteral, costForMatch, optList[i].literalLength, + optList[i].totalCost)); + } + } + + // add one for the magic number; does not include end-of-data marker + // (which will be +1 if the last thing is a literal, +2 if a match) + size_t predictedLength = optList[0].totalCost + 1; + DBUG(("predicted length is %zd\n", predictedLength)); + + + // + // Pass 2: generate output from optimal path + // + + //const uint8_t* inPtr = inBuf; + uint8_t* outPtr = outBuf; + + *outPtr++ = LZ4FH_MAGIC; + + const uint8_t* literalSrcPtr = NULL; + size_t numLiterals = 0; + + for (unsigned int i = 0; i < inLen; ) { + if (optList[i].matchLength == 0) { + // no match at this point, select literals + if (numLiterals != 0) { + // Previous entry was literals. Because we parsed it + // backwards, we can end up with 32 literals followed + // by 255 literals, rather than the other way around. + DBUG((" output literal-literal (%zd)\n", numLiterals)); + if (numLiterals <= INITIAL_LEN) { + *outPtr++ = (numLiterals << 4) | 0x0f; + } else { + *outPtr++ = 0xff; + *outPtr++ = numLiterals - INITIAL_LEN; + } + memcpy(outPtr, literalSrcPtr, numLiterals); + outPtr += numLiterals; + *outPtr++ = EMPTY_MATCH_TOKEN; + } + numLiterals = optList[i].literalLength; + literalSrcPtr = inBuf + i; + + // advance to next node + i += numLiterals; + } else { + // found a match, output previous literals first + size_t longestMatch = optList[i].matchLength; + size_t matchOffset = optList[i].matchOffset; + size_t adjustedMatch = longestMatch - MIN_MATCH_LEN; + + // Start by emitting the 4/4 length byte. + uint8_t mixedLengths; + if (adjustedMatch <= INITIAL_LEN) { + mixedLengths = adjustedMatch; + } else { + mixedLengths = INITIAL_LEN; + } + if (numLiterals <= INITIAL_LEN) { + mixedLengths |= numLiterals << 4; + } else { + mixedLengths |= INITIAL_LEN << 4; + } + DBUG((" match len=%zd off=0x%04zx lits=%zd mix=0x%02x\n", + longestMatch, matchOffset, numLiterals, + mixedLengths)); + *outPtr++ = mixedLengths; + + // Output the literals, starting with the extended length. + if (numLiterals >= INITIAL_LEN) { + *outPtr++ = numLiterals - INITIAL_LEN; + } + memcpy(outPtr, literalSrcPtr, numLiterals); + outPtr += numLiterals; + numLiterals = 0; + literalSrcPtr = NULL; // debug/sanity check + + // Now output the match, starting with the extended length. + if (adjustedMatch >= INITIAL_LEN) { + *outPtr++ = adjustedMatch - INITIAL_LEN; + } + *outPtr++ = matchOffset & 0xff; + *outPtr++ = (matchOffset >> 8) & 0xff; + + i += longestMatch; + } + } + + // housekeeping check -- factor in end-of-data circumstances + predictedLength++; + if (numLiterals == 0) { + predictedLength++; + } + + // Dump any remaining literals, with the end-of-data indicator + // in the match len. + if (numLiterals <= INITIAL_LEN) { + *outPtr++ = (numLiterals << 4) | 0x0f; + } else { + *outPtr++ = 0xff; + *outPtr++ = numLiterals - INITIAL_LEN; + } + memcpy(outPtr, literalSrcPtr, numLiterals); + outPtr += numLiterals; + + *outPtr++ = EOD_MATCH_TOKEN; + + DBUG(("Predicted length %zd, actual %ld\n", + predictedLength, outPtr - outBuf)); + + free(optList); + return outPtr - outBuf; +} + +/* + * Compress a buffer, from "inBuf" to "outBuf". + * + * The input buffer holds between MIN_SIZE and MAX_SIZE bytes (inclusive), + * depending on the length of the source material and whether or not + * we're attempting to preserve the screen holes. + * + * Returns the amount of data in "outBuf" on success, or 0 on failure. + */ +size_t compressBufferGreedily(uint8_t* outBuf, const uint8_t* inBuf, + size_t inLen) +{ + const uint8_t* inPtr = inBuf; + uint8_t* outPtr = outBuf; + + const uint8_t* literalSrcPtr = NULL; + size_t numLiterals = 0; + + *outPtr++ = LZ4FH_MAGIC; + + // Basic strategy: walk forward, searching for a match. When we + // find one, output the literals then the match. + // + // If the literal would cause us to exceed the maximum literal + // length, output the previous literals with a "no match" indicator. + while (inPtr < inBuf + inLen) { + DBUG(("Loop: off 0x%08lx\n", inPtr - inBuf)); + + // sanity-check on MAX_EXPANSION value + assert(outPtr - outBuf < MAX_SIZE + MAX_EXPANSION); + + size_t matchOffset; + size_t longestMatch = findLongestMatch(inPtr, inBuf, inLen, + &matchOffset); + if (longestMatch < MIN_MATCH_LEN) { + // No good match found here, emit as literal. + if (numLiterals == MAX_LITERAL_LEN) { + // We've maxed out the literal string length. Emit + // the previously literals with an empty match indicator. + DBUG((" max literals reached")); + *outPtr++ = 0xff; // literal-len=15, match-len=15 + *outPtr++ = MAX_LITERAL_LEN - INITIAL_LEN; // 240 + memcpy(outPtr, literalSrcPtr, numLiterals); + outPtr += numLiterals; + + // Emit empty match indicator. + *outPtr++ = EMPTY_MATCH_TOKEN; + + // Reset literal len, continue. + numLiterals = 0; + } + if (numLiterals == 0) { + // Start of run of literals. Save pointer to data. + literalSrcPtr = inPtr; + } + numLiterals++; + inPtr++; + } else { + // Good match found. + size_t adjustedMatch = longestMatch - MIN_MATCH_LEN; + + // Start by emitting the 4/4 length byte. + uint8_t mixedLengths; + if (adjustedMatch <= INITIAL_LEN) { + mixedLengths = adjustedMatch; + } else { + mixedLengths = INITIAL_LEN; + } + if (numLiterals <= INITIAL_LEN) { + mixedLengths |= numLiterals << 4; + } else { + mixedLengths |= INITIAL_LEN << 4; + } + DBUG((" match len=%zd off=0x%04zx lits=%zd mix=0x%02x\n", + longestMatch, matchOffset, numLiterals, + mixedLengths)); + *outPtr++ = mixedLengths; + + // Output the literals, starting with the extended length. + if (numLiterals >= INITIAL_LEN) { + *outPtr++ = numLiterals - INITIAL_LEN; + } + memcpy(outPtr, literalSrcPtr, numLiterals); + outPtr += numLiterals; + numLiterals = 0; + literalSrcPtr = NULL; // debug/sanity check + + // Now output the match, starting with the extended length. + if (adjustedMatch >= INITIAL_LEN) { + *outPtr++ = adjustedMatch - INITIAL_LEN; + } + *outPtr++ = matchOffset & 0xff; + *outPtr++ = (matchOffset >> 8) & 0xff; + inPtr += longestMatch; + } + } + + // Dump any remaining literals, with the end-of-data indicator + // in the match len. + if (numLiterals <= INITIAL_LEN) { + *outPtr++ = (numLiterals << 4) | 0x0f; + } else { + *outPtr++ = 0xff; + *outPtr++ = numLiterals - INITIAL_LEN; + } + memcpy(outPtr, literalSrcPtr, numLiterals); + outPtr += numLiterals; + + *outPtr++ = EOD_MATCH_TOKEN; + + return outPtr - outBuf; +} + +/* + * Uncompress from "inBuf" to "outBuf". + * + * Given valid data, "inLen" is not necessary. It can be used as an + * error check. + * + * Returns the uncompressed length on success, 0 on failure. + */ +size_t uncompressBuffer(uint8_t* outBuf, const uint8_t* inBuf, size_t inLen) +{ + uint8_t* outPtr = outBuf; + const uint8_t* inPtr = inBuf; + + if (*inPtr++ != LZ4FH_MAGIC) { + fprintf(stderr, "Missing LZ4FH magic\n"); + return 0; + } + + while (true) { + uint8_t mixedLen = *inPtr++; + + int literalLen = mixedLen >> 4; + if (literalLen != 0) { + if (literalLen == INITIAL_LEN) { + literalLen += *inPtr++; + } + DBUG(("Literals: %d\n", literalLen)); + if ((outPtr - outBuf) + literalLen > (long) MAX_SIZE || + (inPtr - inBuf) + literalLen > (long) inLen) { + fprintf(stderr, "Buffer overrun\n"); + return 0; + } + memcpy(outPtr, inPtr, literalLen); + outPtr += literalLen; + inPtr += literalLen; + } else { + DBUG(("Literals: none\n")); + } + + int matchLen = mixedLen & 0x0f; + if (matchLen == INITIAL_LEN) { + uint8_t addon = *inPtr++; + if (addon == EMPTY_MATCH_TOKEN) { + DBUG(("Match: none\n")); + matchLen = - MIN_MATCH_LEN; + } else if (addon == EOD_MATCH_TOKEN) { + DBUG(("Hit end-of-data at 0x%04lx\n", outPtr - outBuf)); + break; // out of while + } else { + matchLen += addon; + } + } + + matchLen += MIN_MATCH_LEN; + if (matchLen != 0) { + int matchOffset = *inPtr++; + matchOffset |= (*inPtr++) << 8; + DBUG(("Match: %d at %d\n", matchLen, matchOffset)); + // Can't use memcpy() here, because we need to guarantee + // that the match is overlapping. + uint8_t* srcPtr = outBuf + matchOffset; + if ((outPtr - outBuf) + matchLen > MAX_SIZE || + (srcPtr - outBuf) + matchLen > MAX_SIZE) { + fprintf(stderr, "Buffer overrun\n"); + return 0; + } + while (matchLen-- != 0) { + *outPtr++ = *srcPtr++; + } + } + } + + if (inPtr - inBuf != (long) inLen) { + fprintf(stderr, "Warning: uncompress used only %ld of %zd bytes\n", + inPtr - inBuf, inLen); + } + + return outPtr - outBuf; +} + +/* + * Compress a file, from "inFileName" to "outFileName". + * + * Returns 0 on success. + */ +int compressFile(const char* outFileName, const char* inFileName, + bool doPreserveHoles, bool useGreedyParsing) +{ + int result = -1; + uint8_t inBuf1[MAX_SIZE]; + uint8_t inBuf2[MAX_SIZE]; + uint8_t verifyBuf[MAX_SIZE]; + uint8_t outBuf1[MAX_SIZE + MAX_EXPANSION]; + uint8_t outBuf2[MAX_SIZE + MAX_EXPANSION]; + uint8_t* outBuf = NULL; + uint8_t* inBuf = NULL; + size_t outSize, sourceLen, uncompressedLen; + FILE* outfp = NULL; + FILE* infp; + + infp = fopen(inFileName, "rb"); + if (infp == NULL) { + perror("Unable to open input file"); + return -1; + } + + if (outFileName != NULL) { + outfp = fopen(outFileName, "wb"); + if (outfp == NULL) { + perror("Unable to open output file"); + fclose(infp); + return -1; + } + } + + fseek(infp, 0, SEEK_END); + long fileLen = ftell(infp); + rewind(infp); + if (fileLen < MIN_SIZE || fileLen > MAX_SIZE) { + fprintf(stderr, "ERROR: input file is %ld bytes, must be %d - %d\n", + fileLen, MIN_SIZE, MAX_SIZE); + goto bail; + } + + // Read data into buffer. + if (fread(inBuf1, 1, fileLen, infp) != (size_t) fileLen) { + perror("Failed while reading data"); + goto bail; + } + + if (doPreserveHoles) { + // Don't modify the input. + sourceLen = fileLen; // retain original file length + if (useGreedyParsing) { + outSize = compressBufferGreedily(outBuf1, inBuf1, sourceLen); + } else { + outSize = compressBufferOptimally(outBuf1, inBuf1, sourceLen); + } + inBuf = inBuf1; + outBuf = outBuf1; + } else { + sourceLen = MIN_SIZE; // always drop the last 8 bytes + memcpy(inBuf2, inBuf1, sourceLen); + + // try it twice, with zero-filled holes and content-filled holes + + size_t outSize1; + zeroHoles(inBuf1); + if (useGreedyParsing) { + outSize1 = compressBufferGreedily(outBuf1, inBuf1, sourceLen); + } else { + outSize1 = compressBufferOptimally(outBuf1, inBuf1, sourceLen); + } + + size_t outSize2; + fillHoles(inBuf2); + if (useGreedyParsing) { + outSize2 = compressBufferGreedily(outBuf2, inBuf2, sourceLen); + } else { + outSize2 = compressBufferOptimally(outBuf2, inBuf2, sourceLen); + } + + if (false) { // save hole-punched output for examination + FILE* foo = fopen("HOLES", "wb"); + fwrite(inBuf2, 1, MIN_SIZE, foo); + fclose(foo); + } + + if (outSize1 <= outSize2) { + printf(" using zeroed-out holes (%zd vs. %zd)\n", + outSize1, outSize2); + outSize = outSize1; + inBuf = inBuf1; + outBuf = outBuf1; + } else { + printf(" using filled-in holes (%zd vs. %zd)\n", + outSize2, outSize1); + outSize = outSize2; + inBuf = inBuf2; + outBuf = outBuf2; + } + } + + if (outSize == 0) { + fprintf(stderr, "Compression failed\n"); + goto bail; + } + DBUG(("*** outSize is %zd\n", outSize)); + + // uncompress the data we just compressed + memset(verifyBuf, 0xcc, sizeof(verifyBuf)); + uncompressedLen = uncompressBuffer(verifyBuf, outBuf, outSize); + if (uncompressedLen != sourceLen) { + fprintf(stderr, "ERROR: verify expanded %zd of expected %zd bytes\n", + uncompressedLen, sourceLen); + goto bail; + } + + // byte-for-byte comparison + for (size_t ii = 0; ii < sourceLen; ii++) { + if (inBuf[ii] != verifyBuf[ii]) { + fprintf(stderr, + "ERROR: expansion mismatch (byte %zd, 0x%02x 0x%02x)\n", + ii, inBuf[ii], verifyBuf[ii]); + goto bail; + } + } + DBUG(("Verification succeeded\n")); + + if (outfp != NULL) { + /* write the data */ + if (fwrite(outBuf, 1, outSize, outfp) != outSize) { + perror("Failed while writing data"); + goto bail; + } + } else { + // must be in test mode + printf(" success -- compressed len is %zd\n", outSize); + } + + result = 0; + +bail: + fclose(infp); + if (outfp != NULL) { + fclose(outfp); + } + if (result != 0 && outFileName != NULL) { + unlink(outFileName); + } + return result; +} + +/* + * Uncompress data from one file to another. + * + * Returns 0 on success. + */ +int uncompressFile(const char* outFileName, const char* inFileName) +{ + int result = -1; + uint8_t inBuf[MAX_SIZE + MAX_EXPANSION]; + uint8_t outBuf[MAX_SIZE]; + size_t outSize; + FILE* outfp = NULL; + FILE* infp; + + infp = fopen(inFileName, "rb"); + if (infp == NULL) { + perror("Unable to open input file"); + return -1; + } + + outfp = fopen(outFileName, "wb"); + if (outfp == NULL) { + perror("Unable to open output file"); + fclose(infp); + return -1; + } + + fseek(infp, 0, SEEK_END); + long fileLen = ftell(infp); + rewind(infp); + if (fileLen < 10 || fileLen > MAX_SIZE + MAX_EXPANSION) { + // 10 just ensures we have enough for magic number, chunk, eod + fprintf(stderr, "ERROR: input file is %ld bytes, must be < %d\n", + fileLen, MAX_SIZE + MAX_EXPANSION); + goto bail; + } + + // Read data into buffer. + if (fread(inBuf, 1, fileLen, infp) != (size_t) fileLen) { + perror("Failed while reading data"); + goto bail; + } + + outSize = uncompressBuffer(outBuf, inBuf, fileLen); + if (outSize == 0) { + goto bail; + } + DBUG(("*** outSize is %zd\n", outSize)); + + /* write the data */ + if (fwrite(outBuf, 1, outSize, outfp) != outSize) { + perror("Failed while writing data"); + goto bail; + } + + result = 0; + +bail: + fclose(infp); + fclose(outfp); + if (result != 0) { + unlink(outFileName); + } + return result; +} + +/* + * Process args. + */ +int main(int argc, char* argv[]) +{ + ProgramMode mode = MODE_UNKNOWN; + bool doPreserveHoles = false; + bool useGreedyParsing = false; + bool wantUsage = false; + int opt; + + while ((opt = getopt(argc, argv, "19cdth")) != -1) { + switch (opt) { + case '1': + useGreedyParsing = true; + break; + case '9': + useGreedyParsing = false; + break; + case 'c': + if (mode == MODE_UNKNOWN) { + mode = MODE_COMPRESS; + } else { + wantUsage = true; + } + break; + case 'd': + if (mode == MODE_UNKNOWN) { + mode = MODE_UNCOMPRESS; + } else { + wantUsage = true; + } + break; + case 't': + if (mode == MODE_UNKNOWN) { + mode = MODE_TEST; + } else { + wantUsage = true; + } + break; + case 'h': + doPreserveHoles = true; + break; + default: + usage(argv[0]); + return 2; + } + } + + if (argc - optind < 1 || + (mode != MODE_TEST && argc - optind != 2)) + { + wantUsage = true; + } + + if (mode == MODE_UNKNOWN || wantUsage) { + usage(argv[0]); + return 2; + } + + const char* inFileName = argv[optind]; + const char* outFileName = argv[optind+1]; + + int result = 0; + if (mode == MODE_COMPRESS) { + printf("Compressing %s -> %s\n", inFileName, outFileName); + result = compressFile(outFileName, inFileName, doPreserveHoles, + useGreedyParsing); + } else if (mode == MODE_UNCOMPRESS) { + printf("Expanding %s -> %s\n", inFileName, outFileName); + result = uncompressFile(outFileName, inFileName); + } else { + while (optind < argc) { + printf("Testing %s\n", argv[optind]); + result |= compressFile(NULL, argv[optind], doPreserveHoles, + useGreedyParsing); + optind++; + } + } + + return (result != 0); +} + diff --git a/fhpack_disks.zip b/fhpack_disks.zip new file mode 100644 index 0000000..8dc322f Binary files /dev/null and b/fhpack_disks.zip differ diff --git a/make-test-pic.cpp b/make-test-pic.cpp new file mode 100644 index 0000000..eb23b75 --- /dev/null +++ b/make-test-pic.cpp @@ -0,0 +1,114 @@ +/* + * Generate some 8K images for fhpack testing. + * By Andy McFadden + * Version 1.0, August 2015 + * + * Copyright 2015 by faddenSoft. All Rights Reserved. + * See the LICENSE.txt file for distribution terms (Apache 2.0). + */ +#include +#include +#include +#include + +const char* TEST_ALL_ZERO = "allzero#060000"; +const char* TEST_ALL_GREEN = "allgreen#060000"; +const char* TEST_NO_MATCH = "nomatch#060000"; + +int main() +{ + FILE* fp; + + if (access(TEST_ALL_ZERO, F_OK) == 0) { + printf("NOT overwriting %s\n", TEST_ALL_ZERO); + } else { + fp = fopen(TEST_ALL_ZERO, "w"); + for (int i = 0; i < 8192; i++) { + putc('\0', fp); + } + fclose(fp); + } + + if (access(TEST_ALL_GREEN, F_OK) == 0) { + printf("NOT overwriting %s\n", TEST_ALL_GREEN); + } else { + fp = fopen(TEST_ALL_GREEN, "w"); + for (int i = 0; i < 4096; i++) { + putc(0x2a, fp); + putc(0x55, fp); + } + fclose(fp); + } + + if (access(TEST_NO_MATCH, F_OK) == 0) { + printf("NOT overwriting %s\n", TEST_NO_MATCH); + } else { + fp = fopen(TEST_NO_MATCH, "w"); + for (int ic = 0; ic < 252; ic++) { + putc(ic, fp); + putc(ic+1, fp); + putc(ic+2, fp); + putc(ic+3, fp); + } + // 1008 + for (int ic = 0; ic < 252; ic++) { + putc(ic, fp); + putc(ic+2, fp); + putc(ic+1, fp); + putc(ic+3, fp); + } + // 2016 + for (int ic = 0; ic < 252; ic++) { + putc(ic, fp); + putc(ic+1, fp); + putc(ic+3, fp); + putc(ic+2, fp); + } + // 3024 + for (int ic = 0; ic < 252; ic++) { + putc(ic, fp); + putc(ic+3, fp); + putc(ic+2, fp); + putc(ic+1, fp); + } + // 4032 + for (int ic = 0; ic < 252; ic++) { + putc(ic, fp); + putc(ic+3, fp); + putc(ic+1, fp); + putc(ic+2, fp); + } + // 5040 + for (int ic = 0; ic < 252; ic++) { + putc(ic+1, fp); + putc(ic, fp); + putc(ic+2, fp); + putc(ic+3, fp); + } + // 6048 + for (int ic = 0; ic < 252; ic++) { + putc(ic+1, fp); + putc(ic+2, fp); + putc(ic, fp); + putc(ic+3, fp); + } + // 7056 + for (int ic = 0; ic < 252; ic++) { + putc(ic+1, fp); + putc(ic+2, fp); + putc(ic+3, fp); + putc(ic, fp); + } + // 8064 + for (int ic = 0; ic < 32; ic++) { + putc(ic+2, fp); + putc(ic+1, fp); + putc(ic+3, fp); + putc(ic, fp); + } + fclose(fp); + } + + return 0; +} +