Commit Graph

475 Commits

Author SHA1 Message Date
emmanuel-marty 6aa2dae4b3 Add context to libdivsufsort, don't allocate memory during compression 2019-04-07 00:01:22 +02:00
emmanuel-marty e24320b23b Save 1 byte in 8088 decompressor 2019-04-06 00:21:15 +02:00
emmanuel-marty 1353573af1 Small cleanup for end-of-data handling in decompression, check commands 2019-04-06 00:02:11 +02:00
emmanuel-marty a785010448 Revert token to O|LLL|MMMM; revert to always shifting the match offset by 1; set raw block end marker as a large zero-size match 2019-04-05 23:16:05 +02:00
emmanuel-marty 06e6a14871 Add optimization pass to reduce the number of command tokens in the compressed data blocks without changing the compression ratio 2019-04-05 16:32:11 +02:00
emmanuel-marty f05359b63d Don't write an unnecessary footer byte when emitting a raw block 2019-04-05 12:13:51 +02:00
emmanuel-marty 1ef1ad8111 Reorganize token byte for faster decoding on 8-bit CPUs, without affecting the compression ratio 2019-04-05 11:58:44 +02:00
Emmanuel Marty 33b62c004a
Update format description 2019-04-05 10:46:24 +02:00
emmanuel-marty c7692cf688 Store 16-bit lengths and match offsets directly, to simplify decompression on 8-bit CPUs without affecting the compression ratio 2019-04-05 10:42:06 +02:00
emmanuel-marty bdc4e85948 Fix typos in format description 2019-04-05 09:28:28 +02:00
emmanuel-marty c86d38ba63 Reduce the number of literals required at the end of a compressed block 2019-04-05 09:28:16 +02:00
Emmanuel Marty bfaa3790d0
Update corpus compression stats for v0.2.0 2019-04-03 13:19:41 +02:00
emmanuel-marty 4f26bb086c Add LICENSE 2019-04-03 13:06:46 +02:00
emmanuel-marty 0744ec99de Unpack raw blocks in 8088 decompressor 2019-04-03 13:05:32 +02:00
emmanuel-marty 18fc4da994 Implement raw block mode 2019-04-03 13:05:10 +02:00
emmanuel-marty 11d1ff8cd7 Use 3-byte file header 2019-04-03 11:26:36 +02:00
emmanuel-marty 1f04705845 Fix degenerate case; use full 32 bits for suffix array intervals; make EOD parsable by a decompressor as a long 0 match offset as well; use more aggressive compression settings. 2019-04-03 10:16:12 +02:00
emmanuel-marty fcfdbe9745 Add autodocs to internal compressor functions 2019-04-02 15:03:21 +02:00
emmanuel-marty fa1ef05a31 Merge branch 'master' of https://github.com/emmanuel-marty/lzsa 2019-04-02 13:21:55 +02:00
emmanuel-marty 06396f5ba6 Save 2 bytes in 8088 decompressor 2019-04-02 13:21:45 +02:00
Emmanuel Marty 663e154429
Add compression ratio stats for well-known corpus files 2019-04-02 12:49:54 +02:00
emmanuel-marty 8b992bb33a Add autodocs to public functions in compressor and decompressor 2019-04-02 12:12:12 +02:00
Emmanuel Marty cd7517fb65
Fix typo in match offsets note 2019-04-01 21:02:08 +02:00
Emmanuel Marty fde853e095
Clarify the encoding of matches, fix some broken formatting. 2019-04-01 21:00:07 +02:00
marty-emmanuel e216b0c544 Initial checkin 2019-04-01 18:04:56 +02:00