mobygamer
30192238ea
Rewrite 8088 jumptable decompressor for maximum speed
...
This is a rewrite of LZSA1JMP.ASM to use a 256-element jumptable, which
allows the code to handle all of the hot paths (common cases) without
any branching. This not only reduces branches (which are very costly on
x86) to a bare minimum, but also grants us foreknowledge in a decode
path of what steps can be skipped.
The new code is 12.7% faster than the old code, and assembles to less
than 3K of object code and data.
2019-10-26 23:34:24 -05:00
Emmanuel Marty
710d7e05d6
Fix comments
2019-07-14 10:13:16 +02:00
Emmanuel Marty
0b540431fc
Fix comments
2019-07-14 10:12:32 +02:00
Emmanuel Marty
981b1d5925
NASM versions of Jim Leonard's speed-optimized depackers
2019-07-14 10:11:16 +02:00
Emmanuel Marty
19e8bc0468
Merge pull request #17 from MobyGamer/decompressor/8086_speed
...
Time-efficient LZSA2 decompressor
2019-07-14 09:28:40 +02:00
mobygamer
c38b582e73
Time-efficient LZSA2 decompressor
...
This commit provides a time-effecient LZSA2 decompressor for
the 8088 (and higher) CPU. Decompression speed is roughly 50% faster
than ZX7 on the same hardware.
2019-07-13 22:35:28 -05:00
Emmanuel Marty
6445d9ff6f
Merge pull request #16 from peterferrie/master - thanks!
...
smaller
2019-07-12 10:34:58 +02:00
Peter Ferrie
6797fe6268
smaller
2019-07-11 17:20:38 -07:00
Emmanuel Marty
7195104c73
Merge pull request #15 from MobyGamer/decompressor/8086_speed
...
Submit 8086-optimized decompressor using jump tables
2019-07-12 00:59:41 +02:00
mobygamer
f123a6d9df
Submit 8086-optimized decompressor using jump tables
...
Because the 8086's BIU is more efficient and can read a word in
a single operation, we can skew LZSA1 decompression faster on 8086
by using a jump table to eliminate 2 comparions.
2019-07-11 17:22:29 -05:00
Emmanuel Marty
4d7d90b893
Merge pull request #14 from MobyGamer/decompressor/8086_speed
...
Additional minor speedups
2019-07-11 09:01:14 +02:00
mobygamer
638c33b432
Additional minor speedups
2019-07-11 00:58:46 -05:00
Peter Ferrie
68339ad961
smaller
2019-07-09 18:44:37 -07:00
Peter Ferrie
15c9059adf
smaller
2019-07-09 18:30:29 -07:00
Peter Ferrie
e85cb8a5bb
smaller
2019-07-09 16:49:01 -07:00
Emmanuel Marty
467cd1970f
Merge pull request #9 from MobyGamer/decompressor/8086_speed
...
Additional 1% speedup from deferring work
2019-07-09 20:42:17 +02:00
mobygamer
9a180a59f9
Additional 1% speedup from to introspec suggestions
2019-07-09 13:05:10 -05:00
Emmanuel Marty
b7b3ca1907
Merge pull request #8 from MobyGamer/decompressor/8086_speed - thank you!
...
8088-speed-optimized LZSA1 decompression
2019-07-08 09:17:51 +02:00
mobygamer
b0c9d61aab
8088-speed-optimized LZSA1 decompression
...
This commit contributes LZSA1 raw block format decompression code,
optimized for speed for the 8088 and higher processors. With
appropriate compression options (-f1 -m5), decompression speed is
comparable to LZ4.
2019-07-07 20:25:50 -05:00
Peter Ferrie
111403c052
a little more
2019-07-07 12:05:02 -07:00
Emmanuel Marty
fe0dcf107f
Use db rather than .byte
2019-07-04 18:09:29 +02:00
Peter Ferrie
32b98f3085
a bit faster, a bit smaller
2019-07-03 19:38:27 -07:00
Emmanuel Marty
8d8f50a509
Don't use 80186 opcodes. Issue #3
2019-06-27 18:32:49 +02:00
Emmanuel Marty
79ed7bf91e
Further update LZSA2 format; avoid name conflicts
2019-06-08 13:35:03 +02:00
Emmanuel Marty
272f2e7a29
Update LZSA2 6502 and 8088 depackers
2019-06-07 23:22:34 +02:00
emmanuel-marty
d70830b525
Simplify nibble handling in LZSA2 8088 depacker
2019-05-11 11:42:00 +02:00
emmanuel-marty
8b7b4a2b4f
Check in LZSA2 implementation (ratio competitive with ZX7, faster decompression)
2019-05-09 16:51:29 +02:00
emmanuel-marty
2b9780bd65
Finalize lzsa1 compressed format, speed up and simplify decompression
2019-04-24 09:47:40 +02:00
emmanuel-marty
02592cfe3b
Fix typo in 8088 decompressor comments
2019-04-10 17:30:13 +02:00
emmanuel-marty
e24320b23b
Save 1 byte in 8088 decompressor
2019-04-06 00:21:15 +02:00
emmanuel-marty
a785010448
Revert token to O|LLL|MMMM; revert to always shifting the match offset by 1; set raw block end marker as a large zero-size match
2019-04-05 23:16:05 +02:00
emmanuel-marty
1ef1ad8111
Reorganize token byte for faster decoding on 8-bit CPUs, without affecting the compression ratio
2019-04-05 11:58:44 +02:00
emmanuel-marty
c7692cf688
Store 16-bit lengths and match offsets directly, to simplify decompression on 8-bit CPUs without affecting the compression ratio
2019-04-05 10:42:06 +02:00
emmanuel-marty
0744ec99de
Unpack raw blocks in 8088 decompressor
2019-04-03 13:05:32 +02:00
emmanuel-marty
06396f5ba6
Save 2 bytes in 8088 decompressor
2019-04-02 13:21:45 +02:00
marty-emmanuel
e216b0c544
Initial checkin
2019-04-01 18:04:56 +02:00