84 Commits

Author SHA1 Message Date
mobygamer
30192238ea Rewrite 8088 jumptable decompressor for maximum speed
This is a rewrite of LZSA1JMP.ASM to use a 256-element jumptable, which
allows the code to handle all of the hot paths (common cases) without
any branching.  This not only reduces branches (which are very costly on
x86) to a bare minimum, but also grants us foreknowledge in a decode
path of what steps can be skipped.

The new code is 12.7% faster than the old code, and assembles to less
than 3K of object code and data.
2019-10-26 23:34:24 -05:00
introspec
d5d788946e
Added an option for unrolling long match copying
Usually useless and costing +57 bytes, this option can bring dramatic performance improvements on very compressible data dominated by long matches
2019-10-22 20:11:46 +01:00
introspec
495a12216f
-1 byte
Very slightly faster too
2019-10-11 00:23:43 +01:00
introspec
566e3a94e8
+0.2% speed
also, added an option to unroll LDIR for longer matches (which adds 38 bytes, but can be significantly faster for files with many long matches)
2019-10-10 22:50:23 +01:00
Emmanuel Marty
6a62f7d795
Update Z80 depackers changes history 2019-09-26 11:42:52 +02:00
Emmanuel Marty
681f78d1e8
Rename 2019-09-26 07:48:59 +02:00
Emmanuel Marty
8015ab8650
Rename 2019-09-26 07:48:44 +02:00
Emmanuel Marty
2f15298343
Rename 2019-09-26 07:48:33 +02:00
Emmanuel Marty
648a308d87
Rename 2019-09-26 07:48:19 +02:00
Emmanuel Marty
587a92f4ab
Rename Z80 depackers, add version history to LZSA1 2019-09-26 07:47:43 +02:00
Emmanuel Marty
7d9135c548
Update Z80 decompressors 2019-09-25 08:09:18 +02:00
Emmanuel Marty
9de7e930e9
Faster LZSA1 z80 decompression 2019-08-27 13:16:20 +02:00
uniabis
a807344343 2bytes shorter 2019-08-22 12:55:55 +09:00
introspec
be30cae636
-1 byte
slightly slower, but this is the size-optimized branch
2019-08-06 12:36:27 +01:00
introspec
44bff39de3
New faster and shorter decompressors
This update is mostly about better integration of improvements by uniabis, with spke contributing several smaller size optimizations.
2019-08-01 15:07:14 +01:00
introspec
e7bb1faece
Merge branch 'master' into master 2019-07-31 23:24:30 +01:00
introspec
51ef92cdab
incorporated improvements by uniabis
also, slightly faster decompression for fast packer in backwards mode
2019-07-31 20:42:47 +01:00
uniabis
8d0528fddc hd64180 support
a bit faster, a bit smaller
2019-07-31 01:39:27 +09:00
Emmanuel Marty
0a04796b19
Fix for z80 LZSA2 fast backward depacker 2019-07-27 15:39:44 +02:00
introspec
ac3bf78273
fix a bug in the backward version of unlzsa2_fast_v1.asm
an INC HL slipped through
2019-07-27 14:14:54 +01:00
Emmanuel Marty
82edcb8bb5
Fix literal runs that are multiple of 256 bytes 2019-07-27 01:35:46 +02:00
Emmanuel Marty
ae4cc12aed
Use ACME syntax 2019-07-26 12:31:26 +02:00
Emmanuel Marty
fd70be918c
Merge pull request #19 from specke/master
Support for -b in Z80 decompressors
2019-07-24 20:09:48 +02:00
Emmanuel Marty
4835e4c26c
Support backward decompression 2019-07-24 20:08:23 +02:00
introspec
cca79e3e59
Delete unlzsa_small_v1.asm 2019-07-24 17:31:22 +01:00
introspec
607b26d337
Delete unlzsa_fast_v1.asm 2019-07-24 17:31:14 +01:00
introspec
fd61f403ad
LZSA1 decompressors with added support for -b. 2019-07-24 17:30:37 +01:00
introspec
fcfba056d2
Add files via upload
LZSA2 decompressors with support for -b option.
2019-07-24 17:28:39 +01:00
Emmanuel Marty
081a29a3db
Fix copying multiples of 256 bytes 2019-07-14 16:14:55 +02:00
Emmanuel Marty
710d7e05d6
Fix comments 2019-07-14 10:13:16 +02:00
Emmanuel Marty
0b540431fc
Fix comments 2019-07-14 10:12:32 +02:00
Emmanuel Marty
981b1d5925
NASM versions of Jim Leonard's speed-optimized depackers 2019-07-14 10:11:16 +02:00
Emmanuel Marty
19e8bc0468
Merge pull request #17 from MobyGamer/decompressor/8086_speed
Time-efficient LZSA2 decompressor
2019-07-14 09:28:40 +02:00
mobygamer
c38b582e73 Time-efficient LZSA2 decompressor
This commit provides a time-effecient LZSA2 decompressor for
the 8088 (and higher) CPU.  Decompression speed is roughly 50% faster
than ZX7 on the same hardware.
2019-07-13 22:35:28 -05:00
Emmanuel Marty
6445d9ff6f
Merge pull request #16 from peterferrie/master - thanks!
smaller
2019-07-12 10:34:58 +02:00
Peter Ferrie
6797fe6268 smaller 2019-07-11 17:20:38 -07:00
Emmanuel Marty
7195104c73
Merge pull request #15 from MobyGamer/decompressor/8086_speed
Submit 8086-optimized decompressor using jump tables
2019-07-12 00:59:41 +02:00
mobygamer
f123a6d9df Submit 8086-optimized decompressor using jump tables
Because the 8086's BIU is more efficient and can read a word in
a single operation, we can skew LZSA1 decompression faster on 8086
by using a jump table to eliminate 2 comparions.
2019-07-11 17:22:29 -05:00
Emmanuel Marty
4d7d90b893
Merge pull request #14 from MobyGamer/decompressor/8086_speed
Additional minor speedups
2019-07-11 09:01:14 +02:00
mobygamer
638c33b432 Additional minor speedups 2019-07-11 00:58:46 -05:00
Peter Ferrie
3bfe60de44 use BYTE 2019-07-10 18:03:42 -07:00
Peter Ferrie
fb53706361 smaller 2019-07-10 18:02:27 -07:00
Emmanuel Marty
d3d62c3bf0
Use BYTE 2019-07-11 01:52:13 +02:00
Peter Ferrie
7861ef1552 smaller 2019-07-10 16:04:42 -07:00
Peter Ferrie
8e26fa9cac smaller 2019-07-10 10:57:41 -07:00
Peter Ferrie
a1841e5e5d smaller 2019-07-10 10:48:15 -07:00
Peter Ferrie
68339ad961 smaller 2019-07-09 18:44:37 -07:00
Peter Ferrie
15c9059adf smaller 2019-07-09 18:30:29 -07:00
Peter Ferrie
e85cb8a5bb smaller 2019-07-09 16:49:01 -07:00
Emmanuel Marty
467cd1970f
Merge pull request #9 from MobyGamer/decompressor/8086_speed
Additional 1% speedup from deferring work
2019-07-09 20:42:17 +02:00