Commit Graph

12 Commits

Author SHA1 Message Date
dgelessus
6adf8eb88d Fix mypy errors about byte strings as format string parameters 2019-12-30 01:48:33 +01:00
dgelessus
4e1cd05412 Fix miscellaneous mypy errors 2019-12-30 01:48:33 +01:00
dgelessus
f3b3de496e Change naming of compression types
The old names ("system" and "application" compression) were not really
accurate in all cases, so the compression types are now referred to by
their number.
2019-10-07 10:08:32 +02:00
dgelessus
8db1b22bdc Make the generic decompression API stream-based
The non-stream-based APIs still exist as before and are not deprecated,
they just act as thin wrappers around the stream-based API.

The main rsrcfork module doesn't use the stream-based APIs yet, because
it reads each resource's data all at once and not incrementally.
2019-10-02 16:28:40 +02:00
dgelessus
6559cbc337 Refactor .dcmp2 to be stream-based
This is a little more complex than with the other decompressors,
because .dcmp2 has to behave differently when at the byte before EOF.
Checking whether this is the case requires lookahead, which is not easy
to do with a plain IO stream.

Some buffered IO streams provide a peek method for lookahead, but
others don't (such as io.BytesIO). There is no standard way to wrap an
already buffered IO stream to add a peek method, so we need a custom
wrapper class and helper function for this purpose.
2019-10-02 10:26:03 +02:00
dgelessus
1e79dc3c50 Refactor .dcmp0 and .dcmp1 to be stream-based
The decompression code is more readable this way, because the
compressed data needs to be processed sequentially. It also allows
moving the length check and some debug logging into an outer generator.

This also allows incremental decompression, but this doesn't have any
practical advantage, because the compressed resource data is all read
at once (there is no API for opening resources as streams), and
resources are not very large anyway.
2019-10-01 21:26:41 +02:00
dgelessus
3a72bd3406 Remove leading underscores where they don't make much sense
The leading underscore is meant to distinguish private (for internal
use only) APIs from public (for external use) APIs. One can argue about
where the line between public and private should be, but if something
is used from other modules (as with read_variable_length_integer) it's
not really private IMHO.

In scripts (like __main__) it also doesn't make much sense to use
leading underscores, because the entire file is never meant to be used
by external code.
2019-10-01 21:26:41 +02:00
dgelessus
e0f73d3220 Fix more issues reported by mypy 2019-09-29 16:28:07 +02:00
dgelessus
97c459bca7 Change attribute type annotations to standard format
Previously, the types of instance attributes were annotated with the
first assignment of each attribute. The standard way to annotate
instance attributes is to do so at class level without assigning any
value.
2019-09-29 14:58:18 +02:00
dgelessus
0c942e26ec Fix hex number formatting in compressed header info reprs 2019-09-23 23:52:06 +02:00
dgelessus
9dbdf5b827 Move compressed header info constants/classes to .compress.common
This allows the constants/classes to be accessed from the individual
decompressor submodules.
2019-09-23 23:14:06 +02:00
dgelessus
8904f6e093 Refactor the rsrcfork.compress module into separate submodules
Each decompressor now has its own module, since they share almost no
code between each other.
2019-08-22 21:20:35 +02:00