Finally! The Disk class doesn't actually serve as much more than a slightly
improved Globals class at the moment holding every splitting of the source path
and filename that we use in legacy code, as well as a copy of the disk image
itself that gets used long enough to read the a2mg header.
The idea I have here is to begin building the module-based code in parallel.
Then I'll just modify the linear code to compare doing it the old way to doing
it the new. That'll let me verify that the new code does what the old should.
When it's all done, we can just modify main to use the new modular code and
look at splitting the modular code into a package with cppo as a runner. At
that point the code should begin being able to do things cppo cannot. We could
continue to extend cppo at that point, but my inclination is to maintain the
cppo runner as a compatibility layer and begin building a more modern image
tool. Essentially to begin building the CiderPress for Linux or the Java-free
AppleCommander.
The methods int.to_bytes() and int.from_bytes() are ... both messy and ugly,
contrary to the Zen of Python. Right now, we're just writing out single ints
and words, but eventually we'll be reading and writing whole structures, so it
makes sense to begin moving the codebase in that direction.
I've written only the functions I either use or see a use for immediately, save
for pack_u32be() which I wrote for date conversions, and then immediately
realized we don't want to use for that purpose yet. We can remove it if we
don't ultimately need it.
Replaced some of the commented out print lines with calls to a logger object.
These are currently always turned on, which we don't want, but they don't hurt
anything and are short.
Oh, and new logging system! The setup is a bit more verbose than it could be
because logging predates str.format and uses the str % tuple syntax that now
exists mostly to avoid breaking older code. You can override this, and I did.
It's not done yet because we might want to actually make some of the existing
print calls into log.info's. Y'know, just as soon as I set that up so ONLY
logging.INFO goes to stdout, unadorned, and everything higher than that goes to
stderr, depending on your logging level, with pretty formatting.
Yeah, logging can do all of that and chew bubblegum at the same time, I just
haven't set it up yet because I want to do it right.
A little more unStudlyCapping of things. I'm going to have to start actually
creating classes soon which is going to bring back the capitals, but I've been
working to get rid of them so that it becomes less confusing when we get there.
I dunno if it's helped any.
I also added a few comments about our imports and checked that we actually used
everything we imported. No, we don't. But we maybe should switch what we are
using for what we aren't at some point?
Function now takes raw bytes containing two little-endian 16-bit words right
out of a disk image. It extracts timestamp components using bit shifting and
bitwise operators and asks datetime.datetime to give us a timestamp from the
result. Caller need not catch exceptions for this process anymore. Either it
works or you get None back.
The full summary:
**ADDED**
- New dopo_swap swaps 140k images between DOS order and ProDOS order. This
function replaces code in the main program body which does the same thing.
- Add optional zfill parameter to to_bin. Considered this for to_hex, but
nowhere would that be used currently and I'd rather get rid of these lower
level support functions that mainly are there for Bashbyter (but for now
have larger direct use now that Bashbyter is gone.)
**CHANGED**
- In getFileLength for DOS 3.3 T/other files, initialize prevTSpair to [0,0]
which, when combined with the removal of Bashbyter and vestiges of Python2
support, makes `The Correspondent 4.4.dsk` both -cat and extract. (Noted
previously, we currently do nothing about the control characters in the
filenames on this disk. They extract as non-printables and they don't show
up properly in -cat.)
- Replaced 4294967296 (2**32) with 1<<32 for use in turning negative integers
into unsigned 32 bit integers. It's possible to int.to_bytes() in a way
that does this the way we want, and we ought to do that. The syntax is
longer than it needs to be though.
- Strip high bit of DOS 3.3 filenames more efficiently
- Replaced type("".encode().decode()) with str. That wasn't necessary, and
you might think otherwise is an example of why dropping Python 2 is a very
good idea.
- Use int.from_bytes() calls to replace reading several bytes by hand,
multiplying them by bit offsets, and adding them together.
- Made unixDateToADDate return four bytes instead of a hex-ustr because it
only had one caller which just converted the value to that format anyway.
- Misc slight changes like unStudlyCapping identifiers, saving a return value
rather than calling the function that creates it multiple times, tuple
assignment, and coding style
**REMOVED**
- slyce functions: {,bin_,dec_,hex_}slyce
- aToB conversions: binTo{Dec,Hex}, charTo{Dec,Hex}, decTo{Char,Hex},
hexTo{Bin,Char,Dec} (just use to_thing or do it better than that.)
- Removed: readchar{s,Dec,Hex}, writechar{s,sHex,Dec,Hex}
The old way involved a lot more sequence duplication. Now just turn the
bytes object into a mutable bytearray, iterate through the mask and
change what we need, then change it back.
A few clumsy if statements got rewritten with truth tables to verify
that the somewhat simplified conditions still evaluated the same. Other
changes are mostly cosmetic.
Originally cppo was written as a shell script and was never intended to
be a library of functions to be used by anyone else. Single-file Python
modules are often written to be run as standalone programs either to do
what they do from the command line or for testing purposes.
Theoretically you do this if your code provides useful stuff for other
programs to use, and it's hard to argue that cppo does that yet, but it
is intended to do so in the future. Let's start working toward that.
A few of my local copies of cppo have some/most of the code reformatted
in a more "pythonic" coding style. I still use hard tabs for
indentation because even us diehard console-using developers have
editors that can have whatever tabstop we want on a per-file basis, and
we have editorconfig. It's 2017, get with the times, even for a program
made for accessing files for a 1977 computer! ;) No functional changes
here, save for an if statement processing extensions replaces multiple
conditionals with an if x in tuple construct.
Python's native sequence slicing method calls for start with optional
stop and step. This is sometimes exactly what you want, but especially
when parsing binary files, you're gonna want start/length instead. If
start was an expression, messy.
In cppo, there's a function slyce that returns a sliced sequence using
start/length/step metrics, and this is used exclusively for slicing
sequences. Except sometimes you really want Python's start/stop...
I figure: Let's do it Python's way with the slicing syntax, but instead
of seq[start:start+length], you can use sli(): seq[sli(start,length)].
It's not currently used that way, but it now can be. :)
A holdover from DOS 8.3 filenames, files on Windows cannot end with a
dot. We append a - to such names on Windows platforms in all
operations, which should solve the problem, but we'd just duplicated
that code about a dozen times. No need, do it once and we can add
whatever filesystem rules for the host system we need to in one spot.
This one's missing a lot of the cleanups I've done to the others (it
isn't even python3), but it has the debug print statements and the
formatting is generally pretty good. I'll go through my local trees and
begin applying some fixes to this code in various repositories and we'll
see if we can't begin refactoring it completely.