Added hexdump function

What's wrong with b2a_hex() or hex()?  Well, hex() only converts integers.  And
while a2b_hex() ignores whitespace, b2a_hex() doesn't provide any, making for
difficult to read output for anything longer than about 8 bytes or so.

In the basic case, it seems like you want a classic hexdump.  I chose the xxd
format:

xxxxxxxx: xxxx xxxx xxxx xxxx  xxxx xxxx xxxx xxxx |cccccccccccccccc|

Rather than hardcode all of the integers and strings (as I started doing), I
decided that I might as well use variables for these things if only for
readability.  And if they're locals, you might as well be able to override
them.

The knobs you have to play with are therefore these:

- wordsize=2, how many bytes are grouped together
- sep=' ', the spacing between words
- sep2='  ', the midpoint spacing

I suppose I could've made everything else configurable too, but YAGNI.
This commit is contained in:
T. Joseph Carter 2017-07-02 15:45:28 -07:00
parent 490d7e8224
commit f3b5fe7dcd
1 changed files with 43 additions and 0 deletions

43
cppo
View File

@ -37,6 +37,7 @@ import subprocess
#import tempfile # not used, but should be for temp directory?
import logging
import struct
from typing import Sequence
from collections import namedtuple
from binascii import a2b_hex, b2a_hex
@ -953,6 +954,48 @@ class Disk:
self.a2mg = unpack_2mg(self.image)
### UTIL
def seqsplit(seq: Sequence, num: int) -> Sequence:
"""split Sequence into smaller Sequences of size 'num'"""
for i in range(0, len(seq), num):
yield seq[i:i + num]
def hexdump(
buf: bytes,
striphigh: bool = False,
wordsize: int = 2,
sep: str = ' ',
sep2: str = ' '
) -> str:
"""return a multi-line debugging hexdump of a bytes object"""
'''Format is configurable but defaults to that of xxd:
########: #### #### #### #### #### #### #### #### |................|
wordsize is the number of bytes between separators
sep is the separator between words
sep2 is the midline separator
striphigh considers 0xa0-0xfe to be printable ASCII (as on Apple II)
'''
out = []
hlen = 32 + len(sep2) + (16//wordsize-2) * len(sep)
wordlen = wordsize * 2
for i, vals in enumerate(seqsplit(buf, 16)):
hexs = sep2.join([
sep.join(seqsplit(b2a_hex(x).decode(), wordlen))
for x in seqsplit(vals,8)
])
if striphigh:
vals = [x & 0x7f for x in vals]
chars = ''.join([
chr(x) if x >= 0x20 and x < 0x7f else '.'
for x in vals
])
out.append('{i:07x}0: {hexs:{hlen}} |{chars}|'.format(**locals()))
return '\n'.join(out)
### LOGGING
# *sigh* No clean/simple way to use str.format() type log strings without
# jumping through a few hoops