116 Commits

Author SHA1 Message Date
kris
cc6c92335d Implement a much more efficient mechanism for mapping an array between
(x, y) indexing and (page, offset) indexing.  This uses numpy to
construct a new array by indexing into the old one.

In benchmarking this is something like 100x faster.
2019-02-23 23:44:29 +00:00
kris
4178c191db Update cycle timing from working ethernet player.
Add _START and _END addresses that are used by the byte stream to
vector the program counter to the next opcode in the stream.

Support equality testing of opcodes and add tests.

Add an ACK opcode for instructing the client to ACK the TCP stream.

Tick opcode now accepts a cycle argument, for experimenting with
audio support.
2019-02-23 23:38:14 +00:00
kris
e0ab30d074 Fix deprecation warning on newer numpy
Similarity metric should be a float
2019-02-23 23:33:18 +00:00
kris
e4174ed10b Extract out input video decoding into separate module.
Prototype a threaded version of the decoder but this doesn't seem to be
necessary as it's not the bottleneck.

Opcode stream is now aware of frame cycle budget and will keep emitting
until budget runs out -- so no need for fullness estimate.
2019-02-23 23:32:07 +00:00
kris
dc671986a3 Send contents of out.bin file 2019-02-23 23:28:33 +00:00
kris
7deed24ac4 Rename opcode 2019-01-05 23:51:21 +00:00
kris
36fc34d26d Refactor the world 2019-01-05 23:31:56 +00:00
kris
84611ad5e3 WIP: modified version of echo server that reads in page-sized chunks 2019-01-03 23:25:58 +00:00
kris
c797852324 - Stop masking out unchanged bytes explicitly and compare the full
source vs target frame.  This allows us to accumulate runs across
unchanged bytes, if they happen to be the same content value.

- introduce an allowable bit error when building runs, i.e. trade
  some slight imprecision for much more efficient decoding.  This gives
  a slight (~2%) reduction in similarity on my test frames at 140 pixels
  but improves the 280 pixel similarity significantly (~7%)

- so make 280 pixels the default for now

- once the run is complete, compute the median value of each bit in
  the run and use that as content byte.  I also tried mean which had
  exactly the same output

- runs will sometimes now span the (0x7x) screen holes so for now just
  ignore invalid addresses in _write
2019-01-03 17:38:47 +00:00
kris
1c13352106 Implement RLE support, which is more efficient than byte-wise stores
for runs of N >= 4.

Also fix a bug in the decoder that was apparently allowing opcodes to
fall through.  Replace BVC with BRA (i.e. assume 65C02) until I can work
out what is going on
2019-01-03 14:51:57 +00:00
kris
ab4b4f22fd Refactor opcode schedulers and implement one based on the ortools TSP
solver to minimize the cycle cost to visit all changes in our estimated
list.

This is fortunately a tractable (though slow) computation that does give
improvements on the previous heuristic at the level of ~6% better
throughput.

This opcode schedule prefers to group by page and vary over content, so
implement a fast heuristic that does that.  This scheduler is within 2%
of the TSP solution.
2019-01-02 23:10:03 +00:00
kris
a8688a6a7e Memoize hamming_weight to optimize runtime 2019-01-02 22:25:16 +00:00
kris
6de5f1797d Reimplement opcode scheduler to one that is ~as fast as before. As a
bonus we now maintain much better tracking of our target frame rate.

Maintain a running estimate of the opcode scheduling overhead, i.e.
how many opcodes we end up scheduling for each content byte written.

Use this to select an estimated number of screen changes to fill the
cycle budget, ordered by hamming weight of the delta.  Group these
by content byte and then page as before.
2019-01-02 22:16:54 +00:00
kris
8e3f8c9f6d Implement a measure of similarity for two bool arrays and use it to measure
how close we are getting to encoding the target image.
2019-01-02 00:24:25 +00:00
kris
7c5e64fb6f Optimize for cycles/pixel by weighting each output byte by the hamming
weight of the xor of old and new frames, and switch to setting the
new byte directly instead of xor'ing, to improve efficiency of decoder.

Instead of iterating in a fixed order by target byte then page, at
each step compute the next change to make that would maximize
cycles/pixel, including switching page and/or content byte.

This is unfortunately much slower to encode currently but can hopefully
be optimized sufficiently.
2019-01-02 00:03:21 +00:00
kris
0b78e2323a Initial working version of image encoder. This tries to optimize the
bytestream by prioritizing bytes to be XOR'ed that have the highest
hamming weight, i.e. will result in the largest number of pixel
transitions on the screen.

Not especially optimized yet (either runtime, or byte stream)
2019-01-01 21:50:01 +00:00