I noticed that after I implemented the SPI optimization of cycle
counting instead of polling on SPIF, the first "normal" SPI transaction
I tried would fail. This is because nothing was clearing the SPIF flag
anymore, and the normal SPI driver still looks at it. So it was thinking
that the latest transaction was already completed (it wasn't). Worked
around this by making sure we clear the flag in SPI_Assert. I'm not
concerned about performance impact here because the actual clean SPI
driver is not used in performance-bound situations.
Fixed an issue that identified the wrong pins as shorted to ground in
the electrical test functionality. Whoops!
If multiple read or write cycles are done in sequence, we'll no longer
needlessly update the data direction registers (which is a slow SPI
transaction). We can also skip updating the pullups on the AVR if
multiple read cycles occur in sequence.
This makes the code pretty easily portable to other architectures if someone
wants to make a more modern SIMM programmer. I also was pretty careful to split
responsibilities of the different components and give the existing components
better names. I'm pretty happy with the organization of the code now.
As part of this change I have also heavily optimized the code. In particular,
the read and write cycle routines are very important to the overall performance
of the programmer. In these routines I had to make some tradeoffs of code
performance versus prettiness, but the overall result is much faster
programming.
Some of these performance changes are the result of what I discovered when
I upgraded my AVR compiler. I discovered that it is smarter at looking at 32-bit
variables when I use a union instead of bitwise operations.
I also shaved off more CPU cycles by carefully making a few small tweaks. I
added a bypass for the "program only some chips" mask, because it was adding
unnecessary CPU cycles for a feature that is rarely used. I removed the
verification feature from the write routine, because we can always verify the
data after the write chunk is complete, which is more efficient. I also added
assumptions about the initial/final state of the CS/OE/WE pins, which allowed me
to remove more valuable CPU cycles from the read/write cycle routines.
There are also a few enormous performance optimizations I should have done a
long time ago:
1) The code was only handling one received byte per main loop iteration. Reading
every byte available cut nearly a minute off of the 8 MB programming time.
2) The code wasn't taking advantage of the faster programming command available
in the chips used on the 8 MB SIMM.
The end result of all of these optimizations is I have programming time of the
8 MB SIMM down to 3:31 (it used to be 8:43).
Another minor issue I fixed: the Micron SIMM chip identification wasn't working
properly. It was outputting the manufacturer ID again instead of the device ID.