The code that found a nearby data target for an instruction operand
was searching backward but not forward. We now take one step
forward, so that "LDA TABLE-1,Y" fills in automatically.
This altered 2008-address-changes, which had just this situation.
It didn't alter 2010-target-adjustment, but the existing tests were
insufficient and have been improved.
If we have a bug, or somebody edits the project file manually, we
can end up with a very wrong string, such as a null-terminated
string that isn't, or a DCI string that has a mix of high and low
ASCII from start to finish. We now check all incoming strings for
validity, and discard any that fail the test. The verification
code is shared with the extension script inline data formatter.
Also, added a comment to an F8-ROM symbol I stumbled over.
Implement multi-byte project/platform symbols by filling out a table
of addresses. Each symbol is "painted" into the table, replacing
an existing entry if the new entry has higher priority. This allows
us to handle overlapping entries, giving boosted priority to platform
symbols that are defined in .sym65 files loaded later.
The bounds on project/platform symbols are now rigidly defined. If
the "nearby" feature is enabled, references to SYM-1 will be picked
up, but we won't go hunting for SYM+1 unless the symbol is at least
two bytes wide.
The cost of adding a symbol to the symbol table is about the same,
but we don't have a quick way to remove a symbol.
Previously, if two platform symbols had the same value, the symbol
with the alphabetically lowest label would win. Now, the symbol
defined in the most-recently-loaded file wins. (If you define two
symbols with the same value in the same file, it's still resolved
alphabetically.) This allows the user to pick the winner by
arranging the load order of the platform symbol files.
Platform symbols now keep a reference to the file ident of the
symbol file that defined them, so we can show the symbols's source
in the Info panel.
These changes altered the behavior of test 2008-address-changes,
which includes some tests on external addresses that are close to
labeled internal addresses. The previous behavior essentially
treated user labels as being 3 bytes wide and extending outside the
file bounds, which was mildly convenient on occasion but felt a
little skanky. (We could do with a way to define external symbols
relative to internal symbols, for things like the source address of
code that gets relocated.)
Also, re-enabled some unit tests.
Also, added a bit of identifying stuff to CrashLog.txt.
Rearrange the UI elements, and convert the code-behind to a more
XAML-style form. The basic stuff works, but the old "shortcut"
system is still in the process of being replaced.
The code that checked to see if a data target was inside a data
operand wasn't going all the way back to the start of the file.
It was also failing to stop when it should, wasting time.
The anattrib validation method has code that avoids a false-positive
on certain complex embedded instruction arrangements. This was also
preventing it from seeing a transition from a data area to the
middle of an instruction (caused by issue #45).
Fixed a minor bug in GenerateLineList that would cause a blank line
to disappear under certain circumstances. Harmless, but odd.
Added a width property to DefSymbol.
Updated comments.
I didn't think it made sense, but I found something that used it,
so apparently it's a thing. This updates the operand editor to
let you choose PETSCII+DCI, and updates the assemblers to handle
it correctly (really just 64tass, since the others either don't
have a DCI directive or don't deal with PETSCII at all).
Changed the char-encoding sample from "bad dcI" to "pet dcI", and
updated the documentation.
Both dialogs got a couple extra radio buttons for selection of
single character operands. The data operand editor got a combo box
that lets you specify how it scans for viable strings.
Various string scanning methods were made more generic. This got a
little strange with auto-detection of low/high ASCII, but that was
mostly a matter of keeping the previous code around as a special
case.
Made C64 Screen Code DCI strings a thing that works.
The previous functions just grabbed 62 characters and slapped quotes
on the ends, but that doesn't work if we want to show strings with
embedded control characters. This change replaces the simple
formatter with the one used to generate assembly source code. This
increases the cost of refreshing the display list, so a cache will
need to be added in a future change.
Converters for C64 PETSCII and C64 Screen Code have been defined.
The results of changing the auto-scan encoding can now be viewed.
The string operand formatter was using a single delimiter, but for
the on-screen version we want open-quote and close-quote, and might
want to identify some encodings with a prefix. The formatter now
takes a class that defines the various parts. (It might be worth
replacing the delimiter patterns recently added for single-character
operands with this, so we don't have two mechanisms for very nearly
the same thing.)
While working on this change I remembered why there were two kinds
of "reverse" in the old Merlin 32 string operand generator: what you
want for assembly code is different from what you want on screen.
The ReverseMode enum has been resurrected.
The code that searches for character strings in uncategorized data
now recognizes the C64 encodings when selected in the project
properties.
The new code avoids some redundant comparisons when runs of
printable characters are found. I suspect the new implementation
loses on overall performance because we're now calling through
delegates instead of testing characters directly, but I haven't
tested for that.
The previous code output a character in single-quotes if it was
standard ASCII, double-quotes if high ASCII, or hex if it was neither
of those. If a flag was set, high ASCII would also be output as
hex.
The new system takes the character value and an encoding identifier.
The identifier selects the character converter and delimiter
pattern, and puts the two together to generate the operand.
While doing this I realized that I could trivially support high
ASCII character arguments in all assemblers by setting the delimiter
pattern to "'#' | $80".
In FormatDescriptor, I had previously renamed the "Ascii" sub-type
"LowAscii" so it wouldn't be confused, but I dislike filling the
project file with "LowAscii" when "Ascii" is more accurate and less
confusing. So I switched it back, and we now check the project
file version number when deciding what to do with an ASCII item.
The CharEncoding tests/converters were also renamed.
Moved the default delimiter patterns to the string table.
Widened the delimiter pattern input fields slightly. Added a read-
only TextBox with assorted non-typewriter quotes and things so
people have something to copy text from.
We've been treating ASCII strings and instruction/data operands as
ambiguous, resolving low vs. high when generating output for the
display or assembler. This change splits it into two separate
formats, simplifying output generation.
The UI will continue to treat low/high ASCII as as single thing,
selecting the format appropriately based on the data. There's no
reason to have two radio buttons that are never both enabled.
The data operand string functions need some additional work, but
that overlaps substantially with the upcoming PETSCII changes, so
for now all strings set by the data operand editor are low ASCII.
The file format has changed again, but since there hasn't been a
release since the previous change, I'm leaving the file format
at v2. Code has been added to resolve the ASCII mode when loading
a v1 project file.
This removes some complexity from the assembly code generators.
We used to use type="String", with the sub-type indicating whether
the string was null-terminated, prefixed with a length, or whatever.
This didn't leave much room for specifying a character encoding,
which is orthogonal to the sub-type.
What we actually want is to have the type specify the string type,
and then have the sub-type determine the character encoding. These
sub-types can also be used with the Numeric type to specify the
encoding of character operands.
This change updates the enum definitions and the various bits of
code that use them, but does not add any code for working with
non-ASCII character encodings.
The project file version number was incremented to 2, since the new
FormatDescriptor serialization is mildly incompatible with the old.
(Won't explode, but it'll post a complaint and ignore the stuff
it doesn't recognize.)
While I was at it, I finished removing DciReverse. It's still part
of the 2005-string-types regression test, which currently fails
because the generated source doesn't match.
SourceGen creates "auto" labels when it finds a reference to an
address that doesn't have a label associated with it. The label for
address $1234 would be "L1234". This change allows the project to
specify alternative label naming conventions, annotating them with
information from the cross-reference data. For example, a subroutine
entry point (i.e. the target of a JSR) would be "S_1234". (The
underscore was added to avoid confusion when an annotation letter
is the same as a hex digit.)
Also, tweaked the way the preferred clipboard line format is stored
in the settings file (was an integer, now an enumeration string).
In the cross-reference table we now indicate whether the reference
source is doing a read, write, read-modify-write, branch, subroutine
call, is just referencing the address, or is part of the data.
If you double-click on the opcode of "JSR label", the code view
selection jumps to the label. This now works for partial operands,
e.g. "LDA #<label".
Some changes to the find-label-offset code affected the cc65 "is it
a forward reference to a direct-page label" logic. The regression
test now correctly identifies an instruction that refers to itself
as not being a forward reference.
When you edit the operand of an instruction that targets an in-file
address, you're given the opportunity to specify a shortcut that
applies the symbol to the instruction's target address in addition
to or instead of defining a weak symbol reference on the instruction
being edited.
This didn't work right for operands with adjustments, e.g. the store
instructions in self-modifying code. It put the label at the
unadjusted offset, which does nothing useful.
We now correctly back up to the start of the instruction or multi-
byte data area.
Allows specification of table data in various ways, for 16-bit and
24-bit addresses. Shows a preview so you can see if the addresses
look about right. Adds permanent labels at target offsets if none
are present. Optionally sets code hints.
Works beautifully on the A2-Amper-fdraw example, but needs some
additional testing, documentation, etc. Dialog is more complicated
that I would have liked, mostly because of 65816 support, but I
think it'll do.
(issue #10)