The code that searches for character strings in uncategorized data
now recognizes the C64 encodings when selected in the project
properties.
The new code avoids some redundant comparisons when runs of
printable characters are found. I suspect the new implementation
loses on overall performance because we're now calling through
delegates instead of testing characters directly, but I haven't
tested for that.
The previous code output a character in single-quotes if it was
standard ASCII, double-quotes if high ASCII, or hex if it was neither
of those. If a flag was set, high ASCII would also be output as
hex.
The new system takes the character value and an encoding identifier.
The identifier selects the character converter and delimiter
pattern, and puts the two together to generate the operand.
While doing this I realized that I could trivially support high
ASCII character arguments in all assemblers by setting the delimiter
pattern to "'#' | $80".
In FormatDescriptor, I had previously renamed the "Ascii" sub-type
"LowAscii" so it wouldn't be confused, but I dislike filling the
project file with "LowAscii" when "Ascii" is more accurate and less
confusing. So I switched it back, and we now check the project
file version number when deciding what to do with an ASCII item.
The CharEncoding tests/converters were also renamed.
Moved the default delimiter patterns to the string table.
Widened the delimiter pattern input fields slightly. Added a read-
only TextBox with assorted non-typewriter quotes and things so
people have something to copy text from.
We've been treating ASCII strings and instruction/data operands as
ambiguous, resolving low vs. high when generating output for the
display or assembler. This change splits it into two separate
formats, simplifying output generation.
The UI will continue to treat low/high ASCII as as single thing,
selecting the format appropriately based on the data. There's no
reason to have two radio buttons that are never both enabled.
The data operand string functions need some additional work, but
that overlaps substantially with the upcoming PETSCII changes, so
for now all strings set by the data operand editor are low ASCII.
The file format has changed again, but since there hasn't been a
release since the previous change, I'm leaving the file format
at v2. Code has been added to resolve the ASCII mode when loading
a v1 project file.
This removes some complexity from the assembly code generators.
SourceGen creates "auto" labels when it finds a reference to an
address that doesn't have a label associated with it. The label for
address $1234 would be "L1234". This change allows the project to
specify alternative label naming conventions, annotating them with
information from the cross-reference data. For example, a subroutine
entry point (i.e. the target of a JSR) would be "S_1234". (The
underscore was added to avoid confusion when an annotation letter
is the same as a hex digit.)
Also, tweaked the way the preferred clipboard line format is stored
in the settings file (was an integer, now an enumeration string).
In the cross-reference table we now indicate whether the reference
source is doing a read, write, read-modify-write, branch, subroutine
call, is just referencing the address, or is part of the data.
If you double-click on the opcode of "JSR label", the code view
selection jumps to the label. This now works for partial operands,
e.g. "LDA #<label".
Some changes to the find-label-offset code affected the cc65 "is it
a forward reference to a direct-page label" logic. The regression
test now correctly identifies an instruction that refers to itself
as not being a forward reference.
If PTR is defined as an external symbol, we were automatically
symbol-ifying PTR+1. Now we also symbolify PTR+2. This helps with
24-bit pointers on the 65816, and 16-bit "jump vectors", where the
address is preceded by a JMP opcode.
Removed the "AMPERV_" symbol I added to make the tutorial look
right.
If we set the length word to assemble at address zero, the rest of
the code will try to use it as a zero-page label, so don't do that.
Instead, we use the start address, creating an overlapping region.
Easy enough to edit if that's undesirable.
(issue #23)
First is always at zero, second is at the address. This puts an
ORG directive right at the start of the code, and avoids potentially
assembler-specific wrap-around behavior when the desired load
address is $0000 or $0001.
(issue #23)
If set, the first word of the file is used to set the load address.
The initial code entry hint is placed at offset +000002 instead of
the start of the file.
Set it to true for the C64 system definition.
(issue #23)
It's possible to have format descriptors on instructions that are
left over from when the bytes were treated as data. Single-byte
formats were being allowed on single-byte instructions, which
confused things later when the code tried to apply the format to
an instruction with no operand.