The previous code output a character in single-quotes if it was
standard ASCII, double-quotes if high ASCII, or hex if it was neither
of those. If a flag was set, high ASCII would also be output as
hex.
The new system takes the character value and an encoding identifier.
The identifier selects the character converter and delimiter
pattern, and puts the two together to generate the operand.
While doing this I realized that I could trivially support high
ASCII character arguments in all assemblers by setting the delimiter
pattern to "'#' | $80".
In FormatDescriptor, I had previously renamed the "Ascii" sub-type
"LowAscii" so it wouldn't be confused, but I dislike filling the
project file with "LowAscii" when "Ascii" is more accurate and less
confusing. So I switched it back, and we now check the project
file version number when deciding what to do with an ASCII item.
The CharEncoding tests/converters were also renamed.
Moved the default delimiter patterns to the string table.
Widened the delimiter pattern input fields slightly. Added a read-
only TextBox with assorted non-typewriter quotes and things so
people have something to copy text from.
We've been treating ASCII strings and instruction/data operands as
ambiguous, resolving low vs. high when generating output for the
display or assembler. This change splits it into two separate
formats, simplifying output generation.
The UI will continue to treat low/high ASCII as as single thing,
selecting the format appropriately based on the data. There's no
reason to have two radio buttons that are never both enabled.
The data operand string functions need some additional work, but
that overlaps substantially with the upcoming PETSCII changes, so
for now all strings set by the data operand editor are low ASCII.
The file format has changed again, but since there hasn't been a
release since the previous change, I'm leaving the file format
at v2. Code has been added to resolve the ASCII mode when loading
a v1 project file.
This removes some complexity from the assembly code generators.
DCI is handled with the ".shift" pseudo-op. The .null, .ptext,
and .shift operators all work correctly with escaped characters,
so we no longer redo those.
This generalizes the string pseudo-operand formatter, moving it into
the Asm65 library. The assembly source generators have been updated
to use it. This makes the individual generators simpler, and by
virtue of avoiding "test runs" should make them slightly faster.
This also introduces byte-to-character converters, though we're
currently still only supporting low/high ASCII.
Regression test output is unchanged.
We used to use type="String", with the sub-type indicating whether
the string was null-terminated, prefixed with a length, or whatever.
This didn't leave much room for specifying a character encoding,
which is orthogonal to the sub-type.
What we actually want is to have the type specify the string type,
and then have the sub-type determine the character encoding. These
sub-types can also be used with the Numeric type to specify the
encoding of character operands.
This change updates the enum definitions and the various bits of
code that use them, but does not add any code for working with
non-ASCII character encodings.
The project file version number was incremented to 2, since the new
FormatDescriptor serialization is mildly incompatible with the old.
(Won't explode, but it'll post a complaint and ignore the stuff
it doesn't recognize.)
While I was at it, I finished removing DciReverse. It's still part
of the 2005-string-types regression test, which currently fails
because the generated source doesn't match.
The 65816 definition makes it a two-byte instruction, like COP. On
the 6502 it acted like a two-byte instruction, but in practice very
few assemblers treat it that way. Very few humans, for that matter.
So it's now treated as a single byte instruction, with the following
byte encoded as a data value.
This worked, sort of. The problem is that SourceGen will revert to
hex output in certain situations, such as a broken symbolic
reference. There happens to be one in the ZIPPY example, and it's
on a relative branch.
The goal with the segment stuff is to allow cc65 to treat the
source as relocatable code. In that context, a relative branch to
an absolute address doesn't make any sense, so the assembler reports
a range error.
We don't currently have a mechanism that guarantees no references
are broken (and no affordance for finding them), so we can't make
this mode the default yet.
Instead, we continue to use the generic config, but generate the
correct set of lines as comments.
(issue #39)
The cc65 assembler runs in a single pass, which means forward
address references default to 16 bits. For zero-page references
we have to add an explicit width disambiguator. (This is an
unusual situation that only occurs if you have a zero-page .ORG
in the file after code that references it.)
With this change, 2014-label-dp passes, and no other regression
tests were affected.
(issue #40)
The 2014-label-dp test now passes. Prior regression tests are
unaffected.
Also, renamed an IGenerator interface to more accurately reflect
its role.
(issue #37)
To avoid confusing the assembler, expressions with a leading
parenthesis like "(foo & $ffff) + 1" are prefixed with a "0+". This
is not necessary if the operand begins with a '#'.
(issue #16)
We now insert parenthesis as needed. This can cause problems in
some situations, so we always prefix parenthetical expressions with
"0+", which looks goofy and is unnecessary for immediate operands.
But it does generate working source code.
Renamed the "simple" expression mode to "common", as it's not
particularly simple but is what you'd expect most assemblers to do.
(OTOH, life has been full of surprises.)
(issue #16)
Gave cc65 its own expression generator, as the precedence table seems
atypical if not unique. Configured 64tass to use the "simple"
expression mode.
Added some operations on a 32-bit constant to 2007-labels-and-symbols
to exercise the current worst-case expression (shift + AND + add).
Tweaked the Merlin expression generator to handle it.
(issue #16)
Most tests pass, but 2007-labels-and-symbols fails because the
expressions recognized by 64tass don't match up with either of the
other assemblers.
This is currently using a workaround for the local label syntax.
64tass uses '_' as the prefix, which is unfortunate since SourceGen
explicitly allowed underscores in labels. (So does 64tass for that
matter, but it treats labels specially when the '_' comes first.)
We will need to rename any non-local user labels that start with '_'.
(issue #16)