Split ".org" into ".arstart" and ".arend" (address range start/end).
Address range ends are now shown in the code list view, and the
pseudo-op can be edited in app settings. Address range starts are
now shown after notes and long comments, rather than before, which
brings the on-screen display in sync with generated code.
Reworked the address range editor UI to include the new features.
The implementation is fully broken.
More changes to the AddressMap API, putting the resolved region length
into a separate ActualLength field. Added FindRegion(). Renamed
some things.
Code generation changed slightly: the blank line before a region-end
line now comes after it, and ACME's "} ;!pseudopc" is now just "}".
This required minor updates to some of the regression test results.
AddressMap API reshuffle. Added "pre-label" to class and API. Split
AddressMapEntry into two parts to make it clear when FLOATING_LEN
has been resolved.
Updated display line list generator to use in-line linear map
traversal. Previous approach was to walk through the list of regions
in a second pass, inserting .ORG directives, but that was awkward
and is no longer needed.
This is the first step toward changing the address region map from a
linear list to a hierarchy. See issue #107 for the plan.
The AddressMap class has been rewritten to support the new approach.
The rest of the project has been updated to conform to the new API,
but feature-wise is unchanged. While the map class supports
nested regions with explicit lengths, the rest of the application
still assumes a series of non-overlapping regions with "floating"
lengths.
The Set Address dialog is currently non-functional.
All of the output for cc65 changed because generation of segment
comments has been removed. Some of the output for ACME changed as
well, because we no longer follow "* = addr" with a redundant
pseudopc statement. ACME and 65tass have similar approaches to
placing things in memory, and so now have similar implementations.
The data operand editor determines low vs. high ASCII formatting by
examining the first byte of string data. Unfortunately the test was
broken, and for strings with a 1- or 2-byte length, was testing the
length byte instead of the character data. This is now fixed.
This also changes the way empty strings are handled. Before, they
were allowed but not counted, so you couldn't create an empty string
by itself, but could do it if it were part of a larger group. This
was unnecessarily restrictive. Empty L1/L2/null-term strings are now
allowed.
This means that a buffer full of $00 can be formatted as a big pile
of empty strings, which seems a bit ridiculous but there's no good
reason to obstruct it.
(issue #110)
If you select a line that refers to another line, the target line's
address is highlighted. If you select multiple lines, the highlight
is removed. The code that removes the highlight was inadvertently
resetting the selection, making it impossible to shift-click a
range that ended with a highlighted line.
The bottom of the main window shows the total bytes in the project
along with how many are code, data, and junk. The junk figure wasn't
updating if you changed a data item to junk or vice-versa, because a
simple format change doesn't require reanalyzing the file.
To make the counter "live", we need to tell the updater to refresh the
values whenever a format changes to or from "junk".
If you select a note or long comment and delete it, the selection is
lost, because the selected item no longer exists. This is inconvenient
if you're working with the keyboard, because it moves the keyboard
position to the top of the file.
If the previous selection was non-empty, and the new selection is
empty, we want to select the line that would have appeared below the
item that was deleted. Making this work exactly right is a bunch of
work, but we can make it work mostly right by selecting the first line
associated with the offset of the previous selection.
It's useful to have an example of an extension script that handles
multiple types of things. It's also good to show that scripts can
handle data types other than strings, and can chase an address to
format data items elsewhere in the code.
This required updating the tutorial binary, adding the new script,
and updating the tutorial text and associated screen shots.
If a DCI string ended with a string delimiter or non-ASCII character
(e.g. a PETSCII char with no ASCII equivalent), the code generator
output the last byte as a hex value. This caused an error because it
was outputting the raw hex value, with the high bit already set, which
the assembler did not expect.
This change corrects the behavior for code generation and on-screen
display, and adds a few samples to the regression test suite.
(see issue #102)
On the 65816, if you say "JSR foo" from bank $12, but "foo" is an
address in bank 0, most assemblers will conclude that you're forming
a 16-bit argument with a 16-bit address and assemble happily. 64tass
halts with an error. Up until v1.55 or so, you could fake it out
by supplying a large offset.
This no longer works. The preferred way to say "no really I mean to
do this" is to append ",k" to the operand. We now do that as needed.
I didn't want to define a new ExpressionMode for 64tass just to
support an operand modifier that should probably never actually get
generated (you can't call across banks with JSR!), so this is
implemented with a quirk and an op flag.
64tass v1.56.2625 is now the default.
(issue #104)
This is another attempt to fix the ListView keyboard position
behavior. Basic problem: if you change something in the ListView,
the keyboard position is lost, and WPF doesn't expose a nice way to
save and restore it. It appears the way to set the position is by
calling Focus() on the specific item you want to have as the "current"
keyboard position, but you can only do that at certain times.
This attempt removes the grid-splitter resize hack, in favor of just
setting a "needs refocus" flag when we restore the selection set.
This causes Focus() to be called from the StatusChanged callback on
the next event with status="containers generated".
During testing I noticed some other odd behavior: if you used "goto"
to jump to an address, up/down arrows would change focus to a
different control (menu items, grid splitters, etc). The problem
there was that we were setting focus to the ListView control rather
than to a ListViewItem, so arrow keys were in control-traversal mode
rather than list-walk mode. That is also fixed.
(Issue #105)
The DCI string format uses character values where the high bit of the
last byte differs from the rest of the string. Usually all the high
bits are clear except on the last byte, but SourceGen generally allows
either polarity.
This gets a little uncertain with single-character strings, because
SourceGen can't auto-detect DCI very effectively. A series of bytes
with the high bit set could be a single high-ASCII string or a series
of single-byte DCI strings.
The motivation for allowing them is C64 PETSCII. While ASCII allows
"high ASCII" as an escape hatch, PETSCII doesn't have that option, so
there's no way to mark the data as a character or a string. We still
want to do a bit of screening, but if the user specifies a non-ASCII
character set and the selected bytes have their high bits set, we
want to just treat the whole set as 1-byte DCI.
Some minor adjustments were needed for a couple of validity checks
that expected longer strings.
This adds some short DCI strings in different character sets to the
char-encoding regression tests.
(for issue #102)
The code for formatting an address table allows you to specify that
code start tags should be placed on all targets. However, unnecessary
tags are undesirable, and it's not necessary to add a tag if the
target is already treated as executable code. So the implementation
tested to see if the target address was already an instruction.
The code was incorrectly testing for "is instruction", rather than "is
instruction start", which meant that if the table entry pointed at an
instruction embedded inside another instruction we would conclude that
the tag wasn't necessary, when in fact it was. Not only weren't we
getting a useful table entry, we were adding a symbolic reference to a
hidden label.
(issue #103)
When formatting one or more strings with the Edit Data Operand dialog,
the code must determine which options to present. If the selected
bytes appear to represent one or more null-terminated strings, that
option is enabled in the UI.
The "format recognizers" enforce some strict rules, e.g. null-
terminated strings must end in $00, and also try to confirm that the
data looks like a printable string. The algorithm rejects strings
with "illegal" characters in them. This is simpler on some systems
than others. For example, C64 PETSCII defines quite a few control
characters in ways that make them useful for embedding in printable
strings.
The "recognizers" are only used by the operand edit feature, not as
part of an automated string detector, so there's no real upside in
overriding the user's desire to form a string with arbitrary bytes.
This removes the quick rejection from the four recognizers (null-term,
len8, len16, dci). It does not alter the high-level code, which
still insists on a certain percentage of the string being printable;
that may be worth revisiting as well.
(issue #100)
64tass wants to place its output into a 64KB region of memory,
starting at the address "*" is set to, and continuing without
wrapping around the end of the bank. Some files aren't meant to be
handled that way, so we need to generate the output differently.
If the file's output fits nicely, it's considered "loadable", and
is generated in the usual way. If it doesn't, it's treated as
"streamable", and the initial "* = addr" directive is omitted
(leaving "*" at zero), and we go straight to ".logical" directives.
65816 code with an initial address outside bank 0 is treated as
"streamable" whether or not the contents fit nicely in the designated
64K area. This caused a minor change to a few of the 65816 tests.
A new test, 20240-large-overlay, exercises "streamable" by creating
a file with eight overlapping 8KB segments that load at $8000.
While the file as a whole fits in 64KB, it wouldn't if loaded at
the desired start address.
Also, updated the regression test harness to report assembler
failure independently of overall test failure. This makes it easier
to confirm that (say) ACME v0.96.4 still works with the code we
generate, even though it doesn't match the expected output (which
was generated for v0.97).
(problem was raised in issue #98)
The initial implementation was testing the byte value rather than
the converted value, so backslashes were getting through in high
ASCII strings. PETSCII and C64 screen codes don't really have a
backslash so it's not really an issue there.
The new implementation handles high ASCII correctly. The various
201n0-char-encoding-x regression tests have been updated to verify
this.
The code generator outputs an optional comment specifying which
version of which assembler the code was generated for. This was
handled inconsistently and, for the most part, incorrectly. We now
report the correct version.
Two things changed: (1) string literals can now hold backslash
escapes like "\n"; (2) MVN/MVP operands can now be prefixed with '#'.
The former was a breaking change because any string with "\" must
be changed to "\\". This is now handled by the string operand
formatter.
Also, improved test harness output. Show the assembler versions at
the end, and include assembler failure messages in the collected
output.
If you have multiple overlapping segments (say, four 8KB chunks
that all load at $8000), and you use the "goto" command to jump
to address $8100, it should try to jump to that address within
whichever segment you happen to be working (based on the current
line selection). If that address doesn't exist in the current
segment, it's okay to punt and jump to the first occurrence of that
address in the file.
The existing code was always jumping to the first instance.
(related to issue #98)
Code generated by one of the C compilers sets up the stack frame and
then maps the direct page on top of it. If the value at the top of
the stack is 16 bits, it will be referenced via address $ff. The
local variable editor was regarding this as illegal, because lvars are
currently only defined for direct page data, and the value doesn't
entirely fit there (unless you're doing an indirect JMP on an NMOS
6502, in which case it wraps around to $00... but let's ignore that).
The actual max width of a local variable is 257 because of the
possibility of a 16-bit access at $ff.
Older versions of SourceGen don't seem to have an issue when they
encounter this situation, as worrying about (start+width) is really
just an editor affectation. The access itself is still a direct-page
operation. You won't be able to edit the entry without reducing the
length, but otherwise everything works. I don't think there's a need
to bump the file version.
Added a compiled C implementation of strlen(). The most interesting
part about this is that it references a 16-bit value via direct-page
address $ff, which means you'd want a local variable with
address=$ff and width=2. The current UI prevents this.
We were using a very simple regex pattern for the label part, and
not performing additional validation checks later. This allowed
a symbol that started with a number (e.g. "4ALL") to get much farther
than it should have.
This change modifies the regex pattern to match only valid label
syntax.
Replaced the link at the top of the manual. Remove reference to
old tutorial doc. Added an obsolescence notice to the top of the
old tutorial. Updated tutorial message and link in README.
Also, fixed sidenav style.
If you changed the width of a column, and then clicked the "toggle
display of cycle counts" button in the toolbar, the column width
would revert. The problem appears unique to that toolbar button,
so for now the fix is localized there. The more general fix is to
ensure that column width changes don't get stomped, but that's a
larger change.
There's no need to use XHTML Transitional. The only change outside
the template was to use "id" for anchors instead of "name", as the
latter is deprecated.
The calculations were wrong for certain situations, generating
answers that were useless or that caused a false-positive overflow
error.
This adds a couple of simple regression tests, modeled after layout
of the Lode Runner sprite sheet (which worked fine before) and the
Empire II EWS3 font (which failed).
This also bumps up some of the arbitrary limits in the visualizer.
(issue #94)
The $Cxxx I/O locations are mapped into banks $E0/E1, and are usually
configured to appear in banks $00/01 as well. Direct access to
locations in banks $E0/E1 is common in 16-bit code, but we only had
definitions for $E0.
This adds a clone of definitions for $E1, and renames the symbols
to be _E0/_E1 instead of _GS.
This can also be solved with MULTI_MASK, but that will always use
$E0 as the base address, so references to $E1/Cxxx will have a large
adjustment added ("+$10000"), which is kind of ugly.
Note we still don't have definitions for $01/Cxxx. I'll add those
if I run into them in 16-bit code. (That might be a reasonable use
of MULTI_MASK; feels less ugly somehow.)