Memory-mapped I/O locations can have different behavior when read
vs. written. This is part 1 of a change to allow two different
symbols to represent the same address, based on I/O direction.
This also adds a set of address masks for systems like the Atari
2600 that map hardware addresses to multiple locations.
This change updates the data structures, .sym65 file reader,
project serialization, and DefSymbol editor.
We were failing to update properly when a label changed if the label
was one that a plugin cared about. The problem is that a label
add/remove operation skips the code analysis, and a label edit skips
everything but the display update. Plugins only run during the code
analysis pass, so changes weren't being reflected in the display
list until something caused it to refresh.
The solution is to ask the plugin if the label being changed is one
that it cares about. This allows the plugin to use the same
wildcard-match logic that it uses elsewhere.
For efficiency, and to reduce clutter in plugins that don't care
about symbols, a new interface class has been created to handle the
"here are the symbols" call and the "do you care about this label"
call.
The program in Examples/Scripts has been updated to show a very
simple single-call plugin and a slightly more complex multi-call
plugin.
Most of SourceGen uses standard WPF controls, which get their default
style from the system theme. The main disassembly list uses a
custom style, and always looks like the Windows default theme.
Some people greatly prefer white text on a black background, so we
now provide a way to get that. This also requires muting the colors
used for Notes, since those were chosen to contrast with black text.
This does not affect anything other than the ListView used for
code, because everything else can be set through the Windows
"personalization" interface. We might want to change the way the
Notes window looks though, to avoid having glowing bookmarks on
the side.
The last two tabs in the Edit App Settings dialog have "quick set"
buttons configure all fields for a particular assembler, or reset
them to default values. The previous UI was a little annoying,
because you had to pick something from the combo box and then hit
"set" to push the change. It was also confusing, because if you
came back later the combo box was just set to the first entry, not
the thing you picked last.
Now, picking an entry from the combo box immediately updates all
fields. The combo box selection is set to reflect the actual
contents (so if you set everything just right, the combo box will
change to a specific assembler). If nothing matches, a special
entry labeled "Custom" is selected.
Also, rearranged the tutorial sections in the manual so the
address table formatting comes last, and appears in the local TOC.
The Find box now has forward/backward radio buttons. Find Next
searches forward, and Find Previous searches backward, regardless
of the direction of the initial search.
The standard key sequence for "find previous" is Shift+F3. The WPF
ListView has some weird logic that does something like: if you hit
a key, and the selection changes, and the shift key was held down,
then you must have meant to select a range. So Shift+F3 often (but
not always) selects a range. I think this might be fixable if I can
figure out how ListView keeps track of the current keyboard
navigation position (which is not the same as the selection). For
now I'm working around the problem by using Ctrl+F3 to search.
Yay WPF.
Early data sheets listed BRK as one byte, but RTI after a BRK skips
the following byte, effectively making BRK a 2-byte instruction.
Sometimes, such as when diassembling Apple /// SOS code, it's handy
to treat it that way explicitly.
This change makes two-byte BRKs optional, controlled by a checkbox
in the project settings. In the system definitions it defaults to
true for Apple ///, false for all others.
ACME doesn't allow BRK to have an arg, and cc65 only allows it for
65816 code (?), so it's emitted as a hex blob for those assemblers.
Anyone wishing to target those assemblers should stick to 1-byte mode.
Extension scripts have to switch between formatting one byte of
inline data and formatting an instruction with a one-byte operand.
A helper function has been added to the plugin Util class.
To get some regression test coverage, 2022-extension-scripts has
been configured to use two-byte BRK.
Also, added/corrected some SOS constants.
See also issue #44.
The "add platform symbol file" and "add extension script" buttons
create a file dialog with the initial directory set to the
RuntimeData directory inside the SourceGen installation directory.
This is great if you're trying to add a file from the platform
definitions, but annoying if you're trying to add it from the
project directory.
It's really convenient to not have to hunt around though, so now
there are two buttons: one for platform, one for project. The
latter is disabled if the project is new and hasn't been saved yet.
If it's a known function, apply basic numeric formatting to the
various fields. Primarily of value for the pathname and buffer
parameters, which are formatted as addresses.
Also, enable horizontal scrolling in the generic show-text dialog.
The current AddressMap is now passed into the plugin manager, which
wraps it in an AddressTranslate object and passes that to the
plugins at Prepare() time. This allows plugins to convert addresses
to offsets, making it possible to format complex structures.
This breaks existing plugins.
If we have a bug, or somebody edits the project file manually, we
can end up with a very wrong string, such as a null-terminated
string that isn't, or a DCI string that has a mix of high and low
ASCII from start to finish. We now check all incoming strings for
validity, and discard any that fail the test. The verification
code is shared with the extension script inline data formatter.
Also, added a comment to an F8-ROM symbol I stumbled over.
Extension scripts (a/k/a "plugins") can now apply any data format
supported by FormatDescriptor to inline data. In particular, it can
now handle variable-length inline strings. The code analyzer
verifies the string structure (e.g. null-terminated strings have
exactly one null byte, at the very end).
Added PluginException to carry an exception back to the plugin code,
for occasions when they're doing something so wrong that we just
want to smack them.
Added test 2022-extension-scripts to exercise the feature.
We were providing platform symbols to plugins through the PlatSym
list, which allowed them to find constants and well-known addresses.
We now pass all project symbols and user labels in as well. The
name "PlatSym" is no longer accurate, so the class has been renamed.
Also, added a bunch of things to the problem list viewer, and
added some more info to the Info panel.
Also, added a minor test to 2011-hinting that does not affect the
output (which is the point).
Handle situation where a symbol wraps around a bank. Updated
2021-external-symbols for that, and to test the behavior when file
data and an external symbol overlap.
The bank-wrap test turned up a bug in Merlin 32. A workaround has
been added.
Updated documentation to explain widths.
The ability to give explicit widths to local variables worked out
pretty well, so we're going to try adding the same thing to project
and platform symbols.
The first step is to allow widths to be specified in platform files,
and set with the project symbol editor. The DefSymbol editor is
also used for local variables, so a bit of dancing is required.
For platform/project symbols the width is optional, and is totally
ignored for constants. (For variables, constants are used for the
StackRel args, so the width is meaningful and required.)
We also now show the symbol's type (address or constant) and width
in the listing. This gets really distracting when overused, so we
only show it when the width is explicitly set. The default width
is 1, which most things will be, so users can make an aesthetic
choice there. (The place where widths make very little sense is when
the symbol represents a code entry point, rather than a data item.)
The maximum width of a local variable is now 256, but it's not
allowed to overlap with other variables or run of the end of the
direct page. The maximum width of a platform/project symbol is
65536, with bank-wrap behavior TBD.
The local variable table editor now refers to stack-relative
constants as such, rather than simply "constant", to make it clear
that it's not just defining an 8-bit constant.
Widths have been added to a handful of Apple II platform defs.
Change + save + undo + change was being treated as non-dirty.
Added link to "export" feature to documentation TOC.
Added keyboard shortcut for high part in data operand editor.
Corrected various things in the tutorial.
Added a blank line after local variable tables. Otherwise they
just sort of blend in with the stuff around them.
Put prefixes before the DOS 3.3 platform symbols.
Added a BAS_HBASH entry. We were getting BAS_HBASL and MON_GBASH
paired up, which looks weird.
Apply a very light tint to the preview section of the Edit Long
Comment dialog, to hint that the window is read-only.
Having underlined blue text everywhere was too noisy. This changes
the CSS style for internal links to be plain black text that gets
blue and underliney when you hover the mouse over it.
Also, added the current date and time to the set of template
substitutions.
HTML output should have had double quotes around internal anchors.
(Chrome and Edge didn't complain, but the w3c validator wasn't
happy.)
Made the text areas in the load-time problem report dialogs
scrollable.
Updated the manual.
The analyzer sometimes runs into things that don't seem right, like
hidden labels or references to non-existent symbols, but has no way
to report them. This adds a problem viewer.
I'm not quite ready to turn this into a real feature, so for now it's
a free-floating window accessed from the debug menu.
Also, updated some documentation.
Most assemblers end local label scope when a global label is
encountered. cc65 takes this one step further by ending local label
scope when constants or variables are defined. So, if we have a
variable table with a nonzero number of entries, we want to create
a fake global label at that point to end the scope.
Merlin 32 won't let you write " LDA #',' ". For some reason the
comma causes an error. IGenerator now has a "tweak operand format"
interface that lets us fix that.
Split "edit local variable table" into "create" and "edit prior".
The motivation is to allow the user to make changes to the most
recently defined table without having to go search for it. Having
table creation be an explicit action, rather than something that
just happens if you edit a table that isn't there, feels reasonable.
Show table offset in LV table edit dialog, so if you really want
to go find it there's a (clumsy) way to do so.
Increased the maximum width of a variable from 4 to 8. (This is
entirely arbitrary.)
We weren't escaping '<', '>', and '&', which caused browsers to get
very confused. Browsers seem to prefer <PRE> to <CODE> for long
blocks of text, so switch to that.
Also, added support for putting long labels on their own lines in
the HTML output.
Also, fixed some unescaped angle brackets in the manual.
Also, tweaked the edit instruction operand a bit more.
I was using the plain names, but when you've got symbols like
READ and WAIT it's too easy to have a conflict and it's not plainly
obvious where something came from. Now all monitor symbols begin
with MON_, and Applesoft symbols begin with BAS_.
The Amper-fdraw example ended up with a few broken symbol refs,
because it was created before project/platform symbols followed the
"nearby" rules, and was explicitly naming LINNUM and AMPERV. I
switched the operands to default, and they now auto-format correctly.
I added a few more entries to Applesoft while I was at it.
A ".dd2 <address>" item would get linked to an internal label, but
references to external addresses weren't doing the appropriate
search through the platform/project symbol list.
This change altered the output of the 2019-local-variables test.
The previous behavior was restored by disabling "nearby" symbol
matching in the project properties.
Updated the "lookup symbol by address" function to ignore local
variables.
Also, minor updates to Applesoft and F8-ROM symbol tables.
It's possible to define multiple project symbols with the same
address. The way to resolve the ambiguity is to explicitly
reference the desired symbol from the operand. This was the
default behavior of the "create project symbol" shortcut in the
previous version.
It's rarely necessary, and it can get ugly if you rename a project
symbol, because we don't refactor operands in that case.
If you play games with code hints you can create a data operand that
overlaps with code. This causes problems (see issue #45). We now
check for that situation and ignore overlapping data descriptors.
Added a regression test to 2011-hinting.
Also, removed "include symbol table" from export dialog. You can
exclude the table by removing it from the template, which right
now you'd need to do anyway to get rid of the H2 header and other
framing. To make this work correctly as an option we'd need to
parse the "div" in the template file and strip the whole section,
or split the template into multiple parts that get included as
needed. Not worth doing the work until we're sure it matters.
Updated the manual, and changed tutorial #2 to use local variables
for pointers.
If the symbol text box isn't empty, use the string as the initial
value for the Label when creating a new project property.
Fixed a crash when editing a project property.
Implemented local variable editing. Operands that have a local
variable reference, or are eligible to have one, can now be edited
directly from the instruction operand edit dialog.
Also, updated the code list double-click handler so that, if you
double-click on the opcode of an instruction that uses a local
variable reference, the selection and view will jump to the place
where that variable was defined.
Also, tweaked the way the References window refers to references
to an address that didn't use a symbol at that address. Updated
the explanation in the manual, which was a bit confusing.
Also, fixed some odds and ends in the manual.
Also, fixed a nasty infinite recursion bug (issue #47).
The code that checked to see if a data target was inside a data
operand wasn't going all the way back to the start of the file.
It was also failing to stop when it should, wasting time.
The anattrib validation method has code that avoids a false-positive
on certain complex embedded instruction arrangements. This was also
preventing it from seeing a transition from a data area to the
middle of an instruction (caused by issue #45).
Previously, we used the default character encoding from the project
properties to determine how strings and character constants in the
entire source file should be encoded. Now we switch between
encodings as needed. The default character encoding is no longer
relevant.
High ASCII is now an actual encoding, rather than acting like ASCII
that sometimes doesn't work. Because we can do high ASCII character
operands with "| $80", we don't output a .enc to switch from ASCII
to high ASCII unless we need to generate a string. (If we're
already in high ASCII mode, the "| $80" isn't required but won't
hurt anything.)
We now do a scan up front to see if ASCII or high ASCII is needed,
and only output the .cdefs for the encodings that are actually used.
The only gap in the matrix is high ASCII DCI strings -- the ".shift"
pseudo-op rejects text if the string doesn't start with the high
bit clear.
I didn't think it made sense, but I found something that used it,
so apparently it's a thing. This updates the operand editor to
let you choose PETSCII+DCI, and updates the assemblers to handle
it correctly (really just 64tass, since the others either don't
have a DCI directive or don't deal with PETSCII at all).
Changed the char-encoding sample from "bad dcI" to "pet dcI", and
updated the documentation.
The documentation for 64tass says you're required to pass "--ascii"
when the source file is ASCII (as opposed to PETSCII). We were
ignoring this, but it turns out that everything works a bit better
if we don't.
So we now pass "--ascii" on the command line, and add a two-line
character encoding definition to every file that is generated with
ASCII as the default encoding. The sg_petscii and sg_screen
encodings go away, as PETSCII is now the default, and we can use the
built-in "screen" encoding.
The plugin objects are MarshalByRefObject stubs, which means they
don't actually implement the interfaces we're checking for. There's
some additional overhead to do the interface check. We can avoid
it by doing the interface queries during initialization, and just
checking some bit flags later on.
Also, in the extension script info window, show a list of
implemented interfaces.
During a discussion with the cc65 developers, I became convinced that
generating "MVN $01,$02" is wrong, and "MVN #$01,#$02" is correct.
64tass, cc65, and Merlin 32 all accept this syntax; only ACME does
not. Operands without a leading '#' should be treated as 24-bit
values, and have the bank byte extracted.
This change updates the on-screen display and assembled output to
include the '#'. The ACME generator uses a Quirk to suppress the
hash mark. (It doesn't currently accept values larger than 8 bits,
so there's no ambiguity.)
There's no easy way to make non-zero-bank 65816 code work, so I'm
punting and just generating a whole-file hex dump for those. This
renders tests 2007 and 2009 useless, so I'm hesitant to claim that
ACME support is fully functional.
Instead of providing no-op CheckJsr/CheckJsl, plugins now declare
which calls they support by defining interfaces on the plugin class.
I added a CheckBrk call for code like Apple /// SOS calls, which
use BRK as an OS call mechanism. The formatting doesn't work quite
right yet because I've been treating BRK as a two-byte instruction.
Hardly anything else does, and I think it's time I stopped (but not
in this commit).
Note: THIS BREAKS ALL PLUGINS that use the inline JSR/JSL feature,
which is pretty much all of them.
- Updated the tutorial to track changes to WPF, and to clarify
existing content.
- Fixed Ctrl+H Ctrl+C, which was getting masked by the Copy command
handler.
- Fixed initial selection of address in Set Address.
- MakeDist now copies CommonWPF.dll.
- Spent a bunch of time tracking down a null-pointer deref that only
happened when you didn't start with a config file. Fixed.
- The NPE was causing the program to exit without any sort of useful
diagnostic, so I added an uncaught exception handler that writes
the crash to a text file in the current directory.
- Added a trace listener definition to App.config that writes log
messages to a file, but it can't generally be enabled at runtime
because you can't write files from inside the sandbox. So it's
there but commented out.
- Made the initial size of the main window a little wider.
SourceGen creates "auto" labels when it finds a reference to an
address that doesn't have a label associated with it. The label for
address $1234 would be "L1234". This change allows the project to
specify alternative label naming conventions, annotating them with
information from the cross-reference data. For example, a subroutine
entry point (i.e. the target of a JSR) would be "S_1234". (The
underscore was added to avoid confusion when an annotation letter
is the same as a hex digit.)
Also, tweaked the way the preferred clipboard line format is stored
in the settings file (was an integer, now an enumeration string).
In the cross-reference table we now indicate whether the reference
source is doing a read, write, read-modify-write, branch, subroutine
call, is just referencing the address, or is part of the data.
In the data operand edit section, walk through selecting a single
byte vs. multiple bytes when you want to set a multi-byte format.
(inspired by issue #41)
The 2014-label-dp test now passes. Prior regression tests are
unaffected.
Also, renamed an IGenerator interface to more accurately reflect
its role.
(issue #37)
Before you could choose between Merlin-style and generic. Now
there's a combo box that lets you choose Merlin, cc65, or
"common", the latter being used for 64tass.
Gave cc65 its own expression generator, as the precedence table seems
atypical if not unique. Configured 64tass to use the "simple"
expression mode.
Added some operations on a 32-bit constant to 2007-labels-and-symbols
to exercise the current worst-case expression (shift + AND + add).
Tweaked the Merlin expression generator to handle it.
(issue #16)
Correctly handle pre-existing underscores and avoidance of
"reserved" labels.
Also, add more underscores to 2012-label-localizer to exercise
the code.
(issue #16)
Most tests pass, but 2007-labels-and-symbols fails because the
expressions recognized by 64tass don't match up with either of the
other assemblers.
This is currently using a workaround for the local label syntax.
64tass uses '_' as the prefix, which is unfortunate since SourceGen
explicitly allowed underscores in labels. (So does 64tass for that
matter, but it treats labels specially when the '_' comes first.)
We will need to rename any non-local user labels that start with '_'.
(issue #16)
We were using \u23e9, BLACK RIGHT-POINTING DOUBLE TRIANGLE, but
neither Win7 SP1 nor Linux was able to display the glyph. It also
gets all puffy in web browsers. We now use \u25bc, BLACK
DOWN-POINTING TRIANGLE, which seems to work everywhere. It also
feels more appropriate, because it appears next to the "containing"
opcode, with the embedded instruction appearing on the following
line.
Once upon a time, symbol files and extension scripts could only be
defined in the RuntimeData directory, so having the documentation
there made sense. Since both of these things can now be defined in
project directories, the documentation belongs in the manual.
(issue #27)
If PTR is defined as an external symbol, we were automatically
symbol-ifying PTR+1. Now we also symbolify PTR+2. This helps with
24-bit pointers on the 65816, and 16-bit "jump vectors", where the
address is preceded by a JMP opcode.
Removed the "AMPERV_" symbol I added to make the tutorial look
right.
The instruction operand editor and data operand editor are very
different, but there's no need to impose that distinction on the
user. They want to edit the operand either way. We now provide a
single "edit operand" menu item, and open the appropriate dialog
based on what they have selected.
This uses Ctrl+O as the keyboard shortcut, stealing it from
File > Open.
(issue #11)
Two changes:
(1) Code and data hints are now only applied to the first byte on
each selected line. This allows you to slap a code hint on a
string without lighting up the whole string. Inline-data hints
and hint removal work as before.
(2) Added a menu item (with Ctrl+D as shortcut) to toggle the state
of the uncategorized data analyzer. This makes it easy to turn
off the feature that put the code into a string in the first
place.
Don't enable OK unless at least one address is valid.
Don't apply code hints unless asked.
Rename a couple of things for clarity.
Add documentation to manual.
(issue #10)
If set, the first word of the file is used to set the load address.
The initial code entry hint is placed at offset +000002 instead of
the start of the file.
Set it to true for the C64 system definition.
(issue #23)