Split into 6502/65816 portions. The 6502 version is the original
with a few in-place substitutions (e.g. JMP for BRL). The 65816
version is only needed to exercise special handling of PEA/PER.
We have a single character-encoding test that is cloned 3x so we can
exercise the different values for the project's default character
set. It was a 65816 test because it tested 16-bit immediate char
operands, but that's a very small part of it.
The 65816-specific portion is now 20122-char-encoding. The rest is
now 201{2,3,4}0-char-encoding-X.
Tests 10022-embedded-instructions and 10032-flags-and-branches were
a mix of 6502 and 65816 code. The 6502 code has been separated into
its own file, so that the tests can be run on 8-bit-only assemblers.
We append an assembler identifier to generated code. For Merlin 32,
this was "_Merlin32". All of the other assemblers use a lower-case
string, which makes Merlin look a little weird, so it has been
changed to "_merlin32".
Windows filesystems are generally case-insensitive, so this won't
likely affect anything.
A few tweaks:
- Test now requires an ORG on offset +000002, not just a correct
address.
- Suppress on-screen display of the initial ORG directive when
a PRG file is detected. Subtle, but helpful.
- In new project setup, fix initial address for PRG projects that
load at $0000.
- In new project setup, add a "load address" comment to the first line.
Also, fix some out-of-date documentation.
(issue #90)
The 10042-data-recognition test has no 65816-specific content, so it
should be named 10040-data-recognition.
Also, remove header comment from 20102-label-dp.
C64 PRG files are pretty common. Their salient feature is that they
start with a 16-bit value that is used as the load address. The
value is commonly generated by the assembler itself, rather than
explicitly added to the source file.
Not all assemblers know what a PRG file is, and some of them handle
it in ways that are difficult to guarantee in SourceGen. ACME adds
the 16-bit header when the output file name ends in ".prg", cc65
uses a modified config file, 64tass uses a different command-line
option, and Merlin 32 has no idea what they are.
This change adds PRG file detection and handling to the 64tass code
generator. Doing so required making a few changes to the gen/asm
interfaces, because we now need to have the generator pass additional
flags to the assembler, and sometimes we need code generation to
start somewhere other than offset zero. Overall the changes were
pretty minor.
The 20042-address-changes test needed a 6502-only variant. A new test
(20040-address-changes) has been added and given a PRG header. As
part of this change the 65816 variant was changed to use addresses
in bank 2, which uncovered a code generation bug that this change
also fixes.
The 64tass --long-address flag doesn't appear to be necessary for
files <= 65536 bytes long, so we no longer emit it for those.
(issue #90)
One of the most confusing things you can do is select a bunch of
lines and apply a code start tag (nee "code hint"). We now ask for
confirmation when applying start/stop hints to multiple lines.
(issue #89)
Variables, types, and comments have been updated to reflect the new
naming scheme.
The project file serialization code is untouched, because the data
is output as serialized enumerated values. Adding a string conversion
layer didn't seem worthwhile.
No changes in behavior.
(issue #89)
Before:
Hint As Code Entry Point
Hint As Data Start
Hint As Inline Data
Remove Hints
After:
Tag Address As Code Start Point
Tag Address As Code Stop Point
Tag Bytes As Inline Data
Remove Analyzer Tags
The goal is to reduce confusion. The old nomenclature was causing
problems because it's inaccurate -- they're directives, not hints --
and made it look like you need to mark data items explicitly. The
new action names emphasize the idea that you should be tagging a
single address for start/stop, not blanketing a region.
This change updates the user interface, manual, and tutorials, but
does not change how the items are referred to in code, and does not
change how the program works.
(issue #89)
Modified the asm source generators and on-screen display to show the
DP arg for BBR/BBS as hex. The instructions are otherwise treated
as relative branches, e.g. the DP arg doesn't get factored into the
cross-reference table.
ACME/cc65 put the bit number in the mnemonic, 64tass wants it to be
in the first argument, and Merlin32 wants nothing to do with any of
this because it's incompatible with the 65816.
Added an "all ops" test for W65C02.
Created the "all ops" tests for W65C02. Filled in enough of the
necessary infrastructure to be able to create the project and
disassemble the file, though we're not yet handling the instructions
correctly.
We were claiming W65C02S, but it turns out that CPU has the Rockwell
extensions and the STP/WAI instructions. We need to change existing
references to be "WDC 65C02", and add a new CPU definition for the
actual W65C02S chip.
This adds the new CPU definition, the instruction definitions for
the Rockwell extensions, and updates the selectors in project properties
and the instruction chart tool.
This change shouldn't affect any existing projects. Still more to do
before W65C02 works though, mostly because the Rockwell instructions
introduced a new two-argument address mode that has to be handled in
various places.
Sometimes you just want to turn cycle counts on for a bit, and going
through app settings is tiresome. Now there's a toolbar checkbox
for it. The icon isn't ideal, but it'll do.
Renaming a user label doesn't cause a re-analysis, just a display
update, because nothing structural is changing. However, that's not
quite true when you have a reference to a non-existent label (e.g.
"LDA hoser"), and you rename a label to match (e.g. change "blah"
to "hoser"). The most obvious consequence was that the Message list,
which enumerates the broken symbolic references, was not being
updated.
We now identify broken references during the refactoring rename, and
change the reanalysis mode accordingly.
There is a deeper problem, where undoing the label rename does the
wrong thing with the previously-broken symbolic references (in the
earlier example, it "undoes" them to "blah" rather than back to
"hoser"). I added some notes about that, but it's harder to fix.
Also, clean up some code that was still treating ReanalysisScope as
if it were bit flags.
Consider:
LDA $00 loads a value from address $00
LDA $00,X might load from $00, or might not
LDA ($00),Y dereferences $00 as a 16-bit pointer
LDA ($00,X) dereferences a pointer, not necessarily from $00
When perusing the cross-reference list, it's useful to be able to
tell whether an instruction is accessing the location, using it as a
base address, or deferencing it as a pointer. We now show "ptr" in
the list for pointer dereferences. (We already showed "idx" for
indexed accesses.)
When editing local variables, the data grid generally has the input
focus, so hitting Enter doesn't close the dialog. Rather than play
games with the focus, just take Ctrl+Enter as a shortcut to close
the dialog (same as notes and long comments).
(I found myself hitting Ctrl+Enter automatically, and being annoyed
when it didn't work, so I figured I'd make it official.)
The "show cycle counts in comments" setting is the only one that
affects both the on-screen display and generated source code. This
felt a little weird, so it's now two independent settings. This
also provided an opportunity to move it to the initial tab, so it's
easier to toggle on and off. Overall it feels less confusing to have
two settings for essentially the same thing, because code generation
is distinct from everything else.
The "spaces between bytes" setting was moved to the Display Format
tab, which seems a better fit.
Documentation and tutorial have been updated.
Also did a little bit of cleanup in EditAppSettings.
- Added SOS parameter block formatting.
- Normalized SOS call names to values in SOS Reference Manual.
- Added SOS call error code constants.
- (from robjustice) Added more to A3-IO.sym65.
Also, rearranged the ProDOS code slightly.
(issue #85)
Inline BRK instructions have a problem similar to the one fixed
for JSR/JSL back in 63d7a487, but the same fix won't work because
JSR/JSL are assumed "continue", while BRK is assumed "no-continue",
and must therefore set a no-no-continue flag. For now, we just
re-evaluate the BRK on every visit to the code.
A review of the previous fix revealed an opportunity to use the
NoContinueScript flag on subsequent visits to improve consistency.
We weren't altering the status flags after a BRK because of the
assumption that a BRK was a crash. For an inline BRK, such as a SOS
call, execution continues. We need to mark NVZC indeterminate or
we may incorrectly handle conditional branches that follow.
The BRK instruction now uses the same flag updater as JSR, since it's
effectively a subroutine call to unknown code. If execution doesn't
continue across the BRK then the flags don't matter.
Updated 20182-extension-scripts to exercise this.
Added explicit widths to the 6502 vectors.
Two changes to Apple II hi-res visualization:
(1) Allow the row stride to be any value >= 1. This is useful
when data is stored in column-major order, i.e. it's a two-byte-wide
shape, with all of the data for the first column stored before the
data for the second column. (Set the row stride to 1, and the
column stride to the bitmap height.)
(2) Modify the layout of grids (sprite sheets and fonts), so that
we're closer to square when the item counts is low. Otherwise the
thumbnail just looks like a dashed line. (This one is strictly
cosmetic.)
Added a bunch of Applesoft entry points, and updated the F8ROM
definitions.
Added a visualizer for Applesoft shape table shapes that are not part
of an actual shape table.
It's fairly common to want to Find something and then jump back to
where you were. The Find command now adds the current position to
the nav stack for the initial search. If you Find Next, the nav
stack is not altered.
This experimental feature applied platform symbols to the project,
setting labels where the platform symbol's address matched an internal
address. The feature now applies project symbols as well, and has
been renamed to "apply external symbols".
We now report the number of labels set.
First, make the per-segment comments and notes optional.
Second, add an "offset segment by $0100" feature that tries to shift
each segment forward 256 bytes. Doing so avoids potential ambiguity
with direct page locations.
The 20212-reloc-data test no longer has the per-segment comments.
The "smart" PLP handler tries to recover the flags from an earlier
PHP. The non-smart version just marks all the flags as indeterminate.
This doesn't work well on the 65816 in native mode, because having
the M/X flags in an indeterminate state is rarely what you want.
Code rarely uses PLP to reset the flags to a specific state, preferring
explicit SEP/REP. The analyzer is more likely to get the correct
answer by simply leaving the flags in their prior state.
A test case has been added to 20052-branches-and-banks, which now has
"smart PLP" disabled.
Long operands, such as strings and bulk data, can span multiple lines.
SourceGen wraps them at 64 characters, which is fine for assembly
output but occasionally annoying on screen: if the operand column is
wide enough to show the entire value, the comment column is pushed
pretty far to the right.
This change makes the width configurable, as 32/48/64 characters,
with a pop-up in app settings.
The assemblers are all wired to 64 characters, though we could make
this configurable as well with an assembler-specific setting.
Some things have moved around a bit in app settings. The Asm Config
tab now comes last. Having it sandwiched in the middle of tabs that
altered the on-screen display didn't make much sense. The Display
Format is now explicitly for opcodes and operands, and is split into
two columns. The left column is managed by the "quick set" feature,
the right column is independent.
Another chapter in the never-ending AppDomain security saga.
If a computer goes to sleep while SourceGen is running with a project
open, life gets confusing when the system wakes up. The keep-alive
timer fires and a ping is sent to the remote AppDomain, successfully.
At the same time, the lease expires on the remote side, and the objects
are discarded (apparently without bothering to query the ILease object).
This failure mode is 100% repeatable.
Since we can't prevent sandbox objects from disappearing, we have to
detect and recover from the problem. Fortunately we don't keep any
necessary state on the plugin side, so we can just tear the whole
thing down and recreate it.
The various methods in ScriptManager now do a "health check" before
making calls into the plugin AppDomain. If the ping attempt fails,
the AppDomain is "rebooted" by destroying it and creating a new one,
reloading all plugins that were in there before. The plugin binaries
*should* still be in the PluginDllCache directory since the ping failure
was due to objects being discarded, not AppDomain shutdown, and Windows
doesn't let you mess with files that hold executable code.
A new "reboot security sandbox" option has been added to the DEBUG
menu to facilitate testing.
The PluginManager's Ping() method gets called more often, but not to
the extent that performance will be affected.
This change also adds a finalizer to DisasmProject, since we're relying
on it to shut down the ScriptManager, and it's relying on callers to
invoke its cleanup function. The finalizer throws an assertion if the
cleanup function doesn't get called.
(Issue #82)
Changed "Use Keep-Alive Hack" to "Disable Keep-Alive Hack" to emphasize
that it defaults to enabled. Added a menu item for "Disable Security
Sandbox". Added a warning to both that tells the user that they must
reopen the current project for the change to take effect.
Note neither of these is persisted in app settings.
When formatting a table of 16-bit addresses in 65816 code, the bank
byte was always being set to zero. However, for "JMP (addr,X)", the
program bank is used. We now default to that behavior.
The choice can be overridden as before (select 24-bit addresses with a
constant value for the bank byte).
The old values were pretty optimistic in terms of the length of labels.
Short labels in all caps are very retro but sort of annoying to read,
so most disassemblies use longer ones. The new defaults are more
accommodating for the way labels are actually used.
The new name is more indicative of the purpose of the directory.
Updated the docs to point out that you can delete the contents any
time you want, so long as SourceGen isn't running at the time.
Also, change the default column widths for the exporter.
Don't allow comments to be set in the middle of an instruction or
multi-byte data item. The subsequent partial update confuses the
line list generator.
Change order of note/long-comment/comment to match display.
Expand SGEC to include long comments and notes. These are serialized
in JavaScript form.
SGEC now accepts addresses and relative position deltas. Exported
content uses addresses, and can be configured for deltas.
This is still an "experimental" feature, but it's getting expanded
a bit. The implementation now lives in its own class. An "export"
feature that generates SGEC data has been added. The file extension
has been changed from ".sgec" to ".txt" to make it simpler to edit
under Windows.
On an Apple IIgs, the memory-mapped I/O locations are actually in
bank $e0, shadow-copied to bank $00. This adds a copy of the
relevant definitions from Cxxx-IO.sym65, with the addresses in bank
$e0 and "_GS" appended to the labels.
This is now included by default for the Apple IIgs system defintions.
(I thought about just adding them to Cxxx-IO.sym65, but then they
pollute the namespace for 8-bit systems. Stripping them out at run
time got a little complicated because the platform symbols are only
loaded once, and we'd have to reload them every time the CPU definition
changed. Further, there are a few aliases provided as constants, and
constants are allowed to be 32 bits on all systems, so those can't be
stripped. Rather than defining a new definition I figured it was
just easier to have a second file. Maintenance shouldn't be too taxing,
as definitions for 40-year-old machines don't change all that often.)
(I also thought about trying to make the address mirroring stuff work
for me here, but that would result in accesses being made to the
canonical address with an offset of +$e00000, which looks awful.)
The Disk ][ I/O locations are generally accessed as an offset, using
something like "LDA $C08n,X". However the range from $C080-C08F is
already used for language card in slot 0. SourceGen doesn't have a
way to distinguish between indexed and direct accesses, and even if
it did there's no way to separate one peripheral card from another
without knowing the contents of the CPU register.
As a workaround, the Disk ][ definitions are now in a separate symbol
file. When loaded, the definitions replace the base slot 0 equates.
I figure Disk ][ accesses are more common than language card
manipulation, so I'm making it a default for new projects. Existing
projects that reference the Disk ][ symbols (which existed, but as
constants) will need to be updated to include the new .sym65.
When we have relocation data available, the code currently skips the
process of matching an address with a label for a PEA instruction when
the instruction in question doesn't have reloc data. This does a
great job of separating code that pushes parts of addresses from code
that pushes constants.
This change expands the behavior to exclude instructions with 16-bit
address operands that use the Data Bank Register, e.g. "LDA abs"
and "LDA abs,X". This is particularly useful for code that accesses
structured data using the operand as the structure offset, e.g.
"LDX addr" / "LDA $0000,X"
The 20212-reloc-data test has been updated to check the behavior.
Add 20222-data-bank to regression test suite. This exercises handling
of 16-bit operands with inter- and intra-bank references, and tests the
smartness in "smart PLB".
Also, update a couple of older tests that broke because the DBR is no
longer always the same as the PBR. This just required adding "B=K"
in a few places to restore the original output.
If code accesses the high/low parts of a 32-bit address value with
no label, it auto-generates labels for addr+2 and addr. The reloc
handler was replacing the unformatted bytes with a single multi-byte
format, hiding the label at addr+2.
The easy fix is to have the reloc data handler skip the entry. This
is less useful than other approaches, but much simpler.
Added a test to 20212-reloc-data.
Implemented "smart" PLB handling. If we see PHK/PLB, or 8-bit
LDA imm/PHA/PLB, we create a data bank change item. The feature
can be disabled with a project property.
Added a "fake" assembler pseudo-op for DBR changes. Display entries
in line list.
Added entry to double-click handler so that you can double-click on
a PLB instruction operand to open the data bank editor.
Changed basic data item from an "extended enum" to a class, so we can
keep track of where things come from (useful for the display list).
Finished edit dialog. Added serialization to project file.
On the 65816, 16-bit data access instructions (e.g. LDA abs) are
expanded to 24 bits by merging in the Data Bank Register (B). The
value of the register is difficult to determine via static analysis,
so we need a way to annotate the disassembly with the correct value.
Without this, the mapping of address to file offset will sometimes
be incorrect.
This change adds the basic data structures and "fixup" function, a
functional but incomplete editor, and source for a new test case.
The Visual Studio performance profiler showed the FormatDescriptor
equality test being called quite a lot. The test was vs. null, so
a simple change from "==" to "is" improved performance dramatically.
Fixing the underlying issue with a better data structure is still
important, but this provided a big boost with little effort.
The test wasn't correctly excluding instructions, so it was possible
to create a situation where a two-byte data item had an instruction
starting in the second byte.
We also weren't checking the length of the instruction to ensure that
it was wider than the reloc data. This could get weird for an
immediate constant when the M/X flags are wrong. When in doubt, don't
overwrite.
The decision of how to handle indeterminate M/X flag values is made in
StatusFlags. This provides consistent behavior throughout the app.
This was being done for M/X but not for E.
This change also renames the M/X tests, prefixing them with "Is" to
emphasize that they are boolean rather than tri-state.
There should be no change in behavior from this.
This test exercises the relocation data feature. The test file is
generated from a multi-segment OMF file that was hex-edited to have
specific attributes (see 20212-reloc-data-lnk.S for instructions).
The test also serves as a way to exercise the OMF converter.
Also, implement the Bank Relative flag.
The Absolute Indirect and Absolute Indirect Long addressing modes
(e.g. "JMP (addr)" and "JMP [addr]") are 16-bit values in bank 0.
The code analyzer was placing them in the program bank, which
meant the wrong symbol was being used.
Also, tweak some docs.
Works well for things like jump tables. Seeing a bunch of these
scattered in a chunk of data is a decent signal that it's actually
code.
In a bold move, we now exclude PEA operands from auto-label gen when
they don't have relocation data. This is very useful for things
like Int2Hex for which constants are typically pushed with PEA.
Reworked the "use reloc data" setting so it defaults to false and is
explicitly set to true when converting OMF. This provides a minor
optimization since we now check the boolean and skip doing a lookup
in an empty table.
Similar to the ProDOS 8 formatter, but slightly more complex due
to the variable-length parameter block layout.
Also, added Orca shell call numbers to the list of constants.
This was a relatively lightweight change to confirm the usefulness
of relocation data. The results were very positive.
The relatively superficial integration of the data into the data
analysis process causes some problems, e.g. the cross-reference table
entries show an offset because the code analyzer's computed operand
offset doesn't match the value of the label. The feature should be
considered experimental
The feature can be enabled or disabled with a project property. The
results were sufficiently useful and non-annoying to make the setting
enabled by default.
Code generated for 64tass was incorrect for JSR/JMP to a location
outside the file bounds. A test added to 20052-branches-and-banks
revealed an issue with cc65 generation as well.
A "cooked" form of the relocation data is added to the project, for
use during data analysis.
Also, changed the data grids in the segment viewer to allow multi-
select, so users can copy & paste the contents.
We now put a code hint on the JML instruction in each jump table
entry. This is necessary to ensure that the target address is
recognized as code, since a dynamic segment won't otherwise be
referenced.
Also, fiddle with the note/comment formatting some more.
Two basic problems:
(1) cc65, being a one-pass assembler, can't tell if a forward-referenced
label is 16-bit or 24-bit. If the operand is potentially ambiguous,
such as "LDA label", we need to add an operand width disambiguator.
(The existing tests managed to only do backward references.)
(2) 64tass wants the labels on JMP/JSR absolute operands to have 24-bit
values that match the current program bank. This is the opposite of
cc65, which requires 16-bit values. We need to distinguish PBR vs.
DBR instructions (i.e. "LDA abs" vs. "JMP abs") and handle them
differently when formatting for "Common".
Merlin32 doesn't care, and ACME doesn't work at all, so neither of
those needed updating.
The 20052-branches-and-banks test was expanded to cover the problematic
cases.
The handful of 6502-based Atari coin-op systems were very different
from each other, so having a dedicated entry doesn't make sense.
Also, enable word-wrap in the New Project text box that holds the
system description.
The GS/OS loader initializes the calls with JSLs to a loader entry
point, and replaces them with JMLs to code in dynamic segments when
the segments are loaded. Since we have all the segments loaded at
once, we can just rewrite them to be JMLs immediately.
Changed bank-start comments to notes, added a summary to the top-of-file
comment.
Also, fixed a bug where the app settings dialog wasn't identifying
display settings as a preset for 64tass and cc65.
Generate multiple .ORG directives for segments that span multiple
banks. Some assemblers don't like it when things cross. This is
pretty rare (Cryllan Mission is an example).
Conversion of OMF Load files to a data/project pair is generally
working. The 65816 source code generators need some work though.