1
0
mirror of https://github.com/fadden/6502bench.git synced 2024-06-12 23:29:32 +00:00
Commit Graph

42 Commits

Author SHA1 Message Date
Andy McFadden
4e5c34f457 Binary includes
This adds a new data format option, "binary include", that takes a
filename operand.  When assembly sources are generated, the section
of file is replaced with an appropriate pseudo-op, and binary files
are generated that hold the file contents.  This is a convenient way
to remove large binary blobs, such as music or sound samples, that
aren't useful to have in text form in the sources.

Partial pathnames are allowed, so you can output a sound blob to
"sounds/blather.bin".  For safety reasons, we don't allow the files
to be created above the project directory, and existing files will
only be overwritten if they have a matching length (so you don't
accidentally stomp on your project file).

The files are not currently shown in the GenAsm dialog, which lets
you see a preview of the generated sources.  The hex dump tool
can do this for the (presumably rare) situations where it's useful.

A new regression test, 20300-binary-include, has been added.  The
pseudo-op name can be overridden on-screen in the settings.

We don't currently do anything new for text/HTML exports.  It might
be useful to generate an optional appendix with a hex dump of the
excised sections.

(issue #144)
2024-05-31 14:22:39 -07:00
Andy McFadden
2f603b69c3 Fix format overwrite test
A SOS binary had situation were an instruction was being reformatted
as inline data by an extension script.  The code that was supposed
to detect and prevent this had a bug that caused it to only look at
the first byte.  The resulting confusion caused the program to crash.

(issue #150)
2023-05-10 10:07:15 -07:00
Andy McFadden
cb114be0f6 Add "uninitialized data" format type
This allows regions that hold variable storage to be marked as data
that is initialized by the program before it is used.  Previously
the choices were to treat it as bulk data (initialized) or junk
(totally unused), neither of which are correct.

This is functionally equivalent to "junk" as far as source code
generation is concerned (though it doesn't have to be).

For the code/data/junk counter, uninitialized data is counted as
junk, because it technically does not need to be part of the binary.
2021-10-13 15:05:07 -07:00
Andy McFadden
e6c5c7f8df ORG rework, part 6
Added support for non-addressable regions, which are useful for things
like file headers stripped out by the system loader, or chunks that
get loaded into non-addressable graphics RAM.  Regions are specified
with the "NA" address value.  The code list displays the address field
greyed out, starting from zero (which is kind of handy if you want to
know the relative offset within the region).

Putting labels in non-addressable regions doesn't make sense, but
symbol resolution is complicated enough that we really only have two
options: ignore the labels entirely, or allow them but warn of their
presence.  The problem isn't so much the label, which you could
legitimately want to access from an extension script, but rather the
references to them from code or data.  So we keep the label and add a
warning to the Messages list when we see a reference.

Moved NON_ADDR constants to Address class.  AddressMap now has a copy.
This is awkward because Asm65 and CommonUtil don't share.

Updated the asm code generators to understand NON_ADDR, and reworked
the API so that Merlin and cc65 output is correct for nested regions.

Address region changes are now noted in the anattribs array, which
makes certain operations faster than checking the address map.  It
also fixes a failure to recognize mid-instruction region changes in
the code analyzer.

Tweaked handling of synthetic regions, which are non-addressable areas
generated by the linear address map traversal to fill in any "holes".
The address region editor now treats attempts to edit them as
creation of a new region.
2021-09-30 21:11:26 -07:00
Andy McFadden
2fed19ac47 ORG rework, part 5
Updated project file format to save the new map entries.

Tweaked appearance of .arend directives to show the .arstart address
in the operand field.  This makes it easier to match them up on screen.
Also, add a synthetic comment on auto-generated .arstart entries.

Added .arstart/.arend to the things that respond to Jump to Operand
(Ctrl+J).  Selecting one jumps to the other end.  (Well, it jumps
to the code nearest the other, which will do for now.)

Added a menu item to display a text rendering of the address map.
Helpful when things get complicated.

Modified the linear map iterator to return .arend items with the offset
of the last byte in the region, rather than the first byte of the
following region.  While the "exclusive end" approach is pretty
common, it caused problems when updating the line list, because it
meant that the .arend directives were outside the range of offsets
being updated (and, for directives at the end of the file, outside
the file itself).  This was painful to deal with for partial updates.
Changing this required some relatively subtle changes and annoyed some
of the debug assertions, such as the one where all Line items have
offsets that match the start of a line, but it's the cleaner approach.
2021-09-27 18:13:06 -07:00
Andy McFadden
39b7b20144 ORG rework, part 1
This is the first step toward changing the address region map from a
linear list to a hierarchy.  See issue #107 for the plan.

The AddressMap class has been rewritten to support the new approach.
The rest of the project has been updated to conform to the new API,
but feature-wise is unchanged.  While the map class supports
nested regions with explicit lengths, the rest of the application
still assumes a series of non-overlapping regions with "floating"
lengths.

The Set Address dialog is currently non-functional.

All of the output for cc65 changed because generation of segment
comments has been removed.  Some of the output for ACME changed as
well, because we no longer follow "* = addr" with a redundant
pseudopc statement.  ACME and 65tass have similar approaches to
placing things in memory, and so now have similar implementations.
2021-09-16 17:02:19 -07:00
Andy McFadden
ae4c90d838 Warn about multi-line start/stop tags
One of the most confusing things you can do is select a bunch of
lines and apply a code start tag (nee "code hint").  We now ask for
confirmation when applying start/stop hints to multiple lines.

(issue #89)
2020-10-15 17:18:49 -07:00
Andy McFadden
49f4017410 Rename "hints" to "analyzer tags"
Variables, types, and comments have been updated to reflect the new
naming scheme.

The project file serialization code is untouched, because the data
is output as serialized enumerated values.  Adding a string conversion
layer didn't seem worthwhile.

No changes in behavior.

(issue #89)
2020-10-15 16:55:29 -07:00
Andy McFadden
728966f8d2 Add W65C02S support, part 4 (of 4)
Added 20233-rockwell unit test to exercise the new opcodes.  Nothing
too fancy.

Fixed branch offset computation.

(issue #87)
2020-10-11 18:43:00 -07:00
Andy McFadden
34ba47e71d Add W65C02S support, part 3
Modified the asm source generators and on-screen display to show the
DP arg for BBR/BBS as hex.  The instructions are otherwise treated
as relative branches, e.g. the DP arg doesn't get factored into the
cross-reference table.

ACME/cc65 put the bit number in the mnemonic, 64tass wants it to be
in the first argument, and Merlin32 wants nothing to do with any of
this because it's incompatible with the 65816.

Added an "all ops" test for W65C02.
2020-10-11 14:35:17 -07:00
Andy McFadden
2ec2917da5 Fix inline BRK no-no-continue flag
Inline BRK instructions have a problem similar to the one fixed
for JSR/JSL back in 63d7a487, but the same fix won't work because
JSR/JSL are assumed "continue", while BRK is assumed "no-continue",
and must therefore set a no-no-continue flag.  For now, we just
re-evaluate the BRK on every visit to the code.

A review of the previous fix revealed an opportunity to use the
NoContinueScript flag on subsequent visits to improve consistency.
2020-08-22 13:47:52 -07:00
Andy McFadden
2a65457e19 Default "smart PLP handling" to off
The feature is mildly broken and, frankly, unreliable on its best
day.  Default it to off for new projects.
2020-07-24 21:38:45 -07:00
Andy McFadden
b1340ec2ac Tweaks 2020-07-22 10:53:54 -07:00
Andy McFadden
288a857e47 Change PLP handling
The "smart" PLP handler tries to recover the flags from an earlier
PHP.  The non-smart version just marks all the flags as indeterminate.
This doesn't work well on the 65816 in native mode, because having
the M/X flags in an indeterminate state is rarely what you want.

Code rarely uses PLP to reset the flags to a specific state, preferring
explicit SEP/REP.  The analyzer is more likely to get the correct
answer by simply leaving the flags in their prior state.

A test case has been added to 20052-branches-and-banks, which now has
"smart PLP" disabled.
2020-07-20 11:54:00 -07:00
Andy McFadden
cc6ebaffc5 Update relocation data handling
When we have relocation data available, the code currently skips the
process of matching an address with a label for a PEA instruction when
the instruction in question doesn't have reloc data.  This does a
great job of separating code that pushes parts of addresses from code
that pushes constants.

This change expands the behavior to exclude instructions with 16-bit
address operands that use the Data Bank Register, e.g. "LDA abs"
and "LDA abs,X".  This is particularly useful for code that accesses
structured data using the operand as the structure offset, e.g.
"LDX addr" / "LDA $0000,X"

The 20212-reloc-data test has been updated to check the behavior.
2020-07-10 17:41:38 -07:00
Andy McFadden
da38bc0db8 Data Bank Register management, part 6 (of 6)
Add 20222-data-bank to regression test suite.  This exercises handling
of 16-bit operands with inter- and intra-bank references, and tests the
smartness in "smart PLB".

Also, update a couple of older tests that broke because the DBR is no
longer always the same as the PBR.  This just required adding "B=K"
in a few places to restore the original output.
2020-07-10 15:53:43 -07:00
Andy McFadden
2a2aadffec Data Bank Register management, part 5
Update documentation.  Add some information about OMF relocation
data as well.

Fix bug in B=K handling.
2020-07-10 13:29:36 -07:00
Andy McFadden
0929077fda Data Bank Register management, part 4
Implemented "smart" PLB handling.  If we see PHK/PLB, or 8-bit
LDA imm/PHA/PLB, we create a data bank change item.  The feature
can be disabled with a project property.
2020-07-09 19:42:31 -07:00
Andy McFadden
ee58d9e803 Data Bank Register management, part 3
Added a "fake" assembler pseudo-op for DBR changes.  Display entries
in line list.

Added entry to double-click handler so that you can double-click on
a PLB instruction operand to open the data bank editor.
2020-07-09 16:52:23 -07:00
Andy McFadden
973d162edb Data Bank Register management, part 2
Changed basic data item from an "extended enum" to a class, so we can
keep track of where things come from (useful for the display list).

Finished edit dialog.  Added serialization to project file.
2020-07-09 11:14:55 -07:00
Andy McFadden
18e6951f17 Add Data Bank Register management, part 1
On the 65816, 16-bit data access instructions (e.g. LDA abs) are
expanded to 24 bits by merging in the Data Bank Register (B).  The
value of the register is difficult to determine via static analysis,
so we need a way to annotate the disassembly with the correct value.
Without this, the mapping of address to file offset will sometimes
be incorrect.

This change adds the basic data structures and "fixup" function, a
functional but incomplete editor, and source for a new test case.
2020-07-08 17:56:27 -07:00
Andy McFadden
8d291ba21e Fix bank for AbsInd and AbsIndLong addressing
The Absolute Indirect and Absolute Indirect Long addressing modes
(e.g. "JMP (addr)" and "JMP [addr]") are 16-bit values in bank 0.
The code analyzer was placing them in the program bank, which
meant the wrong symbol was being used.

Also, tweak some docs.
2020-07-04 15:03:23 -07:00
Andy McFadden
d58b747571 Use relocation data to format instruction operands
This was a relatively lightweight change to confirm the usefulness
of relocation data.  The results were very positive.

The relatively superficial integration of the data into the data
analysis process causes some problems, e.g. the cross-reference table
entries show an offset because the code analyzer's computed operand
offset doesn't match the value of the label.  The feature should be
considered experimental

The feature can be enabled or disabled with a project property.  The
results were sufficiently useful and non-annoying to make the setting
enabled by default.
2020-07-03 17:58:41 -07:00
Andy McFadden
63d7a48705 Fix bug in inline JSR/JSL no-continue handling
JSR/JSL calls with inline data have the option of reporting that
they don't continue, which causes the code analyzer to treat them
as JMPs instead.  There was a bug that was causing the no-continue
flag to be lost in certain circumstances.

The code now explicitly records the plugin's response in an Anattrib
flag.  Test 2022-extension-scripts has been updated with a test case
that exercises this situation.
2020-05-08 17:41:26 -07:00
Andy McFadden
0bbb307d4e Correct handling of no-op .ORG statements
These were being overlooked because they didn't actually cause
anything to happen (a no-op .ORG sets the address to what it would
already have been).  The assembly source generator works in a way
that causes them to be skipped, so everybody was happy.

This seemed like the sort of thing that was likely to cause problems
down the road, however, so we now split regions correctly when a
no-op .ORG is encountered.  This affects the uncategorized data
analyzer and selection grouping.

This changed the behavior of the 2004-numeric-types test, which was
visibly weird in the UI but generated correct output.

Added the 2024-ui-edge-cases test to provide a place to exercise
edge cases when testing the UI by hand.  It has some value for the
automated regression test, so it's included there.

Also, changed the AddressMapEntry objects to be immutable.  This
is handy when passing lists of them around.
2020-02-28 14:49:18 -08:00
Andy McFadden
4bf1ab2799 Tweak visualizer interface
Report visualization generation errors through an explicit
IApplication interface, instead of pulling messages out of the
DebugLog stream.

Declare that GetVisGenDescrs() is only called when the plugin is in
the "prepared" state, so that plugins can taylor the set based on
the contents of the file.  (This could be used to set min/max on
the "offset" entries, but I want special handling for offsets, so
we might as well set it later.)
2019-12-05 10:29:00 -08:00
Andy McFadden
365864ccdf More progress on visualization
Implemented Apple II hi-res bitmap conversion.  Supports B&W and
color.  Uses essentially the same algorithm as CiderPress.

Experimented with displaying non-text items in ListView.  I assumed
it would work, since it's the sort of thing WPF is designed to do,
but it's always wise to approach with caution.  Visualization Sets
now show a 64x64 button as a placeholder for the eventual thumbnail.

Some things were being flaky, which turned out to be because I
wasn't Prepare()ing the plugins before using them from Edit
Visualization.  To make this a deterministic failure I added an
Unprepare() call that tells the plugin that we're all done.

NOTE: this breaks all existing plugins.
2019-11-30 18:02:03 -08:00
Andy McFadden
b6e571afc2 Correctly handle embedded instruction edge case
This began with a change to support "BRK <operand>" in cc65.  The
assembler only supports this for 65816 projects, so we detect that
and enable it when available.

While fiddling with some test code an assertion fired.  This
revealed a minor issue in the code analyzer: when overwriting inline
data with instructions, we weren't resetting the format descriptor.

The code that exercises it, which requires two-byte BRKs and an
inline BRK handler in an extension script, has been added to test
2022-extension-scripts.

The new regression test revealed a flaw in the 64tass code
generator's character encoding scanner that caused it to hang.
Fixed.
2019-10-19 17:28:45 -07:00
Andy McFadden
716dce5f28 Pass operand to extension script JSR/JSL handlers
Sort of silly to have every handler immediately pull the operand out
of the file data.  (This is arguably less efficient, since we now
have to serialize the argument across the AppDomain boundary, but
we should be okay spending a few extra nanoseconds here.)
2019-10-17 13:15:25 -07:00
Andy McFadden
dfd5bcab1b Optionally treat BRKs as two-byte instructions
Early data sheets listed BRK as one byte, but RTI after a BRK skips
the following byte, effectively making BRK a 2-byte instruction.
Sometimes, such as when diassembling Apple /// SOS code, it's handy
to treat it that way explicitly.

This change makes two-byte BRKs optional, controlled by a checkbox
in the project settings.  In the system definitions it defaults to
true for Apple ///, false for all others.

ACME doesn't allow BRK to have an arg, and cc65 only allows it for
65816 code (?), so it's emitted as a hex blob for those assemblers.
Anyone wishing to target those assemblers should stick to 1-byte mode.

Extension scripts have to switch between formatting one byte of
inline data and formatting an instruction with a one-byte operand.
A helper function has been added to the plugin Util class.

To get some regression test coverage, 2022-extension-scripts has
been configured to use two-byte BRK.

Also, added/corrected some SOS constants.

See also issue #44.
2019-10-09 14:55:56 -07:00
Andy McFadden
8c87ce3004 Check formatted string structure at load time
If we have a bug, or somebody edits the project file manually, we
can end up with a very wrong string, such as a null-terminated
string that isn't, or a DCI string that has a mix of high and low
ASCII from start to finish.  We now check all incoming strings for
validity, and discard any that fail the test.  The verification
code is shared with the extension script inline data formatter.

Also, added a comment to an F8-ROM symbol I stumbled over.
2019-10-06 17:07:07 -07:00
Andy McFadden
28eafef27c Expand the set of things SetInlineDataFormat accepts
Extension scripts (a/k/a "plugins") can now apply any data format
supported by FormatDescriptor to inline data.  In particular, it can
now handle variable-length inline strings.  The code analyzer
verifies the string structure (e.g. null-terminated strings have
exactly one null byte, at the very end).

Added PluginException to carry an exception back to the plugin code,
for occasions when they're doing something so wrong that we just
want to smack them.

Added test 2022-extension-scripts to exercise the feature.
2019-10-05 19:51:34 -07:00
Andy McFadden
1ddf4bed48 Fix code tracing bug
If you set things up just right, it's possible for flag status
changes to fail to get merged.

Added a regression test to 1003-flags-and-branches.

Also, tweaked the instruction operand editor to be a bit smoother
from the keyboard: added alt-key shortcuts, and put the focus on the
OK button after creating/editing a label so you can just hit the
return key twice.
2019-09-17 14:38:16 -07:00
Andy McFadden
431ad94d95 Make "smart" PLP handling optional
We try to be clever with PHP/PLP, but sometimes we get it wrong.  If
we get it wrong a lot, we want to turn it off.  Now we can.
2019-09-02 15:57:59 -07:00
Andy McFadden
15d26c9ebd Don't do plugin interface checks during code analysis
The plugin objects are MarshalByRefObject stubs, which means they
don't actually implement the interfaces we're checking for.  There's
some additional overhead to do the interface check.  We can avoid
it by doing the interface queries during initialization, and just
checking some bit flags later on.

Also, in the extension script info window, show a list of
implemented interfaces.
2019-08-10 17:16:39 -07:00
Andy McFadden
0d0854bda7 Change the way string formats are defined
We used to use type="String", with the sub-type indicating whether
the string was null-terminated, prefixed with a length, or whatever.
This didn't leave much room for specifying a character encoding,
which is orthogonal to the sub-type.

What we actually want is to have the type specify the string type,
and then have the sub-type determine the character encoding.  These
sub-types can also be used with the Numeric type to specify the
encoding of character operands.

This change updates the enum definitions and the various bits of
code that use them, but does not add any code for working with
non-ASCII character encodings.

The project file version number was incremented to 2, since the new
FormatDescriptor serialization is mildly incompatible with the old.
(Won't explode, but it'll post a complaint and ignore the stuff
it doesn't recognize.)

While I was at it, I finished removing DciReverse.  It's still part
of the 2005-string-types regression test, which currently fails
because the generated source doesn't match.
2019-08-07 16:19:13 -07:00
Andy McFadden
98914e9f80 Treat BRK as a 1-byte instruction
The 65816 definition makes it a two-byte instruction, like COP.  On
the 6502 it acted like a two-byte instruction, but in practice very
few assemblers treat it that way.  Very few humans, for that matter.
So it's now treated as a single byte instruction, with the following
byte encoded as a data value.
2019-08-02 17:21:50 -07:00
Andy McFadden
0616e4e4a4 Define interfaces for inline call handlers and BRK
Instead of providing no-op CheckJsr/CheckJsl, plugins now declare
which calls they support by defining interfaces on the plugin class.

I added a CheckBrk call for code like Apple /// SOS calls, which
use BRK as an OS call mechanism.  The formatting doesn't work quite
right yet because I've been treating BRK as a two-byte instruction.
Hardly anything else does, and I think it's time I stopped (but not
in this commit).

Note: THIS BREAKS ALL PLUGINS that use the inline JSR/JSL feature,
which is pretty much all of them.
2019-08-02 16:06:27 -07:00
Andy McFadden
c64f72d147 Move WPF code from SourceGenWPF to SourceGen 2019-07-20 13:28:37 -07:00
Andy McFadden
e3906e021b Move WinForms code to SourceGenWF 2019-07-20 13:02:54 -07:00
Andy McFadden
47b1363738 Add more detail to cross references
In the cross-reference table we now indicate whether the reference
source is doing a read, write, read-modify-write, branch, subroutine
call, is just referencing the address, or is part of the data.
2019-04-11 16:23:02 -07:00
Andy McFadden
2c6212404d Initial file commit 2018-09-28 10:05:11 -07:00