6502bench

mirror of https://github.com/fadden/6502bench.git synced 2024-05-31 22:41:37 +00:00

Author	SHA1	Message	Date
Andy McFadden	13dca8b78c	More tweaks to def sym editing If you edit an existing symbol, the "is the label unique" test will always false-positive match itself, so we have to explicitly handle that case. Dialogs like Edit Instruction Operand make things a bit more complicated because they don't flush results to the symbol table immediately, which means the symbol we pass into the Edit Def Symbol dialog to edit isn't necessarily the one we need to exclude from the label uniqueness test. The dialog was using the initial value as both "original" and "initial", which caused some issues. We now pass both values in. Also, removed some dead code identified by VS.	2022-03-02 08:13:46 -08:00
Andy McFadden	e6c5c7f8df	ORG rework, part 6 Added support for non-addressable regions, which are useful for things like file headers stripped out by the system loader, or chunks that get loaded into non-addressable graphics RAM. Regions are specified with the "NA" address value. The code list displays the address field greyed out, starting from zero (which is kind of handy if you want to know the relative offset within the region). Putting labels in non-addressable regions doesn't make sense, but symbol resolution is complicated enough that we really only have two options: ignore the labels entirely, or allow them but warn of their presence. The problem isn't so much the label, which you could legitimately want to access from an extension script, but rather the references to them from code or data. So we keep the label and add a warning to the Messages list when we see a reference. Moved NON_ADDR constants to Address class. AddressMap now has a copy. This is awkward because Asm65 and CommonUtil don't share. Updated the asm code generators to understand NON_ADDR, and reworked the API so that Merlin and cc65 output is correct for nested regions. Address region changes are now noted in the anattribs array, which makes certain operations faster than checking the address map. It also fixes a failure to recognize mid-instruction region changes in the code analyzer. Tweaked handling of synthetic regions, which are non-addressable areas generated by the linear address map traversal to fill in any "holes". The address region editor now treats attempts to edit them as creation of a new region.	2021-09-30 21:11:26 -07:00
Andy McFadden	39b7b20144	ORG rework, part 1 This is the first step toward changing the address region map from a linear list to a hierarchy. See issue #107 for the plan. The AddressMap class has been rewritten to support the new approach. The rest of the project has been updated to conform to the new API, but feature-wise is unchanged. While the map class supports nested regions with explicit lengths, the rest of the application still assumes a series of non-overlapping regions with "floating" lengths. The Set Address dialog is currently non-functional. All of the output for cc65 changed because generation of segment comments has been removed. Some of the output for ACME changed as well, because we no longer follow "* = addr" with a redundant pseudopc statement. ACME and 65tass have similar approaches to placing things in memory, and so now have similar implementations.	2021-09-16 17:02:19 -07:00
Andy McFadden	0dfa2326dd	Fix L1/L2 ASCII string editing The data operand editor determines low vs. high ASCII formatting by examining the first byte of string data. Unfortunately the test was broken, and for strings with a 1- or 2-byte length, was testing the length byte instead of the character data. This is now fixed. This also changes the way empty strings are handled. Before, they were allowed but not counted, so you couldn't create an empty string by itself, but could do it if it were part of a larger group. This was unnecessarily restrictive. Empty L1/L2/null-term strings are now allowed. This means that a buffer full of $00 can be formatted as a big pile of empty strings, which seems a bit ridiculous but there's no good reason to obstruct it. (issue #110)	2021-09-12 09:46:55 -07:00
Andy McFadden	ec2ad529c8	Remove a couple of faulty assertions One asserted unnecessarily, one should have been an if/then. Both were concerned with instruction operands being formatted with type "address".	2021-08-11 16:25:24 -07:00
Andy McFadden	3368182e14	Allow single-character DCI strings The DCI string format uses character values where the high bit of the last byte differs from the rest of the string. Usually all the high bits are clear except on the last byte, but SourceGen generally allows either polarity. This gets a little uncertain with single-character strings, because SourceGen can't auto-detect DCI very effectively. A series of bytes with the high bit set could be a single high-ASCII string or a series of single-byte DCI strings. The motivation for allowing them is C64 PETSCII. While ASCII allows "high ASCII" as an escape hatch, PETSCII doesn't have that option, so there's no way to mark the data as a character or a string. We still want to do a bit of screening, but if the user specifies a non-ASCII character set and the selected bytes have their high bits set, we want to just treat the whole set as 1-byte DCI. Some minor adjustments were needed for a couple of validity checks that expected longer strings. This adds some short DCI strings in different character sets to the char-encoding regression tests. (for issue #102)	2021-08-08 15:38:39 -07:00
Andy McFadden	d65ab59461	Don't reject strings with "invalid" characters When formatting one or more strings with the Edit Data Operand dialog, the code must determine which options to present. If the selected bytes appear to represent one or more null-terminated strings, that option is enabled in the UI. The "format recognizers" enforce some strict rules, e.g. null- terminated strings must end in $00, and also try to confirm that the data looks like a printable string. The algorithm rejects strings with "illegal" characters in them. This is simpler on some systems than others. For example, C64 PETSCII defines quite a few control characters in ways that make them useful for embedding in printable strings. The "recognizers" are only used by the operand edit feature, not as part of an automated string detector, so there's no real upside in overriding the user's desire to form a string with arbitrary bytes. This removes the quick rejection from the four recognizers (null-term, len8, len16, dci). It does not alter the high-level code, which still insists on a certain percentage of the string being printable; that may be worth revisiting as well. (issue #100)	2021-08-01 17:50:32 -07:00
Andy McFadden	cc6ebaffc5	Update relocation data handling When we have relocation data available, the code currently skips the process of matching an address with a label for a PEA instruction when the instruction in question doesn't have reloc data. This does a great job of separating code that pushes parts of addresses from code that pushes constants. This change expands the behavior to exclude instructions with 16-bit address operands that use the Data Bank Register, e.g. "LDA abs" and "LDA abs,X". This is particularly useful for code that accesses structured data using the operand as the structure offset, e.g. "LDX addr" / "LDA $0000,X" The 20212-reloc-data test has been updated to check the behavior.	2020-07-10 17:41:38 -07:00
Andy McFadden	6ce2cc0b58	Fix label-trampling bug in reloc data handler If code accesses the high/low parts of a 32-bit address value with no label, it auto-generates labels for addr+2 and addr. The reloc handler was replacing the unformatted bytes with a single multi-byte format, hiding the label at addr+2. The easy fix is to have the reloc data handler skip the entry. This is less useful than other approaches, but much simpler. Added a test to 20212-reloc-data.	2020-07-10 13:56:07 -07:00
Andy McFadden	44522dc2f2	Performance tweak The Visual Studio performance profiler showed the FormatDescriptor equality test being called quite a lot. The test was vs. null, so a simple change from "==" to "is" improved performance dramatically. Fixing the underlying issue with a better data structure is still important, but this provided a big boost with little effort.	2020-07-07 12:09:00 -07:00
Andy McFadden	f4fe3af050	Fix application of reloc info in data areas The test wasn't correctly excluding instructions, so it was possible to create a situation where a two-byte data item had an instruction starting in the second byte. We also weren't checking the length of the instruction to ensure that it was wider than the reloc data. This could get weird for an immediate constant when the M/X flags are wrong. When in doubt, don't overwrite.	2020-07-07 11:48:51 -07:00
Andy McFadden	4e70edc90c	Add 20212-reloc-data test This test exercises the relocation data feature. The test file is generated from a multi-segment OMF file that was hex-edited to have specific attributes (see 20212-reloc-data-lnk.S for instructions). The test also serves as a way to exercise the OMF converter. Also, implement the Bank Relative flag.	2020-07-05 17:17:44 -07:00
Andy McFadden	0fa77cba75	Apply relocation data to unformatted data Works well for things like jump tables. Seeing a bunch of these scattered in a chunk of data is a decent signal that it's actually code. In a bold move, we now exclude PEA operands from auto-label gen when they don't have relocation data. This is very useful for things like Int2Hex for which constants are typically pushed with PEA. Reworked the "use reloc data" setting so it defaults to false and is explicitly set to true when converting OMF. This provides a minor optimization since we now check the boolean and skip doing a lookup in an empty table.	2020-07-03 22:03:50 -07:00
Andy McFadden	d58b747571	Use relocation data to format instruction operands This was a relatively lightweight change to confirm the usefulness of relocation data. The results were very positive. The relatively superficial integration of the data into the data analysis process causes some problems, e.g. the cross-reference table entries show an offset because the code analyzer's computed operand offset doesn't match the value of the label. The feature should be considered experimental The feature can be enabled or disabled with a project property. The results were sufficiently useful and non-annoying to make the setting enabled by default.	2020-07-03 17:58:41 -07:00
Andy McFadden	0bbb307d4e	Correct handling of no-op .ORG statements These were being overlooked because they didn't actually cause anything to happen (a no-op .ORG sets the address to what it would already have been). The assembly source generator works in a way that causes them to be skipped, so everybody was happy. This seemed like the sort of thing that was likely to cause problems down the road, however, so we now split regions correctly when a no-op .ORG is encountered. This affects the uncategorized data analyzer and selection grouping. This changed the behavior of the 2004-numeric-types test, which was visibly weird in the UI but generated correct output. Added the 2024-ui-edge-cases test to provide a place to exercise edge cases when testing the UI by hand. It has some value for the automated regression test, so it's included there. Also, changed the AddressMapEntry objects to be immutable. This is handy when passing lists of them around.	2020-02-28 14:49:18 -08:00
Andy McFadden	091955b9c2	Allow setting the start/end address for a block If you have a single line selected, Set Address adds a .ORG directive that changes the addresses of all following data, until the next .ORG directive is reached. Sometimes code will relocate part of itself, and it's useful to be able to set the address at the end of the block to what it would have been before the .ORG change. If you have multiple lines selected, we now add the second .ORG to the offset that follows the last selected line. Also, fixed a bug in the Symbol value updater that wasn't handling non-unique labels correctly.	2019-12-25 18:17:50 -08:00
Andy McFadden	1cdb31de32	Visualizer improvements Various changes: - Generally treat visualization sets like long comments and notes when it comes to defining data region boundaries. (We were doing this for selections; now we're also doing it for format-as-word and in the data analyzer when scanning for strings/fill.) - Clear the visualization cache when the address map is altered. This is necessary for visualizers that dereference addresses. - Read the Apple II screen image from a series of addresses rather than a series of offsets. This allows it to work when the image is contiguous in memory but split into chunks in the file. - Put 1 pixel of padding around the images in the main code list, so they don't blend into the background. - Remember the last visualizer used, so we can re-use it the next time the user selects "new". - Move min-size hack from Loaded to ContentRendered, as it apparently spoils CenterOwner placement.	2019-12-06 15:05:49 -08:00
Andy McFadden	51081c5db0	Tweak "nearby" label finder The code that found a nearby data target for an instruction operand was searching backward but not forward. We now take one step forward, so that "LDA TABLE-1,Y" fills in automatically. This altered 2008-address-changes, which had just this situation. It didn't alter 2010-target-adjustment, but the existing tests were insufficient and have been improved.	2019-10-29 18:12:22 -07:00
Andy McFadden	8c87ce3004	Check formatted string structure at load time If we have a bug, or somebody edits the project file manually, we can end up with a very wrong string, such as a null-terminated string that isn't, or a DCI string that has a mix of high and low ASCII from start to finish. We now check all incoming strings for validity, and discard any that fail the test. The verification code is shared with the extension script inline data formatter. Also, added a comment to an F8-ROM symbol I stumbled over.	2019-10-06 17:07:07 -07:00
Andy McFadden	0d9814d993	Allow explicit widths in project/platform symbols, part 3 Implement multi-byte project/platform symbols by filling out a table of addresses. Each symbol is "painted" into the table, replacing an existing entry if the new entry has higher priority. This allows us to handle overlapping entries, giving boosted priority to platform symbols that are defined in .sym65 files loaded later. The bounds on project/platform symbols are now rigidly defined. If the "nearby" feature is enabled, references to SYM-1 will be picked up, but we won't go hunting for SYM+1 unless the symbol is at least two bytes wide. The cost of adding a symbol to the symbol table is about the same, but we don't have a quick way to remove a symbol. Previously, if two platform symbols had the same value, the symbol with the alphabetically lowest label would win. Now, the symbol defined in the most-recently-loaded file wins. (If you define two symbols with the same value in the same file, it's still resolved alphabetically.) This allows the user to pick the winner by arranging the load order of the platform symbol files. Platform symbols now keep a reference to the file ident of the symbol file that defined them, so we can show the symbols's source in the Info panel. These changes altered the behavior of test 2008-address-changes, which includes some tests on external addresses that are close to labeled internal addresses. The previous behavior essentially treated user labels as being 3 bytes wide and extending outside the file bounds, which was mildly convenient on occasion but felt a little skanky. (We could do with a way to define external symbols relative to internal symbols, for things like the source address of code that gets relocated.) Also, re-enabled some unit tests. Also, added a bit of identifying stuff to CrashLog.txt.	2019-10-02 16:50:15 -07:00
Andy McFadden	4d9d5e2ecf	Instruction operand editor rework, part 3 Implemented editing of labels and project symbols. Also, cleaned up the local variable edit code.	2019-09-08 16:41:54 -07:00
Andy McFadden	2633720c82	Instruction operand editor rework, part 1 Rearrange the UI elements, and convert the code-behind to a more XAML-style form. The basic stuff works, but the old "shortcut" system is still in the process of being replaced.	2019-09-07 13:39:22 -07:00
Andy McFadden	ee6e5d7fb6	Fix a couple of obscure bugs The code that checked to see if a data target was inside a data operand wasn't going all the way back to the start of the file. It was also failing to stop when it should, wasting time. The anattrib validation method has code that avoids a false-positive on certain complex embedded instruction arrangements. This was also preventing it from seeing a transition from a data area to the middle of an instruction (caused by issue #45).	2019-09-04 17:48:55 -07:00
Andy McFadden	5dcdbe3f3a	Various tweaks Fixed a minor bug in GenerateLineList that would cause a blank line to disappear under certain circumstances. Harmless, but odd. Added a width property to DefSymbol. Updated comments.	2019-08-24 17:35:26 -07:00
Andy McFadden	38d3adbb08	PETSCII does DCI I didn't think it made sense, but I found something that used it, so apparently it's a thing. This updates the operand editor to let you choose PETSCII+DCI, and updates the assemblers to handle it correctly (really just 64tass, since the others either don't have a DCI directive or don't deal with PETSCII at all). Changed the char-encoding sample from "bad dcI" to "pet dcI", and updated the documentation.	2019-08-20 17:55:12 -07:00
Andy McFadden	7bbe5692bd	Add C64 encodings to instruction and data operand editors Both dialogs got a couple extra radio buttons for selection of single character operands. The data operand editor got a combo box that lets you specify how it scans for viable strings. Various string scanning methods were made more generic. This got a little strange with auto-detection of low/high ASCII, but that was mostly a matter of keeping the previous code around as a special case. Made C64 Screen Code DCI strings a thing that works.	2019-08-15 17:53:12 -07:00
Andy McFadden	5889f45737	Replace on-screen string operand formatting The previous functions just grabbed 62 characters and slapped quotes on the ends, but that doesn't work if we want to show strings with embedded control characters. This change replaces the simple formatter with the one used to generate assembly source code. This increases the cost of refreshing the display list, so a cache will need to be added in a future change. Converters for C64 PETSCII and C64 Screen Code have been defined. The results of changing the auto-scan encoding can now be viewed. The string operand formatter was using a single delimiter, but for the on-screen version we want open-quote and close-quote, and might want to identify some encodings with a prefix. The formatter now takes a class that defines the various parts. (It might be worth replacing the delimiter patterns recently added for single-character operands with this, so we don't have two mechanisms for very nearly the same thing.) While working on this change I remembered why there were two kinds of "reverse" in the old Merlin 32 string operand generator: what you want for assembly code is different from what you want on screen. The ReverseMode enum has been resurrected.	2019-08-13 17:52:58 -07:00
Andy McFadden	f3c28406a5	Add multiple encoding support to uncategorized data analyzer The code that searches for character strings in uncategorized data now recognizes the C64 encodings when selected in the project properties. The new code avoids some redundant comparisons when runs of printable characters are found. I suspect the new implementation loses on overall performance because we're now calling through delegates instead of testing characters directly, but I haven't tested for that.	2019-08-13 14:08:27 -07:00
Andy McFadden	f33cd7d8a6	Replace character operand output method The previous code output a character in single-quotes if it was standard ASCII, double-quotes if high ASCII, or hex if it was neither of those. If a flag was set, high ASCII would also be output as hex. The new system takes the character value and an encoding identifier. The identifier selects the character converter and delimiter pattern, and puts the two together to generate the operand. While doing this I realized that I could trivially support high ASCII character arguments in all assemblers by setting the delimiter pattern to "'#' \| $80". In FormatDescriptor, I had previously renamed the "Ascii" sub-type "LowAscii" so it wouldn't be confused, but I dislike filling the project file with "LowAscii" when "Ascii" is more accurate and less confusing. So I switched it back, and we now check the project file version number when deciding what to do with an ASCII item. The CharEncoding tests/converters were also renamed. Moved the default delimiter patterns to the string table. Widened the delimiter pattern input fields slightly. Added a read- only TextBox with assorted non-typewriter quotes and things so people have something to copy text from.	2019-08-11 22:11:00 -07:00
Andy McFadden	975b62db6b	Treat low and high ASCII as two distinct formats We've been treating ASCII strings and instruction/data operands as ambiguous, resolving low vs. high when generating output for the display or assembler. This change splits it into two separate formats, simplifying output generation. The UI will continue to treat low/high ASCII as as single thing, selecting the format appropriately based on the data. There's no reason to have two radio buttons that are never both enabled. The data operand string functions need some additional work, but that overlaps substantially with the upcoming PETSCII changes, so for now all strings set by the data operand editor are low ASCII. The file format has changed again, but since there hasn't been a release since the previous change, I'm leaving the file format at v2. Code has been added to resolve the ASCII mode when loading a v1 project file. This removes some complexity from the assembly code generators.	2019-08-10 14:59:24 -07:00
Andy McFadden	0d0854bda7	Change the way string formats are defined We used to use type="String", with the sub-type indicating whether the string was null-terminated, prefixed with a length, or whatever. This didn't leave much room for specifying a character encoding, which is orthogonal to the sub-type. What we actually want is to have the type specify the string type, and then have the sub-type determine the character encoding. These sub-types can also be used with the Numeric type to specify the encoding of character operands. This change updates the enum definitions and the various bits of code that use them, but does not add any code for working with non-ASCII character encodings. The project file version number was incremented to 2, since the new FormatDescriptor serialization is mildly incompatible with the old. (Won't explode, but it'll post a complaint and ignore the stuff it doesn't recognize.) While I was at it, I finished removing DciReverse. It's still part of the 2005-string-types regression test, which currently fails because the generated source doesn't match.	2019-08-07 16:19:13 -07:00
Andy McFadden	c64f72d147	Move WPF code from SourceGenWPF to SourceGen	2019-07-20 13:28:37 -07:00
Andy McFadden	e3906e021b	Move WinForms code to SourceGenWF	2019-07-20 13:02:54 -07:00
Andy McFadden	823aa072fb	Update comments	2019-04-29 13:07:52 -07:00
Andy McFadden	8d0ce87ec7	Experiment on uncategorized data analysis Tried something to speed it up. Didn't help. Cleaned up the code a bit though.	2019-04-18 15:58:43 -07:00
Andy McFadden	97a372a884	Add selectable auto-label styles SourceGen creates "auto" labels when it finds a reference to an address that doesn't have a label associated with it. The label for address $1234 would be "L1234". This change allows the project to specify alternative label naming conventions, annotating them with information from the cross-reference data. For example, a subroutine entry point (i.e. the target of a JSR) would be "S_1234". (The underscore was added to avoid confusion when an annotation letter is the same as a hex digit.) Also, tweaked the way the preferred clipboard line format is stored in the settings file (was an integer, now an enumeration string).	2019-04-15 15:14:04 -07:00
Andy McFadden	47b1363738	Add more detail to cross references In the cross-reference table we now indicate whether the reference source is doing a read, write, read-modify-write, branch, subroutine call, is just referencing the address, or is part of the data.	2019-04-11 16:23:02 -07:00
Andy McFadden	2f74fce80b	Expand set of things that work with double-click on opcode If you double-click on the opcode of "JSR label", the code view selection jumps to the label. This now works for partial operands, e.g. "LDA #<label". Some changes to the find-label-offset code affected the cc65 "is it a forward reference to a direct-page label" logic. The regression test now correctly identifies an instruction that refers to itself as not being a forward reference.	2018-11-03 15:03:25 -07:00
Andy McFadden	b97f7ca3d8	Fix add-label shortcut for adjusted operands When you edit the operand of an instruction that targets an in-file address, you're given the opportunity to specify a shortcut that applies the symbol to the instruction's target address in addition to or instead of defining a weak symbol reference on the instruction being edited. This didn't work right for operands with adjustments, e.g. the store instructions in self-modifying code. It put the label at the unadjusted offset, which does nothing useful. We now correctly back up to the start of the instruction or multi- byte data area.	2018-10-11 16:48:55 -07:00
Andy McFadden	f4e4ac842d	First cut of split-address table formatter Allows specification of table data in various ways, for 16-bit and 24-bit addresses. Shows a preview so you can see if the addresses look about right. Adds permanent labels at target offsets if none are present. Optionally sets code hints. Works beautifully on the A2-Amper-fdraw example, but needs some additional testing, documentation, etc. Dialog is more complicated that I would have liked, mostly because of 65816 support, but I think it'll do. (issue #10)	2018-10-06 18:05:31 -07:00
Andy McFadden	2c6212404d	Initial file commit	2018-09-28 10:05:11 -07:00

41 Commits