6502bench

mirror of https://github.com/fadden/6502bench.git synced 2024-11-04 15:05:03 +00:00

Author	SHA1	Message	Date
Andy McFadden	cb114be0f6	Add "uninitialized data" format type This allows regions that hold variable storage to be marked as data that is initialized by the program before it is used. Previously the choices were to treat it as bulk data (initialized) or junk (totally unused), neither of which are correct. This is functionally equivalent to "junk" as far as source code generation is concerned (though it doesn't have to be). For the code/data/junk counter, uninitialized data is counted as junk, because it technically does not need to be part of the binary.	2021-10-13 15:05:07 -07:00
Andy McFadden	0ac0686c7a	ORG rework, part 9 Modified "jump to" code to understand address range start/end lines. If there are multiple starts or ends at the same offset, we jump to the first one in the set, which is suboptimal but simpler to do. Simplified the API, embedding GoToMode in the Location object (which is where it really needs to be, to make fwd/back work right). Updated HTML export to grey out addresses in NON_ADDR sections. Changed default pseudo-op strings for address regions to ".addrs" and ".adrend", after trying a bunch of things that were worse. Added definitions for region-end pseudo-ops to Merlin32 and cc65 for display on screen. Added regression test 20260 for address region pre-labels. Fixed handling of leading underscores in platform/project symbols. These need to be escaped in 64tass output. Updated regression test 20170-external-symbols to check it.	2021-10-07 12:39:30 -07:00
Andy McFadden	5f472b60cf	ORG rework, part 3 Split ".org" into ".arstart" and ".arend" (address range start/end). Address range ends are now shown in the code list view, and the pseudo-op can be edited in app settings. Address range starts are now shown after notes and long comments, rather than before, which brings the on-screen display in sync with generated code. Reworked the address range editor UI to include the new features. The implementation is fully broken. More changes to the AddressMap API, putting the resolved region length into a separate ActualLength field. Added FindRegion(). Renamed some things. Code generation changed slightly: the blank line before a region-end line now comes after it, and ACME's "} ;!pseudopc" is now just "}". This required minor updates to some of the regression test results.	2021-09-22 15:28:11 -07:00
Andy McFadden	635084db9d	Fix DCI string edge case If a DCI string ended with a string delimiter or non-ASCII character (e.g. a PETSCII char with no ASCII equivalent), the code generator output the last byte as a hex value. This caused an error because it was outputting the raw hex value, with the high bit already set, which the assembler did not expect. This change corrects the behavior for code generation and on-screen display, and adds a few samples to the regression test suite. (see issue #102)	2021-08-10 14:08:39 -07:00
Andy McFadden	478afa542e	Fix 64tass code gen corner case On the 65816, if you say "JSR foo" from bank $12, but "foo" is an address in bank 0, most assemblers will conclude that you're forming a 16-bit argument with a 16-bit address and assemble happily. 64tass halts with an error. Up until v1.55 or so, you could fake it out by supplying a large offset. This no longer works. The preferred way to say "no really I mean to do this" is to append ",k" to the operand. We now do that as needed. I didn't want to define a new ExpressionMode for 64tass just to support an operand modifier that should probably never actually get generated (you can't call across banks with JSR!), so this is implemented with a quirk and an op flag. 64tass v1.56.2625 is now the default. (issue #104)	2021-08-09 14:11:15 -07:00
Andy McFadden	8c053c29f2	Update ACME generator for v0.97 Two things changed: (1) string literals can now hold backslash escapes like "\n"; (2) MVN/MVP operands can now be prefixed with '#'. The former was a breaking change because any string with "\" must be changed to "\\". This is now handled by the string operand formatter. Also, improved test harness output. Show the assembler versions at the end, and include assembler failure messages in the collected output.	2021-07-31 14:42:36 -07:00
Andy McFadden	cb6ceafd73	Make operand wrap length configurable Long operands, such as strings and bulk data, can span multiple lines. SourceGen wraps them at 64 characters, which is fine for assembly output but occasionally annoying on screen: if the operand column is wide enough to show the entire value, the comment column is pushed pretty far to the right. This change makes the width configurable, as 32/48/64 characters, with a pop-up in app settings. The assemblers are all wired to 64 characters, though we could make this configurable as well with an assembler-specific setting. Some things have moved around a bit in app settings. The Asm Config tab now comes last. Having it sandwiched in the middle of tabs that altered the on-screen display didn't make much sense. The Display Format is now explicitly for opcodes and operands, and is split into two columns. The left column is managed by the "quick set" feature, the right column is independent.	2020-07-19 18:39:27 -07:00
Andy McFadden	ee58d9e803	Data Bank Register management, part 3 Added a "fake" assembler pseudo-op for DBR changes. Display entries in line list. Added entry to double-click handler so that you can double-click on a PLB instruction operand to open the data bank editor.	2020-07-09 16:52:23 -07:00
Andy McFadden	6d7fdff6b5	Fix 65816 code generation issues Code generated for 64tass was incorrect for JSR/JMP to a location outside the file bounds. A test added to 20052-branches-and-banks revealed an issue with cc65 generation as well.	2020-07-03 14:02:38 -07:00
Andy McFadden	4d06bb24eb	Improve Common expression generation Removed unnecessary parenthesis from Common-style expressions, which are used by 64tass and ACME.	2020-07-02 13:00:02 -07:00
Andy McFadden	fdd2bcf847	Fix some 65816 code generation issues Two basic problems: (1) cc65, being a one-pass assembler, can't tell if a forward-referenced label is 16-bit or 24-bit. If the operand is potentially ambiguous, such as "LDA label", we need to add an operand width disambiguator. (The existing tests managed to only do backward references.) (2) 64tass wants the labels on JMP/JSR absolute operands to have 24-bit values that match the current program bank. This is the opposite of cc65, which requires 16-bit values. We need to distinguish PBR vs. DBR instructions (i.e. "LDA abs" vs. "JMP abs") and handle them differently when formatting for "Common". Merlin32 doesn't care, and ACME doesn't work at all, so neither of those needed updating. The 20052-branches-and-banks test was expanded to cover the problematic cases.	2020-07-01 17:59:12 -07:00
Andy McFadden	071adb8e95	Two changes to "dense hex" bulk data formatting (1) Added an option to limit the number of bytes per line. This is handy for things like bitmaps, where you might want to put (say) 3 or 8 bytes per line to reflect the structure. (2) Added an application setting that determines whether the screen listing shows Merlin/ACME dense hex (20edfd) or 64tass/cc65 hex bytes ($20,$ed,$fd). Made the setting part of the assembler-driven display definitions. Updated 64tass+cc65 to use ".byte" as their dense hex pseudo-op, and to use the updated formatter code. No changes to regression test output. (Changes were requested in issue #42.) Also, added a resize gripper to the bottom-right corner of the main window. (These seem to have generally fallen out of favor, but I like having it there.)	2019-12-10 17:41:00 -08:00
Andy McFadden	d3670c48e8	Label rework, part 6 Correct handling of local variables. We now correctly uniquify them with regard to non-unique labels. Because local vars can effectively have global scope we mostly want to treat them as global, but they're uniquified relative to other globals very late in the process, so we can't just throw them in the symbol table and be done. Fortunately local variables exist in a separate namespace, so we just need to uniquify the variables relative to the post-localization symbol table. In other words, we take the symbol table, apply the label map, and rename any variable that clashes. This also fixes an older problem where we weren't masking the leading '_' on variable labels when generating 64tass output. The code list now makes non-unique labels obvious, but you can't tell the difference between unique global and unique local. What's more, the default type value in Edit Label is now adjusted to Global for unique locals that were auto-generated. To make it a bit easier to figure out what's what, the Info panel now has a "label type" line that reports the type. The 2023-non-unique-labels test had some additional tests added to exercise conflicts with local variables. The 2019-local-variables test output changed slightly because the de-duplicated variable naming convention was simplified.	2019-11-18 13:36:53 -08:00
Andy McFadden	68c324bbe8	Label rework, part 4 Update the symbol lookup in EditInstructionOperand, EditDataOperand, and GotoBox to correctly deal with non-unique labels. This is a little awkward because we're doing lookups by name on a non-unique symbol, and must resolve the ambiguity. In the case of an instruction operand that refers to an address this is pretty straightforward. For partial bytes (LDA #>:foo) or data directives (.DD1 :foo) we have to take a guess. We can probably make a more informed guess than we currently are, e.g. the LDA case could find the label that minimizes the adjustment, but I don't want to sink a lot of time into this until I'm sure it'll be useful. Data operands with multiple regions are something of a challenge, but I'm not sure specifying a single symbol for multiple locations is important. The "goto" box just finds the match that's closest to the selection. Unlike "find", it always grabs the closest, not the next one forward. (Not sure if this is useful or confusing.)	2019-11-16 16:44:08 -08:00
Andy McFadden	be65f280a3	Minor tweaks - Renamed "strip label prefix/suffix" to "omit label prefix/suffix". - Changed a Merlin operand workaround so it doesn't apply to code that is explicitly not in bank zero. - Changed {addr}/{const} annotations on project/platform symbol equates so they line up a little better on screen and in exported sources.	2019-11-15 16:24:07 -08:00
Andy McFadden	5dd7576529	Label rework, part 2 Continue development of non-unique labels. The actual labels are still unique, because we append a uniquifier tag, which gets added and removed behind the scenes. We're currently using the six-digit hex file offset because this is only used for internal address symbols. The label editor and most of the formatters have been updated. We can't yet assemble code that includes non-unique labels, but older stuff hasn't been broken. This removes the "disable label localization" property, since that's fundamentally incompatible with what we're doing, and adds a non- unique label prefix setting so you can put '@' or ':' in front of your should-be-local labels. Also, fixed a field name typo.	2019-11-12 17:44:51 -08:00
Andy McFadden	4d079c8d14	Label rework, part 1 This adds the concept of label annotations. The primary driver of the feature is the desire to note that sometimes you know what a thing is, but sometimes you're just taking an educated guess. Instead of writing "high_score_maybe", you can now write "high_score?", which is more compact and consistent. The annotations are stripped off when generating source code, making them similar to Notes. I also created a "Generated" annotation for the labels that are synthesized by the address table formatter, but don't modify the label for them, because there's not much need to remind the user that "T1234" was generated by algorithm. This also lays some of the groundwork for non-unique labels.	2019-11-08 21:02:15 -08:00
Andy McFadden	630f7f0f87	Improve the "info" panel Not a huge improvement, but things are slightly more organized, and there's a splash of color in the form of a border around the text describing the format of code and data lines. Added an "IsConstant" property to Symbol.	2019-10-22 21:27:49 -07:00
Andy McFadden	cd23580cc5	Add junk/align directives Sometimes there's a bunch of junk in the binary that isn't used for anything. Often it's there to make things line up at the start of a page boundary. This adds a ".junk" directive that tells the disassembler that it can safely disregard the contents of a region. If the region ends on a power-of-two boundary, an alignment value can be specified. The assembly source generators will output an alignment directive when possible, a .fill directive when appropriate, and a .dense directive when all else fails. Because we're required to regenerate the original data file, it's not always possible to avoid generating a hex dump.	2019-10-18 21:00:28 -07:00
Andy McFadden	37855c8f8e	Allow explicit widths in project/platform symbols, part 4 (of 4) Handle situation where a symbol wraps around a bank. Updated 2021-external-symbols for that, and to test the behavior when file data and an external symbol overlap. The bank-wrap test turned up a bug in Merlin 32. A workaround has been added. Updated documentation to explain widths.	2019-10-03 10:32:54 -07:00
Andy McFadden	2a41d70e04	Allow explicit widths in project/platform symbols, part 1 The ability to give explicit widths to local variables worked out pretty well, so we're going to try adding the same thing to project and platform symbols. The first step is to allow widths to be specified in platform files, and set with the project symbol editor. The DefSymbol editor is also used for local variables, so a bit of dancing is required. For platform/project symbols the width is optional, and is totally ignored for constants. (For variables, constants are used for the StackRel args, so the width is meaningful and required.) We also now show the symbol's type (address or constant) and width in the listing. This gets really distracting when overused, so we only show it when the width is explicitly set. The default width is 1, which most things will be, so users can make an aesthetic choice there. (The place where widths make very little sense is when the symbol represents a code entry point, rather than a data item.) The maximum width of a local variable is now 256, but it's not allowed to overlap with other variables or run of the end of the direct page. The maximum width of a platform/project symbol is 65536, with bank-wrap behavior TBD. The local variable table editor now refers to stack-relative constants as such, rather than simply "constant", to make it clear that it's not just defining an 8-bit constant. Widths have been added to a handful of Apple II platform defs.	2019-10-01 16:00:08 -07:00
Andy McFadden	2633720c82	Instruction operand editor rework, part 1 Rearrange the UI elements, and convert the code-behind to a more XAML-style form. The basic stuff works, but the old "shortcut" system is still in the process of being replaced.	2019-09-07 13:39:22 -07:00
Andy McFadden	c698048001	Handle variable labels that are duplicates of non-variables After thrashing around a bit, I had to choose between making the uniquifier more complicated, or making de-duplication a separate step. Since I don't really expect duplicates to be a thing, I went with the latter. Updated the regression test.	2019-08-31 21:54:20 -07:00
Andy McFadden	02c79db749	Add local variable uniquification For ACME and cc65, enable uniqification. This works with my basic tests, but there are a lot of potential edge cases.	2019-08-31 14:19:50 -07:00
Andy McFadden	6a2532588b	Local variables mostly work Variables are now handled properly end-to-end, except for label uniquification. So cc65 and ACME can't yet handle a file that redefines a local variable. This required a bunch of plumbing, but I think it came out okay.	2019-08-30 18:39:29 -07:00
Andy McFadden	e82339573f	Add VarDirective to PseudoOpNames Also, rearranged the pseudo-op app settings XAML to be a bit easier to maintain.	2019-08-29 12:14:47 -07:00
Andy McFadden	2fa5fdc237	Eliminate duplicate function	2019-08-21 15:29:00 -07:00
Andy McFadden	38d3adbb08	PETSCII does DCI I didn't think it made sense, but I found something that used it, so apparently it's a thing. This updates the operand editor to let you choose PETSCII+DCI, and updates the assemblers to handle it correctly (really just 64tass, since the others either don't have a DCI directive or don't deal with PETSCII at all). Changed the char-encoding sample from "bad dcI" to "pet dcI", and updated the documentation.	2019-08-20 17:55:12 -07:00
Andy McFadden	6251edb5ed	Fix PseudoOp Merge Missed this in the immutability change. Instead of merging new strings in, we create a new instance with the merged data.	2019-08-17 17:22:14 -07:00
Andy McFadden	f87ac20f32	Add a string operand cache String operands used to be simple -- each line had 62 characters plus two hard-coded non-ASCII delimiters -- but now we're mixing character and hex data, so we can't use simple math to tell where the lines will break. We want to render them and keep the result around until some dependency changes, e.g. different delimiters or a change to the pseudo-op table. Also, cleaned up LineListGen a little. It had some methods that were declared static because they were expected to be shared, but that never happened. Also, fixed a bug in GatherEntityCounts where multi-line items were being scanned multiple times.	2019-08-17 17:03:06 -07:00
Andy McFadden	4902b89cf8	Various improvements The PseudoOpNames class is increasingly being used in situations where mutability is undesirable. This change makes instances immutable, eliminating the Copy() method and adding a constructor that takes a Dictionary. The serialization code now operates on a Dictionary instead of the class properties, but the JSON encoding is identical, so this doesn't invalidate app settings file data. Added an equality test to PseudoOpNames. In LineListGen, don't reset the line list if the names haven't actually changed. Use a table lookup for C64 character conversions. I figure that should be faster than multiple conditionals on a modern x64 system. Fixed a 64tass generator issue where we tried to query project properties in a call that might not have a project available (specifically, getting FormatConfig values out of the generator for use in the "quick set" buttons for Display Format). Fixed a regression test harness issue where, if the assembler reported success but didn't actually generate output, an exception would be thrown that halted the tests. Increased the width of text entry fields on the Pseudo-Op tab of app settings. The previous 8-character limit wasn't wide enough to hold ACME's "!pseudopc". Also, use TrimEnd() to remove trailing spaces (leading spaces are still allowed). In the last couple of months, Win10 started stalling for a fraction of a second when executing assemblers. It doesn't do this every time; mostly it happens if it has been a while since the assembler was run. My guess is this has to do with changes to the built-in malware scanner. Whatever the case, we now change the mouse pointer to a wait cursor while updating the assembler version cache.	2019-08-17 11:30:42 -07:00
Andy McFadden	7bbe5692bd	Add C64 encodings to instruction and data operand editors Both dialogs got a couple extra radio buttons for selection of single character operands. The data operand editor got a combo box that lets you specify how it scans for viable strings. Various string scanning methods were made more generic. This got a little strange with auto-detection of low/high ASCII, but that was mostly a matter of keeping the previous code around as a special case. Made C64 Screen Code DCI strings a thing that works.	2019-08-15 17:53:12 -07:00
Andy McFadden	beb1024550	Define and use "delimiter sets" A delimiter definition is four strings (prefix, open, close, suffix) that are concatenated with the character or string data to form an operand. A delimiter set is a collection of delimiter definitions, with separate entries for each character encoding. This is a convenient way to configure Formatter objects, import and export data from the app settings file, and manage the UI needed to allow the user to customize how things look. The full set of options didn't fit on the first app settings tab, so there's now a separate tab just for specifying character and string delimiters. (This might be overkill, but there are various plausible scenarios that make use of it.) The delimiters for on-screen display of strings can now be configured.	2019-08-14 16:10:04 -07:00
Andy McFadden	5889f45737	Replace on-screen string operand formatting The previous functions just grabbed 62 characters and slapped quotes on the ends, but that doesn't work if we want to show strings with embedded control characters. This change replaces the simple formatter with the one used to generate assembly source code. This increases the cost of refreshing the display list, so a cache will need to be added in a future change. Converters for C64 PETSCII and C64 Screen Code have been defined. The results of changing the auto-scan encoding can now be viewed. The string operand formatter was using a single delimiter, but for the on-screen version we want open-quote and close-quote, and might want to identify some encodings with a prefix. The formatter now takes a class that defines the various parts. (It might be worth replacing the delimiter patterns recently added for single-character operands with this, so we don't have two mechanisms for very nearly the same thing.) While working on this change I remembered why there were two kinds of "reverse" in the old Merlin 32 string operand generator: what you want for assembly code is different from what you want on screen. The ReverseMode enum has been resurrected.	2019-08-13 17:52:58 -07:00
Andy McFadden	f33cd7d8a6	Replace character operand output method The previous code output a character in single-quotes if it was standard ASCII, double-quotes if high ASCII, or hex if it was neither of those. If a flag was set, high ASCII would also be output as hex. The new system takes the character value and an encoding identifier. The identifier selects the character converter and delimiter pattern, and puts the two together to generate the operand. While doing this I realized that I could trivially support high ASCII character arguments in all assemblers by setting the delimiter pattern to "'#' \| $80". In FormatDescriptor, I had previously renamed the "Ascii" sub-type "LowAscii" so it wouldn't be confused, but I dislike filling the project file with "LowAscii" when "Ascii" is more accurate and less confusing. So I switched it back, and we now check the project file version number when deciding what to do with an ASCII item. The CharEncoding tests/converters were also renamed. Moved the default delimiter patterns to the string table. Widened the delimiter pattern input fields slightly. Added a read- only TextBox with assorted non-typewriter quotes and things so people have something to copy text from.	2019-08-11 22:11:00 -07:00
Andy McFadden	068b3a44c7	Remove "high" versions of string pseudo-ops High ASCII and other encodings will be noted in the operand field, not the opcode, so we no longer need these. This removes the six input fields from the Pseudo-Op tab of app settings. Values were stored as a serialized class in settings, which generally works correctly as far as forward/backward compatibility goes, so no worries there. This also adds four "delimiter pattern" fields to the Code View tab, allowing the user to customize how encoded strings are marked up for the code list. The values aren't actually used yet. Also, fixed an issue where changes to text fields on the Pseudo-Op tab weren't raising the dirty flag.	2019-08-11 16:44:22 -07:00
Andy McFadden	975b62db6b	Treat low and high ASCII as two distinct formats We've been treating ASCII strings and instruction/data operands as ambiguous, resolving low vs. high when generating output for the display or assembler. This change splits it into two separate formats, simplifying output generation. The UI will continue to treat low/high ASCII as as single thing, selecting the format appropriately based on the data. There's no reason to have two radio buttons that are never both enabled. The data operand string functions need some additional work, but that overlaps substantially with the upcoming PETSCII changes, so for now all strings set by the data operand editor are low ASCII. The file format has changed again, but since there hasn't been a release since the previous change, I'm leaving the file format at v2. Code has been added to resolve the ASCII mode when loading a v1 project file. This removes some complexity from the assembly code generators.	2019-08-10 14:59:24 -07:00
Andy McFadden	0d0854bda7	Change the way string formats are defined We used to use type="String", with the sub-type indicating whether the string was null-terminated, prefixed with a length, or whatever. This didn't leave much room for specifying a character encoding, which is orthogonal to the sub-type. What we actually want is to have the type specify the string type, and then have the sub-type determine the character encoding. These sub-types can also be used with the Numeric type to specify the encoding of character operands. This change updates the enum definitions and the various bits of code that use them, but does not add any code for working with non-ASCII character encodings. The project file version number was incremented to 2, since the new FormatDescriptor serialization is mildly incompatible with the old. (Won't explode, but it'll post a complaint and ignore the stuff it doesn't recognize.) While I was at it, I finished removing DciReverse. It's still part of the 2005-string-types regression test, which currently fails because the generated source doesn't match.	2019-08-07 16:19:13 -07:00
Andy McFadden	c64f72d147	Move WPF code from SourceGenWPF to SourceGen	2019-07-20 13:28:37 -07:00
Andy McFadden	e3906e021b	Move WinForms code to SourceGenWF	2019-07-20 13:02:54 -07:00
Andy McFadden	a8af7e8794	Improve the "common" expression formatter To avoid confusing the assembler, expressions with a leading parenthesis like "(foo & $ffff) + 1" are prefixed with a "0+". This is not necessary if the operand begins with a '#'. (issue #16)	2018-10-26 15:45:39 -07:00
Andy McFadden	975ae1eb28	Speculative fix for reported FormatDataOp crash This adds a null check on the dfd argument in FormatDataOp() to see if we can prevent a crash. The opcode/operand are presented as "!FAILED!" to make it obvious to the user that something has gone wrong. Hopefully this will allow capture of a project that exhibits the problem.	2018-10-26 15:19:27 -07:00
Andy McFadden	da91f86043	Get 64tass expressions working We now insert parenthesis as needed. This can cause problems in some situations, so we always prefix parenthetical expressions with "0+", which looks goofy and is unnecessary for immediate operands. But it does generate working source code. Renamed the "simple" expression mode to "common", as it's not particularly simple but is what you'd expect most assemblers to do. (OTOH, life has been full of surprises.) (issue #16)	2018-10-24 14:57:09 -07:00
Andy McFadden	61914c8f79	Progress toward 64tass expression support Gave cc65 its own expression generator, as the precedence table seems atypical if not unique. Configured 64tass to use the "simple" expression mode. Added some operations on a 32-bit constant to 2007-labels-and-symbols to exercise the current worst-case expression (shift + AND + add). Tweaked the Merlin expression generator to handle it. (issue #16)	2018-10-24 13:17:03 -07:00
Andy McFadden	2c6212404d	Initial file commit	2018-09-28 10:05:11 -07:00

45 Commits