6502bench SourceGen: Properties & Settings

Back to index

Settings Overview

There are two kinds of settings: application settings, and project properties.

Application Settings

Application settings are stored in a file called "SourceGen-settings" in the SourceGen installation directory. If the file is missing or corrupted, default settings will be used. These settings are local to your system, and include everything from window sizes to whether or not you prefer hexadecimal values to be shown in upper case. None of them affect the way the project analyzes code and data, though they may affect the way generated assembly sources look.

The settings editor is divided into four tabs. Changes don't take effect until you hit Apply or OK.

Code View

These settings change the way the code looks on screen.

Click the Column Visibility buttons to hide columns. Click them again to restore the column to a width appropriate for the current font. A "hidden" column just has a width of zero, so with careful mouse positioning you can show and hide columns by dragging the column headers. The buttons may be more convenient though.

You can select a different font for the code list, and make it as large or as small as you want. Mono-space fonts like Courier or Consolas are recommended (and will be the only ones shown).

You can choose to display different parts of the display in upper or lower case, using the "all lower" and "all upper" buttons as a quick way to set all values. These settings are also used for generated assembly code, unless the assembler has specific case-sensitivity requirements. There is no setting for labels, which are always case-sensitive.

The Clipboard drop-down list lets you choose the format for text copied to the clipboard. The "Assembler Source" format includes the rightmost columns (label, opcode, operand, and comment), like assembly source code does. The "Disassembly" format adds the address and bytes on the left. Use the "All Columns" format to get all columns.

The "add spaces in bytes column" checkbox changes the format of the hex data in the code list "bytes" column from dense (20edfd) to spaced (20 ed fd). This also affects the format of clipboard copies and exports.

Text Delimiters

Character and string operands are shown surrounded by quotes, e.g. LDA #'*' or .STR "Hello, world!". It's handy to be able to tell at a glance how characters are encoded, so SourceGen allows you to set the delimiters independently for every supported character encoding.

String operands may contain a mixture of text and hexadecimal values. For example, in ASCII data, the control characters for linefeed and carriage return ($0a and $0d) are considered part of the string, but don't have a printable symbol. (Unicode defines some glpyhs, but they don't look very good at smaller font sizes.)

If one of the delimiter characters appears in the string itself, the character will be output as hex to avoid confusion. For this reason, it's generally wise to use delimiter characters that aren't part of the ASCII character set. The "Sample Characters" box holds some characters that you can copy and paste (with Ctrl+C / Ctrl+V) into the delimiter fields.

For character operands, the prefix and suffix are added to the start and end of the operand. For string operands, the prefix is added to the start of the first line, and suffixes aren't allowed.

These options change the way the code list looks on screen. They do not affect generated code, which must use the delimiter characters specified by the chosen assembler.

Asm Config

These settings configure cross-assemblers and modify assembly source generation in various ways.

To configure an assembler, select it in the pop-up menu. The fields will initially contain assembler-specific default values. All of the values in the Assembler Configuration box may be configured differently for each assembler.

The "executable" box holds the full path to the cross-assembler executable.

The "column widths" section allows you to specify the minimum width of the label, opcode, operand, and comment fields. If the width is less than 1, or isn't a valid number, 1 will be used. These are not hard stops: if the contents of a field are too wide, the contents of the next column will be pushed over. (The comment field width is not currently being used, but may be used to fold lines in the future.)

When "show cycle counts" is checked, every instruction line will have an end-of-line comment that indicates the number of cycles required for that instruction. This is shown in the code list and included in generated assembly output. If the cycle count can't be determined solely from a static analysis, e.g. an extra cycle is required if LDA (dp),Y crosses a page boundary, a '+' will be shown. In some cases the variability can be factored out if the state of certain status flags is known, e.g. 65C02 instructions that take longer in decimal mode won't be shown as variable if the analyzer can determine that D=0 or D=1.

If "put long labels on separate line" is checked, labels that are longer than the label column are placed on their own line. This looks a bit nicer because otherwise the opcode gets pushed out of alignment. (Some assemblers get bent out of shape if you split an equate directive, so those might stay on one line.)

If you enable "identify assembler in output", a comment will be added to the top of the generated assembly output that identifies the target assembler and version. It also shows the command-line options passed to the assembler. This can be very helpful if the source file is sent to other people, since it may not otherwise be obvious from the source file what the intended target assembler is, or what options are required to process the file correctly.

"Disable label localization" turns off the label localizer.

Display Format

These options change the way the code list looks on screen. They do not affect generated code.

The operand width disambiguator strings are used when the width of an instruction operand is unclear. You may specify values for all of them or none of them.

Different assemblers have different ways of forming expressions. Sometimes the rules allow expressions to be written simply, other times explicit grouping with parenthesis is required. Select whichever style you are most comfortable with.

If you would like your local variables to be shown with a prefix character, you can set one here.

The "quick set" buttons configure the fields on this tab to match the conventions of the specified assembler. Select your preferred assembler with the combox box, then click "set" to set the fields. (64tass and ACME use the "common" expression style, cc65 and Merlin 32 have their own unique styles.)

Pseudo-Op

These options change the way the code list looks on screen. Assembler directives and data pseudo-opcodes will use these values. This does not affect generated source code, which always matches the conventions of the target assembler.

Enter the string you want to use for the various data formats. If a field is left blank, a default value is used.

The "quick set" buttons configure the fields on this tab to match the conventions of the specified assembler. Select your preferred assembler with the combox box, then click "set" to set the fields.

Project Properties

Project properties are stored in the .dis65 project file. They specify which CPU to use, which extension scripts to load, and a variety of other things that directly impact how SourceGen processes the project. Because of the potential impact, all changes to the project properties are made through the undo/redo buffer.

The properties editor is divided into four tabs. Changes aren't pushed out to the main application until you close the dialog. Clicking Apply will capture the current changes, ensuring that they're applied even if you later hit Cancel, but the changes are not applied immediately.

All changes are subject to undo/redo.

General

The choice of CPU determines the set of available instructions, as well as cycle costs and register widths. There are many variations on the 6502, but from the perspective of a disassembler most can be treated as one of these three:

  1. MOS 6502. The original 8-bit instruction set.
  2. WDC W65C02S. Expanded the instruction set and smoothed some rough edges.
  3. WDC W65C816S. Expanded instruction set, 24-bit address space, and 16-bit registers.

The Rockwell R65C02, Hudson Soft HuC6280, and Commodore CSG 4510 / 65CE02 have instruction sets that expand on the 6502/65C02, but aren't compatible with the 65816. These are not yet supported by SourceGen.

If "enable undocumented instructions" is checked, some additional opcodes are recognized on the 6502 and 65C02. These instructions are not part of the chip specification, but most of them have consistent behavior and can be used. If the box is not checked, the instructions are treated as invalid and cause the code analyzer to assume that it has run into a data area. This option has no effect on the 65816.

The entry flags determine the initial value for the processor status flag register. Code that is unreachable internally (requiring a code entry point hint) will use this value. This is chiefly of value for 65816 code, where the initial value of the M/X/E flags is significant.

If "analyze uncategorized data" is checked, SourceGen will attempt to identify character strings and regions that are filled with a repeated value. If it's not checked, anything that isn't detected as code or explicitly formatted as data will be shown as individual byte values.

If "seek nearby targets" is checked, the analyzer will try to use nearby labels for data loads and stores, adjusting them to fit (e.g. LDA LABEL+1). If not enabled, labels are not applied unless they match exactly. Note that references into the middle of an instruction or formatted data area are always adjusted, regardless of how this is set. This setting has no effect on local variables, and only enables a 1-byte backward search on project/platform symbols.

If "smart PLP handling" is checked, the analyzer will try to use the processor status flags from a nearby PHP when a PLP is encountered. If not enabled, all flags are set to "indeterminate" following a PLP.

The "default text encoding" setting has two effects. First, it specifies which character encoding to use when searching for strings in uncategorized data. Second, if an assembler has a notion of preferred character encoding (e.g. you can default string operands to PETSCII), this setting will determine which encoding is preferred.

The "min chars for string detection" setting determines how many ASCII characters need to appear consecutively for the data analyzer to declare it a string. Shorter values are prone to false-positive identifications, longer values miss out on short strings. You can also set it to "none" to disable automatic string identification.

The auto-label style setting determines the format for labels that are generated automatically. By default the label will be the letter 'L' followed by the hexadecimal address, but the label can be annotated based on usage. For example, addresses that are the target of branch instructions can be labeled with the letter 'B'.

Project Symbols

You can add, edit, and delete individual symbols and constants. See the symbols section for an explanation of how project symbols work.

The Edit Symbol button opens the Edit Project Symbol dialog, which allows changing any part of a symbol definition. You're not allowed to create two symbols with the same label.

The Import button allows you to import symbols from another project. Only labels that have been tagged as global and exported will be imported. Existing symbols with identical labels will be replaced, so it's okay to run the importer multiple times. Labels that aren't found will not be removed, so you can safely import from multiple projects, but will need to manually delete any symbols that are no longer being exported.

Symbol Files

From here, you can add and remove platform symbol files, or change the order in which they are loaded. See the symbols section for an explanation of how platform symbols work, and the advanced topics section for a description of the file syntax.

Platform symbol files must live in the RuntimeData directory that comes with SourceGen, or in the directory where the project file lives. This is mostly to keep things manageable when projects are distributed to other people, but also acts as a minor security check, to prevent a wayward project from trying to open files it shouldn't.

In the list, files loaded from the RuntimeData directory will be prefixed with RT:. Files loaded from the project directory will be prefixed with PROJ:.

If a platform symbol file can't be found when the project is opened, you will receive a warning.

Extension Scripts

From here, you can add and remove extension script files. See the extension scripts section for details on how extension scripts work.

Extension script files must live in the RuntimeData directory that comes with SourceGen, or in the directory where the project file lives. This is mostly to keep things manageable when projects are distributed to other people, but also acts as a minor security check, to prevent a wayward project from trying to open files it shouldn't.

In the list, files loaded from the RuntimeData directory will be prefixed with RT:. Files loaded from the project directory will be prefixed with PROJ:.

If an extension script file can't be found when the project is opened, you will receive a warning.