Labels & Symbols

Suppose you want to call some code at address $1000. CPUs fundamentally deal with numeric values, so the machine code to call it would be JSR $1000. Humans tend to work better with words, so associating a meaningful symbol with address $1000 can greatly improve the readability of the code: something like JSR DrawSprite is far more helpful for human readers. Further, once the code has been disassembled to source code, using symbols instead of fixed addresses makes it easier to alter the program or re-use the code.

When the target address of instructions like JSR and LDA falls within the scope of the data file, SourceGen classifies the reference as internal, and automatically adds a generic symbolic label (e.g. L1000). This can be edited if desired.

t1-edit-label

On the line at address $2000, select Actions > Edit Label, or double-click on the label "L2000". Change the label to "MAIN", and hit Enter. The label changes on that line, and on the two lines that refer to address $2000. (If you're not sure which lines refer to address $2000, select line $2000 and check the list in the References window.)

Sometimes the target address falls outside the data file. Examples include calls to ROM routines, use of zero-page storage, and access to memory-mapped I/O locations. SourceGen classifies these as external, and does not generate a symbol. In an assembler source file, symbols for these would be expressed as equates (e.g. FOO = $8000), usually at the top of the file or in an "include file". SourceGen allows you to specify symbols for addresses and numeric constants within the project ("project symbols"), or in a symbol file that can be included in multiple projects ("platform symbols"). The SourceGen distribution includes platform symbol files with ROM addresses for several common systems.

t1-pre-sym-2000

For an example, consider the code at address $2000, which is LDA $3000. We want to assign the symbol "INPUT" to address $3000, but we can't do that by editing a label because it's not inside the file bounds. We can open the project symbol editor from the project properties editor, or we can use a shortcut.

t1-edit-sym-2000

With the line at $2000 selected, use Actions > Edit Operand, or double-click on the value in the Operand column ("$3000"). This opens the Edit Instruction Operand dialog. In the bottom left, click Create Project Symbol. Set the Label field to "INPUT", and click OK, then OK in the operand editor.

t1-edit-2000-done

The instruction at $2000 now uses the symbol "INPUT" as its operand. If you scroll to the top of the file, you will see a ".EQ" line for the symbol.


Numeric v. Symbolic

When SourceGen sees a reference to an address, such as the operand of an absolute JSR or LDA, it recognizes it as a numeric reference. You can edit the instruction's operand to use a symbol instead, changing to a symbolic reference. Sometimes the way these are handled can be confusing.

t1-sym-2005-before

Let's use the branch statement at $2005 to illustrate the difference. It performs a branch to $2009, which was automatically assigned the label "L2009".

t1-sym-2005-labeled

Edit the label at $2009 (double-click on "L2009" there), and change it to "IN_RANGE". Line $2005 changes to match. This works because SourceGen is auto-formatting line $2005's operand based on the label it finds when it chases the numeric reference to $2009. The Info window shows this as Format (auto): symbol "IN_RANGE".

Use Edit > Undo to revert the label change.

t1-sym-2005-edit

Edit the instruction operand at $2005 (double-click on "L2009" there). Change the format to Symbol, and type "IN_RANGE" in the symbol box. The preview shows BCC IN_RANGE (?), which hints at a problem. Click OK.

t1-sym-2005-nosym

Some things changed, but not the things we wanted. Line $2005 now says BCC $2009, instead of BCC L2009, and the label at $2009 has disappeared entirely. What went wrong?

The problem is that we edited the operand to use a symbol that isn't defined anywhere. Because "IN_RANGE" isn't defined, the operand was given the default format, and displayed as a hex value. The numeric reference to $2009 was replaced by the symbol, and nothing else refers to that address, so SourceGen no longer had any reason to put an auto-generated label on line $2009, which is why that disappeared.

t1-sym-2005-msg-window

The missing symbol is called out in a message window that popped up at the bottom of the code list window. The message window only appears when there are messages to read. You can hide the window with the Hide button, and make it re-appear with the button in the bottom right of the main window that currently says 1 message.

t1-sym-2005-explicit

We can resolve this issue by providing the desired symbol. As you did earlier, edit the label on line $2009 (double-click in the label column) and set it to "IN_RANGE". When you do, the operand on line $2005 is updated appropriately. If you select line $2005, the Info window shows the format as Format: symbol "IN_RANGE", indicating that the symbol was set explicitly rather than automatically.

t1-sym-2005-adjust

Symbolic references always link to the symbol, even when the symbol doesn't match the numeric reference. To see this, remove the label from line $2009 by undoing that change with Edit > Undo, so the symbol is again undefined. Now set the label on the following line, $200A, to "IN_RANGE".

Line $2005 now says "BCC IN_RANGE-1". Earlier you set the operand to be a symbolic reference to "IN_RANGE", but the symbol doesn't quite match, so SourceGen automatically adjusted the operand by one byte to point to the correct address. Generally speaking, SourceGen will do its best to use the symbols that you tell it to, and will adjust the symbolic references so that the code assembles correctly.

Edit the label on line $200A, and change it to "NIFTY". Note how the reference on line $2005 also changed. This is an example of a "refactoring rename": when you changed the label, SourceGen automatically found everything that referred to it and updated it. If you edit the operand on line $2005, you can confirm that the symbol has changed.

(If you want to clean this up before continuing on to the next section, put the label back on line $2009.)


Non-Unique Label

Most assemblers have a notion of "local" labels, which go out of scope when a non-local (global) label is encountered. The actual definition of "local" is assembler-specific, but SourceGen allows you to create labels that serve the same purpose.

t1-local-loop-edit

By default, newly-created labels have global scope and must be unique. You can change these attributes when you edit the label. Up near the top of the file, at address $1002, double-click on the label ("L1002"). Change the label to "LOOP" and click the "non-unique local" radio button. Click OK.

t1-local-loop1

The label at line $1002 (and the operand on line $100B) should now be "@LOOP". By default, '@' is used to indicate non-unique labels, though you can change it to a different character in the application settings.

t1-local-loop2

At address $2019, double-click to edit the label ("L2019"). If you type "MAIN" or "IS_OK" with Global selected you'll get an error, but if you type "@LOOP" it will be accepted. Note the "non-unique local" radio button is selected automatically if you start a label with '@' (or whatever character you have configured). Click OK.

You now have two lines with the same label. In some cases the assembly source generator need to may "promote" them to globals, or rename them to make them unique, depending on what your preferred assembler allows.

« Previous Next »