1
0
mirror of https://github.com/fadden/6502bench.git synced 2024-12-11 13:50:13 +00:00
6502bench/SourceGen/RuntimeData/Help/tutorials.html
Andy McFadden 5010fbae37 Various minor changes
- Freeze Note brushes, so HTML export doesn't blow up when it tries
  to access them.
- Add Ctrl+Shift+E as keyboard shortcut for File > Export.
- For code/data percentage, count inline data as data.
- Tweak code/data percentage text.
- Document Merlin32 '{' bug.
- Tweak tutorial text.
2020-03-30 16:50:52 -07:00

873 lines
47 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<link href="main.css" rel="stylesheet" type="text/css" />
<title>Tutorials - 6502bench SourceGen</title>
</head>
<body>
<div id="content">
<h1>6502bench SourceGen: Tutorials</h1>
<p><a href="index.html">Back to index</a></p>
<p>The tutorials introduce SourceGen and cover some of the basic
features. They skim lightly over some important concepts, like the
difference between numeric and symbolic references, so reading the
manual is recommended.</p>
<ul>
<li><a href="#basic-features">#1: Basic Features</a></li>
<li><a href="#advanced-features">#2: Advanced Features</a></li>
<li><a href="#address-tables">#3: Address Table Formatting</a></li>
<li><a href="#extension-scripts">#4: Extension Scripts</a></li>
<li><a href="#visualizations">#5: Visualizations</a></li>
</ul>
<h2><a name="basic-features">Tutorial #1: Basic Features</a></h2>
<p>Start by launching SourceGen. The initial screen has a large
center area with some buttons, and some mostly-empty windows on the sides.
The buttons are shortcuts for items in the File menu.</p>
<h3>Create the project</h3>
<p>Click the "Start new project" button.</p>
<p>The New Project window has three parts. The top-left window has a
tree of known platforms, arranged by manufacturer. The top-right window
provides some details on whichever platform is selected. The bottom
window will have some information about the data file, once we choose one.</p>
<p>Scroll down in the list, and select "Generic 6502". Then click
"Select File...", navigate to the SourceGen installation directory,
open the "Examples" folder, then open the "Tutorial" folder. Select the
file named "Tutorial1", and click "Open".</p>
<p>The filename now appears in the bottom window, along with an indication
of the file's size.</p>
<p>Click "OK" to create the project.</p>
<h3>Getting Around</h3>
<p>The first thing we'll do is save the project. Some features create or
load files from the directory where the project lives, so we want to
establish that.</p>
<p>Select File &gt; Save, which will bring up a standard save-file dialog.
Make sure you're in still in the Examples/Tutorial folder. The default
project file name is "Tutorial1.dis65", which is what we want, so just
click "Save".</p>
<p>The display is divided into rows, one per line of disassembled code
or data. This is a standard Windows "list view", so you can select a row
by left-clicking anywhere in it. Use Ctrl+Click to toggle the selection
on individual lines, and Shift+Click to select a range of lines. You can
move the selection around with the up/down arrow keys and PgUp/PgDn. Scroll
the window with the mouse wheel or by dragging the scroll bar.</p>
<p>Each row is divided into nine columns. You can adjust the column
widths by clicking and dragging the column dividers in the header. The
columns on the right side of the screen are similar to what you'd find
in assembly source code: label, opcode, operand, comment. The columns
on the left are what you'd find in a disassembly (file offset, address,
raw bytes), plus some information about processor status flags and line
attributes that may or may not be useful to you. If you find any of
these distracting, collapse the column.</p>
<p>Click on the fourth line down, which has address 1002. The line has
a label, "L1002", and is performing an indexed load from L1017. Both
of these labels were automatically generated, and are named for the
address at which they appear. When you clicked on the line, a few
things happened:</p>
<ul>
<li>The line was highlighted in the system selection color (usually
blue).</li>
<li>Address 1017 and label L1017 were highlighted. When you select
a line with an operand that targets an in-file address, the target
address is highlighted.</li>
<li>An entry appeared in the References window. This tells you that the
only reference to L1002 is a branch from address $100B.</li>
<li>The Info window filled with a bunch of text that describes the
line format and some details about the LDA instruction.</li>
</ul>
<p>Click some other lines, such as address $100B and $1014. Note how the
highlights and contents of other windows change.</p>
<p>Click on L1002 again, then double-click on the opcode ("LDA"). The
selection jumps to L1017. When an operand references an in-file address,
double-clicking on the opcode will take you to it. (Double-clicking on
the operand itself opens a format editor; more on that later.)</p>
<p>With line L1017 selected, double-click on the line that appears in the
References window. Note the selection jumps to L1002. You can immediately
jump to any reference.</p>
<p>At the top of the Symbols window on the right side of the screen is a
row of buttons. Make sure "Auto" and "Addr" are selected. You should see
three labels in the window (L1002, L1014, L1017). Double-click on "L1014"
in the Symbols list. The selection jumps to the appropriate line.</p>
<p>Select Navigate &gt; Find. Type "hello", and hit Enter. The selection will
move to address $100E, which is a string that says "hello!". You can use
Navigate &gt; Find Next to try to find the next occurrence (there isn't one). You
can search for any text that appears in the rightmost columns (label, opcode,
operand, comment).</p>
<p>Select Navigate &gt; Go To. You can enter a label, address, or file offset.
Enter "100b" to set the selection to the line at address $100B.</p>
<p>Near the top-left of the SourceGen window is a set of toolbar icons.
Click the curly left-pointing arrow, and watch the selection move. Click
it again. Then click the curly right-arrow a couple of times. Whenever
you jump around in the file by using the Go To feature, or by double-clicking
on opcodes or lines in the side windows, the locations are added to a
navigation history. The arrows let you move forward and backward
through it.</p>
<h3>Editing</h3>
<p>Click the very first line of the file, which is a comment that says
something like "6502bench SourceGen vX.Y.Z". There are three ways to
open the comment editor:</p>
<ol>
<li>Select Actions &gt; Edit Long Comment from the menu bar.</li>
<li>Right click, and select Edit Long Comment from the
pop-up menu. (This menu is exactly the same as the Actions menu.)</li>
<li>Double-click the comment</li>
</ol>
<p>Most things in the code list will respond to a double-click.
Double-clicking on addresses, flags, labels, operands, and comments will
open editors for those things. Double-clicking on a value in the "bytes"
column will open a floating hex dump viewer. This is usually the most
convenient way to edit something: point and click.</p>
<p>Double-click the comment to open the editor. Type some words into the
upper window, and note that a formatted version appears in the bottom
window. Experiment with the maximum line width and "render in box"
settings to see what they do. You can hit Enter to create line breaks,
or let SourceGen wrap lines for you. When you're done, click "OK". (Or
hit Ctrl+Enter.)</p>
<p>When the dialog closes, you'll see your new comment in place at the
top of the file. If you typed enough words, your comment will span
multiple lines. You can select the comment by selecting any line in it.</p>
<p>Click on the comment, then shift-click on L1014. Right-click, and look
at the menu. Nearly all of the menu items are disabled. Most edit features
are only enabled when a single instance of a relevant item is selected, so
for example Edit Long Comment won't be enabled if you have an instruction
selected.</p>
<p>Let's add a note. Click on $100E (the line with "hello!"), then
select Actions &gt; Edit Note. Type a few words, pick a color, and click "OK"
(or hit Ctrl+Enter). Your note appears in the code, and also in the
window on the bottom left. Notes are like long comments, with three key
differences:</p>
<ol>
<li>You can't pick their line width, but you can pick their color.</li>
<li>They don't appear in generated assembly sources, making them
useful for leaving notes to yourself as you work.</li>
<li>They're listed in the Notes window. Double-clicking them jumps
the selection to the note, making them useful as bookmarks.</li>
</ol>
<p>It's time to do something with the code. If you look at what the code
does you'll see that it's copying several dozen bytes from $1017
to $2000, then jumping to $2000. It appears to be relocating the next
part of the code before
executing it. We want to let the disassembler know what's going on, so
select the line at address $1017 and then
Actions &gt; Set Address. (Or double-click the "1017" in the Addr column.)
In the Set Address dialog, type "2000", and hit Enter.)</p>
<p>Note the way the code list has changed. When you changed the address,
the "JMP $2000" at L1014 found a home inside the bounds of the file, so
the code tracer was able to find the instructions there.</p>
<p>From the menu, select Edit &gt; Undo. Notice how everything reverts to
the way it was. Now, select Edit &gt; Redo. You can undo any change you
make to the project. (The undo history is <strong>not</strong> saved in
the project file, though, so when you exit the program the history is
lost.)</p>
<p>Notice that, while the address column has changed, the offset column
has not. File offsets never change, which is why they're shown here and
in the References and Notes windows. (They can, however, be distracting,
so you'll be forgiven if you reduce the offset column width to zero.)</p>
<p>On the line at address $2000, select Actions &gt; Edit Label, or
double-click on the label "L2000". Change the label to "MAIN", and hit
Enter. The label changes on that line, and on the two lines that refer
to address $2000. (If you're not sure which lines refer to address $2000,
select line $2000 and look at the References window.)</p>
<p>On that same line, select Actions &gt; Edit Comment. Type a short
comment, and hit Enter. Your comment appears in the "comment" column.</p>
<h3>Editing Instruction Operands</h3>
<p>Select the line with address $2003 ("CMP #$04"), then
Actions &gt; Edit Operand. This allows you to pick how you want the
operand to look. It's currently set to "Default", which for an 8-bit
immediate argument means it's shown as a hexadecimal value. Click
"Binary", then "OK". It now appears as a binary value.</p>
<p>The operand in the LDA instruction at line $2000 refers to an address
($3000) that isn't part of the file. We want to create an equate directive to
give it a name. With the line at $2000 selected, use Actions &gt; Edit Operand,
or double-click on "$3000". Select the "Symbol" radio button, then type
"INPUT" in the text box. Click "OK".</p>
<p>Disappointed? The instruction is unchanged. The problem is that we
updated the operand to reference a symbol that doesn't exist. This fact
is noted in a message that appeared at the bottom of the screen. Open the
operand editor again, but this time click on "Create Project Symbol" at
the bottom left. Enter "INPUT" in the Label field, and click "OK", then
click "OK" in the operand editor.</p>
<p>That's better. The instruction looks the way we wanted it to, and the
message at the bottom of the window disappeared. If you scroll up to the
top of the project, you'll see that there's now a ".EQ" line for
the symbol.</p>
<p>Operands that refer to in-file locations behave similarly. Select the
line two down, at address $2005, and Actions &gt; Edit Operand. Enter the
symbol "IS_OK". (Note you don't actually have to click Symbol first -- if
you just start typing as soon as the dialog opens, it'll select Symbol
for you automatically.) Click "OK".</p>
<p>As before, nothing appears to have happened, but if you were watching
carefully you would have noticed that the label at $2009 ("L2009") has
disappeared. This happened because the code at $2005 used to have a
<i>numeric</i> reference to $2009, and SourceGen automatically created a
label. However, you changed the code at $2005 to have a <i>symbolic</i>
reference to a symbol called "IS_OK", and there were no other numeric
references to $2009, so the auto-label was no longer
needed. Because IS_OK doesn't exist, the operand at $2005 is just formatted
as a hexadecimal value. (There's also now a message at the bottom of the
window telling us this.)</p>
<p>Let's fix this. Select the line at address $2009, then
Actions &gt; Edit Label. Enter "IS_OK", and hit Enter. (NOTE: labels are
case-sensitive, so it needs to match the operand at $2005 exactly.) You'll
see the new label appear, and the operand at line $2005 will use it.</p>
<!--<p>There's an easier way. Select Edit &gt; Undo twice, to get back to the
state where line $2005 says "BCC L2009", and line $2009 has the label
L2009. Now double-click on the "BCC" opcode (not operand) at address
$2005. This moves the selection to $2009. Double-click on the label field,
and enter "IS_OK". Hit "OK".</p>
<p>You should now see that both the operand at $2005 and the label at
$2009 have changed to IS_OK, accomplishing what we wanted to do in a
single step. The key difference is that we haven't explicitly set a
format for the BCC operand -- we just defined a label, and SourceGen
used it automatically.</p>-->
<p>There's another way to set a label that is simpler and more convenient.
Select Edit &gt; Undo twice, to get back to the state where line $2005
says "BCC L2009", and line $2009 has the label L2009.
Double-click on the operand on line $2005 ("L2009") to open the operand
editor, then in the bottom left panel click "Create Label". Type "IS_OK",
then click "OK". Make sure the operand format is still set to Default,
then click "OK".</p>
<p>This puts the label IS_OK at line $2009, and we can see the BCC
instruction has it as well. We were able to leave the BCC instruction
set to Default format because the numeric reference to $2009 was
automatically resolved to the IS_OK label. You could do the same thing
by editing the label on line $2009 directly as we did earlier, but
in many cases -- particularly when the operand's target address is far
off screen -- it's more convenient to work through the operand editor.</p>
<h3>Unique vs. Non-Unique Labels</h3>
<p>Most assemblers have a notion of "local" labels, which go out of
scope when a non-local (global) label is encountered. The actual
definition of "local" is assembler-specific, but SourceGen allows you
to create labels that serve the same purpose.</p>
<p>By default, newly-created labels have global scope and must be
unique. You can change these attributes when you edit the label. Up near the
top of the file, at address $1002, double-click on the label ("L1002").
Change the label to "LOOP" and click the "non-unique local" button.
Click OK.</p>
<p>The label at line $1002 (and the operand on line $100B) should now
be "@LOOP". By default, '@' is used to indicate non-unique labels,
though you can change it to a different character in the application
settings.</p>
<p>At address $2019, double-click to edit the label ("L2019"). If
you type "MAIN" or "IS_OK" with Global selected you'll get an error,
but if you type "@LOOP" it will be accepted. Note the "non-unique local"
button is selected automatically if you start a label with '@' (or
whatever character you have configured). Click OK.</p>
<p>You now have two lines with the same label. The assembly source
generator may "promote" them to globals or rename them if your chosen
assembler requires it.</p>
<h3>Editing Data Operands</h3>
<p>There's some string and numeric data down at the bottom of the file. The
final string appears to be multiple strings stuck together. (You may need
to increase the width of the Operand column to see the whole thing.) Notice
that the opcode for the very last line is '+', which means it's a
continuation of the previous line. Long data items can span multiple
lines, split every 64 characters (including delimiters), but they are
still single items: selecting any part selects the whole.</p>
<p>Select the last line in the file, then Actions &gt; Edit Operand. You'll
notice that this dialog is much different from the one you got when editing
the operand of an instruction. At the top it will say "65 bytes
selected". You can format this as a single 65-byte string, as 65 individual
items, or various things in between. For now, select "Single bytes", and
then on the right, select "ASCII (low or high) character". Click "OK".</p>
<p>Each character is now on its own line. The selection still spans the
same set of addresses.</p>
<p>Select address $203D on its own, then Actions &gt; Edit Label. Set the
label to "STR1". Move up a bit and select address $2030, then scroll to
the bottom and shift-click address $2070. Select Actions &gt; Edit Operand.
At the top it should now say, "65 bytes selected in 2 groups".
There are two groups because the presence of a label split the data into
two separate regions. From the "Character encoding" pop-up down in the
"String" section, make sure "Low or High ASCII" encoding is selected,
then select the "mixed character and non-character" string type and
click "OK".</p>
<p>We now have two ".STR" lines, one for "string zero ", and one with the
STR1 label and the rest of the string data. This is okay, but it's not
really what we want. The code at $200B appears to be loading a 16-bit
address from data at $2025, so we want to use that if we can.</p>
<p>Select Edit &gt; Undo three times. You should be back to the state where
there's a single ".STR" line at the bottom of the file, split across two
lines with a '+'.</p>
<p>Select the line at $2026. This is currently formatted as a string,
but that appears to be incorrect, so let's format it as individual bytes
instead. There's an easy way to do that: use Actions &gt; Toggle Single-Byte
Format (or hit Ctrl+B).</p>
<p>The data starting at $2025 appears to be 16-bit addresses that point
into the table of strings, so let's format them appropriately.</p>
<p>Double-click the operand column on line $2025 ("$30") to open
the operand data format editor. Because you only have one byte selected,
most of the options are disabled. This won't do what we want, so
click "Cancel".</p>
<p>Select the line at $2025, then shift-click the line at $202E. Right-click
and select Edit Operand. If you selected the correct set of bytes,
the top line in the dialog should now say, "10 bytes selected". Because
10 is a multiple of two, the 16-bit formats are enabled. It's not a multiple
of 3 or 4, so the 24-bit and 32-bit options are not enabled. Click the
"16-bit words, little-endian" radio button, then over to the right, click
the "Address" radio button. Click "OK".</p>
<p>We just told SourceGen that those 10 bytes are actually five 16-bit numeric
references. SourceGen determined that the addresses are contained in the
file, and created labels for each of them. Labels only work if they're
on their own line, so the long string was automatically split into five
separate ".STR" statements.</p>
<p>Use File &gt; Save (or hit Ctrl+S) to save your work.</p>
<h3>Generating Assembly Code</h3>
<p>You can generate assembly source code from the disassembled data.
Select File &gt; Assembler (or hit Ctrl+Shift+A) to open the source generation
and assembly dialog.</p>
<p>Pick your favorite assembler from the drop list at the top right,
then click "Generate". An assembly source file will be generated in the
directory where your project files lives, named after a combination of the
project name and the assembler name. A preview of the assembled code
appears in the top window. (It's a "preview" because it has line numbers
added and is cut off after a certain limit.)</p>
<p>If you have a cross-assembler installed and configured, you can run
it by clicking "Run Assembler". The output from the assembler will appear
in the lower window, along with an indication of whether the assembled
file matches the original. (Barring bugs in SourceGen or the assembler,
it should always match exactly.)</p>
<p>Click "Close" to close the window.</p>
<h3>End of Part One</h3>
<p>At this point you know enough to work with a SourceGen project. Continue
on to the next tutorial to learn more.</p>
<hr/>
<h2><a name="advanced-features">Tutorial #2: Advanced Features</a></h2>
<p>This tutorial will walk you through some of the fancier things SourceGen
can do. We assume you've already finished the Basic Features tutorial.</p>
<p>Start a new project. Select "Generic 6502". For the data file, navigate
to the Examples directory, then from the Tutorial directory
select "Tutorial2".</p>
<p>The first thing you'll notice is that we immediately ran into a BRK,
which is a pretty reliable sign that we're not in a code section. The
generic profile puts a code entry point hint on the first byte, but that's
wrong here. This particular file begins with <code>00 20</code>, which
could be a load address (some C64 binaries look like this). So let's start
with that assumption.</p>
<p>Click on the first line of code at address $1000, and select
Actions &gt; Remove Hints. The $20 got absorbed into a string. The string
is making it hard to manipulate the next few bytes, so let's fix that by
selecting Edit &gt; Toggle Data Scan (Ctrl+D). This turns off the feature
that looks for strings and .FILL regions, so now each uncategorized byte is
on its own line.</p>
<p>You could select the first two lines and use Actions &gt; Edit Operand
to format them as a 16-bit little-endian hex value, but there's a shortcut:
select only the first line of code, then Actions &gt; Format As Word (Ctrl+W). It
automatically grabbed the following byte and combined them. Since we believe
$2000 is the load address for everything that follows, click on the line
with address $1002, select Actions &gt; Set Address, and enter "2000". With
that line still selected, use Actions &gt; Hint As Code Entry Point
(Ctrl+H then Ctrl+C) to identify it as code.</p>
<p>That looks better, but it's branching off the bottom of the screen
(unless you have a really tall screen or small fonts) because of all the
intervening data. Use Edit &gt; Toggle Data Scan to turn the string
finder back on.</p>
<p>There are four strings starting at address $2004, each of which is
followed by $00. These look like null-terminated strings, so let's make
it official. But first, let's do it wrong. Click on the line with
address $2004 to select it. Hold the shift key down, then double-click
on the operand field of the line with address $2031 (i.e. double-click on
the words "last string").</p>
<p>The Edit Data Operand dialog opens, but the null-terminated strings
option is not available. This is because we didn't include the null byte
on the last string. To be recognized as one of the "special" string types,
every selected string must match the expected pattern.</p>
<p>Cancel out of the dialog. Hold the shift key down, and double-click
on the operand on line $203c (<code>$00</code>).
You should see "Null-terminated strings (4)" as an available
option now (make sure the Character Encoding pop-up is set to
"Low or High ASCII"). Click on that, then click "OK". The strings are now
shown as .ZSTR operands.</p>
<p>It's wise to save your work periodically. Use File &gt; Save to create
a project file for Tutorial2.</p>
<h4>Pointers and Parts</h4>
<p>Let's move on to the code at $203d. It starts by storing a couple of
values into direct page address $02/03. This appears to be setting up a
pointer to $2063, which is a data area inside the file. So let's make it
official.</p>
<p>Select the line at address $2063, and use Actions &gt; Edit Label to
give it the label "XDATA?". The question mark on the end is there to
remind us that we're not entirely sure what this is. Now edit the
operand on line $203d, and set it to the symbol "XDATA", with the part
"low". The question mark isn't really part of the label, so you don't
need to type it here. Edit the operand on line $2041,
and set it to "XDATA" with the part "high". (The symbol text box
gets focus immediately, so you can start typing the symbol name as soon
as the dialog opens; you don't need to click around first.) If all
went well, the operands should now read <code>LDA #&lt;XDATA?</code>
and <code>LDA #&gt;XDATA?</code>.</p>
<p>Let's give the pointer a name. Select line $203d, and use
Actions &gt; Create Local Variable Table to create an empty table.
Click "New Symbol" on the right side. Leave the Address button selected.
Set the Label field to "PTR1", the Value field to $02, and the width
to 2 (it's a 2-byte pointer). Click "OK" to create the entry, and then
"OK" to update the table.</p>
<p>There's now a ".var" statement (similar to a .equ) above line $203d,
and the stores to $02/$03 have changed to "PTR1" and "PTR1+1".</p>
<p>Double-click on the JSR on line $2045 to jump to L20A7. This just
loads a value from $3000 into the accumulator and returns, so not much
to see here. Hit the back-arrow in the toolbar to jump back to the JSR.</p>
<p>The next bit of code masks the accumulator so it holds a value between
0 and 3, then doubles it and uses it as an index into PTR1. We know PTR1
points to XDATA, which looks like it has some 16-bit addresses. The
values loaded are stored in two more zero-page locations, $04-05.</p>
<p>Let's make these a pointer as well. Double-click the operand on
line $204e ("$04"), and click "Create Local Variable". Set the Label
to "PTR2" and the width to 2. Click "OK" to create the symbol, then
"OK" to close the operand editor, which should still be set to Default --
we didn't actually edit the operand, we just used the operand edit
dialog as a convenient way to create a local variable table entry. All
accesses to $04/$05 now use PTR2, and there's a new entry in the local
variable table we created earlier.</p>
<p>The next bit of code copies bytes from PTR2 to $0400, stopping when it
hits a zero byte. Looks like this is copying null-terminated strings.
This confirms our idea that XDATA holds 16-bit addresses, so let's
format it. Select lines $2063 to $2066, and Actions &gt; Edit Operand.
It should say "8 bytes selected" at the top. Select "16-bit words,
little-endian", and then from the Display As box, select "Address".
Click "OK". XDATA should now be four <code>.dd2</code> 16-bit addresses.
If you scroll up, you'll see that the .ZSTR strings near the top now have
labels that match the operands in XDATA.</p>
<p>Now that we know what XDATA holds, let's rename it. Change the label
to STRADDR. The symbol parts in the operands at $203d and $2041 update
automatically.</p>
<p>Let's pause briefly to look at the cycle-count feature. Use
Edit &gt; Settings to open the app settings panel, then select the
Asm Config tab. Click the "Show cycle counts" checkbox, then click "OK".</p>
<p>Every line with an instruction now has a cycle count on it. The cycle
counts are adjusted for everything SourceGen can figure out. For example,
the BEQ on line $205a shows "2+" cycles, meaning that it takes at least two
cycles but might take more. That's because conditional branches take an
extra cycle if the branch is taken. The BNE on line $2061" shows 3 cycles,
because we know that the branch is always taken and doesn't cross a page
boundary. (If you want to see why it's always taken,
look at the value of the 'Z' flag in the "flags" column. Lower-case 'z'
means the zero-flag is clear. You can see it got set on the
<code>ORA #$80</code> line.)</p>
<p>The cycle-count comments are included in assembled output as well. If
you add an end-of-line comment, it appears after the cycle count.</p>
<p>Hit Ctrl+S to save your project. Make that a habit.</p>
<h4>Inline Data</h4>
<p>Consider the code at address $206B. It's a JSR followed by some
ASCII text, then a $00 byte, and then what might be code. Double-click
on the JSR opcode to jump to $20AB to see the function. It pulls the
call address off the stack, and uses it as a pointer. When it encounters
a zero byte, it breaks out of the loop, pushes the adjusted pointer
value back onto the stack, and returns.</p>
<p>This is an example of "inline data", where a function uses the return
address to get a pointer to data. The return address is adjusted to
point past the inline data before returning (technically, it points at
the very last byte of the inline data, because RTS jumps to address + 1).</p>
<p>To format the data, we first need to tell SourceGen that there's data
in line with the code. Select the line at address $206E, then
shift-click the line at address $2077. Use Actions > Hint as Inline Data.</p>
<p>The data turns to single-byte values, and we now see the code
continuing at address $2078. We can format the data as string by
using Actions > Edit Operand, setting the Character Encoding to "Low or
High ASCII", and choosing "null-terminated strings".</p>
<p>That's pretty straightforward, but this could quickly become tedious if
there were a lot of these. SourceGen allows you to define scripts to
automate common formatting tasks. This is covered in a later tutorial.</p>
<h4>Odds &amp; Ends</h4>
<p>The rest of the code isn't really intended to do anything useful. It
just exists to illustrate some odd situations.</p>
<p>Look at the code starting at $2078. It ends with a BRK at $2081, which
as noted earlier is a bad sign. If you look two lines above the BRK,
you'll see that it's loading the accumulator with zero, then doing a BNE,
which should never be taken (note the cycle count for the BNE is 2). The
trick is in the two lines before that, which use self-modifying code to
change the LDA immediate operand from $00 to $ff. The BNE is actually
a branch-always.</p>
<p>We can fix this by correcting the status flags. Select line $207F,
and then Actions &gt; Override Status Flags. This lets us specify what
the flags should be before the instruction is executed. For each flag,
we can override the default behavior and specify that the flag is
clear (0), set (1), or indeterminate (could be 0 or 1). In this case,
we know that the self-modified code will be loading a non-zero value, so
in the "Z" column click on the button in the "Zero" row. Click "OK". The
BNE is now an always-taken branch, and the code list rearranges itself
appropriately (and the cycle count is now 3).</p>
<p>Continuing on, the code at $2086 touches a few consecutive locations. Edit
the label on line $2081, setting it to "STUFF". Notice how the references
to $2081 through $2084 have changed from auto-generated labels to
references to STUFF. For some projects this may be undesirable. Use
Edit &gt; Project Properties, then in the Analysis Parameters box
un-check "Seek nearby targets", and click "OK". You'll notice that the
references to $2081 and later have switched back to auto labels. If
you scroll up, you'll see that the references to PTR1+1 and PTR2+1 were
not affected, because local variables use explicit widths rather
than the "nearby" logic.</p>
<p>The nearby-target behavior is generally desirable, because it lets you
avoid explicitly labeling every part of a multi-byte data item. For now,
use Edit &gt; Undo to switch it back on.</p>
<p>The code at $2092 looks a bit strange. <code>LDX</code>, then a
<code>BIT</code> with a weird symbol, then another <code>LDX</code>. If
you look at the "bytes" column, you'll notice that the three-byte
<code>BIT</code> instruction has only one byte on its line. The
trick here is that the <code>LDX #$01</code> is embedded inside the
<code>BIT</code> instruction. When the code runs through here, X is set
to $00, then the <code>BIT</code> instruction sets some flags, then the
<code>STA</code> runs. Several lines down there's a <code>BNE</code>
to $2095, which is in the middle of the <code>BIT</code> instruction.
It loads X with $01, then also continues to the <code>STA</code>.</p>
<p>Embedded instructions are unusual but not unheard-of. (This trick is
used extensively in Microsoft BASICs, such as Applesoft.) When you see the
extra symbol in the opcode field, you need to look closely at what's going
on.</p>
<hr/>
<h2><a name="address-tables">Tutorial #3: Address Table Formatting</a></h2>
<p><i>This tutorial covers one specific feature.</i></p>
<p>Start a new project. Select the Apple //e platform, click Select File
and navigate to the Examples directory. In A2-Amper-fdraw, select
<code>AMPERFDRAW#061d60</code> (ignore the existing .dis65 file). Click
"OK" to create the project.</p>
<p>Not a lot to see here -- just half a dozen lines of loads and stores.
This particular program interfaces with Applesoft BASIC, so we can make it
a bit more meaningful by loading an additional platform
symbol file. Select Edit &gt; Project Properties, then the Symbol Files
tab. Click Add Symbol Files from Runtime. The file browser starts in
the RuntimeData directory. Open the "Apple" folder, then select
<code>Applesoft.sym65</code>, and click "Open". Click "OK" to close
the project properties window.</p>
<p>The <code>STA</code> instructions now reference <code>BAS_AMPERV</code>,
which is noted as a code vector. We can see the code setting up a jump
(opcode $4c) to $1d70. As it happens, the start address of the code
is $1d60 -- the last four digits of the filename -- so let's make that
change. Double-click the initial .ORG statement, and change it from
$2000 to $1d60. We can now see that $1d70 starts right after this
initial chunk of code.</p>
<p>Select the line with address $1d70, then Actions &gt; Hint As Code
Entry Point.
More code appears, but not much -- if you scroll down you'll see that most
of the file is still data. The code at $1d70 searches through a table at
$1d88 for a match with the contents of the accumulator. If it finds a match,
it loads bytes from tables at $1da6 and $1d97, pushes them on the stack,
and the JMPs away. This code is pushing a return address onto the stack.
When the code at <code>BAS_CHRGET</code> returns, it'll return to that
address. Because of a quirk of the 6502 architecture, the address pushed
must be the desired address minus one.</p>
<p>The first byte in the first address table at $1d97 (which has the auto-label
L1D97) is $b4. The first byte in the second table is $1d. So the first
address we want is $1db4 + 1 = $1db5.</p>
<p>Select the line at $1db5, and use Actions &gt; Hint As Code Entry Point.
More code appears, but again it's only a few lines. Let's dress this one
up a bit. Set a label on the code at $1db5 called "FUNC". At $1d97, edit
the data item (double-click on "$b4"), click "Single bytes", then type "FUNC"
(note the text field gets focus immediately, and the radio button
automatically switches to "symbolic reference" when you start typing).
Click "OK". The operand at $1d97 should now say <code>&lt;FUNC-1</code>.
Repeat the process at $1da6, this time clicking the "High" part radio button
below the symbol entry text box,
to make the operand there say <code>&gt;FUNC</code>. (If it says
<code>&lt;FUNC-152</code>, you forgot to select the High part.)</p>
<p>We've now changed the first entry in the table to symbolic references.
You could repeat these steps for the remaining items, but there's a faster
way. Click on the line at address $1d97, then shift-click the line at
address $1da9 (which should be <code>.FILL 12,$1e</code>). Select
Actions &gt; Format Address Table.</p>
<p>Contrary to first impressions, this imposing dialog does not allow you
to launch objects into orbit. There are a variety of common ways to
structure an address table, all of which are handled here. You can
configure the various parameters and see the effects as you make
each change.</p>
<p>The message at the top should indicate that there are 30 bytes
selected. In Address Characteristics, click the "Parts are split across
sub-tables" checkbox and the "adjusted for RTS/RTL"
checkbox. As soon as you do, the first line of the Generated Addresses
list should show the symbol "FUNC". The rest of the addresses will look like
<code>(+) T1DD0</code>. The "(+)" means that a label was not found at
that location, so a label will be generated automatically.</p>
<p>Down near the bottom, check the "add code entry hint if needed" checkbox.
Because we saw the table contents being pushed onto the stack for RTS,
we know that they're all code entry points.</p>
<p>Click "OK". The table of address bytes at $1d97 should now all be
references to symbols -- 15 low parts followed by 15 high parts. If you
scroll down, you should see nothing but instructions until you get to the
last dozen bytes at the end of the file. (If this isn't the case, use
Edit &gt; Undo, then work through the steps again.)</p>
<p>The formatter did the same steps you went through earlier -- set a
label, apply the label to the low and high bytes in the table, add a
code entry point hint -- but did several of them at once.</p>
<p>We don't want to save this project, so select File &gt; Close. When
SourceGen asks for confirmation, click Discard & Continue.</p>
<hr/>
<h2><a name="extension-scripts">Tutorial #4: Extension Scripts</a></h2>
<p><i>This tutorial covers one specific feature.</i></p>
<p>Some repetitive formatting tasks can be handled with automatic scripts.
This is especially useful for inline data, which can confuse the code
analyzer.</p>
<p>An earlier tutorial demonstrated how to manually mark bytes as
inline data. We're going to do it a faster way. For this tutorial,
start a new project with "Generic 6502", and in the SourceGen
Examples/Tutorial directory select "Tutorial4".</p>
<p>We'll need to load scripts from the project directory, so we have to
save the project. File > Save, use the default name ("Tutorial4.dis65").</p>
<p>Take a look at the disassembly listing. The file starts with a JSR
followed by a string that begins with a small number. This appears to be
a string with a leading length byte. We want to load a script that
can handle that, so use Edit > Project Properties, select the
Extension Scripts tab, and click "Add Scripts from Project". The file
browser opens in the project directory. Select the file
"InlineL1String.cs", click Open, then OK.</p>
<p>Nothing happened. If you look at the script with an editor (and you
know some C#), you'll see that it's looking for a JSR to a function called
"PrintInlineL1String". So let's give it one.</p>
<p>Double-click the JSR operand ("L1026"), click "Create Label", and
enter "PrintInlineL1String". Remember that labels are case-sensitive;
you must enter it exactly as shown. Hit "OK" to accept the label, and "OK"
to close the operand editor. If all went well, address $1003 should now be
an L1 string "How long?", and adress $100D should be another JSR.</p>
<p>The next JSR appears to be followed by a null-terminated string, so
we'll need something that handles that. Go back into Project Properties
and add the script "InlineNullTermString.cs".</p>
<p>This script is slightly different, in that it handles any JSR to a label
that starts with "PrintInlineNullString". So let's give it a couple of
those.</p>
<p>Double-click the operand on line $100D ("L1027"), click Create Label,
and set the label to "PrintInlineNullStringOne". Hit "OK" twice. That
formatted the first one and got us to the next JSR. Repeat the process
on line $1019 ("L1028"), setting the label to "PrintInlineNullStringTwo".</p>
<p>The entire project is now nicely formatted. In a real project the
"Print Inline" locations would be actual print functions, not just RTS
instructions. There would likely be multiple JSRs to the print function,
so labeling a single function entry point could format dozens of inline
strings and clean up the disassembly automatically. The reason for
allowing wildcard names is that some functions may have multiple
entry points or chain through different locations.</p>
<p>Extension scripts can make your life much easier, but they do require
some programming experience. See the
<a href="advanced.html#extension-scripts">manual</a> for more details.</p>
<hr/>
<h2><a name="visualizations">Tutorial #5: Visualizations</a></h2>
<p><i>This tutorial covers one specific feature.</i></p>
<p>Many programs contain a significant amount of graphical data. This is
especially true for games, where the space used for bitmaps is often
larger than the space required for the code. When disassembling a program
it can be very helpful to be able to see the contents of the data
regions in graphical form.</p>
<p>Start a new project with "Generic 6502", and in the SourceGen Tutorial
directory select "Tutorial5". We'll need to load an extension script from
the project directory, so immediately save the project, using the
default name ("Tutorial5.dis65").</p>
<p>Normally a project will give you some sort of hint as to the data
format, e.g. the graphics might be a platform-specific sprite. For
non-standard formats you can glean dimensions from the drawing code. For
the purposes of this tutorial we're just using a simple monochrome bitmap
format, with 8 pixels per byte, and we'll know that our images are for
a Tic-Tac-Toe game. The 'X' and the 'O' are 8x8, the game board is 40x40.
The bitmaps are sprites with transparency, so pixels are either solid
or transparent.</p>
<p>The first thing we need to do is load an extension script that can
decode this format. The RuntimeData directory has a few, but for this
tutorial we're using a custom one. Select Edit &gt; Project Properties,
select the Extension Scripts tab, and click "Add Scripts from Project".
Double-click on "VisTutorial5.cs", then click "OK".</p>
<p>The address of the three bitmaps are helpfully identified by the
load instructions at the top of the file. Select the list at
address $100A, then Actions &gt; Create/Edit Visualization Set. In
the window that opens, click "New Visualization".</p>
<p>We're going to ignore most of what's going on and just focus on the
list of parameters at the bottom. The file offset indicates where in
the file the bitmap starts; note this is an offset, not an address
(that way, if you change the address, your visualizations don't break).
This is followed by the bitmap's width in bytes, and the bitmap's height.
Because we have 8 pixels per byte, we're currently showing an 8x1 image.
We'll come back to row stride.</p>
<p>We happen to know (by playing the game and/or reading the fictitious
drawing code) that the image is 8x8, so change the value in the height
field to 8. As soon as you do, the preview window shows a big blue 'X'.
(The 'X' is 7x7; the last row/column of pixels are transparent so adjacent
images don't blend into each other.)</p>
<p>Let's try doing it wrong. Add a 0 to make the height 80. You can see
some additional bitmap data. Add another 0 to make it 800. Now you get
a big red X, and the "Height" parameter is shown in red. That's because
the maximum value for the height is 512, as shown by "[1,512]" on the
right.</p>
<p>Change it back to 8, and hit "OK". Hit "OK" in the Edit Visualization
Set window as well. You should now see the blue 'X' in the code listing
above line $100A.</p>
<p>Repeat the process at line $1012: select the line, create a visualization
set, create a new visualization. The height will default to 8 because
that's what you used last time. Click "OK" in both dialogs to close them.</p>
<p>Repeat the process at line $101A, but this time the image is 40x40
rather than 8x8. Set the width to 5, and the height to 40. This makes
a mess.</p>
<p>In this case, the bitmap data is 5 bytes wide, but the data is stored
as 8 bytes per row. This is known as the "stride" or "pitch" of the row.
To tell the visualizer to skip the last 3 bytes on each row, set the
"Row stride (bytes)" field to 8. Now we have a proper Tic-Tac-Toe grid.
Note that it fills the preview window just as the 'X' and 'O' did, even
though it's 5x as large. The preview window scales everything up. Hit
"OK" twice to create the visualization.</p>
<p>Let's format the bitmap data. Select line $101A, then shift-click the
last line in the file ($1159). Actions &gt; Edit Operand. Select
"densely-packed bytes", and click "OK". This is perhaps a little too
dense. Open the operand editor again, but this time select the
densely-packed bytes sub-option "...with a limit", and set the limit
to 8 bytes per line. Instead of one very dense statement spread across
a few lines, you get one line of code per row of bitmap. If you prefer
to see individual bytes, you can use Edit &gt; Settings, select the
Display Format tab, and check "use comma-separated format for bulk data".
This can make it a bit easier to read.</p>
<h4>Bitmap Animations</h4>
<p>Some bitmaps represent individual frames in an animated sequence.
You can convert those as well. Double-click on the blue 'X' to open
the visualization set editor, then click "New Bitmap Animation". This
opens the Bitmap Animation Editor.</p>
<p>Let's try it with our Tic-Tac-Toe board pieces. From the list on the
left, select the blue 'X' and click "Add", then click the 'O' and click
"Add". Below the list, set the frame delay to 500. Near the bottom,
click "Start / Stop. This causes the animation to play in a loop. You
can use the controls to add and remove items, change their order, and change
the animation speed. You can add the grid to the animation set, but the
preview scales the bitmaps up to full size, so it may not look the way
you expect.</p>
<p>Hit "OK" to save the animation, then "OK" to update the visualization set.
The code list now shows two entries in the line: the first is the 'X'
bitmap, the second is the animation, which is shown as the initial frame
with a blue triangle superimposed. (If you go back into the editor and
reverse the order of the frames, the list will show the 'O' instead.)
You can have as many bitmaps and animations on a line as you want.</p>
<p>If you have a lot of bitmaps it can be helpful to give them meaningful
names, so that they're easy to identify and sort together in the list.
The "tag" field at the top of the editor windows lets you give things
names. Tags must be unique.</p>
<h4>Other Notes</h4>
<p>The visualization editor is intended to be very dynamic, showing the
results of parameter changes immediately. This can be helpful if you're
not exactly sure what the size or format of a bitmap is. Just keep
tweaking values until it looks right.</p>
<p>Visualization generators are defined by extension scripts. If you're
disassembling a program with a totally custom way of storing graphics,
you can write a totally custom visualizer and distribute it with the
project. Because the file offset is a parameter, you're not limited to
placing visualizations at the start of the graphic data -- you can put
them on any code or data line.</p>
<p>Visualizations have no effect on assembly source code generation,
but they do appear in code exported to HTML. Bitmaps are converted to GIF
images, and animations become animated GIFs.</p>
<p>You can also create animated visualizations of wireframe objects,
but that's not covered in this tutorial.</p>
<hr/>
<h2>End of Tutorials</h2>
<p>That's it for the tutorials. Significantly more detail on
all aspects of SourceGen can be found in the manual.</p>
<p>While you can do some fancy things, nothing you do will alter the
data file. The assembled output will always match the original. So
don't be afraid to play around.</p>
<p>If you want to work on something large over a long period, save your
progress by putting the .dis65 project into a source code control system
like git. Project files are stored in a text format that, while not meant
to be human-readable, should yield reasonable diffs.</p>
</div>
<div id="footer">
<p><a href="index.html">Back to index</a></p>
</div>
</body>
<!-- Copyright 2018 faddenSoft -->
</html>