Revise documentation

2025-04-18 12:40:28 +00:00 · 2018-10-03 18:03:04 -07:00 · 2018-10-03 18:03:04 -07:00 · 8aba1c4fba
commit 8aba1c4fba
parent a8f26a048b
11 changed files with 392 additions and 203 deletions
--- a/SourceGen/RuntimeData/Help/advanced.html
+++ b/SourceGen/RuntimeData/Help/advanced.html
@ -28,8 +28,8 @@ as project symbols into the other projects.</p>
 symbol-import step in every interested project.  This step must be
 repeated whenever the labels are updated.</p>
 <p>A different but related problem is typified by arcade ROM sets,
-where files are split apart because each file must be flashed to a
-separate chip.  All files are expected to be present in memory at
+where files are split apart because each file must be burned into a
+separate PROM.  All files are expected to be present in memory at
 once, so there's no reason to treat them as separate projects. Currently,
 the best way to deal with this is to concatenate the files into a single
 file, and operate on that.</p>
@ -60,7 +60,7 @@ L1103_0  LDA     #$22
 <p>Both sections start at $1100, and have branches to $1103.  The branch
 in the first section resolves to the label in the first version of
 that address chunk, while the branch in the second section resolves to
-the label in the second chunk.  When branches are outside the current
+the label in the second chunk.  When branches originate outside the current
 address chunk, the first chunk that includes that address is used, as
 it is with the <code>JMP $1000</code> at the start of the file.</p>

@ -96,7 +96,7 @@ not help you debug 6502 projects.</p>
    multi-line comment (long comment, note).  Useful for confirming that
    the width limitation is being obeyed.  These are added exactly
    as shown, without comment delimiters, into generated assembly output,
-    which doesn't work out well.</li>
+    which doesn't work out well if you run the assembler.</li>
  <li>Use Keep-Alive Hack.  If set, a "ping" is sent to the extension
    script sandbox every 60 seconds.  This seems to be required to avoid
    an infrequently-encountered Windows bug.  (See code for notes and
--- a/SourceGen/RuntimeData/Help/analysis.html
+++ b/SourceGen/RuntimeData/Help/analysis.html
@ -43,7 +43,7 @@ method in <code>DisasmProject.cs</code>):</p>
    The Anattrib array tracks most of the state from here on.  If we're
    doing a partial re-analysis, this step will just clone a copy of the
    Anattrib array that was made at this point in a previous run.  (This
-    step is described in more detail <a href="code-analysis">below</a>.)</li>
+    step is described in more detail below.)</li>
  <li>Apply user-specified labels to Anattribs.</li>
  <li>Apply user-specified format descriptors.  These are the instruction
    and data operand formats.</li>
@ -51,14 +51,14 @@ method in <code>DisasmProject.cs</code>):</p>
    data, and connects instruction and data operands to target offsets.
    The "nearby label" stuff is handled here.  All of the results are
    stored in the Anattribs array.  (This step is described in more
-    detail <a href="data-analysis">below</a>.)</li>
+    detail below.)</li>
  <li>Remove hidden labels from the symbol table.  These are user-specified
    labels that have been placed on offsets that are in the middle of an
    instruction or multi-byte data item.  They can't be referenced, so we
    want to pull them out of the symbol table.  (Remember, symbolic
    operands use "weak references", so a missing symbol just means the
    operand is shown as a hex value.)</li>
-  <li>Resolve references to platform and project external symbols>
+  <li>Resolve references to platform and project external symbols.
    This sets the operand symbol in Anattrib, and adds the symbol to
    the list that is displayed in .EQ directives.</li>
  <li>Generate cross-reference lists.  This is done for file data and
@ -71,6 +71,103 @@ by walking through the annotated file data.  Most of the actual strings
 aren't rendered until they're needed.</p>


+<h3><a name="auto-format">Automatic Formatting</a></h3>
+
+<p>Every offset in the file is marked as an instruction byte, data byte, or
+inline data byte.  Some offsets are also marked as the start of an instruction
+or data area.  The start offsets may have a format descriptor associated
+with them.</p>
+<p>Format descriptors have a format (like "numeric" or "string") a
+sub-format (like "hexadecimal" or "null-terminated"), and a length.  For
+an instruction operand the length is redundant, but for a data operand it
+determines the width of the numeric value or length of the string.  For
+this reason, instructions do not need a format descriptor, but all
+data items do.</p>
+<p>Symbolic references are format descriptors with a symbol attached.
+The symbol reference also specifies low/high/bank.</p>
+<p>Every offset marked as a start point gets its own line in the on-screen
+display list.  Embedded instructions are identified internally by
+looking for instruction-start offsets inside instructions.</p>
+
+<p>The Anattrib array holds the post-analysis state for every offset,
+including comments and formatting, but any changes you make in the
+editors are applied to the data structures that are saved in the project
+file.  After a change is made, a full or partial re-analysis is done to
+fill out the Anattribs.</p>
+<p>Consider a simple example:</p>
+<pre>
+         .ORG    $1000
+         JMP     L1003
+L1003    NOP
+</pre>
+
+<p>We haven't formatted anything yet.  The data analyzer sees that the
+JMP operand is inside the file, and has no label, so it creates an
+auto-label at offset +000003 and a format descriptor with a symbolic
+operand reference to "L1003" at +000000.</p>
+<p>Now we edit the label, changing L1003 to "FOO".  This goes into the
+project's "user label" list.  The analyzer is
+run, and applies the new "user label" to the Anattrib array.  The
+data analyzer finds the numeric reference in the JMP operand, and finds
+a label at the target address, so it creates a symbolic operand reference
+to "FOO".  When the display list is generated, the symbol "FOO" appears
+in both places.</p>
+<p>Even though the JMP operand changed from "L1003" to "FOO", the only
+change actually written to the project file is the label edit.  The
+contents of the Anattrib array are disposable, so it can be used to
+add labels and "fix up" numeric references.  Generated labels and
+format descriptors are never added to the project file.</p>
+
+<p>If the JMP operand were edited, a format descriptor would be added
+to the user-specified descriptor list.  During the analysis pass it would
+be added to the Anattrib array at offset +000000.</p>
+
+
+<h3><a name="undo-redo">Interaction With Undo/Redo</a></h3>
+
+<p>The analysis pass always considers the current state of the user
+data structures.  Whether you're adding a label or removing one, the
+code runs through the same set of steps.  The advantage of this approach
+is that the act of doing a thing, undoing a thing, and redoing a thing
+are all handled the same way.</p>
+<p>None of the editors modify the project data structures directly.  All
+changes are added to a change set, which is processed by a single function.
+The change sets are kept in the undo/redo buffer indefinitely.  After
+the changes are made, the Anattrib array and other data structures are
+regenerated.</p>
+
+<p>Data format editing can create some tricky situations.  For example,
+suppose you have 8 bytes that have been formatted as two 32-bit words:
+
+<pre>
+1000: 68690074           .dd4    $74006968
+1004: 65737400           .dd4    $00747365
+</pre>
+
+You realize these are null-terminated strings, select both words, and
+reformat them:
+
+<pre>
+1000: 686900             .zstr   "hi"
+1003: 74657374+          .zstr   "test"
+</pre>
+
+Seems simple enough.  Under the hood, SourceGen created three changes:
+<ol>
+  <li>At offset +000000, replace the current format descriptor (4-byte
+    numeric) with a 3-byte null-terminated string descriptor.</li>
+  <li>At offset +000003, add a new 5-byte null-terminated string
+    descriptor.</li>
+  <li>At offset +000004, remove the 4-byte numeric descriptor.</li>
+</ol>
+
+<p>Each entry in the change set has "before" and "after" states for the
+format descriptor at a specific offset.  Only the state for the affected
+offsets is included -- the program doesn't take a complete state snapshot
+(even with the RAM on a modern system that would add up quickly).  When
+undoing a change, before and after are simply reversed.</p>
+
+
 <h2><a name="code-analysis">Code Analysis</a></h2>

 <p>The code tracer walks through the instructions, examining them to
@ -81,8 +178,9 @@ for every instruction:</p>
    Examples: <code>LDA</code>, <code>STA</code>, <code>AND</code>,
    <code>NOP</code>.
  <li>Don't continue.  The next instruction to be executed can't be
-    determined from the file data, unless you're disassembling the
-    system ROM.  Examples: <code>RTS</code>, <code>BRK</code>.
+    determined from the file data (unless you're disassembling the
+    system ROM around the BRK vector).
+    Examples: <code>RTS</code>, <code>BRK</code>.
  <li>Branch always.  The operand specifies the next instruction address.
    Examples: <code>JMP</code>, <code>BRA</code>, <code>BRL</code>.
  <li>Branch sometimes.  Execution may continue at the operand address,
@ -96,8 +194,8 @@ for every instruction:</p>
 </ol>

 <p>Branch targets are added to a list.  When the current run of instructions
-is exhausted (i.e. a "don't continue" instruction is reached), the next
-target is pulled off of the list.</p>
+is exhausted (i.e. a "don't continue" or "branch always" instruction is
+reached), the next target is pulled off of the list.</p>

 <p>The state of the processor status flags is recorded for every
 instruction.  When execution proceeds to the next instruction or branches
@ -116,18 +214,19 @@ of status flags, the analyzer stops pursuing that path.</p>
 when examining 65816 code, but it's possible for the status flag values
 to be indeterminate.  In such a situation, short registers are assumed.
 Similarly, if the carry flag is unknown when an <code>XCE</code> is
-performed, we assume a transition to emulation mode.</p>
+performed, we assume a transition to emulation mode (E=1).</p>

-<p>There are three ways to set a definite value in a status flags:</p>
+<p>There are three ways in which code can set a flag to a definite value:</p>
 <ol>
-  <li>By specific instructions, like <code>SEC</code> or
+  <li>By explicit instructions, like <code>SEC</code> or
    <code>CLD</code>.</li>
-  <li>By immediate instructions.  <code>LDA #$00</code> sets Z=1 and N=0.
-    <code>ORA #$80</code> sets Z=0 and N=1.</li>
+  <li>By immediate-operand instructions.  <code>LDA #$00</code> sets Z=1
+    and N=0.  <code>ORA #$80</code> sets Z=0 and N=1.</li>
  <li>By inference.  For example, if we see a <code>BCC</code> instruction,
    we know that the carry will be clear at the branch target address, and
    set at the following instruction.  The instruction doesn't affect the
-    value of the flag, but we know what the value is at either address.</li>
+    value of the flag, but we know what the value will be at both
+    addresses.</li>
 </ol>
 <p>Self-modifying code can render spoil any of these, possibly requiring a
 status flag override to get correct disassembly.</p>
@ -145,7 +244,7 @@ code does <code>CLC</code>/<code>PHP</code>, followed a bit later by the
 flag around.  Flagging the carry bit as indeterminate with a status flag
 override on the instruction following the PLP fixes things.)</p>

-<p>Some other things that the code analyzer can't handle:</p>
+<p>Some other things that the code analyzer can't recognize automatically:</p>
 <ul>
  <li>Jumping indirectly through an address outside the file, e.g.
    storing an address in zero-page memory and jumping through it.
@ -163,6 +262,26 @@ that it's equal to the program bank register ("K").  Handling this
 correctly will require improvements to the user interface.</p>


+<h3><a name="extension-scripts">Extension Scripts</a></h3>
+
+<p>Extension scripts can mark data that follows a JSR or JSL as inline
+data, or change the format of nearby data or instructions.  The first
+time a JSR/JSL instruction is encountered, all loaded extension scripts
+are offered a chance to act.</p>
+
+<p>The first script that applies a format wins.  Attempts to re-format
+instructions or data will fail.  This rule ensure that anything explicitly
+formatted by the user will not be overridden by a script.</p>
+
+<p>If code jumps into a region that is marked as inline data, the
+branch will be ignored.  If an extension script tries to flag bytes
+as inline data that have already been executed, the script will be
+ignored.  This can lead to a race condition in the analyzer if
+an extension script is doing the wrong thing.  (The race doesn't exist
+with inline data hints specified by the user, because those are applied
+before code analysis starts.)</p>
+
+
 <h2><a name="data-analysis">Data Analysis</a></h2>
 <p>The data analyzer performs two tasks.  It matches operands with
 offsets, and it analyzes uncategorized data.  Either or both of
@ -171,17 +290,17 @@ these can be disabled from the

 <p>The data target analyzer examines every instruction and data operand
 to see if it's referring to an offset within the data file.  If the
-target is within the file, and has a label, a weak symbolic reference
-to that label is added to the Anattrib array.  If the target doesn't
-have a label, the analyzer will either use a nearby label, or generate
-a unique label and use that.</p>
+target is within the file, and has a label, a format descriptor with a
+weak symbolic reference to that label is added to the Anattrib array.  If
+the target doesn't have a label, the analyzer will either use a nearby
+label, or generate a unique label and use that.</p>
 <p>While most of the "nearby label" logic can be disabled, targets that
 land in the middle of an instruction are always adjusted backward to
 the instruction start.  This is necessary because labels are only visible
 if they're associated with the first (opcode) byte of an instruction.</p>

 <p>The uncategorized data analyzer tries to find ASCII strings and
-opportunities to use the ".FILL" instruction.  It breaks the file into
+opportunities to use the ".FILL" operation.  It breaks the file into
 pieces, where contiguous regions hold nothing but data, are not split
 across a ".ORG" directive, are not interrupted by data, and do not
 contain anything that the user has chosen to format.  Each region is
--- a/SourceGen/RuntimeData/Help/codegen.html
+++ b/SourceGen/RuntimeData/Help/codegen.html
@ -15,8 +15,8 @@

 <p>SourceGen can generate an assembly source file that, when fed into
 the target assembler, will recreate the original data file exactly.
-Every assembler is different, so code must be written especially for
-each.<p>
+Every assembler is different, so support must be added to SourceGen
+for each.</p>
 <p>The generation / assembly dialog can be opened with File &gt; Assemble.</p>


@ -37,7 +37,7 @@ assembler.  This is most easily understood with an example.</p>
 <code>54 02 01</code>, with the arguments reversed.  cc65 v2.17 doesn't
 do that; this is a bug that was fixed in a later version.  So if you're
 generating code for v2.17, you want to create source code with the
-arguments the other way around.</p>
+arguments the wrong way around.</p>
 <p>Having version-dependent source code is a bad idea, so SourceGen
 just outputs raw hex bytes for MVN/MVP instructions.  This yields the
 correct code for all versions of the assembler, but is ugly and
@ -56,7 +56,7 @@ intermediaries ("file.o") or metadata ("_FileInformation.txt").  Some
 generators may produce multiple source files, perhaps a link script or
 symbol definition header to go with the assembly source.  To avoid
 spreading files across the filesystem, SourceGen does all of its work
-in the same directory where the project lives.  So before you can generate
+in the same directory where the project lives.  Before you can generate
 code, you have to have given your project a name by saving it.</p>

 <p>The Generate and Assemble dialog has a drop-down list near the top
@ -98,12 +98,12 @@ command-line output will be displayed, with stdout and stderr separated.
 provides.)</p>

 <p>The output will show the assembler's exit code, which will be zero
-on success (note: sometimes they lie.)  If it did, SourceGen will then
-compare the assembler's output to the original file, and report any
-differences.</p>
+on success (note: sometimes they lie.)  If it appeared to succeed,
+SourceGen will then compare the assembler's output to the original file,
+and report any differences.</p>
 <p>Failures here may be due to bugs in the cross-assembler or in
 SourceGen.  However, SourceGen can generally work around assembler bugs,
-so any failure here is an opportunity for improvement.</p>
+so any failure is an opportunity for improvement.</p>

 </div>

--- a/SourceGen/RuntimeData/Help/editors.html
+++ b/SourceGen/RuntimeData/Help/editors.html
@ -16,13 +16,13 @@

 <h2><a name="address">Edit Address</a></h2>
 <p>This adds a target address directive (".ORG") to the current offset.
-If you leave the field blank, the directive will be removed.</p>
+If you leave the text field blank, the directive will be removed.</p>
 <p>Addresses are always interpreted as hexadecimal.  You can prefix
-it with a '$', but that's not necessary.</p>
-<p>24-bit addresses may be written with a bank separator, e.g. "12/3456"
+it with a '$', but that's not required.
+24-bit addresses may be written with a bank separator, e.g. "12/3456"
 would resolve to address $123456.</p>

-<p>There will always be an address directive at the start of the list.
+<p>There will always be an address directive at the start of the file.
 Attempts to remove it will be ignored.</p>


@ -34,14 +34,15 @@ that instruction.  You can override the value of individual flags.</p>
 <p>The 65816 emulation bit, which is not part of the processor status
 register, may also be set in the editor.</p>
 <p>The M, X, and E flags will not be editable unless your CPU configuration
-is set to a 16-bit CPU.</p>
+is set to 65816.</p>


 <h2><a name="label">Edit Label</a></h2>
 <p>Sets or clears a label at the selected offset.  The label must have
-the proper form, and not have the same name as another symbol.</p>
+the proper form, and not have the same name as another symbol.  If
+you edit an auto-generated label you will be required to change the name.</p>
 <p>The label may be marked as local, global, or global and exported.
-Local labels may be generated in the assembler output in a
+Local labels may be modified by the assembly code generator to have a more
 convenient form, such as a local loop identifier.  Global labels are
 always output as-is.  Exported labels are added to a table that may
 be imported by other projects.</p>
@ -51,16 +52,17 @@ be imported by other projects.</p>
 <p>Operands can be displayed in a variety of numeric formats, or as a
 symbol.  The ASCII character format is only available for operands
 whose value falls into the range of low- or high-ASCII characters.</p>
-<p>Symbols may be used in their entirety, or offset by a byte or two.
+<p>Symbols may be used in their entirety, or shifted and masked.
 The low / high / bank selector determines which byte is used as the
 low byte.  For 16-bit operands, this acts as a shift rather than a byte
-select.</p>
+select.  If the symbol is wider than the operand field, a mask will be
+applied automatically.</p>

 <p>A few shortcuts are provided when specifying a symbol.  As noted in
 the introductory sections, operand symbols are weak references.  If the
 symbol hasn't been defined as a label yet, the operand will be formatted
 as hex, which is probably not what you want.</p>
-<p>The default behavior is to just set the operand's symbol.</p>
+<p>The default behavior is just to set the operand's symbol.</p>
 <p>For operands that target an offset inside the file, if the target
 address does not yet have a label, and the symbol doesn't exist, you may
 set the symbol as the label on the target address as well.  You can do
@ -84,24 +86,35 @@ future release.)</p>

 <h2><a name="data">Edit Data Format</a></h2>
 <p>This dialog offers a variety of choices, and can be used to apply a
-format to a range of offsets.  If the range crosses a visual boundary,
+format to a range of offsets.  You must select all of the bytes you want
+to format.  For example, to format two bytes as a 16-bit word, you must
+select both bytes in the editor.  (If you click on the first item, then
+Shift+double-click on the operand field of the last item, you can do
+this very quickly.)  The selection does not need to be contiguous: you
+can use Control+click to select scattered items.)
+<p>If the range is discontiguous, or crosses a visual boundary
 such as a change in address, a user-specified label, or a long comment
-or note, the region will be split.  The top of the dialog indicates how
-many bytes have been selected, and how many regions they have been
-divided into.</p>
+or note, the selection will be split into smaller regions.  A message at the
+top of the dialog indicates how many bytes have been selected, and how
+many regions they have been divided into.</p>
 <p>(End-of-line comments do <i>not</i> split a region, and will
 disappear if they end up inside a multi-byte data item.)</p>

 <p>The "Simple Data" items behave the same as their equivalents in the
 Edit Operand dialog.  However, because the width is not determined by
-an instruction opcode, you will need to specify how wide each item is,
-and the byte order.</p>
-<p>Suppose you find a table of 16-bit addresses in the code.  Click on
+an instruction opcode, and multiple items can be selected, you will need
+to specify how wide each item is and what its byte order is.  For data
+you also have the option of setting the format to "Address", which marks
+the selected bytes as a numeric reference.</p>
+
+<p>Consider a simple example: suppose you find a table of 16-bit
+addresses in the code.  Click on
 the first byte, shift-click the last byte, then select the Edit Data menu
 item.  The number of bytes selected should be even.  Select
-"16-bit words, little-endian", then to the right "Address".  When you
-click OK, the selected data will be formatted as a series of 16-bit
-address values.</p>
+"16-bit words, little-endian", then over to the right click on
+"Address".  When you click OK, the selected data will be formatted as a
+series of 16-bit address values.  If the addresses can be resolved inside
+the data file, each address will be assigned a label.</p>

 <p>The "Bulk Data" items can represent large chunks of data compactly.
 The "fill" option is only available if all selected bytes have the
@ -161,8 +174,8 @@ want to limit the overall length if you're hoping to create 80-column
 output.  Some retro assemblers may have hard line length limitations,
 which could result in the comment being truncated in generated sources.</p>
 <p>A semicolon (';') is placed at the start of the line.  If an assembler
-has different conventions, a different character may be used.  You don't
-need to include a delimiter in the comment field.</p>
+has different conventions, a different delimiter character may be used.  You
+don't need to include a semicolon in the comment field.</p>

 <p>Comments on platform symbols are read from the platform symbol file, and
 cannot be edited from within SourceGen.  Comments on project symbols are
@ -176,11 +189,11 @@ will be word-wrapped at a line width of your choosing.  They're always
 drawn with a fixed-width font, so you can create ASCII-art diagrams.
 Comment delimiters are added automatically at the start of each line.</p>
 <p>For a true retro look you can "box" the comment with asterisks.  You
-can create a fill-width row of asterisks by putting a '*' on a line by
+can create a full-width row of asterisks by putting a '*' on a line by
 itself.  (Assembly source generators are allowed to use a character
 other than '*' for the output, e.g. they might use a full set of
 box outline characters, though that's somewhat against the spirit of
-the thing.)</p>
+the thing.  Regardless, a solo '*' results in a line.)</p>
 <p>The bottom window will update automatically as you type, showing what
 the output is expected to look like.  The actual assembler source output
 will depend on features of the target assembler, such as comment
@ -226,7 +239,7 @@ the same way when used in a .EQ directive.</p>
 the .EQ directive.</p>
 <p>Symbols marked as "address" will be applied automatically when an
 operand references an address outside the scope of the data file.  Symbols
-marked as "constant" will not, though you can still specify it manually.</p>
+marked as "constant" will not, though you can still specify them manually.</p>

 </div>

--- a/SourceGen/RuntimeData/Help/end-notes.html
+++ b/SourceGen/RuntimeData/Help/end-notes.html
@ -19,10 +19,10 @@ school in the late 1980s, I read Don Lancaster's
 <i>Enhancing Your Apple II, Vol. 1</i> (available for download
 <a href="https://www.tinaja.com/ebksamp1.shtml">here</a>).  This
 included a very detailed methodology for disassembling 6502 software.
-I decided to give it a try, so I dumped a monitor listing of the
-operating system from an SSI game ("RDOS") to paper with my Epson
-RX-80 -- tractor feed paper was helpful for this sort of thing -- and
-set to work.</p>
+I wanted to give it a try, so I generated a monitor listing of an
+operating system (called "RDOS") that SSI used on their games, and
+printed it out on my Epson RX-80 -- tractor feed paper was helpful for
+this sort of thing -- then set to work.</p>

 <p>Lancaster's methodology involved highlighting different types of
 instructions with different colors, making notes, and adding labels.
@ -44,14 +44,17 @@ like a modern IDE, because I didn't just want it to translate machine code
 into readable form.  I wanted it to help me with the process of
 understanding the code, by providing cross-reference tables and symbol
 lists and giving me a place to scribble notes to myself while I worked.
-Especially the note-scribbling.</p>
+I especially wanted the note-scribbling, because learning how something
+works is usually an iterative process, where the function of a chunk of
+code gradually reveals itself over time.</p>

 <p>In 2002, while writing the 6502/65816 disassembler for CiderPress, I
 ran into the same problems I had with the original Apple II monitor: it
 blundered through data sections and got lost briefly when a new code
-section started.  This made it annoying to use for even small binaries.  I
+section started.  You had to pick long or short registers for the entire
+diassembly, which made 65816 code something of a disaster.  I
 jotted down some notes on what I thought the core features of a good
-6502 diassembler should be, then went back to work on other features.  It
+6502 disassembler should be, then moved on to work on other features.  It
 was another 15 years before I picked up the idea again.</p>

 <p>More recently, I disassembled some code by dumping it to a text
--- a/SourceGen/RuntimeData/Help/index.html
+++ b/SourceGen/RuntimeData/Help/index.html
@ -54,6 +54,7 @@ and 65816 code.  The official web site is
      <li><a href="mainwin.html#info">Info Window</a></li>
      <li><a href="mainwin.html#navigation">Navigation</a></li>
      <li><a href="mainwin.html#hints">Adding and Removing Hints</a></li>
+      <li><a href="mainwin.html#toggle-format">Quick Format Toggle</a></li>
      <li><a href="mainwin.html#clipboard">Copying to Clipboard</a></li>
    </ul>
  </ul>
@ -108,11 +109,6 @@ and 65816 code.  The official web site is
    <li><a href="tools.html#ascii-chart">ASCII Chart</a></li>
  </ul>

-  <li><a href="tutorials.html">Tutorials</a></li>
-  <ul>
-    <li><a href="tutorials.html#basic-features">Basic Features</a></li>
-  </ul>
-
  <li><a href="advanced.html">Advanced Topics</a></li>
  <ul>
    <li><a href="advanced.html#multi-bin">Working With Multiple Binaries</a></li>
@ -123,12 +119,26 @@ and 65816 code.  The official web site is
  <li><a href="analysis.html">Appendix: Instruction and Data Analysis</a></li>
  <ul>
    <li><a href="analysis.html#analysis-process">Analysis Process</a></li>
+    <ul>
+      <li><a href="analysis.html#auto-format">Automatic Formatting</a></li>
+      <li><a href="analysis.html#undo-redo">Interaction With Undo/Redo</a></li>
+    </ul>
    <li><a href="analysis.html#code-analysis">Code Analysis</a></li>
+    <ul>
+      <li><a href="analysis.html#extension-scripts">Extension Scripts</a></li>
+    </ul>
    <li><a href="analysis.html#data-analysis">Data Analysis</a></li>
  </ul>

  <li><a href="end-notes.html">End Notes</a></li>

+  <br/>
+
+  <li><a href="tutorials.html">Tutorials</a></li>
+  <ul>
+    <li><a href="tutorials.html#basic-features">Basic Features</a></li>
+  </ul>
+
 </ul>


--- a/SourceGen/RuntimeData/Help/intro.html
+++ b/SourceGen/RuntimeData/Help/intro.html
@ -28,7 +28,7 @@ navigate the code while trying to figure out what it does.  A
 disassembler should help you understand the code, not just dump the
 instructions to a text file.</p>
 <p>The computer I built in 2014 has a 4GHz CPU and 8GB of RAM.
-We should put that to good use.</p>
+I figured we should put that kind of power to good use.</p>

 <p>The second purpose is to facilitate sharing and collaboration.  Most
 disassemblers generate output for a specific assembler, or in a way that's
@ -49,12 +49,13 @@ capabilities within SourceGen are sufficiently flexible.  If you need to
 generate assembly source and tweak it a bunch to express the intent of
 the original code, then passing a SourceGen project around won't work.
 This sort of thing is a bit outside the bounds of what a typical
-disassembler does, so it remains to be seen whether this succeeds at
-what it's trying to do, and also whether what it's trying to do is actually
-something that people want.</p>
+disassembler does, so it remains to be seen whether SourceGen succeeds at
+what it's trying to do, and also whether what it's trying to do is
+something that people actually want.</p>

-<p>You can get started by watching the demo video and playing with the
-tutorials.</p>
+<p>You can get started by watching the
+<a href="https://youtu.be/dalISyBPQq8">demo video</a> and playing with the
+<a href="tutorials.html">tutorials</a>.</p>


 <h2><a name="fundamental-concepts">Fundamental Concepts</a></h2>
@ -63,7 +64,7 @@ tutorials.</p>
 rest of the documentation assumes you've read and understood this.  It will
 be helpful if you already understand something about the 6502 instruction
 set and assembly-language programming, but disassembling other programs is
-actually a pretty good way to learn assembly.</p>
+actually a pretty good way to learn how to code in assembly.</p>

 <h2><a name="begin">About 6502 Code</a></h2>

@ -71,21 +72,24 @@ actually a pretty good way to learn assembly.</p>
 the 6502 CPU or any of its derivatives, including but not limited to
 the 65C02 and 65816".  So let's talk about 6502 code.</p>

+<p>Code usually arrives in a big binary blob.  Some of it will be
+instructions, some of it will be data, some will be empty space used
+for variable storage.  Part of the challenge of disassembly is
+identifying which parts of the file contain which.</p>
+
 <p>Much of the code you'll find for the 6502 was written by humans,
 rather than generated by a compiler, which means it won't conform to a
-specific set of conventions.  However, most programmers will use
-subroutines, and will often intersperse code with bits of data storage
-for variables.  The variable data storage is referred to as a "stash".
+standard set of conventions.  However, most programmers will use
+subroutines, which can be identified and analyzed in isolation.  Subroutines
+are often interspersed with variable storage, referred to as a "stash".
 Variables may be single-byte or multi-byte, the latter typically
 in little-endian byte order.</p>

-<p>Data that is principally read-only can take many forms.  Among the
-more common forms are graphics and ASCII string data.  The former is
-generally difficult to recognize automatically, but strings can often be
-identified.  Address tables, which are a collection of addresses to
-other things, are also fairly common.  When used as jump tables, they
-might actually refer to the address before the actual instruction, because
-of the way the RTS (Return to Subroutine) instruction works.</p>
+<p>Much of the data in a typical program is read-only, often in the
+form of graphics or ASCII string data.  Graphics can be difficult
+to recognize automatically, but strings can be identified with a
+reasonable degree of confidence.  Address tables, which are a collection
+of addresses to other things, are also fairly common.</p>

 <p>A simple disassembler would start at the top of the file and just
 start converting bytes to instructions.  Unfortunately there's no reliable
@ -127,14 +131,17 @@ by the program bank register and the data bank register, respectively.
 The disassembler can't generally know the contents of the data bank
 register, which makes life a bit more interesting.</p>

-<p>The 6502 has an 8-bit processor status register with a bunch of flags
-in it.  One use of certain flags is to determine whether a
-conditional branch is taken or not.
-Two flags that are only present on the 65816 (M and X) are especially
-interesting, because they determine whether the accumulator and index
-registers are 8 or 16 bits wide.  This determines the width of immediate-mode
-instructions, so if you don't know what's in the processor status
-register it's hard to correctly disassemble the instruction stream.</p>
+<p>The 6502 has an 8-bit processor status register ("P") with a bunch of flags
+in it.  Some of the flags determine whether a conditional branch is taken
+or not, which is important because some branches appear to be conditional
+but actually are always or never taken in practice.  The disassembler needs
+to be able to figure this out so that it doesn't try to disassemble the
+bytes that follow an always-taken branch.
+A more significant concern is the M and X flags found on the 65802/65816,
+which determine the width of the registers and of immediate load
+instructions.  If you don't know what state the flags are in, you can't
+know whether <code>LDA #value</code> is two bytes or three, and the
+disassembly of the instruction stream will come out wrong.</p>


 <h2><a name="sgintro">How SourceGen Works</a></h2>
@ -145,9 +152,9 @@ only its effect on the flow of execution matters.

 <p>The code tracing has to start somewhere, so SourceGen uses "code entry
 point hints" to identify places where execution may begin.  By default,
-one is placed at the start of the file.  From there, the tracing process
+a hint is placed at the start of the file.  From there, the tracing process
 walks through the code, pursuing all branches.  In many cases, if you
-mark all code entry points, SourceGen will automatically find all
+mark all external entry points, SourceGen will automatically find all
 executable code and separate it from variable storage and data areas.</p>

 <p>As noted earlier, tracking the processor status flags can make the
@ -155,7 +162,7 @@ analysis more accurate.  Identifying situations where a branch instruction
 is always or never taken avoids mis-categorizing a data region as code.
 On the 65816, it's absolutely crucial to track the M/X flags, since those
 affect the width of instructions.  SourceGen tracks the value of the
-processor flags at every instruction, blending sets together when
+processor flags at every instruction, blending sets of flags together when
 multiple paths of execution converge.</p>

 <p>Once instructions and data have been separated, the instruction operands
@ -172,23 +179,16 @@ by an equate directive.</p>
 <h3><a name="scripts">Extension Scripts</a></h3>

 <p>Extension scripts are C# source files that are compiled and
-executed by SourceGen.  They can be added to a project from the RuntimeData
-directory or the directory the project file lives in.</p>
-<p>In v1.0, scripts are only called to examine JSR/JSL instructions.
-They can format nearby bytes as inline data, or apply symbols to
-operands.</p>
-
-<p>If code jumps into a region that is marked as inline data, the
-branch will be ignored.  If an extension script tries to flag bytes
-as inline data that have already been executed, the script will be
-ignored.  This can lead to a race condition in the analyzer if
-an extension script is doing the wrong thing.  (The race doesn't exist
-with inline data hints specified by the user, because those are applied
-before code analysis starts.)</p>
+executed by SourceGen.  They can be added to a project from SourceGen's
+runtime data directory, or can live in the directory next to the project
+file.</p>
+<p>In the current implementation, scripts are only called to examine
+JSR/JSL instructions.  They can format nearby bytes as inline data, or
+apply symbols to operands.</p>

 <p>To reduce the chances of a script causing problems, all scripts are
 executed in a sandbox with severely restricted access.  Notably, nothing
-in the script can access files, except to read those in the PluginDll
+in the sandbox can access files, except to read files from the PluginDll
 directory.</p>
 <p>The PluginDll directory lives next to the SourceGen executable, and
 contains all of the compiled script DLLs, as well as two pre-built
@ -199,10 +199,9 @@ is launched, but may be manually deleted without harm.</p>

 <h3><a name="hints">Analyzer Hints</a></h3>

-<p>Sometimes SourceGen can't automatically find the start or end of a
-code area.  Maybe there's inline data after a JSR that didn't get
-recognized by an extension scripts.  These situations can be resolved
-by adding an appropriate hint.</p>
+<p>Sometimes SourceGen can't automatically find the start or end of an
+instruction stream, or gets confused by inline data.  These situations
+can be resolved by adding an appropriate hint.</p>

 <p><b>Code entry point hints</b> tell the analyzer to add the offset
 to the list of instruction start points.  Suppose you've got a code
@ -247,9 +246,9 @@ end up with this:</p>
 <pre>
         .ORG    $1000
         JMP     L1009
-         JMP &#9193;   L10ef
-         BPL &#9193;   L1053
-         JMP &#9193;   L1230
+         JMP &#9193;  L10ef
+         BPL &#9193;  L1053
+         JMP &#9193;  L1230
         BMI     L101b
 L1009    CLC
 </pre>
@ -276,7 +275,7 @@ would actually be better solved by setting a status flag override on
 the BNE that sets Z=0, so the code tracer will know it's a branch-always
 and do the right thing.)  It's only necessary to place a hint on the
 very first (opcode) byte.  Placing a data hint in the middle of what
-SourceGen believes is an instruction will have no effect.</p>
+SourceGen believes to be instruction will have no effect.</p>

 <p><b>Inline data hints</b> identify bytes as being part of the
 instruction stream, but not instructions.  A simple example of this
@ -285,11 +284,13 @@ is the ProDOS 8 call interface on the Apple II, which looks like this:</p>
         JSR     $bf00
         .DD1    $function
         .DD2    $address
+         BCS     BAD
 </pre>

-<p>The three bytes following a JSR to $bf00 should be skipped over by
-the code analyzer.  In this case, all three bytes must be hinted.</p>
-<p>If code jumps into a region that is marked as inline data, the
+<p>The three bytes following the <code>JSR $bf00</code> should be hinted
+as inline data, so that the code analyzer skips them and continues the
+analysis at the <code>BCS</code>.</p>
+<p>If code branches into a region that is marked as inline data, the
 branch will be ignored.</p>


@ -303,9 +304,9 @@ of the work being disassembled.  (This will vary by region.  Also, note
 that the mere act of disassembling a piece of software may be illegal in
 some cases.)</p>

-<p>To avoid mix-ups, the data file's length and CRC are stored in the
-project file.  SourceGen will refuse to open a project if the data file's
-length and CRC don't match.</p>
+<p>To avoid mix-ups where the wrong data file is used, the file's length
+and CRC are stored in the project file.  SourceGen will refuse to open a
+project if the data file's length and CRC don't match.</p>

 <p>Most of the data in the project file is associated with a file offset.
 When you create a comment, you aren't associating it with line 53, you're
@ -317,14 +318,20 @@ convention, file offsets are always shown as a six-digit hexadecimal value
 with a leading '+', e.g. "+0012ab".  This makes it easy to distinguish
 between an address and a offset.</p>

+<p>Instruction and data operands can be formatted in various ways.  The
+formatting choice is associated with the first offset of the item.  For
+instructions the number of bytes in the operand is determined by the opcode
+(and, on the 65816, the M/X status flags).  For data items the length
+can be a single byte or an entire file.  Operand formats are not allowed
+to overlap.</p>
+
 <p>When an instruction or data operand references an address, we call
 it a <b>numeric reference</b>.  When the target address has a label, and
 the operand uses that symbol, we call that a <b>symbolic reference</b>.
 SourceGen tries to establish symbolic references whenever possible,
 so that the generated assembly source doesn't refer to hard-coded
-locations within the program.</p>
-<p>Data operands can also be numeric references.  From the Edit Data
-dialog, select the "Address" format.</p>
+locations within the program.  Labels are generated automatically for
+the targets of numeric references.</p>

 <p>As your understanding of the disassembled code develops, you will want
 to add comments explaining it.  SourceGen projects have three kinds of
@ -339,32 +346,38 @@ comments:</p>
    are a way for you to leave notes to yourself, perhaps "don't forget
    to figure this out" or "this is the cool part".
 </ol>
-<p>Each offset can have one of each.</p>
+<p>Every file offset can have one of each.</p>

 <p>Labels and comments may disappear if you associate them with a file
 offset that is in the middle of a multi-byte instruction or data item.
 For example, suppose you put a long comment at offset +000010, and then
 mark a 50-byte region starting at offset +000008 as an ASCII string.  The
 comment won't be deleted, but won't be displayed either.  The same thing
-happens to labels.</p>
+can happen to labels.  SourceGen will try to prevent this from happening
+by splitting formatted data into sub-regions at label boundaries.</p>


 <h2><a name="about-symbols">All About Symbols</a></h2>

-<p>A symbol has two parts, a label and a value.  The value may be an
-address or a numeric constant.  Symbols can be defined in different ways,
-and applied in different ways.</p>
+<p>A symbol has two parts, a label and a value.  The label is a short
+ASCII string; the value may be an 8-to-24-bit address or a numeric
+constant.  Symbols can be defined in different ways, and applied in
+different ways.</p>

-<p>The label format is restricted:</p>
+<p>The label syntax is restricted to a format that should be compatible
+with most assemblers:</p>
 <ul>
  <li>2-32 characters long.
  <li>Starts with a letter or underscore.
  <li>Comprised of ASCII letters, numbers, and the underscore.
 </ul>
+<p>Label comparisons are case-sensitive, as is customary for programming
+languages.</p>

-<p><b>Platform symbols</b> are defined in platform symbol files, which
-have a ".sym65" filename extension.  Several come with SourceGen and
-live in the <code>RuntimeData</code> directory.  You can also create your
+<p><b>Platform symbols</b> are defined in platform symbol files.  These
+are named with a ".sym65" extension, and have a fairly straightforward
+name/value syntax.  Several files for popular platforms come with SourceGen
+and live in the <code>RuntimeData</code> directory.  You can also create your
 own, but they have to live in the same directory as the project file.</p>

 <p>Platform symbols can be addresses or constants.  If an instruction
@ -384,7 +397,7 @@ creating two symbols with the same name.  If two symbols have the same
 value, the one whose label comes first alphabetically is used.</p>

 <p>Project symbols always have precedence over platform symbols, allowing
-you to redefine symbols within a project.  (You can "block" a platform
+you to redefine symbols within a project.  (You can "hide" a platform
 symbol by creating a project symbol with the same name and an unused
 value, such as $ffffffff.)</p>

@ -400,8 +413,8 @@ instructions or data offsets that are the target of operands.  They're
 formed by appending the hexadecimal address to the letter "L", with
 additional characters added if some other symbol has already defined
 that label.  Auto labels are only added where they are needed.  Because
-auto labels may be redefined at any time, the editor will try to prevent
-you from using them in operands.</p>
+auto labels may be redefined or disappear, the editor will try to prevent
+you from referring to them when editing operands.</p>

 <p>Operands may use parts of symbols.  For example, if you have a label
 <code>MYSTRING</code>, you can write:</p>
@ -414,7 +427,7 @@ MYSTRING .STR    "hello"
 </pre>

 <p>The format editor allows you to choose which part of the symbol's
-value to use.  If the value doesn't match exactly, and adjustment will
+value to use.  If the value doesn't match exactly, an adjustment will
 be applied.</p>

 <h3><a name="weak-refs">Weak References</a></h3>
@ -451,9 +464,9 @@ results are probably not what you want:</p>
 </pre>

 <p>This happened because you added a weak reference to "FOO" in the operand,
-but the label doesn't exist.  The operand is formatted as hex.  This also
-means that there's no longer a need for an auto label on the NOP instruction,
-so SourceGen removed that as well.</p>
+but the label doesn't exist.  The operand is formatted as hex.  Because
+there's no longer a reference to L1003, SourceGen removed the auto-label
+as well.</p>

 <p>If you set the label "FOO" on the NOP instruction, you'll see what you
 probably wanted:</p>
@ -518,7 +531,9 @@ and jumps to it with the RTS instruction.  However, RTS requires the
 address of the byte before the target instruction, so we actually push
 $1006.</p>

-<p>After adding a code hint at $1007, the project looks like this:</p>
+<p>The disassembler won't know that offset $1007 is code because nothing
+appears to reference it.  After adding a code hint at $1007, the project
+looks like this:</p>
 <pre>
         LDA     #$10
         PHA
--- a/SourceGen/RuntimeData/Help/mainwin.html
+++ b/SourceGen/RuntimeData/Help/mainwin.html
@ -31,7 +31,7 @@ incomplete.  The maximum size for a data file is currently 1 MiB.</p>

 <p>The first time you save the project (with File &gt; Save), you will be
 prompted for the project name.  It's best to use the data file's name
-with ".dis65" added.  This will be configured automatically.  The data
+with ".dis65" added, so this will be set as the default.  The data
 file's name is not stored in the project file, so if you pick a different
 name, or save the project in a different directory, you will have to
 select the data file manually whenever you open the project.</p>
@ -58,7 +58,7 @@ to cancel the loading of the project.</p>
 <p>The locations of the last few projects you've worked with are saved
 in the application settings.  You can access them from
 File &gt; Recent Projects.  If no project is open, links to the two
-most-recently opened projects will be available.</p>
+most-recently-opened projects will be available.</p>


 <h2><a name="working">Working With a Project</a></h2>
@ -70,7 +70,7 @@ most-recently opened projects will be available.</p>
  <li>Top left: cross-reference list.
  <li>Bottom left: notes list.
  <li>Top right: symbols list.
-  <li>Bottom right: line info.
+  <li>Bottom right: info on selected line.
 </ol>

 <p>Most of the action takes place in the center code list.</p>
@ -94,10 +94,12 @@ assembler directive.</p>
    correspond to the instruction or data.  To see the full dump of
    a longer item, such as an ASCII string, double-click on the field
    to open the
-    <a href="tools.html#hexdump">Hex Dump Viewer</a>.  (Note this is
-    a floating window, so you can keep it open while you work.)</li>
+    <a href="tools.html#hexdump">Hex Dump Viewer</a>.  This is
+    a floating window, so you can keep it open while you work.
+    Double-clicking in the bytes column in other rows will update
+    the window position and selection.</li>
  <li><b>Flags</b>.  This shows the state of the status flags as they
-    were before the instruction was executed.  Double-click on this
+    are before the instruction is executed.  Double-click on this
    field to open the
    <a href="editors.html#flags">Edit Status Flag Override</a> dialog.</li>
  <li><b>Attributes</b>.  Some instructions and data items have
@ -115,8 +117,8 @@ assembler directive.</p>
    If an instruction is embedded inside this one, a &#9193; symbol
    will appear.
    If you double-click this field for an instruction or data item
-    whose operand refers to an address in the file, the view will jump to
-    that location.</li>
+    whose operand refers to an address in the file, the selection will
+    jump to that location.</li>
  <li><b>Operand</b>.  The instruction or data operand.  Data operands
    may span a large number of bytes.  Double-click on this field to
    open the
@ -177,7 +179,7 @@ enabled will depend on what you have selected in the main window.</p>
    when a single equate directive, generated from a project symbol, is
    selected.</li>

-  <li><a href="#hinting">Hinting</a> (Hint As Code Entry Point, Hint As
+  <li><a href="#hints">Hinting</a> (Hint As Code Entry Point, Hint As
    Data Start, Hint As Inline Data, Remove Hints).  Enabled when one or more
    code and data lines are selected.  Remove Hints is only enabled when
    at least one line has hints.</li>
@ -187,7 +189,8 @@ enabled will depend on what you have selected in the main window.</p>
  <li>Delete Note / Long Comment.  Deletes the selected note or long
    comment.  Enabled when a single note or long comment is selected.</li>
  <li><a href="tools.html#hexdump">Show Hex Dump</a>.  Opens the hex dump
-    viewer with the current selection highlighted.  Always enabled.</li>
+    viewer, with the current selection highlighted.  Always enabled.  If
+    nothing is selected, the viewer will open at the top of the file.</li>
 </ul>


@ -199,8 +202,8 @@ change with Edit &gt; Redo, Ctrl+Y, or Ctrl+Shift+Z.</p>
 are added to the undo/redo buffer.  This has no fixed size limit, so no
 matter how much you change, you can always undo back to the point where
 the project was opened.</p>
-<p>The undo buffer is not saved as part of the project, so closing and
-reopening the project resets the buffer.</p>
+<p>The undo history is not saved as part of the project.  Closing a project
+clears the buffer.</p>


 <h3><a name="references">References Window</a></h3>
@ -264,7 +267,9 @@ Use Edit &gt; Find Next to find the next match.</p>

 <p>Use Edit &gt; Go To to jump to an offset, address, or label.  Remember
 that offsets and addresses are always hexadecimal, and offsets start
-with a '+'.</p>
+with a '+'.  If you have a label that is also a valid hexadecimal
+address, like "FEED", the label takes precedence.  To jump to the address
+write "$FEED" instead.</p>

 <p>When you jump around, by double-clicking on an opcode or an entry
 in one of the side windows, the currently-selected line is added to
@ -291,6 +296,17 @@ entirely from the
 <a href="settings.html#project-properties">project properties</a> editor.


+<h3><a name="toggle-format">Quick Format Toggle</a></h3>
+
+<p>The "Toggle Single-Byte Format" feature provides a quick way to
+change a range of bytes to single bytes
+or back to their default format.  It's equivalent to opening the Edit
+Data Format dialog and selecting "Single bytes" displayed as hex, or
+selecting "Default".</p>
+<p>This can be handy if the default format for a range of bytes is a
+string, but you want to see it as bytes or set a label in the middle.</p>
+
+
 <h3><a name="clipboard">Copying to Clipboard</a></h3>

 <p>When you use Edit &gt; Copy, all lines selected in the code list are
@ -298,14 +314,16 @@ copied to the system clipboard.  This can be a convenient way to post
 code snippets into forum postings or documentation.  The text is
 copied from the data shown on screen, so your chosen capitalization
 and pseudo-ops will appear in the copy.</p>
-<p>A copy of all of the fields is also written to the clipboard, in
-CSV format.  If you open a program like Excel, you can use Paste Special
-to put the data into individual cells.</p>
+<p>Long comments are included, but notes are not.</p>
+<p>By default, the label, opcode, operand, and comment fields are included.
+From the
+<a href="settings.html#app-settings">app settings</a> dialog you can select
+a different format, "Disassembly", which also includes the address and byte
+columns.</p>

-<p>By default, the label, opcode, operand, and comment fields are included
-in the text form.  From the
-<a href="settings.html#app-settings">app settings</a> you can select
-a different format that also includes the address and byte columns.</p>
+<p>A copy of all of the fields is also written to the clipboard in CSV
+format.  If you have a spreadsheet like Excel, you can use Paste Special
+to put the data into individual cells.</p>

 </div>

--- a/SourceGen/RuntimeData/Help/settings.html
+++ b/SourceGen/RuntimeData/Help/settings.html
@ -21,15 +21,15 @@ project properties.</p>
 <p>Application settings are stored in a file called "SourceGen-settings"
 in the SourceGen installation directory.  If the file is missing or
 corrupted, some default settings will be used.  These settings are local
-to your system, and include everything from window sizes to whether you
-prefer hexadecimal values to be shown in upper case.  None of them
+to your system, and include everything from window sizes to whether or not
+you prefer hexadecimal values to be shown in upper case.  None of them
 affect the way the project analyzes code and data, though they may affect
 the way generated assembly sources look.</p>

 <p>Project properties are stored in each individual .dis65 project file.
 They specify which CPU to use, which extension scripts to load, and a
 variety of other things that directly impact how SourceGen processes
-the project.  Because of the way it impacts the project, all changes to
+the project.  Because of the potential impact, all changes to
 the project properties are made through the undo/redo buffer.</p>


@ -50,7 +50,7 @@ hide columns from the code list.  The buttons may be more convenient
 though.</p>

 <p>You can select a different font for the code list.  Make it as large
-or small as you want.  Monospace fonts like Courier or Consolas are
+or small as you want.  Mono-space fonts like Courier or Consolas are
 recommended.</p>

 <p>You can choose to display different parts of the display in upper or
@ -147,8 +147,8 @@ you later hit Cancel, but the changes are not applied immediately.</p>

 <p>The choice of CPU determines the set of available instructions, as
 well as cycle costs and register widths.  There are many variations
-on the 6502, but from the perspective of a disassembler only three
-matter:
+on the 6502, but from the perspective of a disassembler most can be
+treated as one of these three:
 <ol>
  <li>MOS 6502.  The original 8-bit instruction set.</li>
  <li>WDC W65C02S.  Expanded the instruction set and smoothed
@ -156,9 +156,9 @@ matter:
  <li>WDC W65C816S.  Expanded instruction set, 24-bit address space,
    and 16-bit registers.</li>
 </ol>
-<p>The Rockwell R65C02 features an expanded instruction set that is
-compatible with the WDC 65C02 but incompatible with the 65816.  It's
-not currently supported by SourceGen.</p>
+<p>The Rockwell R65C02, Hudson Soft HuC6280, and Commodore CSG 4510 / 65CE02
+have instruction sets that expand on the 6502/65C02, but aren't compatible
+with the 65816.  These are not yet supported by SourceGen.</p>

 <p>If "enable undocumented instructions" is checked, some additional
 opcodes are recognized on the 6502 and 65C02.  These instructions are
@ -198,14 +198,18 @@ create two symbols with the same label.</p>
 <p>The Import button allows you to import symbols from another project.
 Only labels that have been tagged as global and exported will be imported.
 Existing symbols with identical labels will be replaced, so it's okay to
-run the importer multiple times.</p>
+run the importer multiple times.  Labels that aren't found will not be
+removed, so you can safely import from multiple projects, but will need
+to manually delete any symbols that are no longer being exported.</p>


 <h3><a name="projprop-symfiles">Symbol Files</a></h3>
 <p>From here, you can add and remove platform symbol files, or change
 the order in which they are loaded.
 See the <a href="intro.html#about-symbols">symbols</a> section for an
-explanation of how platform symbols work.</p>
+explanation of how platform symbols work.
+See "README.md" in the RuntimeData directory for a description of the
+file syntax.</p>

 <p>Platform symbol files must live in the RuntimeData directory that comes
 with SourceGen, or in the directory where the project file lives.  This
@ -222,7 +226,9 @@ you will receive a warning.</p>
 <h3><a name="projprop-extscripts">Extension Scripts</a></h3>
 <p>From here, you can add and remove extension script files.
 See the <a href="intro.html#scripts">extension scripts</a> section for
-an explanation of how extension scripts work.</p>
+an overview of how extension scripts work.
+There's a more detailed document in the RuntimeData directory
+("ExtensionScripts.md").</p>


 <p>Extension script files must live in the RuntimeData directory that comes
--- a/SourceGen/RuntimeData/Help/tools.html
+++ b/SourceGen/RuntimeData/Help/tools.html
@ -46,7 +46,7 @@ pasting in some situations.</p>

 <p>If "always on top" is checked, the window will stay above all other
 windows that don't also declare that they should always be on top.  By
-default this box is checked for the project dump, and not checked for
+default this box is checked when displaying project data, and not checked for
 external files.</p>


--- a/SourceGen/RuntimeData/Help/tutorials.html
+++ b/SourceGen/RuntimeData/Help/tutorials.html
@ -70,15 +70,18 @@ these distracting, collapse the column.</p>
 <p>Click on the fourth line down, which has address 1002.  The line has
 a label, "L1002", and is performing an indexed load from L1017.  Both
 of these labels were automatically generated, and are named for the
-address they appear.  When you clicked on the line, a few things happened:</p>
+address at which they appear.  When you clicked on the line, a few
+things happened:</p>
 <ul>
-  <li>The line was highlighted in the system selection color.</li>
+  <li>The line was highlighted in the system selection color (usually
+    blue).</li>
  <li>Address 1017 and label L1017 were highlighted.  When a line
-    with an in-file operand is selected, the target address is higlighted.</li>
-  <li>An entry appeared in the References window.  This notes that the only
-    reference to L1002 is a branch from address $100B.</li>
+    with an in-file operand is selected, the target address is
+    highlighted.</li>
+  <li>An entry appeared in the References window.  This tells you that the
+    only reference to L1002 is a branch from address $100B.</li>
  <li>The Info window filled with a bunch of text that describes the
-    line and the LDA instruction.</li>
+    line format and some details about the LDA instruction.</li>
 </ul>

 <p>Click some other lines, such as address $100B and $1014.  Note how the
@ -91,17 +94,17 @@ the operand itself opens a format editor; more on that later.)</p>
 References window.  Note the selection jumps to L1002.  You can immediately
 jump to any reference.</p>
 <p>At the top of the Symbols window on the right side of the screen is a
-row of buttons.  Make sure "Auto" is highlighted.  You should see three
+row of buttons.  Make sure "Auto" is selected.  You should see three
 labels in the window (L1002, L1014, L1017).  Double-click on L1014.  The
 selection jumps to the appropriate line.</p>

 <p>Select Edit &gt; Find.  Type "hello", and hit Enter.  The selection will
 move to address $100E, which is a string that says "hello!".  You can use
 Edit &gt; Find Next to try to find the next occurrence (there isn't one).  You
-can search for text that appears in the rightmost columns (label, opcode,
+can search for any text that appears in the rightmost columns (label, opcode,
 operand, comment).</p>
 <p>Select Edit &gt; Go To.  You can enter a label, address, or file offset.
-Enter "100b" to jump the selection to $100B.</p>
+Enter "100b" to set the selection to $100B.</p>

 <p>Near the top-left of the SourceGen window is a set of toolbar icons.
 Click the left-arrow, and watch the selection moves.  Click it again.  Then
@ -118,21 +121,21 @@ something like "6502bench SourceGen vX.Y.Z".  There are three ways to
 open the comment editor:</p>
 <ol>
  <li>Select Actions &gt; Edit Long Comment from the menu bar.</li>
-  <li>Right click, and select Actions &gt; Edit Long Comment from the
-    pop-up menu.  (The menus area exactly the same.)</li>
+  <li>Right click, and select Edit Long Comment from the
+    pop-up menu.  (This menu is exactly the same as the Actions menu.)</li>
  <li>Double-click the comment</li>
 </ol>
 <p>Most things in the code list will respond to a double-click.
 Double-clicking on addresses, flags, labels, operands, and comments will
 open editors for those things.  Double-clicking on a value in the "bytes"
 column will open a floating hex dump viewer.  This is usually the most
-convenient way to edit something.</p>
+convenient way to edit something: point and click.</p>
 <p>Double-click the comment to open the editor.  Type some words into the
 upper window, and note that a formatted version appears in the bottom
 window.  Experiment with the maximum line width and "render in box"
 settings to see what they do.  You can hit Enter to create line breaks,
 or let SourceGen wrap lines for you.  When you're done, click OK.  (Or
-hit Ctrl-Enter.</p>
+hit Ctrl+Enter.)</p>
 <p>When the dialog closes, you'll see your new comment in place at the
 top of the file.  If you typed enough words, your comment will span
 multiple lines.  You can select the comment by selecting any line in it.</p>
@ -151,15 +154,17 @@ differences:</p>
 <ol>
  <li>You can't pick their line width, but you can pick their color.</li>
  <li>They don't appear in generated assembly sources, making them
-    useful for leaving notes to yourself.</li>
+    useful for leaving notes to yourself as you work.</li>
  <li>They're listed in the Notes window.  Double-clicking them jumps
    the selection to the note, making them useful as bookmarks.</li>
 </ol>

-<p>It's time to do something with the code.  It's copying the instructions
-from $1017 to $2000, then jumping to $2000, so it looks like it's
-relocating the code before executing it.  We want to do the same thing
-to our disassembled code, so select the line at address $1017 and then
+<p>It's time to do something with the code.  If you look at what the code
+does you'll see that it's copying several dozen bytes from $1017
+to $2000, then jumping to $2000.  It appears to be relocating the next
+part of the code before
+executing it.  We want to let the disassembler know what's going on, so
+select the line at address $1017 and then
 Edit &gt; Edit Address.  (Or double-click the "1017" in the addr column.)
 In the Edit Address dialog, type "2000", and hit Enter.)</p>

@ -178,8 +183,8 @@ so you'll be forgiven if you reduce the offset column width to zero.)</p>
 <p>On the line at address $2000, select Actions &gt; Edit Label, or
 double-click on the label "L2000".  Change the label to "MAIN", and hit
 Enter.  The label changes on that line, and on the two lines that refer
-to address $2000.  (If you're not sure what refers to line $2000, check
-the References window.)</p>
+to address $2000.  (If you're not sure what refers to line $2000, select
+it and check the References window.)</p>
 <p>On that same line, select Actions &gt; Edit Comment.  Type a short
 comment, and hit Enter.  Your comment appears in the "comment" column.</p>

@ -215,12 +220,12 @@ Actions &gt; Edit Label.  Enter "IS_OK", and hit Enter.  (NOTE: labels are
 case-sensitive, so it needs to match the operand at $2005 exactly.)  You'll
 see the new label appear, and the operand at line $2005 will use it.</p>
 <p>There's an easier way.  Use Edit &gt; Undo twice, to get back to the place
-where line $2005 is using "L2009" as it's operand.  Select that line and
+where line $2005 is using "L2009" as its operand.  Select that line and
 Actions &gt; Edit Operand.  Enter "IS_OK", then select "Create label at target
 address instead".  Hit "OK".</p>
 <p>You should now see that both the operand at $2005 and the label at
 $2009 have changed to IS_OK, accomplishing what we wanted to do in a
-single step.  (There's actually a sutble difference compared to the two-step
+single step.  (There's actually a subtle difference compared to the two-step
 process: the operand at $2005 is still a numeric reference.  It was
 automatically changed to match IS_OK in the same way that the references
 to MAIN were when we renamed "L2000" earlier.  If you actually do want the
@ -248,7 +253,7 @@ label to "STR1".  Move up a bit and select address $2030, then scroll to
 the bottom and shift-click address $2070.  Select Actions &gt; Edit Data
 Format.  At the top it should now say, "65 bytes selected in 2 groups".
 There are two groups because the presence of a label split the data into
-two separate regions.  Selected "mixed ASCII and non-ASCII", then click
+two separate regions.  Select "mixed ASCII and non-ASCII", then click
 "OK".</p>
 <p>We now have two ".STR" lines, one for "string zero  ", one with the
 STR1 label and the rest of the string data.  This is okay, but it's not
@ -260,8 +265,8 @@ a single ".STR" line at the bottom, split across two lines with a '+'.</p>
 but that appears to be incorrect, so let's format it as individual bytes
 instead.  There's an easy way to do that: use Actions &gt; Toggle Single-Byte
 Format (or hit Ctrl+B).</p>
-<p>The data starting at $2025 appears to be 16-bit addresses into the
-table of strings, so let's format them appropriately.</p>
+<p>The data starting at $2025 appears to be 16-bit addresses that point
+into the table of strings, so let's format them appropriately.</p>
 <p>Select the line at $2025, then shift-click the line at $202E.  Select
 Actions &gt; Edit Data Format.  If you selected the correct set of bytes,
 the top should say, "10 bytes selected".  Click the
@ -277,7 +282,7 @@ on their own line, so each string is now in a separate ".STR" statement.

 <h3>Generating Assembly Code</h3>

-<p>You can generate asssembly source code from the disassembled data.
+<p>You can generate assembly source code from the disassembled data.
 Select File &gt; Assembler (or hit Ctrl+Shift+A) to open the generation
 and assembly dialog.</p>
 <p>Pick your favorite assembler from the drop list at the top right,