mirror of
https://github.com/fadden/6502bench.git
synced 2024-11-18 15:06:07 +00:00
Various doc fixes
This commit is contained in:
parent
330b4a238a
commit
4aee3af089
@ -13,7 +13,11 @@
|
|||||||
<h1>6502bench SourceGen: Instruction and Data Analysis</h1>
|
<h1>6502bench SourceGen: Instruction and Data Analysis</h1>
|
||||||
<p><a href="index.html">Back to index</a></p>
|
<p><a href="index.html">Back to index</a></p>
|
||||||
|
|
||||||
|
<p><i>This section discusses the internal workings of SourceGen. It is
|
||||||
|
not necessary to understand this to use the program.</i></p>
|
||||||
|
|
||||||
<h2><a name="analysis-process">Analysis Process</a></h2>
|
<h2><a name="analysis-process">Analysis Process</a></h2>
|
||||||
|
|
||||||
<p>Analysis of the file data is a complex multi-step process. Some
|
<p>Analysis of the file data is a complex multi-step process. Some
|
||||||
changes to the project, such as adding a code entry point hint or
|
changes to the project, such as adding a code entry point hint or
|
||||||
changing the CPU selection, require a full re-analysis of instructions
|
changing the CPU selection, require a full re-analysis of instructions
|
||||||
@ -42,16 +46,16 @@ method in <code>DisasmProject.cs</code>):</p>
|
|||||||
attributes, or "anattribs", with one entry per byte in the file.
|
attributes, or "anattribs", with one entry per byte in the file.
|
||||||
The Anattrib array tracks most of the state from here on. If we're
|
The Anattrib array tracks most of the state from here on. If we're
|
||||||
doing a partial re-analysis, this step will just clone a copy of the
|
doing a partial re-analysis, this step will just clone a copy of the
|
||||||
Anattrib array that was made at this point in a previous run. (This
|
Anattrib array that was made at this point in a previous run. (The
|
||||||
step is described in more detail below.)</li>
|
code analysis pass is described in more detail below.)</li>
|
||||||
<li>Apply user-specified labels to Anattribs.</li>
|
<li>Apply user-specified labels to Anattribs.</li>
|
||||||
<li>Apply user-specified format descriptors. These are the instruction
|
<li>Apply user-specified format descriptors. These are the instruction
|
||||||
and data operand formats.</li>
|
and data operand formats.</li>
|
||||||
<li>Run the data analyzer. This looks for patterns in uncategorized
|
<li>Run the data analyzer. This looks for patterns in uncategorized
|
||||||
data, and connects instruction and data operands to target offsets.
|
data, and connects instruction and data operands to target offsets.
|
||||||
The "nearby label" stuff is handled here. All of the results are
|
The "nearby label" stuff is handled here. All of the results are
|
||||||
stored in the Anattribs array. (This step is described in more
|
stored in the Anattribs array. (The data analysis pass is described in
|
||||||
detail below.)</li>
|
more detail below.)</li>
|
||||||
<li>Remove hidden labels from the symbol table. These are user-specified
|
<li>Remove hidden labels from the symbol table. These are user-specified
|
||||||
labels that have been placed on offsets that are in the middle of an
|
labels that have been placed on offsets that are in the middle of an
|
||||||
instruction or multi-byte data item. They can't be referenced, so we
|
instruction or multi-byte data item. They can't be referenced, so we
|
||||||
@ -105,12 +109,12 @@ fill out the Anattribs.</p>
|
|||||||
L1003 NOP
|
L1003 NOP
|
||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
<p>We haven't formatted anything yet. The data analyzer sees that the
|
<p>We haven't explicitly formatted anything yet. The data analyzer sees
|
||||||
JMP operand is inside the file, and has no label, so it creates an
|
that the JMP operand is inside the file, and has no label, so it creates an
|
||||||
auto-label at offset +000003 and a format descriptor with a symbolic
|
auto-label at offset +000003 and a format descriptor with a symbolic
|
||||||
operand reference to "L1003" at +000000.</p>
|
operand reference to "L1003" at +000000.</p>
|
||||||
<p>Now we edit the label, changing L1003 to "FOO". This goes into the
|
<p>Suppose we now edit the label, changing L1003 to "FOO". This goes into
|
||||||
project's "user label" list. The analyzer is
|
the project's "user label" list. The analyzer is
|
||||||
run, and applies the new "user label" to the Anattrib array. The
|
run, and applies the new "user label" to the Anattrib array. The
|
||||||
data analyzer finds the numeric reference in the JMP operand, and finds
|
data analyzer finds the numeric reference in the JMP operand, and finds
|
||||||
a label at the target address, so it creates a symbolic operand reference
|
a label at the target address, so it creates a symbolic operand reference
|
||||||
@ -119,8 +123,9 @@ in both places.</p>
|
|||||||
<p>Even though the JMP operand changed from "L1003" to "FOO", the only
|
<p>Even though the JMP operand changed from "L1003" to "FOO", the only
|
||||||
change actually written to the project file is the label edit. The
|
change actually written to the project file is the label edit. The
|
||||||
contents of the Anattrib array are disposable, so it can be used to
|
contents of the Anattrib array are disposable, so it can be used to
|
||||||
add labels and "fix up" numeric references. Generated labels and
|
hold auto-generated labels and "fix up" numeric references. Labels and
|
||||||
format descriptors are never added to the project file.</p>
|
format descriptors generated by SourceGen are never added to the
|
||||||
|
project file.</p>
|
||||||
|
|
||||||
<p>If the JMP operand were edited, a format descriptor would be added
|
<p>If the JMP operand were edited, a format descriptor would be added
|
||||||
to the user-specified descriptor list. During the analysis pass it would
|
to the user-specified descriptor list. During the analysis pass it would
|
||||||
@ -167,9 +172,10 @@ reformat them:</p>
|
|||||||
|
|
||||||
<p>Each entry in the change set has "before" and "after" states for the
|
<p>Each entry in the change set has "before" and "after" states for the
|
||||||
format descriptor at a specific offset. Only the state for the affected
|
format descriptor at a specific offset. Only the state for the affected
|
||||||
offsets is included -- the program doesn't take a complete state snapshot
|
offsets is included -- the program doesn't record the state of the full
|
||||||
(even with the RAM on a modern system that would add up quickly). When
|
project after each change (even with the RAM on a modern system that would
|
||||||
undoing a change, before and after are simply reversed.</p>
|
add up quickly). When undoing a change, before and after are simply
|
||||||
|
reversed.</p>
|
||||||
|
|
||||||
|
|
||||||
<h2><a name="code-analysis">Code Analysis</a></h2>
|
<h2><a name="code-analysis">Code Analysis</a></h2>
|
||||||
@ -222,9 +228,9 @@ performed, we assume a transition to emulation mode (E=1).</p>
|
|||||||
|
|
||||||
<p>There are three ways in which code can set a flag to a definite value:</p>
|
<p>There are three ways in which code can set a flag to a definite value:</p>
|
||||||
<ol>
|
<ol>
|
||||||
<li>By explicit instructions, like <code>SEC</code> or
|
<li>With explicit instructions, like <code>SEC</code> or
|
||||||
<code>CLD</code>.</li>
|
<code>CLD</code>.</li>
|
||||||
<li>By immediate-operand instructions. <code>LDA #$00</code> sets Z=1
|
<li>With immediate-operand instructions. <code>LDA #$00</code> sets Z=1
|
||||||
and N=0. <code>ORA #$80</code> sets Z=0 and N=1.</li>
|
and N=0. <code>ORA #$80</code> sets Z=0 and N=1.</li>
|
||||||
<li>By inference. For example, if we see a <code>BCC</code> instruction,
|
<li>By inference. For example, if we see a <code>BCC</code> instruction,
|
||||||
we know that the carry will be clear at the branch target address, and
|
we know that the carry will be clear at the branch target address, and
|
||||||
@ -274,8 +280,9 @@ time a JSR/JSL instruction is encountered, all loaded extension scripts
|
|||||||
are offered a chance to act.</p>
|
are offered a chance to act.</p>
|
||||||
|
|
||||||
<p>The first script that applies a format wins. Attempts to re-format
|
<p>The first script that applies a format wins. Attempts to re-format
|
||||||
instructions or data will fail. This rule ensure that anything explicitly
|
instructions or data that has already been formatted will fail. This rule
|
||||||
formatted by the user will not be overridden by a script.</p>
|
ensures that anything explicitly formatted by the user will not be
|
||||||
|
overridden by a script.</p>
|
||||||
|
|
||||||
<p>If code jumps into a region that is marked as inline data, the
|
<p>If code jumps into a region that is marked as inline data, the
|
||||||
branch will be ignored. If an extension script tries to flag bytes
|
branch will be ignored. If an extension script tries to flag bytes
|
||||||
|
@ -82,8 +82,8 @@ rather than generated by a compiler, which means it won't conform to a
|
|||||||
standard set of conventions. However, most programmers will use
|
standard set of conventions. However, most programmers will use
|
||||||
subroutines, which can be identified and analyzed in isolation. Subroutines
|
subroutines, which can be identified and analyzed in isolation. Subroutines
|
||||||
are often interspersed with variable storage, referred to as a "stash".
|
are often interspersed with variable storage, referred to as a "stash".
|
||||||
Variables may be single-byte or multi-byte, the latter typically
|
Variables and constants may be single-byte or multi-byte, the latter
|
||||||
in little-endian byte order.</p>
|
typically in little-endian byte order.</p>
|
||||||
|
|
||||||
<p>Much of the data in a typical program is read-only, often in the
|
<p>Much of the data in a typical program is read-only, often in the
|
||||||
form of graphics or ASCII string data. Graphics can be difficult
|
form of graphics or ASCII string data. Graphics can be difficult
|
||||||
@ -100,14 +100,15 @@ data ends and code resumes: 6502 instructions are variable-length, so if
|
|||||||
the last byte of the data area appears to be a three-byte instruction,
|
the last byte of the data area appears to be a three-byte instruction,
|
||||||
the first two bytes of the next instruction area will be gobbled up.</p>
|
the first two bytes of the next instruction area will be gobbled up.</p>
|
||||||
|
|
||||||
<p>Some programmers will use a trick where they "embed" an instruction
|
<p>To make things even more difficult (sometimes deliberately), programmers
|
||||||
|
will sometimes use a trick where they "embed" an instruction
|
||||||
inside another instruction. This allows code to branch to two different
|
inside another instruction. This allows code to branch to two different
|
||||||
entry points, one of which will set a flag or load a register, and then
|
entry points, one of which will set a flag or load a register, and then
|
||||||
continue on to common code.</p>
|
continue on to common code.</p>
|
||||||
|
|
||||||
<p>Another trick is to embed "inline data" after a JSR or JSL instruction.
|
<p>Another trick is to embed "inline data" after a JSR or JSL instruction.
|
||||||
The caller pulls the calling address off the stack, uses it to access
|
The called subroutine pulls the caller's address off the stack, uses it to
|
||||||
the parameters, then pushes the address back on after modifying it to
|
access the parameters, then pushes the address back on after modifying it to
|
||||||
point to an address past the inline data. This can be very confusing
|
point to an address past the inline data. This can be very confusing
|
||||||
for the disassembler, which will try to interpret the inline data as
|
for the disassembler, which will try to interpret the inline data as
|
||||||
instructions.</p>
|
instructions.</p>
|
||||||
@ -293,7 +294,7 @@ is the ProDOS 8 call interface on the Apple II, which looks like this:</p>
|
|||||||
|
|
||||||
<p>The three bytes following the <code>JSR $bf00</code> should be hinted
|
<p>The three bytes following the <code>JSR $bf00</code> should be hinted
|
||||||
as inline data, so that the code analyzer skips them and continues the
|
as inline data, so that the code analyzer skips them and continues the
|
||||||
analysis at the <code>BCS</code>. Because you need to hint *every* byte
|
analysis at the <code>BCS</code>. Because you need to hint <i>every</i> byte
|
||||||
of inline data, all bytes in a selected line will receive hints.</p>
|
of inline data, all bytes in a selected line will receive hints.</p>
|
||||||
<p>If code branches into a region that is marked as inline data, the
|
<p>If code branches into a region that is marked as inline data, the
|
||||||
branch will be ignored.</p>
|
branch will be ignored.</p>
|
||||||
@ -660,7 +661,7 @@ use the shortest instruction possible.</p>
|
|||||||
way. Some use opcode suffixes, others use operand prefixes, some
|
way. Some use opcode suffixes, others use operand prefixes, some
|
||||||
allow both. You can configure how they appear in the
|
allow both. You can configure how they appear in the
|
||||||
<a href="settings.html#app-settings">application settings</a>.</p>
|
<a href="settings.html#app-settings">application settings</a>.</p>
|
||||||
<p>SourcGen will only add width disambiguators to opcodes or operands when
|
<p>SourceGen will only add width disambiguators to opcodes or operands when
|
||||||
they are needed, with one exception: the opcode suffix for long
|
they are needed, with one exception: the opcode suffix for long
|
||||||
(24-bit address) operations is always applied. This is done because some
|
(24-bit address) operations is always applied. This is done because some
|
||||||
assemblers require it, insisting on "LDAL" rather than "LDA" for an
|
assemblers require it, insisting on "LDAL" rather than "LDA" for an
|
||||||
|
@ -45,7 +45,7 @@ limitations under the License.
|
|||||||
<TextBlock FontSize="24"
|
<TextBlock FontSize="24"
|
||||||
Text="{Binding ProgramVersionString, StringFormat={}Version {0},
|
Text="{Binding ProgramVersionString, StringFormat={}Version {0},
|
||||||
FallbackValue=Version X.Y.Z-alpha1}"/>
|
FallbackValue=Version X.Y.Z-alpha1}"/>
|
||||||
<TextBlock Text="Copyright 2018 faddenSoft" Margin="0,30,0,0"/>
|
<TextBlock Text="Copyright 2019 faddenSoft" Margin="0,30,0,0"/>
|
||||||
<TextBlock Text="Created by Andy McFadden"/>
|
<TextBlock Text="Created by Andy McFadden"/>
|
||||||
</StackPanel>
|
</StackPanel>
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user