From a37143e9fce50c1a3e3257cc57235e68a222155f Mon Sep 17 00:00:00 2001 From: Andy McFadden Date: Mon, 7 Jun 2021 17:14:16 -0700 Subject: [PATCH] Execute scripts This change applies the substitution scripts on the HTML files, replacing away the jQuery load() calls with the actual file contents, and setting the correct URLs to the prev/next buttons. --- docs/sgtutorial/about-disasm.html | 369 ++++++------ docs/sgtutorial/address-tables.html | 582 ++++++++++--------- docs/sgtutorial/advanced-topics.html | 208 ++++--- docs/sgtutorial/digging-deeper.html | 382 +++++++------ docs/sgtutorial/editing-data.html | 456 ++++++++------- docs/sgtutorial/extension-scripts.html | 430 +++++++------- docs/sgtutorial/generating-code.html | 310 +++++----- docs/sgtutorial/index.html | 220 +++++--- docs/sgtutorial/inline-data.html | 320 ++++++----- docs/sgtutorial/labels-symbols.html | 752 +++++++++++++------------ docs/sgtutorial/local-variables.html | 588 ++++++++++--------- docs/sgtutorial/moving-around.html | 450 ++++++++------- docs/sgtutorial/odds-ends.html | 442 ++++++++------- docs/sgtutorial/sidenav-incl.html | 52 +- docs/sgtutorial/simple-edits.html | 456 ++++++++------- docs/sgtutorial/string-formatting.html | 290 ++++++---- docs/sgtutorial/using-sourcegen.html | 396 +++++++------ docs/sgtutorial/visualizations.html | 733 +++++++++++++----------- 18 files changed, 4125 insertions(+), 3311 deletions(-) diff --git a/docs/sgtutorial/about-disasm.html b/docs/sgtutorial/about-disasm.html index fd4cb1f..961f02c 100644 --- a/docs/sgtutorial/about-disasm.html +++ b/docs/sgtutorial/about-disasm.html @@ -1,161 +1,208 @@ - - - - - - - - - - - About Disassembly - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

About Disassembly

- -
-
-

Well-written assembly-language source code has meaningful - comments and labels, so that humans can read and understand it. - For example:

-
-          .org  $2000
-          sec                         ;set carry
-          ror   A                     ;shift into high bit
-          bmi   CopyData              ;branch always
-
-          .asciiz "first string"
-          .asciiz "another string"
-          .asciiz "string the third"
-          .asciiz "last string"
-
-CopyData  lda   #<addrs               ;get pointer into
-          sta   ptr                   ; address table
-          lda   #>addrs
-          sta   ptr+1
-
- -

Computers operate at a much lower level, so a piece of software - called an assembler is used to convert the source code to - object code that the CPU can execute. - Object code looks more like this:

-
-38 6a 30 39 66 69 72 73 74 20 73 74 72 69 6e 67
-00 61 6e 6f 74 68 65 72 20 73 74 72 69 6e 67 00
-73 74 72 69 6e 67 20 74 68 65 20 74 68 69 72 64
-00 6c 61 73 74 20 73 74 72 69 6e 67 00 a9 63 85
-02 a9 20 85 03
-
- -

This arrangement works perfectly well until somebody needs to - modify the software and nobody can find the original sources. - Disassembly is the act of taking a raw hex - dump and converting it to source code.

-
-
- -
-
- t0-bad-disasm -
-
-

Disassembling a blob of data can be tricky. A simple - disassembler can format instructions, but can't generally tell - the difference between instructions and data. Many 6502 programs - intermix code and data freely, so simply dumping everything as - an instruction stream can result in sections with nonsensical output.

-
-
- -
-
-

One way to separate code from data is to try to execute all - possible data paths. There are a number of reasons why it's difficult - or impossible to do this perfectly, but you can get pretty good - results by identifying execution entry points and just walking through - the code. When a conditional branch is encountered, both paths are - traversed. When all code has been traced, every byte that hasn't - been visited is either - data used by the program, or dead space not used by anything.

- -

The process can be improved by keeping track of the flags in the - 6502 status register. For example, in the code fragment shown - earlier, BMI conditional branch instruction is used. - A simple tracing algorithm would both follow the branch and fall - through to the following instruction. However, the code that precedes - the BMI ensures that the branch is always taken, so a - clever disassembler would only trace that path.

- -

(The situation is worse on the 65816, because the length of - certain instructions is determined by the values of the processor - status flags.)

- -

Once the instructions and data are separated and formatted - nicely, it's still up to a human to figure out what it all means. - Comments and meaningful labels are needed to make sense of it. - These should be added to the disassembly listing.

-
-
- -
-
- t0-sourcegen -
-
-

SourceGen performs the instruction tracing, and makes it easy - to format operands and add labels and comments. - When the disassembled code is ready, SourceGen can generate source code - for a variety of modern cross-assemblers, and produce HTML listings - with embedded graphic visualizations.

-
-
- - -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + About Disassembly - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

About Disassembly

+ +
+
+

Well-written assembly-language source code has meaningful + comments and labels, so that humans can read and understand it. + For example:

+
+          .org  $2000
+          sec                         ;set carry
+          ror   A                     ;shift into high bit
+          bmi   CopyData              ;branch always
+
+          .asciiz "first string"
+          .asciiz "another string"
+          .asciiz "string the third"
+          .asciiz "last string"
+
+CopyData  lda   #<addrs               ;get pointer into
+          sta   ptr                   ; address table
+          lda   #>addrs
+          sta   ptr+1
+
+ +

Computers operate at a much lower level, so a piece of software + called an assembler is used to convert the source code to + object code that the CPU can execute. + Object code looks more like this:

+
+38 6a 30 39 66 69 72 73 74 20 73 74 72 69 6e 67
+00 61 6e 6f 74 68 65 72 20 73 74 72 69 6e 67 00
+73 74 72 69 6e 67 20 74 68 65 20 74 68 69 72 64
+00 6c 61 73 74 20 73 74 72 69 6e 67 00 a9 63 85
+02 a9 20 85 03
+
+ +

This arrangement works perfectly well until somebody needs to + modify the software and nobody can find the original sources. + Disassembly is the act of taking a raw hex + dump and converting it to source code.

+
+
+ +
+
+ t0-bad-disasm +
+
+

Disassembling a blob of data can be tricky. A simple + disassembler can format instructions, but can't generally tell + the difference between instructions and data. Many 6502 programs + intermix code and data freely, so simply dumping everything as + an instruction stream can result in sections with nonsensical output.

+
+
+ +
+
+

One way to separate code from data is to try to execute all + possible data paths. There are a number of reasons why it's difficult + or impossible to do this perfectly, but you can get pretty good + results by identifying execution entry points and just walking through + the code. When a conditional branch is encountered, both paths are + traversed. When all code has been traced, every byte that hasn't + been visited is either + data used by the program, or dead space not used by anything.

+ +

The process can be improved by keeping track of the flags in the + 6502 status register. For example, in the code fragment shown + earlier, BMI conditional branch instruction is used. + A simple tracing algorithm would both follow the branch and fall + through to the following instruction. However, the code that precedes + the BMI ensures that the branch is always taken, so a + clever disassembler would only trace that path.

+ +

(The situation is worse on the 65816, because the length of + certain instructions is determined by the values of the processor + status flags.)

+ +

Once the instructions and data are separated and formatted + nicely, it's still up to a human to figure out what it all means. + Comments and meaningful labels are needed to make sense of it. + These should be added to the disassembly listing.

+
+
+ +
+
+ t0-sourcegen +
+
+

SourceGen performs the instruction tracing, and makes it easy + to format operands and add labels and comments. + When the disassembled code is ready, SourceGen can generate source code + for a variety of modern cross-assemblers, and produce HTML listings + with embedded graphic visualizations.

+
+
+ + +
+ +
+ Next » +
+ + + + + diff --git a/docs/sgtutorial/address-tables.html b/docs/sgtutorial/address-tables.html index 9242384..ba76b71 100644 --- a/docs/sgtutorial/address-tables.html +++ b/docs/sgtutorial/address-tables.html @@ -1,267 +1,315 @@ - - - - - - - - - - - Address Tables - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Address Tables

- -
-
-

Code often contains tables of addresses to code or data. - Formatting them one at a time can be tedious, so SourceGen - provides a faster way. For this tutorial we'll start by labeling - and tagging a single entry by hand, then do the rest in one shot.

- -

Start a new project. Select the Apple //e platform, click - Select File and navigate to the 6502bench Examples directory. - In the "A2-Amper-fdraw" directory, select the file "AMPERFDRAW#061d60" - (just ignore the existing .dis65 file). - Click OK to create the project.

-
-
- -
-
- t3-initial -
-
-

Not a lot to see here -- just half a dozen lines of loads and stores, - then nothing but data. - This particular program interfaces with Applesoft BASIC, so we can make it - a bit more meaningful by loading an additional platform - symbol file.

-
-
- -
-
- t3-a2-props -
-
-

Select Edit > Project Properties, then the - Symbol Files tab. Click Add Symbol Files from Runtime. - The file browser starts in the "RuntimeData" directory. - Open the "Apple" folder, then select "Applesoft.sym65", - and click Open. Click OK to close - the project properties window.

-
-
- -
-
- t3-amperv -
-
-

The STA instructions now reference BAS_AMPERV, - which is noted as a code vector. We can see the code setting up a jump - (opcode $4C) to $1D70.

-
-
- -
-
- t3-1d70 -
-
-

As it happens, the start address of the code - is $1D60 -- the last four digits of the filename -- so let's make that - change. Double-click the initial .ORG statement, - and change it from $2000 to $1D60. We can now see that $1D70 starts - right after this initial chunk of code.

-
-
- -
-
- t3-1d70-code -
-
-

Select the line with address $1D70, then - Actions > Tag Address As Code Start Point. - More code appears, but not much -- if you scroll down you'll see that most - of the file is still data.

- -

The code at $1D70 searches through a table at - $1D88 for a match with the contents of the accumulator. If it finds a match, - it loads bytes from tables at $1DA6 and $1D97, pushes them on the stack, - and then JMPs away. This code is pushing a return address onto the stack. - When the code at BAS_CHRGET returns, it'll return to that - address. Because of a quirk of the 6502 architecture, the address pushed - must be the desired address minus one.

-
-
- -
-
- t3-1d97 -
-
-

The first byte in the first address table at $1D97 (which - has the auto-label L1D97) is $B4. - The first byte in the second table is $1D. So the first - address we want is $1DB4 + 1 = $1DB5.

-
-
- -
-
- t3-1d97-edit.png -
-
-

Select the line at $1DB5, and use - Actions > Tag Address As Code Start Point. - More code appears, but again it's only a few lines. Let's dress this one - up a bit. Set a label on the code at $1DB5 called "FUNC". - Then, at $1D97, edit the data item (double-click on "$B4"), - click Single bytes, then type "FUNC" - (note the text field gets focus immediately, and the radio button - automatically switches to symbolic reference when you start typing). - Click OK.

-
-
- -
-
- t3-1d97-post.png -
-
-

The operand at $1D97 should now say <FUNC-1. - Repeat the process at $1DA6, this time clicking the High - part radio button below the symbol entry text box, - to make the operand there say >FUNC. (If it says - <FUNC-152, you forgot to select the high part.)

- -

We've now changed the first entry in the address table to a - symbolic reference, which is helps someone reading the code to - understand what is being referenced. You could repeat these - steps (tag as code, set label, change address bytes to symbols) - for the remaining items, but there's an easier way.

-
-
- -
-
- t3-format-dialog -
-
-

Click on the line at address $1D97, then shift-click the line at - address $1DA9 (which should be .FILL 12,$1e). Select - Actions > Format Address Table.

- -

Contrary to first impressions, this imposing dialog does not allow you - to launch objects into orbit. There are a variety of common ways to - structure an address table, all of which are handled here. You can - configure the various parameters and see the effects as you make - each change.

-
-
- -
-
- t3-format-cfg -
-
-

The message at the top should indicate that there are 30 bytes - selected. In Address Characteristics, click the - Parts are split across sub-tables checkbox and the - Adjusted for RTS/RTL checkbox. - As soon as you do, the first line of the Generated Addresses - list should show the symbol "FUNC". - The rest of the addresses will look like - "(+) T1DD0". The "(+)" means that a label was not found at - that location, so a new global label will be generated automatically.

- -

Down near the bottom, check the - Tag targets as code start points checkbox. - Because we saw the table contents being pushed onto the stack for - RTS, we know that they're all code entry points.

-

Click OK.

-
-
- -
-
- t3-format-done -
-
-

The table of address bytes at $1D97 should now all be - references to symbols -- 15 low parts followed by 15 high parts. If you - scroll down, you should see nothing but instructions until you get to the - last dozen bytes at the end of the file. (If this isn't the case, use - Edit > Undo, then work through the steps again.)

-
-
- -
-
-

The formatter did the same series of actions you went through earlier, - - but applied them to multiple locations in one shot. The next step in - the disassembly process would be to rename the "Tnnnn" labels to - something more meaninful.

- -

We don't want to save this project, so select - File > Close. When SourceGen asks for confirmation, - click Discard & Continue.

-
-
- - -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Address Tables - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Address Tables

+ +
+
+

Code often contains tables of addresses to code or data. + Formatting them one at a time can be tedious, so SourceGen + provides a faster way. For this tutorial we'll start by labeling + and tagging a single entry by hand, then do the rest in one shot.

+ +

Start a new project. Select the Apple //e platform, click + Select File and navigate to the 6502bench Examples directory. + In the "A2-Amper-fdraw" directory, select the file "AMPERFDRAW#061d60" + (just ignore the existing .dis65 file). + Click OK to create the project.

+
+
+ +
+
+ t3-initial +
+
+

Not a lot to see here -- just half a dozen lines of loads and stores, + then nothing but data. + This particular program interfaces with Applesoft BASIC, so we can make it + a bit more meaningful by loading an additional platform + symbol file.

+
+
+ +
+
+ t3-a2-props +
+
+

Select Edit > Project Properties, then the + Symbol Files tab. Click Add Symbol Files from Runtime. + The file browser starts in the "RuntimeData" directory. + Open the "Apple" folder, then select "Applesoft.sym65", + and click Open. Click OK to close + the project properties window.

+
+
+ +
+
+ t3-amperv +
+
+

The STA instructions now reference BAS_AMPERV, + which is noted as a code vector. We can see the code setting up a jump + (opcode $4C) to $1D70.

+
+
+ +
+
+ t3-1d70 +
+
+

As it happens, the start address of the code + is $1D60 -- the last four digits of the filename -- so let's make that + change. Double-click the initial .ORG statement, + and change it from $2000 to $1D60. We can now see that $1D70 starts + right after this initial chunk of code.

+
+
+ +
+
+ t3-1d70-code +
+
+

Select the line with address $1D70, then + Actions > Tag Address As Code Start Point. + More code appears, but not much -- if you scroll down you'll see that most + of the file is still data.

+ +

The code at $1D70 searches through a table at + $1D88 for a match with the contents of the accumulator. If it finds a match, + it loads bytes from tables at $1DA6 and $1D97, pushes them on the stack, + and then JMPs away. This code is pushing a return address onto the stack. + When the code at BAS_CHRGET returns, it'll return to that + address. Because of a quirk of the 6502 architecture, the address pushed + must be the desired address minus one.

+
+
+ +
+
+ t3-1d97 +
+
+

The first byte in the first address table at $1D97 (which + has the auto-label L1D97) is $B4. + The first byte in the second table is $1D. So the first + address we want is $1DB4 + 1 = $1DB5.

+
+
+ +
+
+ t3-1d97-edit.png +
+
+

Select the line at $1DB5, and use + Actions > Tag Address As Code Start Point. + More code appears, but again it's only a few lines. Let's dress this one + up a bit. Set a label on the code at $1DB5 called "FUNC". + Then, at $1D97, edit the data item (double-click on "$B4"), + click Single bytes, then type "FUNC" + (note the text field gets focus immediately, and the radio button + automatically switches to symbolic reference when you start typing). + Click OK.

+
+
+ +
+
+ t3-1d97-post.png +
+
+

The operand at $1D97 should now say <FUNC-1. + Repeat the process at $1DA6, this time clicking the High + part radio button below the symbol entry text box, + to make the operand there say >FUNC. (If it says + <FUNC-152, you forgot to select the high part.)

+ +

We've now changed the first entry in the address table to a + symbolic reference, which is helps someone reading the code to + understand what is being referenced. You could repeat these + steps (tag as code, set label, change address bytes to symbols) + for the remaining items, but there's an easier way.

+
+
+ +
+
+ t3-format-dialog +
+
+

Click on the line at address $1D97, then shift-click the line at + address $1DA9 (which should be .FILL 12,$1e). Select + Actions > Format Address Table.

+ +

Contrary to first impressions, this imposing dialog does not allow you + to launch objects into orbit. There are a variety of common ways to + structure an address table, all of which are handled here. You can + configure the various parameters and see the effects as you make + each change.

+
+
+ +
+
+ t3-format-cfg +
+
+

The message at the top should indicate that there are 30 bytes + selected. In Address Characteristics, click the + Parts are split across sub-tables checkbox and the + Adjusted for RTS/RTL checkbox. + As soon as you do, the first line of the Generated Addresses + list should show the symbol "FUNC". + The rest of the addresses will look like + "(+) T1DD0". The "(+)" means that a label was not found at + that location, so a new global label will be generated automatically.

+ +

Down near the bottom, check the + Tag targets as code start points checkbox. + Because we saw the table contents being pushed onto the stack for + RTS, we know that they're all code entry points.

+

Click OK.

+
+
+ +
+
+ t3-format-done +
+
+

The table of address bytes at $1D97 should now all be + references to symbols -- 15 low parts followed by 15 high parts. If you + scroll down, you should see nothing but instructions until you get to the + last dozen bytes at the end of the file. (If this isn't the case, use + Edit > Undo, then work through the steps again.)

+
+
+ +
+
+

The formatter did the same series of actions you went through earlier, + + but applied them to multiple locations in one shot. The next step in + the disassembly process would be to rename the "Tnnnn" labels to + something more meaninful.

+ +

We don't want to save this project, so select + File > Close. When SourceGen asks for confirmation, + click Discard & Continue.

+
+
+ + +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/advanced-topics.html b/docs/sgtutorial/advanced-topics.html index 536d4dc..ba14d00 100644 --- a/docs/sgtutorial/advanced-topics.html +++ b/docs/sgtutorial/advanced-topics.html @@ -1,80 +1,128 @@ - - - - - - - - - - - Advanced Topics - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Advanced Topics

- -

This section has tutorials for three subjects:

- - - -

These features are not essential, but they can make your life easier. -All of the tutorials assume you are already familiar with SourceGen.

- -

 

- -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Advanced Topics - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Advanced Topics

+ +

This section has tutorials for three subjects:

+ + + +

These features are not essential, but they can make your life easier. +All of the tutorials assume you are already familiar with SourceGen.

+ +

 

+ +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/digging-deeper.html b/docs/sgtutorial/digging-deeper.html index bc621aa..12b93d1 100644 --- a/docs/sgtutorial/digging-deeper.html +++ b/docs/sgtutorial/digging-deeper.html @@ -1,167 +1,215 @@ - - - - - - - - - - - Digging Deeper - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Digging Deeper

- -
-
-

This tutorial will walk you through some of the fancier things SourceGen - can do. We assume you've already finished the basic features tutorial, - and know how to create projects and move around in them.

-

Start a new project. Select Generic 6502. For the - data file, navigate to the Examples directory, then from the Tutorial - directory select "Tutorial2".

-
-
- -
-
- t2-tutorial-top -
-
-

Looking at the code list, the first thing you'll notice is that we - immediately ran into a - BRK, which is a pretty reliable sign that we're not in - a code section. This particular file begins with 00 20, which - could be a load address (e.g. some C64 binaries look like this). So let's start - with that assumption.

-
-
- -
-
-

As discussed in the introductory material, SourceGen separates code - from data by tracing all possible execution paths from declared entry - points. The generic profiles mark the first byte of the file as an entry - point, but that's wrong here. We want to change the entry point to - be after the 16-bit load address, at offset +000002.

-
-
- -
-
- t2-1000-edit1 -
-
-

Click on the first line of code at address $1000, and select - Actions > Remove Analyzer Tags - (Ctrl+H Ctrl+R). - This removes the "code entry point" tag. - Unfortunately the $20 is still auto-detected as being part of a string - directive.

-
-
- -
-
- t2-1000-edit2 -
-
-

The string is making it hard to manipulate the next few bytes, - so let's fix that by selecting Edit > Toggle Data Scan - (Ctrl+D). This turns off the feature that - automatically generates string and .FILL directives, - so now each uncategorized byte is on its own line.

-
-
- -
-
- t2-1000-fmt-word -
-
-

You could select the first two lines and use - Actions > Edit Operand to format them as a 16-bit - little-endian hex value, but there's a shortcut: select the first - line with data (address $1000), then Actions > Format As Word - (Ctrl+W). - It automatically grabbed the following byte and combined them.

-
-
- -
-
- t2-1000-setcode -
-
-

Since we believe $2000 is the load address for everything that follows, - click on the line with address $1002, select - Actions > Set Address, and enter "2000". With that line - still selected, use Actions > Tag Address As Code Start Point - (Ctrl+H Ctrl+C) to - tell the analyzer to start looking for code there.

-
-
- -
-
- t2-1000-ready -
-
-

That looks better, but the branch destination ($203D) is off the bottom of the - screen (unless you have a really tall screen or small fonts) because of - all the intervening data. Use Edit > Toggle Data Scan - (Ctrl+D) - to turn the string-finder back on. Now it's easier to read.

-
-
- - -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Digging Deeper - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Digging Deeper

+ +
+
+

This tutorial will walk you through some of the fancier things SourceGen + can do. We assume you've already finished the basic features tutorial, + and know how to create projects and move around in them.

+

Start a new project. Select Generic 6502. For the + data file, navigate to the Examples directory, then from the Tutorial + directory select "Tutorial2".

+
+
+ +
+
+ t2-tutorial-top +
+
+

Looking at the code list, the first thing you'll notice is that we + immediately ran into a + BRK, which is a pretty reliable sign that we're not in + a code section. This particular file begins with 00 20, which + could be a load address (e.g. some C64 binaries look like this). So let's start + with that assumption.

+
+
+ +
+
+

As discussed in the introductory material, SourceGen separates code + from data by tracing all possible execution paths from declared entry + points. The generic profiles mark the first byte of the file as an entry + point, but that's wrong here. We want to change the entry point to + be after the 16-bit load address, at offset +000002.

+
+
+ +
+
+ t2-1000-edit1 +
+
+

Click on the first line of code at address $1000, and select + Actions > Remove Analyzer Tags + (Ctrl+H Ctrl+R). + This removes the "code entry point" tag. + Unfortunately the $20 is still auto-detected as being part of a string + directive.

+
+
+ +
+
+ t2-1000-edit2 +
+
+

The string is making it hard to manipulate the next few bytes, + so let's fix that by selecting Edit > Toggle Data Scan + (Ctrl+D). This turns off the feature that + automatically generates string and .FILL directives, + so now each uncategorized byte is on its own line.

+
+
+ +
+
+ t2-1000-fmt-word +
+
+

You could select the first two lines and use + Actions > Edit Operand to format them as a 16-bit + little-endian hex value, but there's a shortcut: select the first + line with data (address $1000), then Actions > Format As Word + (Ctrl+W). + It automatically grabbed the following byte and combined them.

+
+
+ +
+
+ t2-1000-setcode +
+
+

Since we believe $2000 is the load address for everything that follows, + click on the line with address $1002, select + Actions > Set Address, and enter "2000". With that line + still selected, use Actions > Tag Address As Code Start Point + (Ctrl+H Ctrl+C) to + tell the analyzer to start looking for code there.

+
+
+ +
+
+ t2-1000-ready +
+
+

That looks better, but the branch destination ($203D) is off the bottom of the + screen (unless you have a really tall screen or small fonts) because of + all the intervening data. Use Edit > Toggle Data Scan + (Ctrl+D) + to turn the string-finder back on. Now it's easier to read.

+
+
+ + +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/editing-data.html b/docs/sgtutorial/editing-data.html index c10ee06..f4ca981 100644 --- a/docs/sgtutorial/editing-data.html +++ b/docs/sgtutorial/editing-data.html @@ -1,204 +1,252 @@ - - - - - - - - - - - Editing Data - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Editing Data

- -
-
- t1-data-stringblob -
-
-

There's some string and numeric data down at the bottom of the file. The - final string appears to be multiple strings stuck together. (You may need - to increase the width of the Operand column to see the whole thing.) Notice - that the opcode for the very last line is '+', which means the operand - is a continuation of the previous line. Long data items can span multiple - lines, split every 64 characters (including delimiters), but they are - still single items: selecting any part selects the whole.

-
-
- -
-
- t1-data-editdlg-1 -
-
-

Select the last line in the file, then Actions > Edit Operand. - You'll notice that this dialog is much different from the one you got when editing - the operand of an instruction. At the top it will say "65 bytes - selected". You can format this as a single 65-byte string, as 65 individual - items, or various things in between. For now, select Single bytes, - and then on the right, select ASCII (low or high) character. - Click OK.

-
-
- -
-
- t1-data-edited-1 -
-
-

Each character is now on its own line. The selection still spans the - same set of addresses.

-
-
- -
-
- t1-data-editdlg-2 -
-
-

Select address $203D on its own, then Actions > Edit Label. Set the - label to "STR1". Move up a bit and select address $2030, then scroll to - the bottom and shift-click address $2070. Select - Actions > Edit Operand. - At the top it should now say, "65 bytes selected in 2 groups". - There are two groups because the presence of a label split the data into - two separate regions. From the Character encoding pop-up down - in the "String" section, make sure Low or High ASCII encoding - is selected, then select the Mixed character and non-character - string type and click OK.

-
-
- -
-
- t1-data-edited-2 -
-
-

We now have two .STR lines, one for "string zero ", - and one with the "STR1" label and the rest of the string data. - This is okay, - but it's not really what we want. The code at $200B appears to be loading - a 16-bit address from data at $2025, so we want to use that if we can.

-

Select Edit > Undo three times. You should be back to the - state where there's a single .STR line at the bottom of - the file, split across two lines with a '+'.

-
-
- -
-
- t1-data-string-bytes -
-
-

Select the line at $2026. This is currently formatted as a string, - but that appears to be incorrect, so let's format it as individual bytes - instead. There's an easy way to do that: use - Actions > Toggle Single-Byte Format - (or hit Ctrl+B).

-
-
- -
-
-

The data starting at $2025 appears to be 16-bit little-endian - addresses that point into the table of strings, so let's format - them appropriately.

- -
-
- -
-
- t1-data-editdlg-3 -
-
-

Select the line at $2025, then shift-click the line at $202E. Right-click - and select Edit Operand. If you selected the correct set of bytes, - the top line in the dialog should now say, "10 bytes selected". - Because 10 is a multiple of two, the 16-bit formats are enabled. It's not a multiple - of 3 or 4, so the 24-bit and 32-bit options are not enabled. Click the - 16-bit words, little-endian radio button, then over to the right, - click the Address radio button. Click OK.

-
-
- -
-
- t1-data-edited-3 -
-
-

We just told SourceGen that those 10 bytes are actually five 16-bit numeric - references. SourceGen determined that the addresses are contained in the - file, and generated labels for each of them. Labels only work if they're - on their own line, so the long string was automatically split into five - separate .STR statements.

-
-
- -
-
-

Note we didn't explicitly format the string data. We formatted - the addresses at $2025, which placed labels at the start of the strings, - but the strings themselves were automatically detected and formatted - by SourceGen. By default, SourceGen looks for ASCII strings, but this - can be changed in the project properties. You can even disable the - string auto-detection entirely if you want.

-
-
- - -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Editing Data - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Editing Data

+ +
+
+ t1-data-stringblob +
+
+

There's some string and numeric data down at the bottom of the file. The + final string appears to be multiple strings stuck together. (You may need + to increase the width of the Operand column to see the whole thing.) Notice + that the opcode for the very last line is '+', which means the operand + is a continuation of the previous line. Long data items can span multiple + lines, split every 64 characters (including delimiters), but they are + still single items: selecting any part selects the whole.

+
+
+ +
+
+ t1-data-editdlg-1 +
+
+

Select the last line in the file, then Actions > Edit Operand. + You'll notice that this dialog is much different from the one you got when editing + the operand of an instruction. At the top it will say "65 bytes + selected". You can format this as a single 65-byte string, as 65 individual + items, or various things in between. For now, select Single bytes, + and then on the right, select ASCII (low or high) character. + Click OK.

+
+
+ +
+
+ t1-data-edited-1 +
+
+

Each character is now on its own line. The selection still spans the + same set of addresses.

+
+
+ +
+
+ t1-data-editdlg-2 +
+
+

Select address $203D on its own, then Actions > Edit Label. Set the + label to "STR1". Move up a bit and select address $2030, then scroll to + the bottom and shift-click address $2070. Select + Actions > Edit Operand. + At the top it should now say, "65 bytes selected in 2 groups". + There are two groups because the presence of a label split the data into + two separate regions. From the Character encoding pop-up down + in the "String" section, make sure Low or High ASCII encoding + is selected, then select the Mixed character and non-character + string type and click OK.

+
+
+ +
+
+ t1-data-edited-2 +
+
+

We now have two .STR lines, one for "string zero ", + and one with the "STR1" label and the rest of the string data. + This is okay, + but it's not really what we want. The code at $200B appears to be loading + a 16-bit address from data at $2025, so we want to use that if we can.

+

Select Edit > Undo three times. You should be back to the + state where there's a single .STR line at the bottom of + the file, split across two lines with a '+'.

+
+
+ +
+
+ t1-data-string-bytes +
+
+

Select the line at $2026. This is currently formatted as a string, + but that appears to be incorrect, so let's format it as individual bytes + instead. There's an easy way to do that: use + Actions > Toggle Single-Byte Format + (or hit Ctrl+B).

+
+
+ +
+
+

The data starting at $2025 appears to be 16-bit little-endian + addresses that point into the table of strings, so let's format + them appropriately.

+ +
+
+ +
+
+ t1-data-editdlg-3 +
+
+

Select the line at $2025, then shift-click the line at $202E. Right-click + and select Edit Operand. If you selected the correct set of bytes, + the top line in the dialog should now say, "10 bytes selected". + Because 10 is a multiple of two, the 16-bit formats are enabled. It's not a multiple + of 3 or 4, so the 24-bit and 32-bit options are not enabled. Click the + 16-bit words, little-endian radio button, then over to the right, + click the Address radio button. Click OK.

+
+
+ +
+
+ t1-data-edited-3 +
+
+

We just told SourceGen that those 10 bytes are actually five 16-bit numeric + references. SourceGen determined that the addresses are contained in the + file, and generated labels for each of them. Labels only work if they're + on their own line, so the long string was automatically split into five + separate .STR statements.

+
+
+ +
+
+

Note we didn't explicitly format the string data. We formatted + the addresses at $2025, which placed labels at the start of the strings, + but the strings themselves were automatically detected and formatted + by SourceGen. By default, SourceGen looks for ASCII strings, but this + can be changed in the project properties. You can even disable the + string auto-detection entirely if you want.

+
+
+ + +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/extension-scripts.html b/docs/sgtutorial/extension-scripts.html index 2b6f433..f8d9f86 100644 --- a/docs/sgtutorial/extension-scripts.html +++ b/docs/sgtutorial/extension-scripts.html @@ -1,191 +1,239 @@ - - - - - - - - - - - Extension Scripts - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Extension Scripts

- -
-
-

Some repetitive formatting tasks can be handled with automatic scripts. - This is especially useful for inline data, which can confuse the code - analyzer.

-

An earlier tutorial demonstrated how to manually mark bytes as - inline data. We're going to do it a faster way. For this tutorial, - start a new project with the Generic 6502 profile, and - in the SourceGen Examples/Tutorial directory select "Tutorial4".

-

We'll need to load scripts from the project directory, so we have to - save the project. File > Save, - use the default name ("Tutorial4.dis65").

-
-
- -
-
- t4-add-inlinel1 -
-
-

Take a look at the disassembly listing. The file starts with a - JSR followed by a string that begins with a small number. - This appears to be a string with a leading length byte. We want to load - a script that can handle that, so use - Edit > Project Properties, select the - Extension Scripts tab, and click - Add Scripts from Project. The file - browser opens in the project directory. Select the file - "InlineL1String.cs", click Open, then OK.

-
-
- -
-
- t4-inlinel1-src -
-
-

Nothing happened. If you look at the script with an editor (and you - know some C#), you'll see that it's looking for a JSR to a - function called PrintInlineL1String. So let's give it one.

-
-
- -
-
-

Double-click the JSR opcode on line $1000 - to jump to address $1026. The only thing there is an RTS. - It's supposed to be a routine that prints a string with a leading length - byte, but for the sake of keeping the example code short it's just a - place-holder. Use the curly toolbar arrow - (or Alt+LeftArrow) to jump back to $1026.

-
-
- -
-
- t4-inlinel1-edit -
-
-

This time, double-click the JSR operand - ("L1026") to edit the operand. - Click Create Label, and enter PrintInlineL1String. - Remember that labels are case-sensitive; - you must enter it exactly as shown. Hit OK to accept the label, - and OK to close the operand editor.

-
-
- -
-
- t4-inlinel1-done -
-
-

If all went well, address $1003 - should now be an L1 string "How long?", and address $100D - should be another JSR.

-
-
- -
-
-

The next JSR appears to be followed by a null-terminated string, so - we'll need something that handles that.

-
-
- -
-
- t4-inlinenull-src -
-
-

Go back into Project Properties - and add the script called "InlineNullTermString.cs" from the project directory. - This script is slightly different, in that it handles any JSR to a label - that starts with PrintInlineNullString. So let's give it a couple of - those.

-
-
- -
-
- t4-inlinenull-done -
-
-

Double-click the operand on line $100D ("L1027"), - click Create Label, - and set the label to "PrintInlineNullStringOne". - Hit OK twice. That formatted the first one and got us - to the next JSR. Repeat the process on line $1019 - ("L1028"), setting the label to "PrintInlineNullStringTwo".

-
-
- -
-
-

The entire project is now nicely formatted. In a real project the - "Print Inline" locations would be actual print functions, not just RTS - instructions. There would likely be multiple JSRs to the print function, - so labeling a single function entry point could format dozens of inline - strings and clean up the disassembly automatically. The reason for - allowing wildcard names is that some functions may have multiple - entry points or chain through different locations.

- -

Extension scripts can make your life much easier, but they do require - some programming experience. See the SourceGen manual for more details.

-
-
- -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Extension Scripts - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Extension Scripts

+ +
+
+

Some repetitive formatting tasks can be handled with automatic scripts. + This is especially useful for inline data, which can confuse the code + analyzer.

+

An earlier tutorial demonstrated how to manually mark bytes as + inline data. We're going to do it a faster way. For this tutorial, + start a new project with the Generic 6502 profile, and + in the SourceGen Examples/Tutorial directory select "Tutorial4".

+

We'll need to load scripts from the project directory, so we have to + save the project. File > Save, + use the default name ("Tutorial4.dis65").

+
+
+ +
+
+ t4-add-inlinel1 +
+
+

Take a look at the disassembly listing. The file starts with a + JSR followed by a string that begins with a small number. + This appears to be a string with a leading length byte. We want to load + a script that can handle that, so use + Edit > Project Properties, select the + Extension Scripts tab, and click + Add Scripts from Project. The file + browser opens in the project directory. Select the file + "InlineL1String.cs", click Open, then OK.

+
+
+ +
+
+ t4-inlinel1-src +
+
+

Nothing happened. If you look at the script with an editor (and you + know some C#), you'll see that it's looking for a JSR to a + function called PrintInlineL1String. So let's give it one.

+
+
+ +
+
+

Double-click the JSR opcode on line $1000 + to jump to address $1026. The only thing there is an RTS. + It's supposed to be a routine that prints a string with a leading length + byte, but for the sake of keeping the example code short it's just a + place-holder. Use the curly toolbar arrow + (or Alt+LeftArrow) to jump back to $1026.

+
+
+ +
+
+ t4-inlinel1-edit +
+
+

This time, double-click the JSR operand + ("L1026") to edit the operand. + Click Create Label, and enter PrintInlineL1String. + Remember that labels are case-sensitive; + you must enter it exactly as shown. Hit OK to accept the label, + and OK to close the operand editor.

+
+
+ +
+
+ t4-inlinel1-done +
+
+

If all went well, address $1003 + should now be an L1 string "How long?", and address $100D + should be another JSR.

+
+
+ +
+
+

The next JSR appears to be followed by a null-terminated string, so + we'll need something that handles that.

+
+
+ +
+
+ t4-inlinenull-src +
+
+

Go back into Project Properties + and add the script called "InlineNullTermString.cs" from the project directory. + This script is slightly different, in that it handles any JSR to a label + that starts with PrintInlineNullString. So let's give it a couple of + those.

+
+
+ +
+
+ t4-inlinenull-done +
+
+

Double-click the operand on line $100D ("L1027"), + click Create Label, + and set the label to "PrintInlineNullStringOne". + Hit OK twice. That formatted the first one and got us + to the next JSR. Repeat the process on line $1019 + ("L1028"), setting the label to "PrintInlineNullStringTwo".

+
+
+ +
+
+

The entire project is now nicely formatted. In a real project the + "Print Inline" locations would be actual print functions, not just RTS + instructions. There would likely be multiple JSRs to the print function, + so labeling a single function entry point could format dozens of inline + strings and clean up the disassembly automatically. The reason for + allowing wildcard names is that some functions may have multiple + entry points or chain through different locations.

+ +

Extension scripts can make your life much easier, but they do require + some programming experience. See the SourceGen manual for more details.

+
+
+ +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/generating-code.html b/docs/sgtutorial/generating-code.html index bcb376e..b50b006 100644 --- a/docs/sgtutorial/generating-code.html +++ b/docs/sgtutorial/generating-code.html @@ -1,131 +1,179 @@ - - - - - - - - - - - Generating Code - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Generating Code

- -
-
- t1-asmgen-dlg -
-
-

You can generate assembly source code from the disassembled data. - Select File > Assemble (or hit Ctrl+Shift+A) - to open the source generation and assembly dialog.

-
-
- -
-
- t1-asmgen-preview -
-
-

Pick your favorite assembler from the drop list at the top right, - then click Generate. An assembly source file will be generated in the - directory where your project files lives, named after a combination of the - project name and the assembler name. A preview of the assembled code - appears in the top window. (It's called a "preview" because it has line numbers - added and is cut off after a certain limit.)

-
-
- -
-
- t1-asmgen-asmout -
-
-

If you have a cross-assembler installed and configured, you can run - it by clicking Run Assembler. The output from the assembler will appear - in the lower window, along with an indication of whether the assembled - file matches the original. (Unless there are bugs in SourceGen or the assembler, - it should always match exactly.)

- -

Click Close to close the window.

-
-
- -
-
-

If you want to output source code for display on a web site or - posting in an online forum, you have a couple of options. The easiest - way to do this is to select the lines and use Edit > Copy - to copy them to the system clipboard, and then simply paste it elsewhere. - The format will match what's on the screen, and will not be tailored to - any specific assembler. The set of columns included in the copy can be - configured in the application settings editor, e.g. you can limit it to - just the columns typically found in source code.

-
-
- -
-
- t1-export-dlg -
-
-

To export some or all of the project as text or HTML, use - File > Export (Ctrl+Shift+E). - This is an easy way to share a disassembly listing with people who - don't have access to SourceGen. The feature is primarily useful in - situations where you want to show the disassembly data (Address - and Bytes), or want to embed visualizations (explained later).

-
-
- - -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Generating Code - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Generating Code

+ +
+
+ t1-asmgen-dlg +
+
+

You can generate assembly source code from the disassembled data. + Select File > Assemble (or hit Ctrl+Shift+A) + to open the source generation and assembly dialog.

+
+
+ +
+
+ t1-asmgen-preview +
+
+

Pick your favorite assembler from the drop list at the top right, + then click Generate. An assembly source file will be generated in the + directory where your project files lives, named after a combination of the + project name and the assembler name. A preview of the assembled code + appears in the top window. (It's called a "preview" because it has line numbers + added and is cut off after a certain limit.)

+
+
+ +
+
+ t1-asmgen-asmout +
+
+

If you have a cross-assembler installed and configured, you can run + it by clicking Run Assembler. The output from the assembler will appear + in the lower window, along with an indication of whether the assembled + file matches the original. (Unless there are bugs in SourceGen or the assembler, + it should always match exactly.)

+ +

Click Close to close the window.

+
+
+ +
+
+

If you want to output source code for display on a web site or + posting in an online forum, you have a couple of options. The easiest + way to do this is to select the lines and use Edit > Copy + to copy them to the system clipboard, and then simply paste it elsewhere. + The format will match what's on the screen, and will not be tailored to + any specific assembler. The set of columns included in the copy can be + configured in the application settings editor, e.g. you can limit it to + just the columns typically found in source code.

+
+
+ +
+
+ t1-export-dlg +
+
+

To export some or all of the project as text or HTML, use + File > Export (Ctrl+Shift+E). + This is an easy way to share a disassembly listing with people who + don't have access to SourceGen. The feature is primarily useful in + situations where you want to show the disassembly data (Address + and Bytes), or want to embed visualizations (explained later).

+
+
+ + +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/index.html b/docs/sgtutorial/index.html index cb9993d..bdece7d 100644 --- a/docs/sgtutorial/index.html +++ b/docs/sgtutorial/index.html @@ -1,86 +1,134 @@ - - - - - - - - - - - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

SourceGen Tutorial

- -

This tutorial demonstrates many of the features of SourceGen. It will not -teach you 6502 assembly language, nor reveal any tricks for working with a -specific system.

- -

The tutorial is divided into four broad sections:

- - -

The tutorials work best if you follow along.

- -

The 6502bench software distribution comes with a full manual. Launch -SourceGen and hit F1 to access it. - -

- -
- - - - - + + + + + + + + + + + SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

SourceGen Tutorial

+ +

This tutorial demonstrates many of the features of SourceGen. It will not +teach you 6502 assembly language, nor reveal any tricks for working with a +specific system.

+ +

The tutorial is divided into four broad sections:

+ + +

The tutorials work best if you follow along.

+ +

The 6502bench software distribution comes with a full manual. Launch +SourceGen and hit F1 to access it. + +

+ +
+ + + + + diff --git a/docs/sgtutorial/inline-data.html b/docs/sgtutorial/inline-data.html index 0f1b763..95dd722 100644 --- a/docs/sgtutorial/inline-data.html +++ b/docs/sgtutorial/inline-data.html @@ -1,136 +1,184 @@ - - - - - - - - - - - Inline Data - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Inline Data

- -
-
- t2-206b -
-
-

Consider the code at address $206B. It's a JSR followed by some - ASCII text, then a $00 byte, and then what might be code.

-
-
- -
-
- t2-20ab-1 -
- ... -
- t2-20ab-2 -
-
-

Double-click on the JSR opcode - to jump to $20AB to see the function. It pulls the - call address off the stack, and uses it as a pointer. When it encounters - a zero byte, it breaks out of the loop, pushes the adjusted pointer - value back onto the stack, and returns.

-
-
- -
-
-

This is an example of "inline data", where a function uses the return - address to get a pointer to data. The return address is adjusted to - point past the inline data before returning (technically, it points at - the very last byte of the inline data, because - RTS jumps to address + 1).

-
-
- -
-
- t2-inline-tag -
-
-

To format the data, we first need to tell SourceGen that there's data - in line with the code. Select the line at address $206E, then - shift-click the line at address $2077. Use - Actions > Tag Bytes As Inline Data - (Ctrl+HCtrl+I).

-
-
- -
-
- t2-inline-after -
-
-

The data turns to single-byte values, and we now see the code - continuing at address $2078. We can format the data as a string by - using Actions > Edit Operand, - setting the Character Encoding to Low or High ASCII, - and selecting null-terminated strings.

-
-
- -
-
-

That's pretty straightforward, but this could quickly become tedious if - there were a lot of these. SourceGen allows you to define scripts to - automate common formatting tasks. This is covered in the "Extension - Scripts" tutorial.

-
-
- - -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Inline Data - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Inline Data

+ +
+
+ t2-206b +
+
+

Consider the code at address $206B. It's a JSR followed by some + ASCII text, then a $00 byte, and then what might be code.

+
+
+ +
+
+ t2-20ab-1 +
+ ... +
+ t2-20ab-2 +
+
+

Double-click on the JSR opcode + to jump to $20AB to see the function. It pulls the + call address off the stack, and uses it as a pointer. When it encounters + a zero byte, it breaks out of the loop, pushes the adjusted pointer + value back onto the stack, and returns.

+
+
+ +
+
+

This is an example of "inline data", where a function uses the return + address to get a pointer to data. The return address is adjusted to + point past the inline data before returning (technically, it points at + the very last byte of the inline data, because + RTS jumps to address + 1).

+
+
+ +
+
+ t2-inline-tag +
+
+

To format the data, we first need to tell SourceGen that there's data + in line with the code. Select the line at address $206E, then + shift-click the line at address $2077. Use + Actions > Tag Bytes As Inline Data + (Ctrl+HCtrl+I).

+
+
+ +
+
+ t2-inline-after +
+
+

The data turns to single-byte values, and we now see the code + continuing at address $2078. We can format the data as a string by + using Actions > Edit Operand, + setting the Character Encoding to Low or High ASCII, + and selecting null-terminated strings.

+
+
+ +
+
+

That's pretty straightforward, but this could quickly become tedious if + there were a lot of these. SourceGen allows you to define scripts to + automate common formatting tasks. This is covered in the "Extension + Scripts" tutorial.

+
+
+ + +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/labels-symbols.html b/docs/sgtutorial/labels-symbols.html index 150ba78..80eecae 100644 --- a/docs/sgtutorial/labels-symbols.html +++ b/docs/sgtutorial/labels-symbols.html @@ -1,352 +1,400 @@ - - - - - - - - - - - Labels & Symbols - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Labels & Symbols

- -
-
-

Suppose you want to call some code at address $1000. CPUs - fundamentally deal with numeric values, so the machine code to - call it would be JSR $1000. Humans tend to work better with - words, so associating a meaningful symbol with address $1000 - can greatly improve the readability of the code: something like - JSR DrawSprite is far more helpful for human readers. - Further, once the code has been disassembled to source code, using symbols - instead of fixed addresses makes it easier to alter the program or re-use - the code.

- -

When the target address of instructions like JSR and - LDA falls within the scope of the data file, SourceGen classifies - the reference as internal, and automatically adds a generic - symbolic label (e.g. L1000). This can be edited if desired.

-
-
- -
-
- t1-edit-label -
-
-

On the line at address $2000, select - Actions > Edit Label, or double-click on the label - "L2000". Change the label to "MAIN", and hit - Enter. The label changes on that line, - and on the two lines that refer to address $2000. - (If you're not sure which lines refer to address $2000, - select line $2000 and check the list in the References window.)

-
-
- -
-
-

Sometimes the target address falls outside the data file. Examples - include calls to ROM routines, use of zero-page storage, and access to - memory-mapped I/O locations. SourceGen classifies these as external, - and does not generate a symbol. In an assembler source file, symbols - for these would be expressed as equates (e.g. FOO = $8000), - usually at the top of the file or in an "include file". SourceGen - allows you to specify symbols for addresses and numeric constants within - the project ("project symbols"), or in a symbol file that can be - included in multiple projects ("platform symbols"). The SourceGen - distribution includes platform symbol files with ROM addresses for - several common systems.

-
-
- -
-
- t1-pre-sym-2000 -
-
-

For an example, consider the code at address $2000, which is - LDA $3000. We want to assign the symbol "INPUT" to address - $3000, but we can't do that by editing a label because it's not inside - the file bounds. We can open the project symbol editor from the project - properties editor, or we can use a shortcut.

-
-
- -
-
- t1-edit-sym-2000 -
-
-

With the line at $2000 selected, use Actions > Edit Operand, - or double-click on the value in the Operand column - ("$3000"). This opens the - Edit Instruction Operand dialog. In the bottom left, click - Create Project Symbol. Set the Label field to - "INPUT", and - click OK, then OK in the operand editor.

-
-
- -
-
- t1-edit-2000-done -
-
-

The instruction at $2000 now uses the symbol "INPUT" - as its operand. If you scroll to the top of the file, you will see a - ".EQ" line for the symbol.

-
-
- -
- -

Numeric v. Symbolic

- -
-
-

When SourceGen sees a reference to an address, such as the operand of an - absolute JSR or LDA, it recognizes it - as a numeric reference. You can edit the instruction's operand - to use a symbol instead, changing to a symbolic reference. - Sometimes the way these are handled can be confusing.

-
-
- -
-
- t1-sym-2005-before -
-
-

Let's use the branch statement at $2005 to illustrate the difference. - It performs a branch to $2009, which was automatically assigned the - label "L2009".

-
-
- -
-
- t1-sym-2005-labeled -
-
-

Edit the label at $2009 (double-click on "L2009" there), - and change it to "IN_RANGE". Line $2005 changes to match. - This works because SourceGen - is auto-formatting line $2005's operand based on the label it finds when it - chases the numeric reference to $2009. - The Info window shows this as Format (auto): symbol "IN_RANGE".

-

Use Edit > Undo to revert the label change.

-
-
- -
-
- t1-sym-2005-edit -
-
-

Edit the instruction operand at $2005 (double-click on - "L2009" there). Change the format to Symbol, - and type "IN_RANGE" in the symbol box. - The preview shows BCC IN_RANGE (?), which hints at a - problem. Click OK.

-
-
- -
-
- t1-sym-2005-nosym -
-
-

Some things changed, but not the things we wanted. Line $2005 now - says BCC $2009, instead of BCC L2009, and the - label at $2009 has disappeared entirely. What went wrong?

-
-
- -
-
-

The problem is that we edited the operand to use a symbol that isn't - defined anywhere. Because "IN_RANGE" isn't defined, the operand was - given the default format, and displayed as a hex value. - The numeric reference to $2009 was replaced by the symbol, and nothing - else refers to that address, - so SourceGen no longer had any reason to put an auto-generated label - on line $2009, which is why that disappeared.

-
-
- -
-
- t1-sym-2005-msg-window -
-
-

The missing symbol is called out in a message window that popped up - at the bottom of the code list window. The message window only appears - when there are messages to read. You can hide the window with the - Hide button, and make it re-appear with the button in the - bottom right of the main window that currently says 1 message.

-
-
- -
-
- t1-sym-2005-explicit -
-
-

We can resolve this issue by providing the desired symbol. As you - did earlier, edit the label on line $2009 (double-click in the label column) - and set it to "IN_RANGE". When you do, the operand on line $2005 - is updated appropriately. - If you select line $2005, the Info window shows the format as - Format: symbol "IN_RANGE", indicating that the symbol - was set explicitly rather than automatically.

-
-
- -
-
- t1-sym-2005-adjust -
-
-

Symbolic references always link to the symbol, even when the symbol - doesn't match the numeric reference. To see this, remove the label from - line $2009 by undoing that change with Edit > Undo, - so the symbol is again undefined. Now set the label on the following line, - $200A, to "IN_RANGE".

-
-
- -
-
-

Line $2005 now says "BCC IN_RANGE-1". Earlier you set - the operand to be a symbolic reference to "IN_RANGE", but the symbol - doesn't quite match, so SourceGen automatically adjusted the operand by - one byte to point to the correct address. Generally speaking, SourceGen - will do its best to use the symbols that you tell it to, and will adjust the - symbolic references so that the code assembles correctly.

-
-
- -
-
-

Edit the label on line $200A, and change it to "NIFTY". - Note how the reference on line $2005 also changed. This is an example - of a "refactoring rename": when you changed the label, SourceGen - automatically found everything that referred to it and updated it. - If you edit the operand on line $2005, you can confirm that the - symbol has changed.

- -

(If you want to clean this up before continuing on to the next - section, put the label back on line $2009.)

-
-
- -
- -

Non-Unique Label

- -
-
-

Most assemblers have a notion of "local" labels, which go out of - scope when a non-local (global) label is encountered. The actual - definition of "local" is assembler-specific, but SourceGen allows you - to create labels that serve the same purpose.

-
-
- -
-
- t1-local-loop-edit -
-
-

By default, newly-created labels have global scope and must be - unique. You can change these attributes when you edit the label. Up near the - top of the file, at address $1002, double-click on the label ("L1002"). - Change the label to "LOOP" and click the "non-unique local" - radio button. - Click OK.

-
-
- -
-
- t1-local-loop1 -
-
-

The label at line $1002 (and the operand on line $100B) should now - be "@LOOP". By default, '@' is used to indicate non-unique labels, - though you can change it to a different character in the application - settings.

-
-
- -
-
- t1-local-loop2 -
-
-

At address $2019, double-click to edit the label ("L2019"). If - you type "MAIN" or "IS_OK" with Global selected you'll - get an error, but if you type "@LOOP" it will be accepted. Note - the "non-unique local" radio - button is selected automatically if you start a label with '@' (or - whatever character you have configured). Click OK.

-

You now have two lines with the same label. In some cases the - assembly source generator need to may "promote" them to globals, or - rename them to make them unique, depending on what your preferred assembler - allows.

-
-
- - -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Labels & Symbols - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Labels & Symbols

+ +
+
+

Suppose you want to call some code at address $1000. CPUs + fundamentally deal with numeric values, so the machine code to + call it would be JSR $1000. Humans tend to work better with + words, so associating a meaningful symbol with address $1000 + can greatly improve the readability of the code: something like + JSR DrawSprite is far more helpful for human readers. + Further, once the code has been disassembled to source code, using symbols + instead of fixed addresses makes it easier to alter the program or re-use + the code.

+ +

When the target address of instructions like JSR and + LDA falls within the scope of the data file, SourceGen classifies + the reference as internal, and automatically adds a generic + symbolic label (e.g. L1000). This can be edited if desired.

+
+
+ +
+
+ t1-edit-label +
+
+

On the line at address $2000, select + Actions > Edit Label, or double-click on the label + "L2000". Change the label to "MAIN", and hit + Enter. The label changes on that line, + and on the two lines that refer to address $2000. + (If you're not sure which lines refer to address $2000, + select line $2000 and check the list in the References window.)

+
+
+ +
+
+

Sometimes the target address falls outside the data file. Examples + include calls to ROM routines, use of zero-page storage, and access to + memory-mapped I/O locations. SourceGen classifies these as external, + and does not generate a symbol. In an assembler source file, symbols + for these would be expressed as equates (e.g. FOO = $8000), + usually at the top of the file or in an "include file". SourceGen + allows you to specify symbols for addresses and numeric constants within + the project ("project symbols"), or in a symbol file that can be + included in multiple projects ("platform symbols"). The SourceGen + distribution includes platform symbol files with ROM addresses for + several common systems.

+
+
+ +
+
+ t1-pre-sym-2000 +
+
+

For an example, consider the code at address $2000, which is + LDA $3000. We want to assign the symbol "INPUT" to address + $3000, but we can't do that by editing a label because it's not inside + the file bounds. We can open the project symbol editor from the project + properties editor, or we can use a shortcut.

+
+
+ +
+
+ t1-edit-sym-2000 +
+
+

With the line at $2000 selected, use Actions > Edit Operand, + or double-click on the value in the Operand column + ("$3000"). This opens the + Edit Instruction Operand dialog. In the bottom left, click + Create Project Symbol. Set the Label field to + "INPUT", and + click OK, then OK in the operand editor.

+
+
+ +
+
+ t1-edit-2000-done +
+
+

The instruction at $2000 now uses the symbol "INPUT" + as its operand. If you scroll to the top of the file, you will see a + ".EQ" line for the symbol.

+
+
+ +
+ +

Numeric v. Symbolic

+ +
+
+

When SourceGen sees a reference to an address, such as the operand of an + absolute JSR or LDA, it recognizes it + as a numeric reference. You can edit the instruction's operand + to use a symbol instead, changing to a symbolic reference. + Sometimes the way these are handled can be confusing.

+
+
+ +
+
+ t1-sym-2005-before +
+
+

Let's use the branch statement at $2005 to illustrate the difference. + It performs a branch to $2009, which was automatically assigned the + label "L2009".

+
+
+ +
+
+ t1-sym-2005-labeled +
+
+

Edit the label at $2009 (double-click on "L2009" there), + and change it to "IN_RANGE". Line $2005 changes to match. + This works because SourceGen + is auto-formatting line $2005's operand based on the label it finds when it + chases the numeric reference to $2009. + The Info window shows this as Format (auto): symbol "IN_RANGE".

+

Use Edit > Undo to revert the label change.

+
+
+ +
+
+ t1-sym-2005-edit +
+
+

Edit the instruction operand at $2005 (double-click on + "L2009" there). Change the format to Symbol, + and type "IN_RANGE" in the symbol box. + The preview shows BCC IN_RANGE (?), which hints at a + problem. Click OK.

+
+
+ +
+
+ t1-sym-2005-nosym +
+
+

Some things changed, but not the things we wanted. Line $2005 now + says BCC $2009, instead of BCC L2009, and the + label at $2009 has disappeared entirely. What went wrong?

+
+
+ +
+
+

The problem is that we edited the operand to use a symbol that isn't + defined anywhere. Because "IN_RANGE" isn't defined, the operand was + given the default format, and displayed as a hex value. + The numeric reference to $2009 was replaced by the symbol, and nothing + else refers to that address, + so SourceGen no longer had any reason to put an auto-generated label + on line $2009, which is why that disappeared.

+
+
+ +
+
+ t1-sym-2005-msg-window +
+
+

The missing symbol is called out in a message window that popped up + at the bottom of the code list window. The message window only appears + when there are messages to read. You can hide the window with the + Hide button, and make it re-appear with the button in the + bottom right of the main window that currently says 1 message.

+
+
+ +
+
+ t1-sym-2005-explicit +
+
+

We can resolve this issue by providing the desired symbol. As you + did earlier, edit the label on line $2009 (double-click in the label column) + and set it to "IN_RANGE". When you do, the operand on line $2005 + is updated appropriately. + If you select line $2005, the Info window shows the format as + Format: symbol "IN_RANGE", indicating that the symbol + was set explicitly rather than automatically.

+
+
+ +
+
+ t1-sym-2005-adjust +
+
+

Symbolic references always link to the symbol, even when the symbol + doesn't match the numeric reference. To see this, remove the label from + line $2009 by undoing that change with Edit > Undo, + so the symbol is again undefined. Now set the label on the following line, + $200A, to "IN_RANGE".

+
+
+ +
+
+

Line $2005 now says "BCC IN_RANGE-1". Earlier you set + the operand to be a symbolic reference to "IN_RANGE", but the symbol + doesn't quite match, so SourceGen automatically adjusted the operand by + one byte to point to the correct address. Generally speaking, SourceGen + will do its best to use the symbols that you tell it to, and will adjust the + symbolic references so that the code assembles correctly.

+
+
+ +
+
+

Edit the label on line $200A, and change it to "NIFTY". + Note how the reference on line $2005 also changed. This is an example + of a "refactoring rename": when you changed the label, SourceGen + automatically found everything that referred to it and updated it. + If you edit the operand on line $2005, you can confirm that the + symbol has changed.

+ +

(If you want to clean this up before continuing on to the next + section, put the label back on line $2009.)

+
+
+ +
+ +

Non-Unique Label

+ +
+
+

Most assemblers have a notion of "local" labels, which go out of + scope when a non-local (global) label is encountered. The actual + definition of "local" is assembler-specific, but SourceGen allows you + to create labels that serve the same purpose.

+
+
+ +
+
+ t1-local-loop-edit +
+
+

By default, newly-created labels have global scope and must be + unique. You can change these attributes when you edit the label. Up near the + top of the file, at address $1002, double-click on the label ("L1002"). + Change the label to "LOOP" and click the "non-unique local" + radio button. + Click OK.

+
+
+ +
+
+ t1-local-loop1 +
+
+

The label at line $1002 (and the operand on line $100B) should now + be "@LOOP". By default, '@' is used to indicate non-unique labels, + though you can change it to a different character in the application + settings.

+
+
+ +
+
+ t1-local-loop2 +
+
+

At address $2019, double-click to edit the label ("L2019"). If + you type "MAIN" or "IS_OK" with Global selected you'll + get an error, but if you type "@LOOP" it will be accepted. Note + the "non-unique local" radio + button is selected automatically if you start a label with '@' (or + whatever character you have configured). Click OK.

+

You now have two lines with the same label. In some cases the + assembly source generator need to may "promote" them to globals, or + rename them to make them unique, depending on what your preferred assembler + allows.

+
+
+ + +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/local-variables.html b/docs/sgtutorial/local-variables.html index c20a11e..2c97bff 100644 --- a/docs/sgtutorial/local-variables.html +++ b/docs/sgtutorial/local-variables.html @@ -1,270 +1,318 @@ - - - - - - - - - - - Local Variables - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Local Variables

- -
-
- t2-203d-start -
-
-

Let's move on to the code at $203D. It starts by storing a couple of - values into direct page address $02/03. This appears to be setting up a - pointer to $2063, which is a data area inside the file. So let's make it - official.

-
-
- -
-
- t2-203d-edit1 -
-
-

Select the line at address $2063, and use - Actions > Edit Label to - give it the label "XDATA?". The question mark on the end is there to - remind us that we're not entirely sure what this is. Now edit the - operand on line $203D, and set it to the symbol "XDATA", - with the part "low". The question mark isn't really part of the label, - so you don't need to type it here.

-
-
- -
-
- t2-203d-after-edit2 -
-
-

Edit the operand on line $2041, - and set it to "XDATA" with the part "high". (The symbol text box - gets focus immediately, so you can start typing the symbol name as soon - as the dialog opens; you don't need to click around first.) If all - went well, the operands should now read LDA #<XDATA? - and LDA #>XDATA?.

-
-
- -
-
- t2-create-ptr-entry -
-
-

Let's give the pointer a name. Select line $203D, and use - Actions > Create Local Variable Table - to create an empty table. Click New Symbol on the right side. - Leave the Address button selected. Set the Label field to "PTR1", - the Value field to "$02", and the width to "2" (it's a 2-byte pointer). - Click "OK" to create the entry, and then - "OK" to update the table.

-
-
- -
-
- t2-after-ptr-entry -
-
-

There's now a .VAR statement - (similar to a .EQU) above line $203D, - and the stores to $02/$03 have changed to - "PTR1" and "PTR1+1".

-
-
- -
-
- t2-20a7 -
-
-

Double-click on the JSR opcode on line $2045 to jump to - L20A7. - The code here just loads a value from $3000 into the accumulator - and returns, so not much to see here. Hit the back-arrow in the - toolbar to jump back to the JSR.

-
-
- -
-
- t2-2048 -
-
-

The next bit of code masks the accumulator so it holds a value between - 0 and 3, then doubles it and uses it as an index into PTR1. - We know PTR1 points to XDATA, - which looks like it has some 16-bit addresses. The - values loaded are stored in two more zero-page locations, $04-05. - Let's make these a pointer as well.

-
-
- -
-
- t2-204e-lv -
-
-

Double-click the operand on line $204E ("$04"), - and click Create Local Variable. Set the Label - to "PTR2" and the width to 2. Click OK - to create the symbol, then OK - to close the operand editor, which should still be set to Default format -- - we didn't actually edit the operand, we just used the operand edit - dialog as a convenient way to create a local variable table entry. All - accesses to $04/$05 now use PTR2, and there's a new entry in the local - variable table we created earlier.

-
-
- -
-
- t2-2055 -
-
-

The next section of code, at $2055, copies bytes from PTR2 - to $0400, stopping when it hits a zero byte. - It looks like this is copying null-terminated strings.

-
-
- -
-
- t2-fmt-xdata -
-
-

This confirms our idea that XDATA holds 16-bit addresses, - so let's format it. Select lines $2063 to $2066, and - Actions > Edit Operand. - The editor window should say "8 bytes selected" at the top. - Click the 16-bit words, little-endian radio button, - and then in the Display As box, click Address. - Click OK.

-
-
- -
-
- t2-fmt-xdata-done -
-
-

The values at XDATA should now be four - .DD2 16-bit addresses. - If you scroll up, you'll see that the .ZSTR strings - near the top now have labels that match the operands in XDATA.

-
-
- -
-
-

Now that we know what XDATA holds, let's rename it. - Change the label to STRADDR. The symbol parts in the - operands at $203D and $2041 update automatically.

-
-
- -
- -
-
- t2-enable-counts -
-
-

Let's take a quick look at the cycle-count feature. Use - Edit > Settings to open the app settings panel. - In the Miscellaneous group on the right side, click the - Show cycle counts for instructions checkbox, then click - OK. (There's also a toolbar button for this.)

-
-
- -
-
- t2-show-cycle-counts -
-
-

Every line with an instruction now has a cycle count on it. The cycle - counts are adjusted for everything SourceGen can figure out. For example, - the BEQ on line $205A shows "2+" cycles, meaning that - it takes at least two cycles but might take more. That's because - conditional branches take an - extra cycle if the branch is taken. The BNE on line - $2061 shows 3 cycles, because we know that the branch is always - taken and doesn't cross a page boundary.

-
-
- -
-
-

(If you want to see why it's always taken, - look at the value of the 'Z' flag in the "flags" column, which indicates - the state of the flags before the instruction on that line is executed. - Lower-case 'z' means the zero-flag is clear (0), upper-case 'Z' means it's - set (1). The analyzer determined that the flag was clear for instructions - following the BEQ because we're on the branch-not-taken path. - The following instruction, ORA #$80, cleared the 'Z' flag and - set the 'N' flag, so a BMI would also be an always-taken branch.)

- -

The cycle-count comments can be added to generated source code as well.

-

If you add an end-of-line comment, it appears after the cycle count. - (Try it.)

-
-
- - -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Local Variables - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Local Variables

+ +
+
+ t2-203d-start +
+
+

Let's move on to the code at $203D. It starts by storing a couple of + values into direct page address $02/03. This appears to be setting up a + pointer to $2063, which is a data area inside the file. So let's make it + official.

+
+
+ +
+
+ t2-203d-edit1 +
+
+

Select the line at address $2063, and use + Actions > Edit Label to + give it the label "XDATA?". The question mark on the end is there to + remind us that we're not entirely sure what this is. Now edit the + operand on line $203D, and set it to the symbol "XDATA", + with the part "low". The question mark isn't really part of the label, + so you don't need to type it here.

+
+
+ +
+
+ t2-203d-after-edit2 +
+
+

Edit the operand on line $2041, + and set it to "XDATA" with the part "high". (The symbol text box + gets focus immediately, so you can start typing the symbol name as soon + as the dialog opens; you don't need to click around first.) If all + went well, the operands should now read LDA #<XDATA? + and LDA #>XDATA?.

+
+
+ +
+
+ t2-create-ptr-entry +
+
+

Let's give the pointer a name. Select line $203D, and use + Actions > Create Local Variable Table + to create an empty table. Click New Symbol on the right side. + Leave the Address button selected. Set the Label field to "PTR1", + the Value field to "$02", and the width to "2" (it's a 2-byte pointer). + Click "OK" to create the entry, and then + "OK" to update the table.

+
+
+ +
+
+ t2-after-ptr-entry +
+
+

There's now a .VAR statement + (similar to a .EQU) above line $203D, + and the stores to $02/$03 have changed to + "PTR1" and "PTR1+1".

+
+
+ +
+
+ t2-20a7 +
+
+

Double-click on the JSR opcode on line $2045 to jump to + L20A7. + The code here just loads a value from $3000 into the accumulator + and returns, so not much to see here. Hit the back-arrow in the + toolbar to jump back to the JSR.

+
+
+ +
+
+ t2-2048 +
+
+

The next bit of code masks the accumulator so it holds a value between + 0 and 3, then doubles it and uses it as an index into PTR1. + We know PTR1 points to XDATA, + which looks like it has some 16-bit addresses. The + values loaded are stored in two more zero-page locations, $04-05. + Let's make these a pointer as well.

+
+
+ +
+
+ t2-204e-lv +
+
+

Double-click the operand on line $204E ("$04"), + and click Create Local Variable. Set the Label + to "PTR2" and the width to 2. Click OK + to create the symbol, then OK + to close the operand editor, which should still be set to Default format -- + we didn't actually edit the operand, we just used the operand edit + dialog as a convenient way to create a local variable table entry. All + accesses to $04/$05 now use PTR2, and there's a new entry in the local + variable table we created earlier.

+
+
+ +
+
+ t2-2055 +
+
+

The next section of code, at $2055, copies bytes from PTR2 + to $0400, stopping when it hits a zero byte. + It looks like this is copying null-terminated strings.

+
+
+ +
+
+ t2-fmt-xdata +
+
+

This confirms our idea that XDATA holds 16-bit addresses, + so let's format it. Select lines $2063 to $2066, and + Actions > Edit Operand. + The editor window should say "8 bytes selected" at the top. + Click the 16-bit words, little-endian radio button, + and then in the Display As box, click Address. + Click OK.

+
+
+ +
+
+ t2-fmt-xdata-done +
+
+

The values at XDATA should now be four + .DD2 16-bit addresses. + If you scroll up, you'll see that the .ZSTR strings + near the top now have labels that match the operands in XDATA.

+
+
+ +
+
+

Now that we know what XDATA holds, let's rename it. + Change the label to STRADDR. The symbol parts in the + operands at $203D and $2041 update automatically.

+
+
+ +
+ +
+
+ t2-enable-counts +
+
+

Let's take a quick look at the cycle-count feature. Use + Edit > Settings to open the app settings panel. + In the Miscellaneous group on the right side, click the + Show cycle counts for instructions checkbox, then click + OK. (There's also a toolbar button for this.)

+
+
+ +
+
+ t2-show-cycle-counts +
+
+

Every line with an instruction now has a cycle count on it. The cycle + counts are adjusted for everything SourceGen can figure out. For example, + the BEQ on line $205A shows "2+" cycles, meaning that + it takes at least two cycles but might take more. That's because + conditional branches take an + extra cycle if the branch is taken. The BNE on line + $2061 shows 3 cycles, because we know that the branch is always + taken and doesn't cross a page boundary.

+
+
+ +
+
+

(If you want to see why it's always taken, + look at the value of the 'Z' flag in the "flags" column, which indicates + the state of the flags before the instruction on that line is executed. + Lower-case 'z' means the zero-flag is clear (0), upper-case 'Z' means it's + set (1). The analyzer determined that the flag was clear for instructions + following the BEQ because we're on the branch-not-taken path. + The following instruction, ORA #$80, cleared the 'Z' flag and + set the 'N' flag, so a BMI would also be an always-taken branch.)

+ +

The cycle-count comments can be added to generated source code as well.

+

If you add an end-of-line comment, it appears after the cycle count. + (Try it.)

+
+
+ + +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/moving-around.html b/docs/sgtutorial/moving-around.html index 4611ab8..c460807 100644 --- a/docs/sgtutorial/moving-around.html +++ b/docs/sgtutorial/moving-around.html @@ -1,201 +1,249 @@ - - - - - - - - - - - Moving Around - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Moving Around

- -
-
- t1-fullscreen -
-
-

The display is divided into five main areas:

-
    -
  1. Code listing. Disassembled code and data is shown here.
  2. -
  3. References. Shows a list of the places in the file that - reference the currently selected line.
  4. -
  5. Notes. A list of notes in the project. Useful as bookmarks.
  6. -
  7. Symbols. All known symbols. The buttons at the top allow you - to filter out symbol types that you're not interested in.
  8. -
  9. Info. Information about the selected line. For code, this - will have a summary of the instruction.
  10. -
-
-
- -
-
-

The center code list window is divided into rows, one per line of disassembled - code or data. This is a standard "list view" control, so you can select a - row by left-clicking anywhere in it. Use Ctrl+click - to toggle the selection on individual lines, and - Shift+click to select a range of lines. You can move - the selection around with the up/down arrow keys and - PgUp/PgDn. Scroll the window - with the mouse wheel, or by dragging the scroll bar thumb.

-

Each row is divided into nine columns. You can adjust the column - widths by clicking and dragging the column dividers in the header

-
-
- -
-
- t1-all-columns -
-
-

The columns on the right side of the window are similar to - what you'd find in assembly source code: label, opcode, operand, - comment. The columns on the left are what you'd find in a disassembly - (file offset, address, raw bytes), plus some information about - processor status flags and line attributes that may or may not be - useful to you. If you find any of these distracting, collapse the - column. (Many of the screen shots captured here will omit the - "Attr" column for the sake of compactness.)

-
-
- -
-
- t1-click-1002 -
-
-

Click on the fourth line down, which has address 1002. The - line has a label, "L1002", and is performing an indexed load from - L1017. Both of these labels were automatically generated, and are - named for the address at which they appear. When you clicked on - the line, a few things happened:

-
    -
  • The line was highlighted in the system selection color (usually - blue).
  • -
  • Address 1017 and label L1017 were highlighted. When you select - a line with an operand that targets an in-file address, the target - address is highlighted.
  • -
  • An entry appeared in the References window. This tells you that the - only reference to L1002 is a branch from address $100B.
  • -
  • The Info window filled with a bunch of text that describes the - line format and some details about the LDA instruction.
  • -
-

Click some other lines, such as address $100B and $1014. Note how the - highlights and contents of other windows change.

-
-
- -
-
- t1-click-1017 -
-
-

Click on L1002 again, then double-click on the opcode ("LDA"). The - selection jumps to L1017. When an operand references an in-file address, - double-clicking on the opcode will take you to it. (Double-clicking on - the operand itself opens a format editor; more on that later.)

-

With line L1017 selected, double-click on the line that appears in the - References window. Note the selection jumps to L1002. You can immediately - jump to any reference.

-
-
- -
-
- t1-symbol-filters -
-
-

At the top of the Symbols window on the right side of the screen is a - row of buttons. Make sure "Auto" and "Addr" are selected. You should see - three labels in the window (L1002, L1014, L1017). Double-click on "L1014" - in the Symbols list. The selection jumps to the appropriate line.

-
-
- -
-
-

Select Navigate > Find. Type "hello", and hit Enter. The selection will - move to address $100E, which is a string that says "hello!". You can use - Navigate > Find Next to try to find the next occurrence (there isn't one). You - can search for any text that appears in the rightmost columns (label, opcode, - operand, comment).

-
-
- -
-
- t1-goto-box -
-
-

Select Navigate > Go To. You can enter a label, address, or file offset. - Enter "100b" to set the selection to the line at address $100B.

-
-
- -
-
- t1-toolbar -
-
-

Near the top-left of the SourceGen window is a set of toolbar icons. - Click the curly left-pointing arrow, and watch the selection move. Click - it again. Then click the curly right-arrow a couple of times. Whenever - you jump around in the file by using the Go To feature, or by double-clicking - on opcodes or lines in the side windows, the locations are added to a - navigation history. The arrows let you move forward and backward - through it.

-
-
- -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Moving Around - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Moving Around

+ +
+
+ t1-fullscreen +
+
+

The display is divided into five main areas:

+
    +
  1. Code listing. Disassembled code and data is shown here.
  2. +
  3. References. Shows a list of the places in the file that + reference the currently selected line.
  4. +
  5. Notes. A list of notes in the project. Useful as bookmarks.
  6. +
  7. Symbols. All known symbols. The buttons at the top allow you + to filter out symbol types that you're not interested in.
  8. +
  9. Info. Information about the selected line. For code, this + will have a summary of the instruction.
  10. +
+
+
+ +
+
+

The center code list window is divided into rows, one per line of disassembled + code or data. This is a standard "list view" control, so you can select a + row by left-clicking anywhere in it. Use Ctrl+click + to toggle the selection on individual lines, and + Shift+click to select a range of lines. You can move + the selection around with the up/down arrow keys and + PgUp/PgDn. Scroll the window + with the mouse wheel, or by dragging the scroll bar thumb.

+

Each row is divided into nine columns. You can adjust the column + widths by clicking and dragging the column dividers in the header

+
+
+ +
+
+ t1-all-columns +
+
+

The columns on the right side of the window are similar to + what you'd find in assembly source code: label, opcode, operand, + comment. The columns on the left are what you'd find in a disassembly + (file offset, address, raw bytes), plus some information about + processor status flags and line attributes that may or may not be + useful to you. If you find any of these distracting, collapse the + column. (Many of the screen shots captured here will omit the + "Attr" column for the sake of compactness.)

+
+
+ +
+
+ t1-click-1002 +
+
+

Click on the fourth line down, which has address 1002. The + line has a label, "L1002", and is performing an indexed load from + L1017. Both of these labels were automatically generated, and are + named for the address at which they appear. When you clicked on + the line, a few things happened:

+
    +
  • The line was highlighted in the system selection color (usually + blue).
  • +
  • Address 1017 and label L1017 were highlighted. When you select + a line with an operand that targets an in-file address, the target + address is highlighted.
  • +
  • An entry appeared in the References window. This tells you that the + only reference to L1002 is a branch from address $100B.
  • +
  • The Info window filled with a bunch of text that describes the + line format and some details about the LDA instruction.
  • +
+

Click some other lines, such as address $100B and $1014. Note how the + highlights and contents of other windows change.

+
+
+ +
+
+ t1-click-1017 +
+
+

Click on L1002 again, then double-click on the opcode ("LDA"). The + selection jumps to L1017. When an operand references an in-file address, + double-clicking on the opcode will take you to it. (Double-clicking on + the operand itself opens a format editor; more on that later.)

+

With line L1017 selected, double-click on the line that appears in the + References window. Note the selection jumps to L1002. You can immediately + jump to any reference.

+
+
+ +
+
+ t1-symbol-filters +
+
+

At the top of the Symbols window on the right side of the screen is a + row of buttons. Make sure "Auto" and "Addr" are selected. You should see + three labels in the window (L1002, L1014, L1017). Double-click on "L1014" + in the Symbols list. The selection jumps to the appropriate line.

+
+
+ +
+
+

Select Navigate > Find. Type "hello", and hit Enter. The selection will + move to address $100E, which is a string that says "hello!". You can use + Navigate > Find Next to try to find the next occurrence (there isn't one). You + can search for any text that appears in the rightmost columns (label, opcode, + operand, comment).

+
+
+ +
+
+ t1-goto-box +
+
+

Select Navigate > Go To. You can enter a label, address, or file offset. + Enter "100b" to set the selection to the line at address $100B.

+
+
+ +
+
+ t1-toolbar +
+
+

Near the top-left of the SourceGen window is a set of toolbar icons. + Click the curly left-pointing arrow, and watch the selection move. Click + it again. Then click the curly right-arrow a couple of times. Whenever + you jump around in the file by using the Go To feature, or by double-clicking + on opcodes or lines in the side windows, the locations are added to a + navigation history. The arrows let you move forward and backward + through it.

+
+
+ +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/odds-ends.html b/docs/sgtutorial/odds-ends.html index 3bd99e4..c0843db 100644 --- a/docs/sgtutorial/odds-ends.html +++ b/docs/sgtutorial/odds-ends.html @@ -1,197 +1,245 @@ - - - - - - - - - - - Odds & Ends - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Odds & Ends

- -
-
-

The rest of the code isn't really intended to do anything useful. It - just exists to illustrate some odd situations.

-
-
- -
-
- t2-2078 -
-
-

Look at the code starting at $2078. It ends with a BRK - at $2081, which as noted earlier is a bad sign. If you look two lines - above the BRK, you'll see that it's loading the accumulator - with zero, then doing a BNE, which should never be - taken (note the cycle count for the BNE is 2). - The trick is in the two lines before that, which use self-modifying code to - change the LDA immediate operand from $00 to $ff. - The BNE is actually a branch-always.

-
-
- -
-
- t2-override-status.png -
-
-

We can fix this by correcting the status flags. Select line $207F, - and then Actions > Override Status Flags. This lets us specify what - the flags should be before the instruction is executed. For each flag, - we can override the default behavior and specify that the flag is - clear (0), set (1), or indeterminate (could be 0 or 1). In this case, - we know that the self-modified code will be loading a non-zero value, so - in the "Z" column click on the button in the "Zero" row. - Click "OK".

-
-
- -
-
- t2-2078-done -
-
-

The BNE is now an always-taken branch, and the code - list rearranges itself appropriately (and the cycle count is now 3).

-
-
- -
-
- t2-2086 -
-
-

Continuing on, the code at $2086 touches a few consecutive locations - that have auto-generated labels.

-
-
- -
-
- t2-2081-stuff -
-
-

Edit the label on line $2081, setting it to STUFF. - Notice how the references to $2081 through $2084 have changed from - auto-generated labels to references to STUFF.

-
-
- -
-
- t2-seek-nearby -
-
-

For some projects this may be undesirable. Use - Edit > Project Properties, then in the - Analysis Parameters box un-check - Seek nearby targets, and click OK.

-
-
- -
-
-

You'll notice that the references to $2081 and later have switched - back to auto labels. If you scroll up, you'll see that the references to - PTR1+1 and PTR2+1 were - not affected, because local variables use explicit widths rather - than the "nearby" logic.

-

The nearby-target behavior is generally desirable, because it lets you - avoid explicitly labeling every part of a multi-byte data item. For now, - use Edit > Undo to switch it back on. - (Changes to project properties are added to the undo/redo buffer - just like any other change to the project.)

-
-
- -
-
- t2-2092 -
-
-

The code at $2092 looks a bit strange. LDX, then a - BIT with a weird symbol, then another LDX. If - you look at the "bytes" column, you'll notice that the three-byte - BIT instruction has only one byte on its line.

-
-
- -
-
-

The trick here is that the LDX #$01 is embedded inside the - BIT instruction. When the code runs through here, X is set - to $00, then the BIT instruction sets some flags, then the - STA runs. Several lines down there's a BNE - to $2095, which is in the middle of the BIT instruction. - It loads X with $01, then also continues to the STA.

-

Embedded instructions are unusual but not unheard-of. (This trick is - used extensively in Microsoft BASICs, such as Applesoft.) When you see the - extra symbol in the opcode field, you need to look closely at what's going - on.

-
-
- -
- -
-
-

This is the end of the basic tutorial (congratulations!). - The next sections explore some advanced topics.

-
-
- - -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Odds & Ends - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Odds & Ends

+ +
+
+

The rest of the code isn't really intended to do anything useful. It + just exists to illustrate some odd situations.

+
+
+ +
+
+ t2-2078 +
+
+

Look at the code starting at $2078. It ends with a BRK + at $2081, which as noted earlier is a bad sign. If you look two lines + above the BRK, you'll see that it's loading the accumulator + with zero, then doing a BNE, which should never be + taken (note the cycle count for the BNE is 2). + The trick is in the two lines before that, which use self-modifying code to + change the LDA immediate operand from $00 to $ff. + The BNE is actually a branch-always.

+
+
+ +
+
+ t2-override-status.png +
+
+

We can fix this by correcting the status flags. Select line $207F, + and then Actions > Override Status Flags. This lets us specify what + the flags should be before the instruction is executed. For each flag, + we can override the default behavior and specify that the flag is + clear (0), set (1), or indeterminate (could be 0 or 1). In this case, + we know that the self-modified code will be loading a non-zero value, so + in the "Z" column click on the button in the "Zero" row. + Click "OK".

+
+
+ +
+
+ t2-2078-done +
+
+

The BNE is now an always-taken branch, and the code + list rearranges itself appropriately (and the cycle count is now 3).

+
+
+ +
+
+ t2-2086 +
+
+

Continuing on, the code at $2086 touches a few consecutive locations + that have auto-generated labels.

+
+
+ +
+
+ t2-2081-stuff +
+
+

Edit the label on line $2081, setting it to STUFF. + Notice how the references to $2081 through $2084 have changed from + auto-generated labels to references to STUFF.

+
+
+ +
+
+ t2-seek-nearby +
+
+

For some projects this may be undesirable. Use + Edit > Project Properties, then in the + Analysis Parameters box un-check + Seek nearby targets, and click OK.

+
+
+ +
+
+

You'll notice that the references to $2081 and later have switched + back to auto labels. If you scroll up, you'll see that the references to + PTR1+1 and PTR2+1 were + not affected, because local variables use explicit widths rather + than the "nearby" logic.

+

The nearby-target behavior is generally desirable, because it lets you + avoid explicitly labeling every part of a multi-byte data item. For now, + use Edit > Undo to switch it back on. + (Changes to project properties are added to the undo/redo buffer + just like any other change to the project.)

+
+
+ +
+
+ t2-2092 +
+
+

The code at $2092 looks a bit strange. LDX, then a + BIT with a weird symbol, then another LDX. If + you look at the "bytes" column, you'll notice that the three-byte + BIT instruction has only one byte on its line.

+
+
+ +
+
+

The trick here is that the LDX #$01 is embedded inside the + BIT instruction. When the code runs through here, X is set + to $00, then the BIT instruction sets some flags, then the + STA runs. Several lines down there's a BNE + to $2095, which is in the middle of the BIT instruction. + It loads X with $01, then also continues to the STA.

+

Embedded instructions are unusual but not unheard-of. (This trick is + used extensively in Microsoft BASICs, such as Applesoft.) When you see the + extra symbol in the opcode field, you need to look closely at what's going + on.

+
+
+ +
+ +
+
+

This is the end of the basic tutorial (congratulations!). + The next sections explore some advanced topics.

+
+
+ + +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/sidenav-incl.html b/docs/sgtutorial/sidenav-incl.html index 23691f7..a30f227 100644 --- a/docs/sgtutorial/sidenav-incl.html +++ b/docs/sgtutorial/sidenav-incl.html @@ -1,26 +1,26 @@ - - + + diff --git a/docs/sgtutorial/simple-edits.html b/docs/sgtutorial/simple-edits.html index e0d9f36..458b4dd 100644 --- a/docs/sgtutorial/simple-edits.html +++ b/docs/sgtutorial/simple-edits.html @@ -1,204 +1,252 @@ - - - - - - - - - - - Simple Edits - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Simple Edits

- -
-
-

Click the very first line of the file, which is a comment that says - something like "6502bench SourceGen vX.Y.Z". There are three ways to - open the comment editor:

-
    -
  1. Select Actions > Edit Long Comment from the menu bar.
  2. -
  3. Right click, and select Edit Long Comment from the - pop-up menu. (This menu is exactly the same as the Actions menu.)
  4. -
  5. Double-click the comment
  6. -
-

Most things in the code list will respond to a double-click. - Double-clicking on addresses, flags, labels, operands, and comments will - open editors for those things. Double-clicking on a value in the "bytes" - column will open a floating hex dump viewer. This is usually the most - convenient way to edit something: point and click.

-
-
- -
-
- t1-edit-long-comment -
-
-

Double-click the comment to open the editor. Type some words into the - upper window, and note that a formatted version appears in the bottom - window. Experiment with the maximum line width and "render in box" - settings to see what they do. You can hit Enter to create line breaks, - or let SourceGen wrap lines for you. When you're done, click OK. - (Or hit Ctrl+Enter.)

-

When the dialog closes, you'll see your new comment in place at the - top of the file. If you typed enough words, your comment will span - multiple lines. You can select the comment by selecting any line in it.

-
-
- -
- -
-

Click on the comment, then shift-click on L1014. Right-click, and look - at the menu. Nearly all of the menu items are disabled. Most edit features - are only enabled when a single instance of a relevant item is selected, so - for example Edit Long Comment won't be enabled if you have an - instruction selected.

- -
-
- -
-
- t1-edit-note -
-
-

Let's add a note. Click on $100E (the line with "hello!"), then - select Actions > Edit Note. Type a few words, pick a color, - and click OK (or hit Ctrl+Enter). - Your note appears in the code, and also in the - window on the bottom left. Notes are like long comments, with three key - differences:

-
    -
  1. You can't pick their line width, but you can pick their color.
  2. -
  3. They don't appear in generated assembly sources, making them - useful for leaving notes to yourself as you work.
  4. -
  5. They're listed in the Notes window. Double-clicking them jumps - the selection to the note, making them useful as bookmarks.
  6. -
- -
-
- -
-
- t1-set-addr-1017 -
-
-

It's time to do something with the code. If you look at what the code - does you'll see that it's copying several dozen bytes from $1017 - to $2000, then jumping to $2000. It appears to be relocating the next - part of the code before - executing it. We want to let the disassembler know what's going on, so - select the line at address $1017 and then - Actions > Set Address. (Or double-click on - "1017" in the Addr column.) - In the Set Address dialog, type "2000", and hit Enter.)

-
-
- -
-
- t1-addr-chg-1017 -
-
-

Note the way the code list has changed. When you changed the address, - the JMP $2000 at address $1014 found a home inside - the bounds of the file, so - the code tracer was able to find the instructions there.

-
-
- -
-
-

From the menu, select Edit > Undo. Notice how - everything reverts to the way it was. Now, select - Edit > Redo to restore the changes. You can undo any change you - make to the project. (The undo history is not saved in - the project file, though, so when you exit the program the history is - lost.)

-

As you make alterations to the addresses, notice that, while the - Address column changes, the Offset column does not. - File offsets never change, which is why they're shown here and - in the References and Notes windows. (They can, however, be distracting, - so you'll be forgiven if you reduce the offset column width to zero.)

-
-
- -
-
- t1-simple-instr-edit -
-
-

Select the line with address $2003 ("CMP #$04"), then - Actions > Edit Operand. This allows you to pick how you want the - operand to look. It's currently set to "Default", which for an 8-bit - immediate argument means it's shown as a hexadecimal value. Click - "Binary", then "OK". It now appears as a binary value.

-
-
- -
-
- t1-2003-done -
-
-

On that same line, select Actions > Edit Comment. Type a short - comment, and hit Enter. Your comment appears - in the "comment" column.

-
-
- -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Simple Edits - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Simple Edits

+ +
+
+

Click the very first line of the file, which is a comment that says + something like "6502bench SourceGen vX.Y.Z". There are three ways to + open the comment editor:

+
    +
  1. Select Actions > Edit Long Comment from the menu bar.
  2. +
  3. Right click, and select Edit Long Comment from the + pop-up menu. (This menu is exactly the same as the Actions menu.)
  4. +
  5. Double-click the comment
  6. +
+

Most things in the code list will respond to a double-click. + Double-clicking on addresses, flags, labels, operands, and comments will + open editors for those things. Double-clicking on a value in the "bytes" + column will open a floating hex dump viewer. This is usually the most + convenient way to edit something: point and click.

+
+
+ +
+
+ t1-edit-long-comment +
+
+

Double-click the comment to open the editor. Type some words into the + upper window, and note that a formatted version appears in the bottom + window. Experiment with the maximum line width and "render in box" + settings to see what they do. You can hit Enter to create line breaks, + or let SourceGen wrap lines for you. When you're done, click OK. + (Or hit Ctrl+Enter.)

+

When the dialog closes, you'll see your new comment in place at the + top of the file. If you typed enough words, your comment will span + multiple lines. You can select the comment by selecting any line in it.

+
+
+ +
+ +
+

Click on the comment, then shift-click on L1014. Right-click, and look + at the menu. Nearly all of the menu items are disabled. Most edit features + are only enabled when a single instance of a relevant item is selected, so + for example Edit Long Comment won't be enabled if you have an + instruction selected.

+ +
+
+ +
+
+ t1-edit-note +
+
+

Let's add a note. Click on $100E (the line with "hello!"), then + select Actions > Edit Note. Type a few words, pick a color, + and click OK (or hit Ctrl+Enter). + Your note appears in the code, and also in the + window on the bottom left. Notes are like long comments, with three key + differences:

+
    +
  1. You can't pick their line width, but you can pick their color.
  2. +
  3. They don't appear in generated assembly sources, making them + useful for leaving notes to yourself as you work.
  4. +
  5. They're listed in the Notes window. Double-clicking them jumps + the selection to the note, making them useful as bookmarks.
  6. +
+ +
+
+ +
+
+ t1-set-addr-1017 +
+
+

It's time to do something with the code. If you look at what the code + does you'll see that it's copying several dozen bytes from $1017 + to $2000, then jumping to $2000. It appears to be relocating the next + part of the code before + executing it. We want to let the disassembler know what's going on, so + select the line at address $1017 and then + Actions > Set Address. (Or double-click on + "1017" in the Addr column.) + In the Set Address dialog, type "2000", and hit Enter.)

+
+
+ +
+
+ t1-addr-chg-1017 +
+
+

Note the way the code list has changed. When you changed the address, + the JMP $2000 at address $1014 found a home inside + the bounds of the file, so + the code tracer was able to find the instructions there.

+
+
+ +
+
+

From the menu, select Edit > Undo. Notice how + everything reverts to the way it was. Now, select + Edit > Redo to restore the changes. You can undo any change you + make to the project. (The undo history is not saved in + the project file, though, so when you exit the program the history is + lost.)

+

As you make alterations to the addresses, notice that, while the + Address column changes, the Offset column does not. + File offsets never change, which is why they're shown here and + in the References and Notes windows. (They can, however, be distracting, + so you'll be forgiven if you reduce the offset column width to zero.)

+
+
+ +
+
+ t1-simple-instr-edit +
+
+

Select the line with address $2003 ("CMP #$04"), then + Actions > Edit Operand. This allows you to pick how you want the + operand to look. It's currently set to "Default", which for an 8-bit + immediate argument means it's shown as a hexadecimal value. Click + "Binary", then "OK". It now appears as a binary value.

+
+
+ +
+
+ t1-2003-done +
+
+

On that same line, select Actions > Edit Comment. Type a short + comment, and hit Enter. Your comment appears + in the "comment" column.

+
+
+ +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/string-formatting.html b/docs/sgtutorial/string-formatting.html index 0d3e315..ee5b74b 100644 --- a/docs/sgtutorial/string-formatting.html +++ b/docs/sgtutorial/string-formatting.html @@ -1,121 +1,169 @@ - - - - - - - - - - - String Formatting - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

String Formatting

- -
-
-

Programs can encode strings, such as human-readable text or - filenames, in a variety of ways. Assemblers generally support one - or more of these. SourceGen allows you to choose from a number of - different formats, and automatically generates appropriate assembler - directives.

-

The most popular formats are null-terminated (string data followed - by $00), length-delimited (first byte or two holds the string length), - and dextral character inverted (the high bit on the last byte is - flipped). Sometimes strings are stored in reverse, so the output - routine can decrement a register to zero.

-
-
- -
-
- t2-str-null-term-start -
-
-

Looking at the Tutorial2 code, there are four strings starting - at address $2004, each of which is followed by $00. These look like - null-terminated strings, so let's make it official.

-
-
- -
-
- t2-str-null-term-bad -
-
-

First, let's do it wrong. Click on the line with - address $2004 to select it. Hold the shift key down, then double-click - on the operand field of the line with address $2031 (i.e. double-click on - the words "last string").

-

The Edit Data Operand dialog opens, but the null-terminated strings - option is not available. This is because we didn't include the null byte - on the last string. To be recognized as one of the "special" string types, - every selected string must match the expected pattern.

-
-
- -
-
- t2-str-null-term-good -
-
-

Cancel out of the dialog. Hold the shift key down, and double-click - on the operand on line $203C ($00). - With all 57 bytes selected, - you should now see "Null-terminated strings (4)" as an available - option (make sure the Character Encoding pop-up is set to - "Low or High ASCII"). Click on that, then click OK. - The strings are now shown as .ZSTR operands.

-
-
- - -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + String Formatting - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

String Formatting

+ +
+
+

Programs can encode strings, such as human-readable text or + filenames, in a variety of ways. Assemblers generally support one + or more of these. SourceGen allows you to choose from a number of + different formats, and automatically generates appropriate assembler + directives.

+

The most popular formats are null-terminated (string data followed + by $00), length-delimited (first byte or two holds the string length), + and dextral character inverted (the high bit on the last byte is + flipped). Sometimes strings are stored in reverse, so the output + routine can decrement a register to zero.

+
+
+ +
+
+ t2-str-null-term-start +
+
+

Looking at the Tutorial2 code, there are four strings starting + at address $2004, each of which is followed by $00. These look like + null-terminated strings, so let's make it official.

+
+
+ +
+
+ t2-str-null-term-bad +
+
+

First, let's do it wrong. Click on the line with + address $2004 to select it. Hold the shift key down, then double-click + on the operand field of the line with address $2031 (i.e. double-click on + the words "last string").

+

The Edit Data Operand dialog opens, but the null-terminated strings + option is not available. This is because we didn't include the null byte + on the last string. To be recognized as one of the "special" string types, + every selected string must match the expected pattern.

+
+
+ +
+
+ t2-str-null-term-good +
+
+

Cancel out of the dialog. Hold the shift key down, and double-click + on the operand on line $203C ($00). + With all 57 bytes selected, + you should now see "Null-terminated strings (4)" as an available + option (make sure the Character Encoding pop-up is set to + "Low or High ASCII"). Click on that, then click OK. + The strings are now shown as .ZSTR operands.

+
+
+ + +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/using-sourcegen.html b/docs/sgtutorial/using-sourcegen.html index 7d3b61e..ecb897d 100644 --- a/docs/sgtutorial/using-sourcegen.html +++ b/docs/sgtutorial/using-sourcegen.html @@ -1,174 +1,222 @@ - - - - - - - - - - - Using SourceGen - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Using SourceGen

- -
-
-

This first section covers the basics of working with SourceGen: how to - move around, make edits, generate code, and so on. - SourceGen has some unusual features, so it's worth reading through this - even if you've used other disassemblers.

- -

You can't do anything useful until you open an existing project or - create a new one, so we'll start there.

-
-
- -
-
-

A SourceGen project is always associated with a data file, which - holds part or all of the program being disassembled. - For simplicity, the project is given the same name as the data file, with - .dis65 on the end. - No part of the data file is included in the project file, so you need - to keep both files in the same place. - If the program you're disassembling was split into more than one data - file, you'll need a separate project file for each.

-
-
- -
-
- t1-fresh-install -
-
-

To start a new project, launch SourceGen, and click on the - "Start New Project" button on - the initial screen, or use File > New. This opens the "New Project" - window, which lets you specify the target system and data file.

-
-
- -
-
- t1-new-project -
-
-

Choosing a target system, such as Apple //e or Commodore 64, will - create a project configured with the appropriate CPU and options. - If nothing in the list matches the file you want to work on, - there are "generic" entries for each - of the primary CPU varieties (6502, 65C02, W65C02, and 65816). If - you're unsure, just take your best guess. It's easy to change things after the - project has been started.

-

The area on the right side of the window has a list of the files, scripts, - and optional features that will be enabled for the - selected system. The various items here will be explained in more - detail later on.

-
-
- -
-
- t1-new-tutorial1 -
-
-

For this tutorial, we're going to use "Generic 6502", - near the bottom of the list.

-

The other thing we need to do here is select the data file to be - disassembled. Click Select File, navigate to the Examples - directory in the SourceGen installation directory, open Tutorial, - and select Tutorial1. -

Click OK to create the project.

-
-
- -
-
-

The first thing you should do after creating a new project is save it. - Some features create or load files from the directory where the project - file lives, so we want to establish that. Use File > Save - or Ctrl+S to save it, with the default name - (Tutorial1.dis65), in the directory where the data file lives.

-

(It's okay to create the project in the installation directory. You - don't need to work off of a copy of the data file; SourceGen doesn't modify - it, so you don't have to worry about trashing the example data.)

-
-
- -
-
- t1-settings -
-
-

The disassembly display can be tailored to your personal - preferences. Use Edit > Settings to open the - settings editor. You can change fonts, upper/lower case, text - delimiters, line wrapping, pseudo-op names, and more. There - are "quick set" buttons on some screens that allow you to make the - output resemble various popular assemblers.

-
-
- -
-
-

All app settings are local to your system, and do not affect - the project in any way. If somebody else opens the same project, - they may see entirely different pseudo-ops and upper-case choices, - based on their own personal preferences. - (The settings that affect projects are accessed through a - different screen, via Edit > Project Properties.)

- -

For now, you can leave everything set to default values.

-
-
- -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Using SourceGen - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Using SourceGen

+ +
+
+

This first section covers the basics of working with SourceGen: how to + move around, make edits, generate code, and so on. + SourceGen has some unusual features, so it's worth reading through this + even if you've used other disassemblers.

+ +

You can't do anything useful until you open an existing project or + create a new one, so we'll start there.

+
+
+ +
+
+

A SourceGen project is always associated with a data file, which + holds part or all of the program being disassembled. + For simplicity, the project is given the same name as the data file, with + .dis65 on the end. + No part of the data file is included in the project file, so you need + to keep both files in the same place. + If the program you're disassembling was split into more than one data + file, you'll need a separate project file for each.

+
+
+ +
+
+ t1-fresh-install +
+
+

To start a new project, launch SourceGen, and click on the + "Start New Project" button on + the initial screen, or use File > New. This opens the "New Project" + window, which lets you specify the target system and data file.

+
+
+ +
+
+ t1-new-project +
+
+

Choosing a target system, such as Apple //e or Commodore 64, will + create a project configured with the appropriate CPU and options. + If nothing in the list matches the file you want to work on, + there are "generic" entries for each + of the primary CPU varieties (6502, 65C02, W65C02, and 65816). If + you're unsure, just take your best guess. It's easy to change things after the + project has been started.

+

The area on the right side of the window has a list of the files, scripts, + and optional features that will be enabled for the + selected system. The various items here will be explained in more + detail later on.

+
+
+ +
+
+ t1-new-tutorial1 +
+
+

For this tutorial, we're going to use "Generic 6502", + near the bottom of the list.

+

The other thing we need to do here is select the data file to be + disassembled. Click Select File, navigate to the Examples + directory in the SourceGen installation directory, open Tutorial, + and select Tutorial1. +

Click OK to create the project.

+
+
+ +
+
+

The first thing you should do after creating a new project is save it. + Some features create or load files from the directory where the project + file lives, so we want to establish that. Use File > Save + or Ctrl+S to save it, with the default name + (Tutorial1.dis65), in the directory where the data file lives.

+

(It's okay to create the project in the installation directory. You + don't need to work off of a copy of the data file; SourceGen doesn't modify + it, so you don't have to worry about trashing the example data.)

+
+
+ +
+
+ t1-settings +
+
+

The disassembly display can be tailored to your personal + preferences. Use Edit > Settings to open the + settings editor. You can change fonts, upper/lower case, text + delimiters, line wrapping, pseudo-op names, and more. There + are "quick set" buttons on some screens that allow you to make the + output resemble various popular assemblers.

+
+
+ +
+
+

All app settings are local to your system, and do not affect + the project in any way. If somebody else opens the same project, + they may see entirely different pseudo-ops and upper-case choices, + based on their own personal preferences. + (The settings that affect projects are accessed through a + different screen, via Edit > Project Properties.)

+ +

For now, you can leave everything set to default values.

+
+
+ +
+ +
+ « Previous + Next » +
+ + + + + diff --git a/docs/sgtutorial/visualizations.html b/docs/sgtutorial/visualizations.html index c85362d..2b97a30 100644 --- a/docs/sgtutorial/visualizations.html +++ b/docs/sgtutorial/visualizations.html @@ -1,343 +1,390 @@ - - - - - - - - - - - Visualizations - SourceGen Tutorial - - - - -
- - - -
- -
- - - -
- -
- - - -
- -
- -

Visualizations

- -
-
-

Many programs contain a significant amount of graphical data. This is - especially true for games, where the space used for bitmaps is often - larger than the space required for the code. When disassembling a program - it can be very helpful to be able to see the contents of the data - regions in graphical form.

- -

Start a new project with the Generic 6502 profile, - and from the SourceGen Tutorial directory select "Tutorial5". - We'll need to load an extension script from - the project directory, so immediately save the project, using the - default name ("Tutorial5.dis65").

- -

Normally a project will give you some sort of hint as to the data - format, e.g. the graphics might be a platform-specific sprite. For - non-standard formats you can glean dimensions from the drawing code. For - the purposes of this tutorial we're just using a simple monochrome bitmap - format, with 8 pixels per byte, and we'll know that our images are for - a Tic-Tac-Toe game. The 'X' and the 'O' are 8x8, the game board is 40x40. - The bitmaps are sprites with transparency, so pixels are either solid - or transparent.

-
-
- -
-
- t5-add-vis -
-
-

The first thing we need to do is load an extension script that can - decode this format. The SourceGen "RuntimeData" directory has a few, - but for this tutorial we're using a custom one. Select - Edit > Project Properties, select the - Extension Scripts tab, and click - Add Scripts from Project. - Double-click on "VisTutorial5.cs", - then click OK.

-
-
- -
-
- t5-new-vis -
-
-

The addresses of the three bitmaps are helpfully identified by the - load instructions at the top of the file. Select the line at - address $100A, then - Actions > Create/Edit Visualization Set. In - the window that opens, click New Visualization.

-
-
- -
-
-

We're going to ignore most of what's going on and just focus on the - list of parameters at the bottom. The file offset indicates where in - the file the bitmap starts; note this is an offset, not an address - (that way, if you change the address, your visualizations don't break). - This is followed by the bitmap's width in bytes, and the bitmap's height. - Because we have 8 pixels per byte, we're currently showing an 8x1 image. - We'll come back to row stride.

-
-
- -
-
- t5-set-height-8 -
-
-

We happen to know (by playing the game and/or reading the fictitious - drawing code) that the image is 8x8, so change the value in the - Height - field to 8. As soon as you do, the preview window shows a big blue 'X'. - (The 'X' is 7x7; the last row/column of pixels are transparent so adjacent - images don't bump into each other.)

-
-
- - -
-
- t5-set-height-80 -
-
-

Let's try doing it wrong. Add a '0' in the Height - field to make the - height 80. You can see some additional bitmap data.

-
-
- -
-
- t5-set-height-800 -
-
-

Add another 0 to make it 800. Now you get - a big red X, and the Height parameter is shown in red. - That's because the maximum value for the height is 512, as shown - by "[1,512]" on the right.

-
-
- -
-
- t5-addvis1 -
-
-

Change it back to 8, and hit OK. - Hit OK in the Edit Visualization Set - window as well. You should now see the blue 'X' in the code listing - above line $100A.

-
-
- -
-
- t5-addvis2 -
-
-

Repeat the process at line $1012: select the line, create a visualization - set, create a new visualization. The height will default to 8 because - that's what you used last time, so you shouldn't have to - make any changes to the initial values. - Click OK in both dialogs to close them.

-
-
- -
-
- t5-101a-mess -
-
-

Repeat the process at line $101A, but this time the image is 40x40 - rather than 8x8. Set the width to 5, and the height to 40. This makes - a mess.

-
-
- -
-
- t5-101a-good -
-
-

In this case, the bitmap data is 5 bytes wide, but the data is stored - as 8 bytes per row. This is known as the "stride" or "pitch" of the row. - To tell the visualizer to skip the last 3 bytes on each row, set the - Row stride (bytes) field to 8. - Now we have a proper Tic-Tac-Toe grid. - Note that it fills the preview window just as the 'X' and 'O' did, even - though it's 5x as large. The preview window scales everything up. Hit - OK twice to create the visualization.

-
-
- -
-
- t5-fmt-dense -
-
-

Let's format the bitmap data. Select line $101A, then shift-click the - last line in the file ($1159). Actions > Edit Operand. Select - Densely-packed bytes, and click OK. - This is perhaps a little too - dense. Open the operand editor again, but this time select the - densely-packed bytes sub-option ...with a limit, and set the limit - to 8 bytes per line. Instead of one very dense statement spread across - a few lines, you get one line of source code per row of bitmap.

-
-
- -
-
-

To change whether or not commas appear between bytes in the operand, - open Edit > Settings, select the - Display Format tab, and check - Use comma-separated format for bulk data. - This trades off compactness for ease of reading.

-
-
- -
- -

Bitmap Animations

- -
-
- t5-bitmap-anim-editor -
-
-

Some bitmaps represent individual frames in an animated sequence. - You can convert those as well. Double-click on the blue 'X' to open - the visualization set editor, then click "New Bitmap Animation". This - opens the Bitmap Animation Editor.

-
-
- -
-
- t5-xo-anim -
-
-

Let's try it with our Tic-Tac-Toe board pieces. From the list - on the left, select the blue 'X' and click Add, then - click the 'O' and click Add. Below the list, set the - frame delay to 500 msec. Near the bottom, click - Start / Stop. This causes the animation to play in a loop.

-
-
- -
-
-

You can use the controls to add and remove items, change their order, and change - the animation speed. You can add the grid bitmap to the animation set, but the - preview scales the bitmaps up to full size, so it may not look the way - you expect.

-

Hit OK to save the animation, then - OK to update the visualization set.

-
-
- -
-
- t5-list-xanim -
-
-

The code list now shows two entries in the line: the first is the 'X' - bitmap, the second is the animation, which is shown as the initial frame - with a blue triangle superimposed. (If you go back into the editor and - reverse the order of the frames, the list will show the 'O' instead.) - You can have as many bitmaps and animations on a line as you want.

-
-
- -
-
-

If you have a lot of bitmaps it can be helpful to give them meaningful - names, so that they're easy to identify and sort together in the list. - The Tag field at the top of the editor windows lets you - give things names. Tags must be unique.

-
-
- -
- -

Other Notes

- -
-
-

The visualization editor is intended to be very dynamic, showing the - results of parameter changes immediately. This can be helpful if you're - not exactly sure what the size or format of a bitmap is. Just keep - tweaking values until it looks right.

- -

Visualization generators are defined by extension scripts. If you're - disassembling a program with a totally custom way of storing graphics, - you can write a totally custom visualizer and distribute it with the - project. Because the file offset is a parameter, you're not limited to - placing visualizations at the start of the graphic data -- you can put - them on any code or data line.

- -

Visualizations have no effect on assembly source code generation, - but they do appear in code exported to HTML. Bitmaps are converted to GIF - images, and animations become animated GIFs.

-
-
- -
-
- t5-wireframe-sample -
-
-

You can also create animated visualizations of wireframe objects - (vector graphics, 3D shapes), but that's not covered in this tutorial.

-
-
- - - -
- -
- « Previous - Next » -
- - - - - + + + + + + + + + + + Visualizations - SourceGen Tutorial + + + + +
+ + +
+ 6502bench +
+ +
+ +
+ + + + + +
+ +
+ + + + +
+ +
+ +

Visualizations

+ +
+
+

Many programs contain a significant amount of graphical data. This is + especially true for games, where the space used for bitmaps is often + larger than the space required for the code. When disassembling a program + it can be very helpful to be able to see the contents of the data + regions in graphical form.

+ +

Start a new project with the Generic 6502 profile, + and from the SourceGen Tutorial directory select "Tutorial5". + We'll need to load an extension script from + the project directory, so immediately save the project, using the + default name ("Tutorial5.dis65").

+ +

Normally a project will give you some sort of hint as to the data + format, e.g. the graphics might be a platform-specific sprite. For + non-standard formats you can glean dimensions from the drawing code. For + the purposes of this tutorial we're just using a simple monochrome bitmap + format, with 8 pixels per byte, and we'll know that our images are for + a Tic-Tac-Toe game. The 'X' and the 'O' are 8x8, the game board is 40x40. + The bitmaps are sprites with transparency, so pixels are either solid + or transparent.

+
+
+ +
+
+ t5-add-vis +
+
+

The first thing we need to do is load an extension script that can + decode this format. The SourceGen "RuntimeData" directory has a few, + but for this tutorial we're using a custom one. Select + Edit > Project Properties, select the + Extension Scripts tab, and click + Add Scripts from Project. + Double-click on "VisTutorial5.cs", + then click OK.

+
+
+ +
+
+ t5-new-vis +
+
+

The addresses of the three bitmaps are helpfully identified by the + load instructions at the top of the file. Select the line at + address $100A, then + Actions > Create/Edit Visualization Set. In + the window that opens, click New Visualization.

+
+
+ +
+
+

We're going to ignore most of what's going on and just focus on the + list of parameters at the bottom. The file offset indicates where in + the file the bitmap starts; note this is an offset, not an address + (that way, if you change the address, your visualizations don't break). + This is followed by the bitmap's width in bytes, and the bitmap's height. + Because we have 8 pixels per byte, we're currently showing an 8x1 image. + We'll come back to row stride.

+
+
+ +
+
+ t5-set-height-8 +
+
+

We happen to know (by playing the game and/or reading the fictitious + drawing code) that the image is 8x8, so change the value in the + Height + field to 8. As soon as you do, the preview window shows a big blue 'X'. + (The 'X' is 7x7; the last row/column of pixels are transparent so adjacent + images don't bump into each other.)

+
+
+ + +
+
+ t5-set-height-80 +
+
+

Let's try doing it wrong. Add a '0' in the Height + field to make the + height 80. You can see some additional bitmap data.

+
+
+ +
+
+ t5-set-height-800 +
+
+

Add another 0 to make it 800. Now you get + a big red X, and the Height parameter is shown in red. + That's because the maximum value for the height is 512, as shown + by "[1,512]" on the right.

+
+
+ +
+
+ t5-addvis1 +
+
+

Change it back to 8, and hit OK. + Hit OK in the Edit Visualization Set + window as well. You should now see the blue 'X' in the code listing + above line $100A.

+
+
+ +
+
+ t5-addvis2 +
+
+

Repeat the process at line $1012: select the line, create a visualization + set, create a new visualization. The height will default to 8 because + that's what you used last time, so you shouldn't have to + make any changes to the initial values. + Click OK in both dialogs to close them.

+
+
+ +
+
+ t5-101a-mess +
+
+

Repeat the process at line $101A, but this time the image is 40x40 + rather than 8x8. Set the width to 5, and the height to 40. This makes + a mess.

+
+
+ +
+
+ t5-101a-good +
+
+

In this case, the bitmap data is 5 bytes wide, but the data is stored + as 8 bytes per row. This is known as the "stride" or "pitch" of the row. + To tell the visualizer to skip the last 3 bytes on each row, set the + Row stride (bytes) field to 8. + Now we have a proper Tic-Tac-Toe grid. + Note that it fills the preview window just as the 'X' and 'O' did, even + though it's 5x as large. The preview window scales everything up. Hit + OK twice to create the visualization.

+
+
+ +
+
+ t5-fmt-dense +
+
+

Let's format the bitmap data. Select line $101A, then shift-click the + last line in the file ($1159). Actions > Edit Operand. Select + Densely-packed bytes, and click OK. + This is perhaps a little too + dense. Open the operand editor again, but this time select the + densely-packed bytes sub-option ...with a limit, and set the limit + to 8 bytes per line. Instead of one very dense statement spread across + a few lines, you get one line of source code per row of bitmap.

+
+
+ +
+
+

To change whether or not commas appear between bytes in the operand, + open Edit > Settings, select the + Display Format tab, and check + Use comma-separated format for bulk data. + This trades off compactness for ease of reading.

+
+
+ +
+ +

Bitmap Animations

+ +
+
+ t5-bitmap-anim-editor +
+
+

Some bitmaps represent individual frames in an animated sequence. + You can convert those as well. Double-click on the blue 'X' to open + the visualization set editor, then click "New Bitmap Animation". This + opens the Bitmap Animation Editor.

+
+
+ +
+
+ t5-xo-anim +
+
+

Let's try it with our Tic-Tac-Toe board pieces. From the list + on the left, select the blue 'X' and click Add, then + click the 'O' and click Add. Below the list, set the + frame delay to 500 msec. Near the bottom, click + Start / Stop. This causes the animation to play in a loop.

+
+
+ +
+
+

You can use the controls to add and remove items, change their order, and change + the animation speed. You can add the grid bitmap to the animation set, but the + preview scales the bitmaps up to full size, so it may not look the way + you expect.

+

Hit OK to save the animation, then + OK to update the visualization set.

+
+
+ +
+
+ t5-list-xanim +
+
+

The code list now shows two entries in the line: the first is the 'X' + bitmap, the second is the animation, which is shown as the initial frame + with a blue triangle superimposed. (If you go back into the editor and + reverse the order of the frames, the list will show the 'O' instead.) + You can have as many bitmaps and animations on a line as you want.

+
+
+ +
+
+

If you have a lot of bitmaps it can be helpful to give them meaningful + names, so that they're easy to identify and sort together in the list. + The Tag field at the top of the editor windows lets you + give things names. Tags must be unique.

+
+
+ +
+ +

Other Notes

+ +
+
+

The visualization editor is intended to be very dynamic, showing the + results of parameter changes immediately. This can be helpful if you're + not exactly sure what the size or format of a bitmap is. Just keep + tweaking values until it looks right.

+ +

Visualization generators are defined by extension scripts. If you're + disassembling a program with a totally custom way of storing graphics, + you can write a totally custom visualizer and distribute it with the + project. Because the file offset is a parameter, you're not limited to + placing visualizations at the start of the graphic data -- you can put + them on any code or data line.

+ +

Visualizations have no effect on assembly source code generation, + but they do appear in code exported to HTML. Bitmaps are converted to GIF + images, and animations become animated GIFs.

+
+
+ +
+
+ t5-wireframe-sample +
+
+

You can also create animated visualizations of wireframe objects + (vector graphics, 3D shapes), but that's not covered in this tutorial.

+
+
+ + + +
+ +
+ « Previous +
+ + + + +