ORCA-C/Debugger.md

168 lines
8.4 KiB
Markdown
Raw Normal View History

# How the Debugger Works
## COP Vector
The ORCA compilers and debuggers use an invasive debug mechanism that depends on the compilers inserting `COP` instructions in the code stream. When the Apple IIGS executes a `COP` instruction it calls a `COP` handler; the debuggers insert themselves in the list of programs that are called when a `COP` instruction is encountered.
Several `COP` instructions are used. There are separate `COP` instructions for executing a line of source code, breaking, stepping past a line, creating symbol tables, entering and leaving subroutines, and for passing messages to the debugger. The various `COP` instructions are summarized in table A-1, and explained in detail below.
### Table A-1: Debugger COP Instructions
00 Indicates a new source code line.
01 Indicates a hard-coded break point.
02 Indicates a memory protection point.
03 Used when a new subroutine starts.
04 Marks the end of a subroutine.
05 Creates a symbol table.
06 Switches the source file.
07 Sends a message to the debugger.
## COP 00
`COP 00` indicates that a new source line has been reached and that the debugger must take appropriate action, such as updating the source listing position and variables window.
The `COP` instruction is followed by the line number. In assembly language, this would look like:
cop $00
dc i'16'
## COP 01
`COP 01`, like `COP 00`, marks the start of an executable line of source code. The difference is that `COP 01` also indicates that the user has marked the line as a hard-coded break point, so the debugger should break at the line.
The `COP` instruction is followed by the line number. In assembly language, this would look like:
cop $01
dc i'16'
## COP 02
`COP 02`, like `COP 00`, marks the start of an executable line of source code. The difference is that `COP 02` marks a protected line, indicating that the debugger should not take the normal action of updating the debugger display. The only reason for putting `COP 02` instructions in the code is to give the debugger a chance to override the memory protection status of a line. For example, the ORCA/Debugger allows manual break points to override these hard-coded memory protection points.
The COP instruction is followed by the line number. In assembly language, this would look like:
cop $02
dc i'16'
## COP 03
This instruction is used right after a subroutine is called, and marks entry into the subroutine. The `COP` instruction is followed by the four byte address of the subroutine name, stored with a length-byte prefix (P-string format)
cop $03
dc a4'name'
...
name dc i1'15',c'Subroutine Name'
## COP 04
This instruction marks the end of a subroutine. It should appear right after the last executable line in the subroutine, but before the code that wipes out the stack frame and returns to the caller.
Debuggers will remove any symbol tables that have been created since the last `COP 03` instruction.
Every `COP 04` instruction must match exactly one `COP 03` instruction. If the debugger encounters a `COP 03` and never finds a `COP 04`, or encounters a `COP 04` without first hitting a `COP 03`, it could crash or corrupt memory.
There is no operand for this instruction. In assembly language, it looks like this:
cop $04
## COP 05
`COP 05` provides access to a subroutines symbol table. It can be used after a call to vectors 3 or 6, but must be used before any calls to vectors 0, 1, and 2. The debuggers symbol table is organized as shown in Figure A-1.
### Figure A-1: Debugger Symbol Table Format
2018-02-10 05:50:08 +00:00
$00 Displacement to the end of the table
--- repeat for each variable
| $02 Pointer to the next variable name. The name is stored in P-string format.
2018-02-10 05:50:08 +00:00
| $06 Pointer to the variable's value. If the variable is an array, then this points to the first element.
| $0a Address flag; 0 -> direct page, 1 -> long address
| $0b Format of value; see table A-2
| $0c Number of subscripts; 0 if not array
| --- repeat for each array dimension
| | $0e Minimum subscript value
| | $12 Maximum subscript value
| | $16 Size of each element
The following table shows the format used to store the variables current value:
### Table A-2: Debugger Symbol Table Format Codes
Value Format
0 1-byte integer
1 2-byte integer
2 4-byte integer
3 single-precision real
4 double-precision real
5 extended-precision real
6 C-style string
7 Pascal-style string
8 character
9 boolean
10 8-byte SANE COMP
11 Pointer
12 Structure
13 Reference to previously defined structure
14 Object Pascal object
One-byte integers default to unsigned, while two-byte and four-byte integers default to signed format. `OR`ing the format code with `$40` reverses this default, giving signed one-byte integers or unsigned four-byte integers. (The signed flag is not supported by PRIZM 1.1.3.)
A pointer to a scalar type (1-10) is indicated by `OR`ing the values format code with `$80`.
A pointer to a non-scalar type (11-14) is indicated by a type 11 pointer record, followed by the record(s) for the underlying type.
Structures, records, and objects are indicated by a type 12 or 14 record, followed by one ore more field records. Field records are similar to variable fields, with a different interpretation of the value and address flag fields.
--- repeat for each field
| $00 Pointer to the next field name. The name is stored in P-string format.
| $04 Pointer to the field offset.
| $08 EOF flag; 0 -> last field, 1 -> more fields present
| $09 Format of value; see table A-2
| $0a Number of subscripts; 0 if not array
| --- repeat for each array dimension
| | $0c Minimum subscript value
| | $10 Maximum subscript value
| | $14 Size of each element
After a record has been defined in the symbol table, it may be referred to with a type 13 reference. The subscript field is used as an offset within the symbol table of the actual definition. This allows for recursive structures (such as linked lists) and also acts as a form of compression to reduces symbol table size.
The symbol table follows right after the `COP 05` instruction.
## COP 06
`COP 06` is used at the start of all subroutines, right after the `COP 03` that marks the start of the subroutine. (You can put the `COP 06` before or after any `COP 05`, so long as it comes before any `COP 00`, `COP 01` or `COP 02` instructions). This instruction flags the source file for the subroutine, giving the debugger a chance to switch to the correct source file if it is not already being displayed. You can also imbed other `COP 06` instructions inside of the subroutine if the subroutine spans several source files.
The `COP 06` instruction is followed by the four-byte address of the full path name of the source file. The path name is given as a P-string. The ORCA/Debugger supports path names up to 255 characters long, and allows either / or : characters as separators. Heres what the instruction might look like in assembly language:
cop $06
dc a4'name'
...
name dc i1'23',c'/hd/programs/source.pas'
## COP 07
`COP 07` is used to send messages to the debugger. The first four bytes following the `COP 07` have a fixed format, but the remaining bytes vary from message to message.
The two bytes right after the `COP 07` instruction are the total length of the debugger message, in bytes. This will always be at least 4. The next two bytes are the message number. The message number can be followed by more bytes.
Three messages are currently defined and supported by ORCA/Debugger. None uses any optional fields, so the length word should be four for all three of these messages.
Message 0 tells the debugger to start patching all debugger `COP` instructions with `JMP` instructions. This is the message sent by the `DebugFast` utility. This message must be sent before a program starts to execute sending this message after a program with debug code starts, but before it finishes, can cause memory corruption or crashes.
Message 1 tells the debugger to stop patching `COP` instructions, reversing the effect of message 0. The `DebugNoFast` utility sends this message.
Message 2 tells the debugger to treat the next `COP 00` as if it were a `COP 01`. The `DebugBreak` utility sends this message.
## COP 08
`COP 08` provides access to a global symbol table. The format is identical to `COP 05`.