add updates after orca debugger was published.
8.4 KiB
How the Debugger Works
COP Vector
The ORCA compilers and debuggers use an invasive debug mechanism that depends on the compilers inserting COP
instructions in the code stream. When the Apple IIGS executes a COP
instruction it calls a COP
handler; the debuggers insert themselves in the list of programs that are called when a COP
instruction is encountered.
Several COP
instructions are used. There are separate COP
instructions for executing a line of source code, breaking, stepping past a line, creating symbol tables, entering and leaving subroutines, and for passing messages to the debugger. The various COP
instructions are summarized in table A-1, and explained in detail below.
Table A-1: Debugger COP Instructions
00 Indicates a new source code line.
01 Indicates a hard-coded break point.
02 Indicates a memory protection point.
03 Used when a new subroutine starts.
04 Marks the end of a subroutine.
05 Creates a symbol table.
06 Switches the source file.
07 Sends a message to the debugger.
COP 00
COP 00
indicates that a new source line has been reached and that the debugger must take appropriate action, such as updating the source listing position and variables window.
The COP
instruction is followed by the line number. In assembly language, this would look like:
cop $00
dc i'16'
COP 01
COP 01
, like COP 00
, marks the start of an executable line of source code. The difference is that COP 01
also indicates that the user has marked the line as a hard-coded break point, so the debugger should break at the line.
The COP
instruction is followed by the line number. In assembly language, this would look like:
cop $01
dc i'16'
COP 02
COP 02
, like COP 00
, marks the start of an executable line of source code. The difference is that COP 02
marks a protected line, indicating that the debugger should not take the normal action of updating the debugger display. The only reason for putting COP 02
instructions in the code is to give the debugger a chance to override the memory protection status of a line. For example, the ORCA/Debugger allows manual break points to override these hard-coded memory protection points.
The COP instruction is followed by the line number. In assembly language, this would look like:
cop $02
dc i'16'
COP 03
This instruction is used right after a subroutine is called, and marks entry into the subroutine. The COP
instruction is followed by the four byte address of the subroutine name, stored with a length-byte prefix (P-string format)
cop $03
dc a4'name'
...
name dc i1'15',c'Subroutine Name'
COP 04
This instruction marks the end of a subroutine. It should appear right after the last executable line in the subroutine, but before the code that wipes out the stack frame and returns to the caller.
Debuggers will remove any symbol tables that have been created since the last COP 03
instruction.
Every COP 04
instruction must match exactly one COP 03
instruction. If the debugger encounters a COP 03
and never finds a COP 04
, or encounters a COP 04
without first hitting a COP 03
, it could crash or corrupt memory.
There is no operand for this instruction. In assembly language, it looks like this:
cop $04
COP 05
COP 05
provides access to a subroutine’s symbol table. It can be used after a call to vectors 3 or 6, but must be used before any calls to vectors 0, 1, and 2. The debugger’s symbol table is organized as shown in Figure A-1.
Figure A-1: Debugger Symbol Table Format
$00 Displacement to the end of the table
--- repeat for each variable
| $02 Pointer to the next variable name. The name is stored in P-string format.
| $06 Pointer to the variable's value. If the variable is an array, then this points to the first element.
| $0a Address flag; 0 -> direct page, 1 -> long address
| $0b Format of value; see table A-2
| $0c Number of subscripts; 0 if not array
| --- repeat for each array dimension
| | $0e Minimum subscript value
| | $12 Maximum subscript value
| | $16 Size of each element
The following table shows the format used to store the variable’s current value:
Table A-2: Debugger Symbol Table Format Codes
Value Format
0 1-byte integer
1 2-byte integer
2 4-byte integer
3 single-precision real
4 double-precision real
5 extended-precision real
6 C-style string
7 Pascal-style string
8 character
9 boolean
10 8-byte SANE COMP
11 Pointer
12 Structure
13 Reference to previously defined structure
14 Object Pascal object
One-byte integers default to unsigned, while two-byte and four-byte integers default to signed format. OR
ing the format code with $40
reverses this default, giving signed one-byte integers or unsigned four-byte integers. (The signed flag is not supported by PRIZM 1.1.3.)
A pointer to a scalar type (1-10) is indicated by OR
ing the value’s format code with $80
.
A pointer to a non-scalar type (11-14) is indicated by a type 11 pointer record, followed by the record(s) for the underlying type.
Structures, records, and objects are indicated by a type 12 or 14 record, followed by one ore more field records. Field records are similar to variable fields, with a different interpretation of the value and address flag fields.
--- repeat for each field
| $00 Pointer to the next field name. The name is stored in P-string format.
| $04 Pointer to the field offset.
| $08 EOF flag; 0 -> last field, 1 -> more fields present
| $09 Format of value; see table A-2
| $0a Number of subscripts; 0 if not array
| --- repeat for each array dimension
| | $0c Minimum subscript value
| | $10 Maximum subscript value
| | $14 Size of each element
After a record has been defined in the symbol table, it may be referred to with a type 13 reference. The subscript field is used as an offset within the symbol table of the actual definition. This allows for recursive structures (such as linked lists) and also acts as a form of compression to reduces symbol table size.
The symbol table follows right after the COP 05
instruction.
COP 06
COP 06
is used at the start of all subroutines, right after the COP 03
that marks the start of the subroutine. (You can put the COP 06
before or after any COP 05
, so long as it comes before any COP 00
, COP 01
or COP 02
instructions). This instruction flags the source file for the subroutine, giving the debugger a chance to switch to the correct source file if it is not already being displayed. You can also imbed other COP 06
instructions inside of the subroutine if the subroutine spans several source files.
The COP 06
instruction is followed by the four-byte address of the full path name of the source file. The path name is given as a P-string. The ORCA/Debugger supports path names up to 255 characters long, and allows either / or : characters as separators. Here’s what the instruction might look like in assembly language:
cop $06
dc a4'name'
...
name dc i1'23',c'/hd/programs/source.pas'
COP 07
COP 07
is used to send messages to the debugger. The first four bytes following the COP 07
have a fixed format, but the remaining bytes vary from message to message.
The two bytes right after the COP 07
instruction are the total length of the debugger message, in bytes. This will always be at least 4. The next two bytes are the message number. The message number can be followed by more bytes.
Three messages are currently defined and supported by ORCA/Debugger. None uses any optional fields, so the length word should be four for all three of these messages.
Message 0 tells the debugger to start patching all debugger COP
instructions with JMP
instructions. This is the message sent by the DebugFast
utility. This message must be sent before a program starts to execute – sending this message after a program with debug code starts, but before it finishes, can cause memory corruption or crashes.
Message 1 tells the debugger to stop patching COP
instructions, reversing the effect of message 0. The DebugNoFast
utility sends this message.
Message 2 tells the debugger to treat the next COP 00
as if it were a COP 01
. The DebugBreak
utility sends this message.
COP 08
COP 08
provides access to a global symbol table. The format is identical to COP 05
.