Create Debugger.md

from ORCA/Debugger appendix A, including mistakes.
2025-03-06 06:29:57 +00:00 · 2018-02-10 00:46:46 -05:00 · 2018-02-10 00:46:46 -05:00 · e620357dc6
commit e620357dc6
parent 5685009791
1 changed files with 140 additions and 0 deletions
--- a/Debugger.md
+++ b/Debugger.md
@ -0,0 +1,140 @@
+
+
+# How the Debugger Works
+
+## COP Vector
+
+The ORCA compilers and debuggers use an invasive debug mechanism that depends on the compilers inserting `COP` instructions in the code stream.  When the Apple IIGS executes a `COP` instruction it calls a `COP` handler; the debuggers insert themselves in the list of programs that are called when a `COP` instruction is encountered.
+
+Several `COP` instructions are used.  There are separate `COP` instructions for executing a line of source code, breaking, stepping past a line, creating symbol tables, entering and leaving subroutines, and for passing messages to the debugger.  The various `COP` instructions are summarized in table A-1, and explained in detail below.
+
+### Table A-1:  Debugger COP Instructions
+
+    00	Indicates a new source code line.
+    01	Indicates a hard-coded break point.
+    02	Indicates a memory protection point.
+    03	Used when a new subroutine starts.
+    04	Marks the end of a subroutine.
+    05	Creates a symbol table.
+    06	Switches the source file.
+    07	Sends a message to the debugger.
+
+## COP 00
+
+`COP 00` indicates that a new source line has been reached and that the debugger must take appropriate action, such as updating the source listing position and variables window.
+
+The `COP` instruction is followed by the line number.  In assembly language, this would look like:
+
+    	cop	$00
+    	dc	i'16'
+
+## COP 01
+
+`COP 01`, like `COP 00`, marks the start of an executable line of source code.  The difference is that `COP 01` also indicates that the user has marked the line as a hard-coded break point, so the debugger should break at the line.
+
+The `COP` instruction is followed by the line number.  In assembly language, this would look like:
+
+    	cop	$01
+    	dc	i'16'
+
+## COP 02
+
+`COP 02`, like `COP 00`, marks the start of an executable line of source code.  The difference is that `COP 02` marks a protected line, indicating that the debugger should not take the normal action of updating the debugger display.  The only reason for putting `COP 02` instructions in the code is to give the debugger a chance to override the memory protection status of a line.  For example, the ORCA/Debugger allows manual break points to override these hard-coded memory protection points.
+
+The COP instruction is followed by the line number.  In assembly language, this would look like:
+
+    	cop	$02
+    	dc	i'16'
+
+## COP 03
+
+This instruction is used right after a subroutine is called, and marks entry into the subroutine.  The `COP` instruction is followed by the four byte address of the subroutine name, coded as a null terminated string (c-string).
+
+    	cop	$03
+    	dc	a4'name'
+    
+    	...
+    
+    name	dc	c'Subroutine Name',i1'0'
+
+## COP 04
+
+This instruction marks the end of a subroutine.  It should appear right after the last executable line in the subroutine, but before the code that wipes out the stack frame and returns to the caller.
+
+Debuggers will remove any symbol tables that have been created since the last `COP 03` instruction.
+
+Every `COP 04` instruction must match exactly one `COP 03` instruction.  If the debugger encounters a `COP 03` and never finds a `COP 04`, or encounters a `COP 04` without first hitting a `COP 03`, it could crash or corrupt memory.
+
+There is no operand for this instruction.  In assembly language, it looks like this:
+
+    	cop	$04
+
+
+## COP 05
+
+`COP 05` provides access to a subroutine’s symbol table.  It can be used after a call to vectors 3 or 6, but must be used before any calls to vectors 0, 1, and 2.  The debugger’s symbol table is organized as shown in Figure A-1.
+
+### Figure A-1:  Debugger Symbol Table Format
+
+    $00	Displacement to the end of the table
+    $02 Pointer to the next variable name. The name is stored in Pascal string format.
+    $06 Pointer to the variable's value. If the variable is an array, then this points to the first element.
+    $0a Address flag;  0 -> direct page, 1 -> long address
+    $0b Format of value; see table A-2
+    $0c Number of subscripts; 0 if not array
+    $0e Minimum subscript value
+    $12 Maximum subscript value
+    $16 Size of each element  
+
+
+The following table shows the format used to store the variable’s current value:
+
+### Table A-2:  Debugger Symbol Table Format Codes
+
+    	Value	Format
+    	0	1-byte integer
+    	1	2-byte integer
+    	2	4-byte integer
+    	3	single-precision real
+    	4	double-precision real
+    	5	extended-precision real
+    	6	C-style string
+    	7	Pascal-style string
+    	8	character
+    	9	boolean
+
+
+The format code indicating a pointer to any of these types of values is obtained by `OR`ing the value’s format code with `$80`.
+
+One-byte integers default to unsigned, while two-byte and four-byte integers default to signed format.  `OR`ing the format code with `$40` reverses this default, giving signed one-byte integers or unsigned four-byte integers.  (The signed flag is not supported by PRIZM 1.1.3.)
+
+The symbol table follows right after the `COP 05` instruction.
+
+
+## COP 06
+
+`COP 06` is used at the start of all subroutines, right after the `COP 03` that marks the start of the subroutine.  (You can put the `COP 06` before or after any `COP 05`, so long as it comes before any `COP 00`, `COP 01` or `COP 02` instructions).  This instruction flags the source file for the subroutine, giving the debugger a chance to switch to the correct source file if it is not already being displayed.  You can also imbed other `COP 06` instructions inside of the subroutine if the subroutine spans several source files.
+
+The `COP 06` instruction is followed by the four-byte address of the full path name of the source file.  The path name is given as a C-string.  The ORCA/Debugger supports path names up to 255 characters long, and allows either / or : characters as separators.  Here’s what the instruction might look like in assembly language:
+
+    	cop	$06
+    	dc	a4'name'
+    
+    	...
+    
+    name	dc	c'/hd/programs/source.pas',i1'0'
+
+
+## COP 07
+
+`COP 07` is used to send messages to the debugger.  The first four bytes following the `COP 07` have a fixed format, but the remaining bytes vary from message to message.
+
+The two bytes right after the `COP 07` instruction are the total length of the debugger message, in bytes.  This will always be at least 4.  The next two bytes are the message number.  The message number can be followed by more bytes.
+
+Three messages are currently defined and supported by ORCA/Debugger.  None uses any optional fields, so the length word should be four for all three of these messages.
+
+Message 0 tells the debugger to start patching all debugger `COP` instructions with `JMP` instructions.  This is the message sent by the `DebugFast` utility.  This message must be sent before a program starts to execute – sending this message after a program with debug code starts, but before it finishes, can cause memory corruption or crashes.
+
+Message 1 tells the debugger to stop patching `COP` instructions, reversing the effect of message 0.  The `DebugNoFast` utility sends this message.
+
+Message 2 tells the debugger to treat the next `COP 00` as if it were a `COP 01`.  The `DebugBreak` utility sends this message.