merge in updates from the ORCA Debugger disk.

2025-02-01 16:31:10 +00:00 · 2018-03-25 00:33:31 -04:00 · 2018-03-25 00:33:31 -04:00 · b9e6e4dd27
commit b9e6e4dd27
parent 7ef2ee2fcc
1 changed files with 63 additions and 34 deletions
--- a/Debugger.md
+++ b/Debugger.md
@ -78,8 +78,8 @@ There is no operand for this instruction.  In assembly language, it looks like t

    $00 Displacement to the end of the table
    --- repeat for each variable
-    | $02 Pointer to the next variable name. The name is stored in P-string format.
-    | $06 Pointer to the variable's value. If the variable is an array, then this points to the first element.
+    | $02 Pointer to the variable name. The name is stored in P-string format.
+    | $06 Pointer to the variable's address. If the variable is an array, then this points to the first element.
    | $0a Address flag;  0 -> direct page, 1 -> long address
    | $0b Format of value; see table A-2
    | $0c Number of subscripts; 0 if not array
@ -88,51 +88,68 @@ There is no operand for this instruction.  In assembly language, it looks like t
    | | $12 Maximum subscript value
    | | $16 Size of each element

+The symbol table follows right after the `COP 05` instruction.

 The following table shows the format used to store the variable’s current value:

 ### Table A-2:  Debugger Symbol Table Format Codes

-    Value	Format
-    0   1-byte integer
-    1   2-byte integer
-    2   4-byte integer
-    3   single-precision real
-    4   double-precision real
-    5   extended-precision real
-    6   C-style string
-    7   Pascal-style string
-    8   character
-    9   boolean
-    10  8-byte SANE COMP
-    11  Pointer
-    12  Structure
-    13  Reference to previously defined structure
-    14  Object Pascal object
+    Value   Format
+    0       1-byte integer
+    1       2-byte integer
+    2       4-byte integer
+    3       single-precision real
+    4       double-precision real
+    5       extended-precision real
+    6       C-style string
+    7       Pascal-style string
+    8       character
+    9       boolean
+    10      SANE COMP number
+    11      pointer
+    12      structure, union or record
+    13      derived type
+    14      object


 One-byte integers default to unsigned, while two-byte and four-byte integers default to signed format.  `OR`ing the format code with `$40` reverses this default, giving signed one-byte integers or unsigned four-byte integers.  (The signed flag is not supported by PRIZM 1.1.3.)

-A pointer to a scalar type (1-10) is indicated by `OR`ing the value’s format code with `$80`.
+A pointer to a scalar type (1-10) is indicated by `OR`ing the value’s format code with `$80`. For example, `$82` would be a pointer to a 4-byte integer.

-A pointer to a non-scalar type (11-14) is indicated by a type 11 pointer record, followed by the record(s) for the underlying type.

-Structures, records, and objects are indicated by a type 12 or 14 record, followed by one ore more field records. Field records are similar to variable fields, with a different interpretation of the value and address flag fields.
+### Value 11, Pointer

-    --- repeat for each field
-    | $00 Pointer to the next field name. The name is stored in P-string format.
-    | $04 Pointer to the field offset.
-    | $08 EOF flag;  0 -> last field, 1 -> more fields present
-    | $09 Format of value; see table A-2
-    | $0a Number of subscripts; 0 if not array
-    | --- repeat for each array dimension
-    | | $0c Minimum subscript value
-    | | $10 Maximum subscript value
-    | | $14 Size of each element
+The pointer type is intended for use when a pointer to a pointer is needed.  The pointer symbol table entry is followed by a second symbol table entry that describes the value being pointed to.  For example, to describe a pointer to a pointer to a 4-byte integer, a compiler would generate two 12-byte symbol table entries.  The first would contain all of the normal address information, but have a type value of `$0B` (11).  The following entry would have a type value of `$82`.  In this second entry, the name and address fields are unneeded, and should be set to 0.  Only the format field is actually used.

-After a record has been defined in the symbol table, it may be referred to with a type 13 reference.  The subscript field is used as an offset within the symbol table of the actual definition. This allows for recursive structures (such as linked lists) and also acts as a form of compression to reduces symbol table size.
+While the reason for creating this type value is to allow pointers to pointers, the type will work for pointers to any other symbol table entry.  `Or`ing the type with `$80 `is still preferred, though, since it saves a symbol table entry.
+
+### Value 12, Structure, Union or Record
+
+This type is followed by a series of symbol table entries describing the fields within the structure, union or record (hereafter referred to as a record).  The field entries are coded exactly like normal symbol table entries, with these exceptions:
+
+1. The address field is a displacement to the field within the record, not the actual address.
+
+2. The one-byte address flag, normally used to tell if the address is a direct page displacement or absolute address, is now used as a flag indicating if there are more entries in the record.  If the byte is 0, the symbol table entry is the last one in the record.  If the byte is 1, the symbol table entry is followed by another field.
+
+There are no restrictions on the type of symbol table entries that can appear as a field.  Specifically, records can contain other records, pointers, or even pointers to other records.
+
+Variant records (C unions) are supported by the pragmatic approach of allowing fields to overlap.  The debugger is perfectly willing to display each and every variant at the same time.  This can lead to some very strange results when variant records are used for their intended purpose of overlapping radically different types of data, but it is also a very useful feature for the other common use of variant records: treating the same binary data as two different kinds of data.  Because the debugger allows all of the fields to be displayed, even if they overlap, a programmer can actually see all of the data formats at the same time.
+
+For an array of records, the array size and subscript information follows the symbol, and the field entries follow the array subscript information.
+
+### Value 13, Derived Type
+
+A derived type is a space-saver.  In a derived type, the subscript field is a displacement past the first symbol table entry.  The debugger uses the type for the symbol table entry at the given displacement.
+
+For example, assume there are three variables in the symbol table, p1, p2 and p3.  Each is a record containing two real values.  It's perfectly legal to create three separate symbol table entries, each with a record.  To save space, though, the best choice is to create a symbol table entry for p1 in the normal way.  Then, for p2 and p3, use a derived type and substitute the displacement of p1 in the symbol table for the subscript count, rather than duplicating the entire record declaration.
+
+Derived types can be used for any type in the symbol table, but for efficiency, they should only be used when the type referred to is one of the multi-entry types (11, pointer; 12, struct; or an array).
+
+
+### Value 14, Object
+
+Internally, an object is a pointer to a record.  When the user types entries in the debugger, though, accessing an object looks and works just like accessing a record.  The debugger will also allow you to type the name of the object itself, while it will not allow you to type the name of a record P in that case, the debugger prints the actual pointer value for the object.

-The symbol table follows right after the `COP 05` instruction.

 ## COP 06

@ -164,4 +181,16 @@ Message 2 tells the debugger to treat the next `COP 00` as if it were a `COP 01`

 ## COP 08

-`COP 08` provides access to a global symbol table. The format is identical to `COP 05`. 
+This coprocessor instruction is used to enter global symbols in a top-level symbol table.  For the purpose of debugging, a global symbol is any symbol the compiler writer feels should be available to the programmer for the duration of the debugging session.  For example, in Modula-2, this would be any symbol defined at the top level in a module.
+
+When a debugger encounters the first `COP 08` instruction, it will create a new stack frame above all current stack frames, placing all of the symbols from the symbol table in that stack frame.  Unlike symbols entered with the COP 05 instruction, these symbols will survive a return from the subroutine.  In fact, they will remain available until the program stops executing.
+
+Multiple `COP 08` instructions can be used.  When the debugger encounters a subsequent `COP 08` instruction, any symbols in the symbol table are added to the symbols currently displayed in the top-level table.
+
+While multiple `COP 08` instructions can be used, duplicate symbol tables must not be entered.  The compiler is responsible for insuring that, once symbols from a unit or module have been entered, they are not entered into the debuggers symbol table a second time, even if the subroutine that actually contained the `COP 08` instruction is called again.
+
+PRIZM has no way to resolve multiple symbols with the same name.  For example, if `COP 08` instructions from two different units each enter a symbol with the same name, there is no way to see both of these values, and there isn't even a good way to determine which of the symbols the debugger will actually show when the user examines the symbol.  In general, it is expected that this issue will simply be pointed out to the user.  If the user wants to see both values, one of the names will have to be changed.
+
+In ORCA/Debugger, all of the symbols are displayed, even if there are two symbols with the same name.
+
+While there is no direct prohibition against entering some global variables with COP 05 and some with `COP 08`, debugger displays will be a lot cleaner if all global variables are entered using `COP 08`, and all local variables are entered using `COP 05`.