start writing docs about structs and pointers, update syntax files with ^^

2025-11-01 06:16:15 +00:00 · 2025-06-18 18:38:53 +02:00
parent bd72eaad4c
commit c2bf9024f8
14 changed files with 45 additions and 67 deletions
--- a/.idea/libraries/eclipse_lsp4j.xml
+++ b/.idea/libraries/eclipse_lsp4j.xml
@@ -4,8 +4,8 @@
    <CLASSES>
      <root url="jar://$MAVEN_REPOSITORY$/org/eclipse/lsp4j/org.eclipse.lsp4j/0.24.0/org.eclipse.lsp4j-0.24.0.jar!/" />
      <root url="jar://$MAVEN_REPOSITORY$/org/eclipse/lsp4j/org.eclipse.lsp4j.jsonrpc/0.24.0/org.eclipse.lsp4j.jsonrpc-0.24.0.jar!/" />
-      <root url="jar://$MAVEN_REPOSITORY$/com/google/code/gson/gson/2.12.1/gson-2.12.1.jar!/" />
-      <root url="jar://$MAVEN_REPOSITORY$/com/google/errorprone/error_prone_annotations/2.36.0/error_prone_annotations-2.36.0.jar!/" />
+      <root url="jar://$MAVEN_REPOSITORY$/com/google/code/gson/gson/2.13.1/gson-2.13.1.jar!/" />
+      <root url="jar://$MAVEN_REPOSITORY$/com/google/errorprone/error_prone_annotations/2.38.0/error_prone_annotations-2.38.0.jar!/" />
    </CLASSES>
    <JAVADOC />
    <SOURCES />
--- a/README.md
+++ b/README.md
@@ -61,6 +61,7 @@ What does Prog8 provide?
 - code often is smaller and faster than equivalent C code compiled with CC65 or even LLVM-MOS
 - modularity, symbol scoping, subroutines. No need for forward declarations.
 - various data types other than just bytes (16-bit words, floats, strings)
+- Structs and typed pointers
 - floating point math is supported on certain targets
 - access to most Kernal ROM routines as external subroutine definitions you can call normally
 - tight control over Zeropage usage
--- a/docs/source/comparing.rst
+++ b/docs/source/comparing.rst
@@ -46,6 +46,7 @@ Data types
 - maximum storage size for arrays is 256 bytes (512 for split word arrays) , the maximum number of elements in the array depends on the size of a single element value.
  you can use larger "arrays" via pointer indexing, see below at Pointers.  One way of obtaining a piece of memory to store
  such an "array" is by using  ``memory()`` builtin function.
+- there is limited support for structs and typed pointers, see below at "Pointers and Structs".


 Variables
@@ -76,10 +77,13 @@ Subroutines
  With only a little bit of code it is possible to implement a simple cooperative multitasking system that runs multiple tasks simultaneously. See the "multitasking" example,
  which uses the "coroutines" library.  Each task is a subroutine and it simply has its state stored in the statically allocated variables so it can resume after yielding, without doing anything special.

-Pointers
--------
- There is no specific pointer datatype.
-  However, variables of the ``uword`` datatype can be used as a pointer to one of the possible 65536 memory locations,
+Pointers and Structs
+--------------------
+
+Legacy 'untyped' pointers:
+
+- In Prog8 versions before 12.0 there was no support for typed pointers, only 'untyped' ones:
+  Variables of the ``uword`` datatype can be used as a pointer to one of the possible 65536 memory locations,
  so the value it points to is always a single byte. This is similar to ``uint8_t*`` from C.
  You have to deal with the uword manually if the object it points to is something different.
 - Note that there is the ``peekw`` builtin function that *does* allow you to directy obtain the *word* value at the given memory location.
@@ -88,6 +92,12 @@ Pointers
 - Pointers don't have to be a variable, you can immediately access the value of a given memory location using ``@($d020)`` for instance.
  Reading is done by assigning it to a variable, writing is done by just assigning the new value to it.

+Typed pointers and structs:
+
+- Since version 12, prog8 supports struct types and typed pointers.
+- Structs are a grouping of one or more fields, that together make up the struct type.
+- Typed pointers are just that: a pointer to a specific type (which can be a simple type such as float, or a struct type.)
+

 Foreign function interface (external/ROM calls)
 -----------------------------------------------
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -96,6 +96,7 @@ Features
  and inline assembly to have full control when every register, cycle or byte matters
 - Variables are all allocated statically, no memory allocation overhead
 - Variable data types include signed and unsigned bytes and words, arrays, strings.
+- Structs and typed pointers
 - Tight control over Zeropage usage
 - Programs can be restarted after exiting (i.e. run them multiple times without having to reload everything), due to automatic variable (re)initializations.
 - Programs can be configured to execute in ROM
@@ -222,6 +223,7 @@ Look in the `syntax-files <https://github.com/irmen/prog8/tree/master/syntax-fil
    compiling.rst
    programming.rst
    variables.rst
+    structpointers.rst
    binlibrary.rst
    libraries.rst
    targetsystem.rst
--- a/docs/source/programming.rst
+++ b/docs/source/programming.rst
@@ -1061,7 +1061,7 @@ Reusing *virtual registers* R0-R15 for parameters

 Normally, every subroutine parameter will get its own local variable in the subroutine where the argument value
 will be stored when the subroutine is called. In certain situations, this may lead to many variables being allocated.
-You *can* instruct the compiler to not allocate a new variable, but instead to reuse one of the *virtual registers* R0-R15
+You *can* tell the compiler to not allocate a new variable, but instead to reuse one of the *virtual registers* R0-R15
 (accessible in the code as ``cx16.r0`` - ``cx16.r15``)  for the parameter. This is done by adding a ``@Rx`` tag
 to the parameter. This can only be done for booleans, byte, and word types.
 Note: the R0-R15 *virtual registers* are described in more detail below for the Assembly subroutines.
--- a/docs/source/structpointers.rst
+++ b/docs/source/structpointers.rst
@@ -0,0 +1,7 @@
+.. _pointers:
+
+====================
+Structs and Pointers
+====================
+
+Work in progress.
--- a/docs/source/technical.rst
+++ b/docs/source/technical.rst
@@ -134,7 +134,8 @@ Calling a subroutine requires three steps:

 #. preparing the arguments (if any) and passing them to the routine.
   Numeric types are passed by value (bytes, words, booleans, floats),
-   but array types and strings are passed by reference which means as ``uword`` being a pointer to their address in memory.
+   but array types passed by reference which means as ``uword`` being a pointer to their address in memory.
+   Strings are passed as a pointer to a byte: ``^^ubyte``.
 #. calling the subroutine
 #. preparing the return value (if any) and returning that from the call.

--- a/docs/source/todo.rst
+++ b/docs/source/todo.rst
@@ -58,11 +58,10 @@ STRUCTS and TYPED POINTERS
 - DONE: allow  a.b.ptr[i].value  (equiv to a.b.ptr[i]^^.value)  expressions  (assignment target doesn't parse yet, see below)
 - DONE: check passing arrays to typed ptr sub-parameters.  NOTE: word array can only be a @nosplit array if the parameter type is ^^word, because the words need to be sequential in memory there
 - DONE: allow str assign to ^^ubyte without cast (take address)
+- write docs in structpointers.rst
 - fix support for (expression) array index dereferencing "barray[2]^^"   where barray is ^^bool[10]
 - fix support for (assigntarget) array index dereferencing "barray[2]^^"   where barray is ^^bool[10]
 - fix support for (assigntarget) array index dereferencing "array[2].value"   where array is struct pointers
- add unit tests for expected AST elements for all syntaxes dealing with pointers, dereference(chain), derefs, and indexing (both as value and assigntargets)
- add unit tests for all changes (pointers and structs)
 - try to fix parse error  l1^^.s[0] = 4242   (equivalent to l1.s[0]=4242 , which does parse correctly)
 - try to make sizeof(^^type) parse correctly (or maybe replace it immediately with sys.SIZEOF_POINTER)
 - add ?. null-propagation operator (for expression and assignment)?
@@ -71,8 +70,6 @@ STRUCTS and TYPED POINTERS
 - 6502 asm symbol name prefixing should work for dereferences too.
 - really fixing the pointer dereferencing issues (cursed hybrid beween IdentifierReference, PtrDereferece and PtrIndexedDereference) may require getting rid of scoped identifiers altogether and treat '.' as a "scope or pointer following operator"
 - (later, nasty parser problem:) support chaining pointer dereference on function calls that return a pointer.  (type checking now fails on stuff like func().field and func().next.field)
- update syntax highlighting files
- write docs


 Future Things and Ideas
--- a/docs/source/variables.rst
+++ b/docs/source/variables.rst
@@ -5,6 +5,7 @@ Variables and Values
 ====================

 Because this is such a big subject, variables and values have their own chapter.
+Structs and pointers are in a separate chapter again: :ref:`pointers`.


 Variables
@@ -349,7 +350,7 @@ a dynamic location in memory: currently this is equivalent to directly referenci
 memory at the given index. In contrast to a real array variable, the index value can be the size of a word.
 Unlike array variables, negative indexing for pointer variables does *not* mean it will be counting from the end, because the size of the buffer is unknown.
 Instead, it simply addresses memory that lies *before* the pointer variable.
-See also :ref:`pointervars`
+See also :ref:`pointervars` and the chapter about it :ref:`pointers`.

 **LSB/MSB split word and str arrays:**

@@ -572,13 +573,14 @@ without defining a memory-mapped location, you can do so by enclosing the addres
    @($d020) = 0      ; set the c64 screen border to black ("poke 53280,0")
    @(vic+$20) = 6    ; you can also use expressions to 'calculate' the address

-This is the official syntax to 'dereference a pointer' as it is often named in other languages.
 You can actually also use the array indexing notation for this. It will be silently converted into
 the direct memory access expression as explained above. Note that unlike regular arrays,
 the index is not limited to an ubyte value. You can use a full uword to index a pointer variable like this::

    pointervar[999] = 0     ; set memory byte to zero at location pointervar + 999.

+More information about *typed pointers* can be found in the chapter :ref:`pointers`.
+

 Converting/Casting types into other types
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
--- a/examples/test.p8
+++ b/examples/test.p8
@@ -1,56 +1,12 @@
-%import textio
-%import strings
-
-; Animal guessing game where the computer gets smarter every time.
-; Note: this program can be compiled for multiple target systems.
-
 main {

-    str userinput = "x"*80
-
    sub start() {
-        db.init()
-        txt.print_uw(db.first)
-        txt.nl()
-        cx16.r0 = db.first
+        struct Node {
+            ^^uword s
+        }

-        cx16.r1 = db.first.negative
-        cx16.r0 = db.first.negative.animal
-        txt.print_uw(db.first.negative)
-        txt.nl()
-        txt.print(db.first.negative.animal)
-        txt.nl()
-        txt.print(db.first.positive.animal)
-        txt.nl()
-    }
-}
-
-db {
-    struct Node {
-        str question
-        str animal
-        ^^Node negative
-        ^^Node positive
-    }
-
-    ^^Node first
-
-    sub init() {
-        first = Node("does it swim", 0, 0, 0)
-        ^^Node eagle = Node(0, "eagle", 0, 0)
-        ^^Node dolpin = Node(0, "dolpin", 0, 0)
-        first.negative = eagle
-        first.positive = dolpin
-    }
-}
-
-arena {
-    ; extremely trivial arena allocator (that never frees)
-    uword buffer = memory("arena", 10000, 0)
-    uword next = buffer
-
-    sub alloc(ubyte size) -> uword {
-        defer next += size
-        return next
+        ^^Node l1
+
+        l1^^.s[0] = 4242
    }
 }
--- a/syntax-files/NotepadPlusPlus/Prog8.xml
+++ b/syntax-files/NotepadPlusPlus/Prog8.xml
@@ -13,7 +13,7 @@
            <Keywords name="Numbers, suffix1"></Keywords>
            <Keywords name="Numbers, suffix2"></Keywords>
            <Keywords name="Numbers, range"></Keywords>
-            <Keywords name="Operators1">@ &amp; &amp;&lt; &amp;&gt; | ^ ~ &gt;&gt; &lt;&lt; += -= *= = **= &amp;= |= ^= &lt;&lt;= &gt;&gt;= -&gt; = == != &lt; &gt; &lt;= &gt;= , + ++ - -- * ** / ( ) [ ]</Keywords>
+            <Keywords name="Operators1">@ &amp; &amp;&lt; &amp;&gt; | ^ ~ &gt;&gt; &lt;&lt; += -= *= = **= &amp;= |= ^= &lt;&lt;= &gt;&gt;= -&gt; = == != &lt; &gt; &lt;= &gt;= , + ++ - -- * ** ^^ / ( ) [ ]</Keywords>
            <Keywords name="Operators2"></Keywords>
            <Keywords name="Folders in code1, open">{ {{</Keywords>
            <Keywords name="Folders in code1, middle"></Keywords>
--- a/syntax-files/NotepadPlusPlus/Prog8dark.xml
+++ b/syntax-files/NotepadPlusPlus/Prog8dark.xml
@@ -13,7 +13,7 @@
            <Keywords name="Numbers, suffix1"></Keywords>
            <Keywords name="Numbers, suffix2"></Keywords>
            <Keywords name="Numbers, range"></Keywords>
-            <Keywords name="Operators1">@ &amp; &amp;&lt; &amp;&gt; | ^ ~ &gt;&gt; &lt;&lt; += -= *= = **= &amp;= |= ^= &lt;&lt;= &gt;&gt;= -&gt; = == != &lt; &gt; &lt;= &gt;= , + ++ - -- * ** / ( ) [ ]</Keywords>
+            <Keywords name="Operators1">@ &amp; &amp;&lt; &amp;&gt; | ^ ~ &gt;&gt; &lt;&lt; += -= *= = **= &amp;= |= ^= &lt;&lt;= &gt;&gt;= -&gt; = == != &lt; &gt; &lt;= &gt;= , + ++ - -- * ** ^^ / ( ) [ ]</Keywords>
            <Keywords name="Operators2"></Keywords>
            <Keywords name="Folders in code1, open">{ {{</Keywords>
            <Keywords name="Folders in code1, middle"></Keywords>
--- a/syntax-files/SublimeText/Prog8.sublime-syntax
+++ b/syntax-files/SublimeText/Prog8.sublime-syntax
@@ -151,6 +151,8 @@ contexts:
  storage:
    - match: (\b(ubyte|byte|word|uword|long|float|str|struct)\b)
      scope: storage.type.prog8
+    - match: (\^\^)
+      scope: storage.modifier.prog8
    - match: (\b(const)\b)
      scope: storage.modifier.prog8
  support:
--- a/syntax-files/Vim/prog8.vim
+++ b/syntax-files/Vim/prog8.vim
@@ -38,7 +38,7 @@ syn match prog8Directive "\(^\|\s\)%\(zpreserved\|zpallowed\|address\|encoding\|
 syn match prog8Directive "\(^\|\s\)%\(align\|asmbinary\|asminclude\|breakpoint\)\>"
 syn match prog8Directive "\(^\|\s\)%\(asm\|ir\)\>"

-syn match prog8Type "\<\%(u\?byte\|u\?word\|float\|str\|bool\|long\)\>"
+syn match prog8Type "\<\%(u\?byte\|u\?word\|float\|str\|bool\|long\|\^\^\)\>"
 syn region prog8ArrayType matchgroup=prog8Type
            \ start="\<\%(u\?byte\|u\?word\|float\|str\|bool\)\[" end="\]"
            \ transparent