Added explanation of memory architecture.

This commit is contained in:
Bobbi Webber-Manners 2018-05-01 13:47:19 -04:00 committed by GitHub
parent d0b65a8451
commit 8c15345efd
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -716,17 +716,50 @@ The call stack is used for all memory allocation within the virtual machine, as
### VM Memory Organization
The call stack grows down from top of memory. The evaluation stack is fixed in low memory (32 bytes). In an optimized virtual machine implementation, this would be placed in zero page.
cc65 places the VM excutable code and static evaluation stack (32 bytes) in low memory. In an optimized virtual machine implementation, this would be placed in zero page.
## Compiler Internals
Virtual machine addresses correspond to physical machine addresses on 6502 systems.
### Relationship of Compiler and Interpreter
Under Linux, the virtual machine uses a 64K byte array as workspace, and addresses point into this space.
...
The call stack grows down from top of memory.
The bytecode is loaded at the start of memory. This location differs depending on the platform:
- Apple II - 0x5000
- Commodore 64 - 0x3000
- Commodore VIC20 - 0x4000
These addresses are chosen to allow space for the EightBall VM executable, which loads below these addresses. Once again these values can be tuned by inspecting the map files generated by cc65.
## Interpreter / Compiler Internals
### Relationship of Interpreter / Compiler
EightBall was first implemented as an interpreted language (although the language design was always intended to permit compilation.) The bytecode compiler and virtual machine were added with v0.5 in April 2018.
In order to use the least code possible, the compiler uses the same data structures as the interpreter, but in a different way.
### Interpreter Memory Organization
cc65 places the executable code of the EightBall line editor / interpreter / compiler in low memory.
The source code of the program is stored in plain ASCII (or PETSCII on Commodore systems) text in a buffer immediately above the EightBall executable code. Note that the lower bounds of this buffer have to be adjusted by hand in `eightball.c` when the code changes size. The size of the code segments generated by cc65 can be determined by inspecting the map file created by the compiler. (This is HEAP2 in `eightball.c`).
Global and local variables are allocated from the highest available memory address down. (This is HEAP1 in `eightball.c`). For each variable a small `var_t` header is stored, consisting of the first four characters of the name, a byte which records whether it is a `byte` or `word` variable and also the number of dimensions. If the number of dimensions is zero then this indicates a scalar variable, otherwise it is an array of the specified number of elements. The `var_t` header also includes a two byte pointer to next, allowing them to be assembled into a linked list. Following the `var_t` header the actual variable data is stored:
- One byte for a `byte` scalar
- Two bytes for a `word` scalar
- `sz` bytes for a `byte[sz]` array
- `2*sz` bytes for a `word[sz]` array
When entering a subroutine a special `var_t` entry is made for a `word` variable using the otherwise illegal name `"----"` to mark the stack frame. The value of this this variable is actually a pointer into the call stack which is used to unwind the stack when a subroutine exits.
### Compiler Memory Organization
...
The compiler shares most of the infrastructure with the interpreter. The source code of the program is obviously still stored in HEAP2.
The main difference is that instead of storing global and local variables in HEAP1, the compiler uses the `var_t` data structures to keep track of the variable during compilation only - they serve as temporary symbol tables so the compiler can keep track of the address of all the variables in scope. Instead of the payload described above, the entries created by the compiler contain a pointer to the address of the variable in the virtual machine's address space.
The compiled bytecode is written to the beginning of HEAP1, starting from the lowest address and working up. Since no actual data is stored in HEAP1 when compiling (only `var_t` headers and addresses), it is hoped that there will be enough space for the compiled code without having it collide with the symbol tables!
# Code Examples