Update README.md

This commit is contained in:
David Schmenk 2014-04-29 19:56:10 -07:00
parent 1c96bcf3c8
commit 9196c0e71a

View File

@ -1,15 +1,15 @@
###PLASMA
#PLASMA
##Introduction
PLASMA is a combination of virtual machine and assembler/compiler matched closely to the 6502 architecture. It is an attempt to satisfy a few challenges surrounding code size, efficient execution, small runtime and fast just-in-time compilation. By architecting a unique bytecode that maps nearly one-to-one to the higher level representation, the compiler/assembler can be very simple and execute quickly on the Apple II for a self-hosted environment. A modular approach provides for incremental development and code reuse. Different projects have led to the architecture of PLASMA, most notably Apple Pascal, FORTH, and my own Java VM for the 6502, VM02. Each has tried to map a generic VM to the 6502 with varying levels of success. Apple Pascal, based on the USCD Pascal using the p-code interpreter, was a very powerful system and ran fast enough on the Apple II to be interactive but didn't win any speed contests. FORTH was the poster child for efficiency and obtuse syntax. Commonly referred to as a write only language, it was difficult to come up to speed as a developer, especially when using other's code. My own project in creating a Java VM for the Apple II uncovered the folly of shoehorning a large system into something never intended to run 32 bit applications.
#Low Level Implementation
##Low Level Implementation
Both the Pascal and Java VMs used a bytecode to hide the underlying CPU architecture and offer platform agnostic application execution. The application and tool chains were easily moved from platform to platform by simply writing a bytecode interpreter and small runtime to translate the higher level constructs to the underlying hardware. The performance of the system was dependent on the actual hardware and efficiency of the interpreter. Just-in-time compilation wasn't really an option on small, 8 bit systems. FORTH, on the other hand, was usually implemented as a threaded interpreter. A threaded interpreter will use the address of functions to call as the code stream instead of a bytecode, eliminating one level of indirection with a slight increase in code size. The threaded approach can be made faster at the expense of another slight increase in size by inserting an actual Jump SubRoutine opcode before each address, thus removing the interpreter's inner loop altogether.
All three systems were implemented using stack architecture. Pascal and Java were meant to be compiled high level languages, using a stack machine as a simple compilation target. FORTH was meant to be written directly as a stack oriented language, similar to RPN on HP calculators. The 6502 is a challenging target due to it's unusual architecture so writing a bytecode interpreter for Pascal and Java results in some inefficiencies and limitations. FORTH's inner interpreter loop on the 6502 tends to be less efficient than most other CPUs. Another difference is how each system creates and manipulates it's stack. Pascal and Java use the 6502 hardware stack for all stack operations. Unfortunately the 6502 stack is hard-limited to 256 bytes. However, in normal usage this isn't too much of a problem as the compilers don't put undue pressure on the stack size by keeping most values in global or local variables. FORTH creates a small stack using a portion of the 6502's zero page, a 256 byte area of low memory that can be accessed with only a byte address and indexed using either of the X or Y registers. With zero page, the X register can be used as an indexed, indirect address and the Y register can be used as an indirect, indexed address.
#A New Approach
##A New Approach
PLASMA takes an approach that uses the best of all the above implementations to create a unique, powerful and efficient platform for developing new applications on the Apple II. One goal was to create a very small VM runtime, bytecode interpreter, and module loader that could adjust the code size vs. performance optimizations to allow for interpreted code, threaded code, or efficiently compiled native code. The decision was made early on to implement a stack based architecture duplicating the approach taken by FORTH. Space in the zero page would be assigned to a 16 bit, 32 element evaluation stack, indexed by the X register. The stack is purposely not split between low and high values so as to allow reading and writing addresses stored directly on the stack. The trade off is that the stack pointer has to be incremented and decremented by two for every push/pop operation. A simple compiler was written so that higher level constructs could be used and global/local variables would hold values instead of using clever stack manipulation. Function/procedure frames would allow for local variables, but with a limitation - the frame could be no larger than 256 bytes. By enforcing this limitation, the function frame could easily be accessed through a frame pointer value in zero page, indexed by the Y register. The call stack uses the 6502's hardware stack resulting in the same 256 byte limitation imposed by the hardware. However, this limitation could be lifted by extending the call sequence to save and restore the return address in the function frame. This was not done initially for performance reasons and simplicity of implementation. One of the goals of PLASMA was to allow for intermixing of functions implemented as bytecode, threaded code, or native code. Taking a page from the FORTH play book, a function call is implemented as a native subroutine call to an address. If the function is in bytecode, the first thing it does is call back into the interpreter to execute the following bytecode. Functions can be selectively expanded as bytecode, threaded code, or natively compiled, all at load time. Threaded code expands to about 3X the size of bytecode with about 3X the performance. Native code is about 5X-10X the size with a significant improvement in performance. The native code compiler uses a strategy of caching the Top-Of-Stack value in the Y and A registers of the 6502. The compiler also tracks the TOS pointer and adjusts the address accordingly to avoid actual manipulation of the X register. The X register in effect becomes the frame pointer for the evaluation stack. Function call parameters are pushed onto the evaluation stack in order they are written. The first operation inside of the function call is to pull the parameters off the evaluation stack and put them in local frame storage. Function callers and callees must agree on the number of parameters to avoid stack underflow/overflow. All functions return a value on the evaluation stack regardless of it being used or not. Lastly, PLASMA is not a typed language. Just like assembly, any value can represent a character, integer, or address. It's the programmer's job to know the type. Only bytes and words are known to PLASMA. Bytes are unsigned 8 bit quantities, words are signed 16 bit quantities. All stack operations involve 16 bits of precision.
@ -99,7 +99,7 @@ Hexadecimal constants are preceded with a $ to identify them as such.
$C030 ; Speaker address
```
#Constants, Variables and Functions
###Constants, Variables and Functions
The source code of a PLASMA module first defines constants, variables and data. Constants must be initialized with a value. Variables can have sizes associated with them to declare storage space. Data can be declared with or without a variable name associated with it. Arrays, tables, strings and any predeclared data can be created and accessed in multiple ways.
@ -192,36 +192,36 @@ Expressions are algebraic. Data is free-form, but all operations on the evaluat
More complex expressions can be built up using algebraic unary and binary operations.
|OP | Unary Operation
| OP | Unary Operation
|--------------------------
|^ | byte pointer
|* | word pointer
|@ | address of
|- | negate
|# | bitwise compliment
|! | logical NOT
| ^ | byte pointer
| * | word pointer
| @ | address of
| - | negate
| # | bitwise compliment
| ! | logical NOT
|OP | Binary Operation
| OP | Binary Operation
|---------------------------
|* | multiply
|/ | divide
|% | modulo
|+ | add
|- | subtract
|<< | shift left
|>> | shift right
|& | bitwise AND
|| | bitwise OR
|^ | bitwise XOR
|== | equals
|<> | not equal
|>= | greater than or equal
|> | greater than
|<= | less than or equal
|< | less than
|OR | logical OR
|AND | logical AND
| * | multiply
| / | divide
| % | modulo
| + | add
| - | subtract
| << | shift left
| >> | shift right
| & | bitwise AND
| | | bitwise OR
| ^ | bitwise XOR
| == | equals
| <> | not equal
| >= | greater than or equal
| > | greater than
| <= | less than or equal
| < | less than
| OR | logical OR
| AND | logical AND
Statements are built up from expressions and control flow keywords. Simplicity of syntax took precedence over flexibility and complexity. The simplest statement is the basic assignment using =.
@ -309,7 +309,7 @@ Lastly, the repeat/until statement will continue looping as long as the until ex
until txtbuf == 0 or numlines == maxlines
```
#Runtime
###Runtime
PLASMA includes a very minimal runtime that nevertheless provides a great deal of functionality to the system. Two system calls are provided to access native 6502 routines (usually in ROM) and ProDOS.
@ -350,7 +350,7 @@ memset(val16, addr, len) will fill memory with a 16 bit value. memcpy(srcaddr,
memcpy(strptr + ofst + 1, scrnptr, numchars)
```
#Implementation Details
###Implementation Details
The original design concept was to create an efficient, flexible, and expressive environment for building applications directly on the Apple II. Choosing a stack based architecture was easy after much experience with other stack based implementations. It also makes the compiler simple to implement. The first take on the stack architecture was to make it a very strict stack architecture in that everything had to be on the stack. The only opcode with operands was the CONSTANT opcode. This allowed for a very small bytecode interpreter and a very easy compile target. However, only when adding an opcode with operands that would greatly improved performance, native code generation or code size was it done. The opcode table grew slowly over time but still retains a small runtime interpreter with good native code density.