diff --git a/HISTORY.markdown b/HISTORY.markdown index 316e86a..fefecd8 100644 --- a/HISTORY.markdown +++ b/HISTORY.markdown @@ -1,6 +1,16 @@ History of SixtyPical ===================== +0.8 +--- + +* Explicit word literals prefixed with `word` token. +* Can `copy` literals into user-defined destinations. +* Fixed bug where loop variable wasn't being checked at end of `repeat` loop. +* `buffer` and `pointer` types. +* `copy ^` syntax to load the addr of a buffer into a pointer. +* `copy []+y` syntax to read and write values to and from memory through a pointer. + 0.7 --- diff --git a/README.markdown b/README.markdown index 7f58c06..c356b91 100644 --- a/README.markdown +++ b/README.markdown @@ -24,14 +24,14 @@ programs to 6502 machine code. It is a **work in progress**, currently at the **proof-of-concept** stage. -The current released version of SixtyPical is 0.7. +The current development version of SixtyPical is 0.8. Documentation ------------- -* Design Goals — coming soon. +* [Design Goals](doc/Design%20Goals.md) * [SixtyPical specification](doc/SixtyPical.md) -* [SixtyPical history](HISTORY.md) +* [SixtyPical revision history](HISTORY.md) * [Literate test suite for SixtyPical syntax](tests/SixtyPical%20Syntax.md) * [Literate test suite for SixtyPical execution](tests/SixtyPical%20Execution.md) * [Literate test suite for SixtyPical analysis](tests/SixtyPical%20Analysis.md) @@ -41,15 +41,31 @@ Documentation TODO ---- -* `word table` type. -* `vector table` type. -* zero-page memory locations. -* indirect addressing. -* `low` and `high` address operators (turn `word` type into `byte`.) Possibly. -* save registers on stack or in memory (this preserves them = not trashed) +### Add 16 bit values. -At some point... +I guess this means making `add` a bit more like `copy`. +And then: add to pointer. (Not necessarily range-checked yet though.) + +And then write a little demo "game" where you can move a block around the screen with +the joystick. + +### `word table` and `vector table` types + +### `low` and `high` address operators + +To turn `word` type into `byte`. + +### save registers on stack + +This preserves them, so semantically, they can be used even though they +are trashed inside the block. + +### And at some point... + +* `copy x, [ptr] + y` +* Maybe even `copy [ptra] + y, [ptrb] + y`, which can be compiled to indirect LDA then indirect STA! +* Check that the buffer being read or written to through pointer, appears in approporiate inputs or outputs set. * initialized `byte table` memory locations * always analyze before executing or compiling, unless told not to * `trash` instruction. diff --git a/doc/Design Goals.md b/doc/Design Goals.md new file mode 100644 index 0000000..874a44c --- /dev/null +++ b/doc/Design Goals.md @@ -0,0 +1,52 @@ +Design Goals for SixtyPical +=========================== + +(draft) + +The intent of SixtyPical is to have a very low-level language that +benefits from abstract interpretation. + +"Very low-level" means, on a comparable level of abstraction as +assembly language. + +In the original vision for SixtyPical, SixtyPical instructions mapped +nearly 1:1 to 6502 instructions. However, many times when programming +in 6502 you're using idioms (e.g. adding a 16-bit constant to a 16-bit +value stored in 2 bytes) and it's just massively easier to analyze such +actions when they are represented by a single instruction. + +So SixtyPical instructions are similar to, inspired by, and have +analogous restrictions as 6502 instructions, but in many ways, they +are more abstract. For example, `copy`. + +The intent is that programming in SixtyPical is a lot like programming +in 6052 assembler, but it's harder to make a stupid error that you have +to spend a lot of time debugging. + +The intent is not to make it absolutely impossible to make such errors, +just harder. + +### Some Background ### + +The ideas in SixtyPical came from a couple of places. + +One major impetus was when I was working on [Shelta][], trying to cram +all that code for that compiler into 512 bytes. This involved looking +at the x86 registers and thinking hard about which ones were preserved +when (and which ones weren't) and making the best use of that. And +while doing that, one thing that came to mind was: I Bet The Assembler +Could Track This. + +Another influence was around 2007 when "Typed Assembly Language" (and +"Proof Carrying Code") were all the rage. I haven't heard about them +in a while, so I guess they turned out to be research fads? But for a +while there, it was all Necula, Necula, Necula. Anyway, I remember at +the time looking into TAL and expecting to find something that matched +the impression I had pre-formulated about what a "Typed Assembly" +might be like. And finding that it didn't match my vision very well. + +I don't actually remember what TAL seemed like to me at the time, but +what I had in mind was more like SixtyPical. + +(I'll also write something about abstract interpretation here at some +point, hopefully.) diff --git a/doc/SixtyPical.md b/doc/SixtyPical.md index a118a73..e85c778 100644 --- a/doc/SixtyPical.md +++ b/doc/SixtyPical.md @@ -1,7 +1,7 @@ SixtyPical ========== -This document describes the SixtyPical programming language version 0.7, +This document describes the SixtyPical programming language version 0.8, both its execution aspect and its static analysis aspect (even though these are, technically speaking, separate concepts.) @@ -14,20 +14,26 @@ the language. Types ----- -There are five TYPES in SixtyPical: +There are six *primitive types* in SixtyPical: * bit (2 possible values) * byte (256 possible values) -* byte table (256 entries, each holding a byte) +* word (65536 possible values) * routine (code stored somewhere in memory, read-only) * vector (address of a routine) +* pointer (address of a byte in a buffer) + +There are also two *type constructors*: + +* X table (256 entries, each holding a value of type X, where X is `byte`) +* buffer[N] (N entries; each entry is a byte; N is a power of 2, ≤ 64K) Memory locations ---------------- -A primary concept in SixtyPical is the MEMORY LOCATION. At any given point -in time during execution, each memory location is either UNINITIALIZED or -INITIALIZED. At any given point in the program text, too, each memory +A primary concept in SixtyPical is the *memory location*. At any given point +in time during execution, each memory location is either *uninitialized* or +*initialized*. At any given point in the program text, too, each memory location is either uninitialized or initialized. Where-ever it is one or the other during execution, it is the same in the corresponding place in the program text; thus, it is a static property. @@ -64,17 +70,27 @@ They come in bit and byte types. There are two bit constants, off on -and two-hundred and fifty-six byte constants, +two hundred and fifty-six byte constants, 0 1 ... 255 +and sixty-five thousand five hundred and thirty-six word constants, + + word 0 + word 1 + ... + word 65535 + +Note that if a word constant is between 256 and 65535, the leading `word` +token can be omitted. + ### User-defined ### There may be any number of user-defined memory locations. They are defined -by giving the type, which must be `byte`, `byte table`, or `vector`, and the +by giving the type (which may be any type except `bit` and `routine`) and the name. byte pos @@ -88,10 +104,10 @@ case, an explicit address in memory cannot be given. byte pos : 0 -A user-defined vector memory location is decorated with READS and WRITES lists -like a routine (see below), and it may only hold addresses of routines which -are compatible. (Meaning, the routine's inputs (resp. outputs, trashes) -must be a subset of the vector's inputs (resp. outputs, trashes.)) +A user-defined vector memory location is decorated with `inputs`, `outputs` +and `trashes` lists like a routine (see below), and it may only hold addresses +of routines which are compatible. (Meaning, the routine's inputs (resp. outputs, +trashes) must be a subset of the vector's inputs (resp. outputs, trashes.)) vector actor_logic inputs a, score @@ -99,13 +115,54 @@ must be a subset of the vector's inputs (resp. outputs, trashes.)) trashes y @ $c000 +Note that in the code of a routine, if a memory location is named by a +user-defined symbol, it is an address in memory, and can be read and written. +But if it is named by a literal integer, either decimal or hexadecimal, it +is a constant and can only be read (and when read always yields that constant +value. So, for instance, to read the value at `screen` above, in the code, +you would need to reference the symbol `screen`; attempting to read 1024 +would not work. + +This is actually useful, at least at this point, as you can rely on the fact +that literal integers in the code are always immediate values. (But this +may change at some point.) + +### Buffers and Pointers ### + +Roughly speaking, a `buffer` is a table that can be longer than 256 bytes, +and a `pointer` is an address within a buffer. + +A `pointer` is implemented as a zero-page memory location, and accessing the +buffer pointed to is implemented with "indirect indexed" addressing, as in + + LDA ($02), Y + STA ($02), Y + +There are extended modes of `copy` for using these types of memory location. +See `copy` below, but here is some illustrative example code: + + copy ^buf, ptr // this is the only way to initialize a pointer + add ptr, 4 // ok, but only if it does not exceed buffer's size + ld y, 0 // you must set this to something yourself + copy [ptr] + y, byt // read memory through pointer, into byte + copy 100, [ptr] + y // write memory through pointer (still trashes a) + +where `ptr` is a user-defined storage location of `pointer` type, and the +`+ y` part is mandatory. + Routines -------- -Every routine must list all the memory locations it READS from, i.e. its -INPUTS, and all the memory locations it WRITES to, whether they are OUTPUTS -or merely TRASHED. Every memory location that is not written to by the -routine (or any routines that the routine calls) is PRESERVED by the routine. +Every routine must list all the memory locations it *reads from*, which we +call its `inputs`, and all the memory locations it *writes to*. The latter +we divide into two groups: its `outputs` which it intentionally initializes, +and its `trashes`, which it does not care about, and leaves uninitialized. +For example, if it uses a register to temporarily store an intermediate +value used in a multiplication, that register has no meaning outside of +the multiplication, and is one of the routine's `trashes`. + +It is common to say that the `trashes` are the memory locations that are +*not preserved* by the routine. routine foo inputs a, score @@ -114,6 +171,9 @@ routine (or any routines that the routine calls) is PRESERVED by the routine. ... } +The union of the `outputs` and `trashes` is sometimes collectively called +"the WRITES" of the routine, for historical reasons and as shorthand. + Routines may call only routines previously defined in the program source. Thus, directly recursive routines are not allowed. (However, routines may also call routines via vectors, which are dynamically assigned. In this @@ -122,16 +182,16 @@ case, there is, for the time being, no check for recursive calls.) For a SixtyPical program to be run, there must be one routine called `main`. This routine is executed when the program is run. -The memory locations given given as inputs are considered to be initialized +The memory locations given as inputs to a routine are considered to be initialized at the beginning of the routine. Various instructions cause memory locations to be initialized after they are executed. Calling a routine which trashes some memory locations causes those memory locations to be uninitialized after that routine is called. At the end of a routine, all memory locations listed -as outputs must be initialised. +as outputs must be initialized. -A routine can also be declared as "external", in which case its body need -not be defined but an absolute address must be given for where the routine -is located in memory. +A literal word can given instead of the body of the routine. This word is the +absolute address of an "external" routine located in memory but not defined by +the SixtyPical program. routine chrout inputs a @@ -141,6 +201,13 @@ is located in memory. Instructions ------------ +Instructions are inspired by, and in many cases closely resemble, the 6502 +instruction set. However, in many cases they do not map 1:1 to 6502 instructions. +If a SixtyPical instruction cannot be translated validly to one more more 6502 +instructions while retaining all the stated constraints, that's a static error +in a SixtyPical program, and technically any implementation of SixtyPical, even +an interpreter, should flag it up. + ### ld ### ld , [+ ] @@ -148,13 +215,12 @@ Instructions Reads from src and writes to dest. * It is illegal if dest is not a register. -* It is illegal if dest does not occur in the WRITES lists of the current - routine. +* It is illegal if dest does not occur in the WRITES of the current routine. * It is illegal if src is not of same type as dest (i.e., is not a byte.) * It is illegal if src is uninitialized. After execution, dest is considered initialized. The flags `z` and `n` may be -changed by this instruction; they must be named in the WRITES lists, and they +changed by this instruction; they must be named in the WRITES, and they are considered initialized after it has executed. If and only if src is a byte table, the index-memory-location must be given. @@ -169,8 +235,7 @@ underlying opcodes. Reads from src and writes to dest. * It is illegal if dest is a register or if dest is read-only. -* It is illegal if dest does not occur in the WRITES lists of the current - routine. +* It is illegal if dest does not occur in the WRITES of the current routine. * It is illegal if src is not of same type as dest. * It is illegal if src is uninitialized. @@ -179,6 +244,49 @@ changed by this instruction (unless of course dest is a flag.) If and only if dest is a byte table, the index-memory-location must be given. +### copy ### + + copy , + +Reads from src and writes to dest. Differs from `st` in that is able to +copy more general types of data (for example, vectors,) and it trashes the +`z` and `n` flags and the `a` register. + +* It is illegal if dest is read-only. +* It is illegal if dest does not occur in the WRITES of the current routine. +* It is illegal if src is not of same type as dest. +* It is illegal if src is uninitialized. + +After execution, dest is considered initialized, and `z` and `n`, and +`a` are considered uninitialized. + +There are two extra modes that this instruction can be used in. The first is +to load an address into a pointer: + + copy ^, + +This copies the address of src into dest. In this case, src must be +of type buffer, and dest must be of type pointer. src will not be +considered a memory location that is read, since it is only its address +that is being retrieved. + +The second is to read or write indirectly through a pointer. + + copy [] + y, + copy , [] + y + +In both of these, the memory location in the `[]+y` syntax must be +a pointer. + +The first copies the contents of memory at the pointer (offset by the `y` +register) into a byte memory location. + +The second copies a literal byte, or a byte memory location, into +the contents of memory at the pointer (offset by the `y` register). + +In addition to the constraints above, `y` must be initialized before +this mode is used. + ### add dest, src ### add , @@ -187,10 +295,9 @@ Adds the contents of src to dest and stores the result in dest. * It is illegal if src OR dest OR c is uninitialized. * It is illegal if dest is read-only. -* It is illegal if dest does not occur in the WRITES lists - of the current routine. +* It is illegal if dest does not occur in the WRITES of the current routine. -Affects n, z, c, and v flags, requiring that they be in the WRITES lists, +Affects n, z, c, and v flags, requiring that they be in the WRITES, and initializing them afterwards. dest and src continue to be initialized afterwards. @@ -203,10 +310,9 @@ Increments the value in dest. Does not honour carry. * It is illegal if dest is uninitialized. * It is illegal if dest is read-only. -* It is illegal if dest does not occur in the WRITES lists - of the current routine. +* It is illegal if dest does not occur in the WRITES of the current routine. -Affects n and z flags, requiring that they be in the WRITES lists, +Affects n and z flags, requiring that they be in the WRITES, and initializing them afterwards. ### sub ### @@ -217,10 +323,9 @@ Subtracts the contents of src from dest and stores the result in dest. * It is illegal if src OR dest OR c is uninitialized. * It is illegal if dest is read-only. -* It is illegal if dest does not occur in the WRITES lists - of the current routine. +* It is illegal if dest does not occur in the WRITES of the current routine. -Affects n, z, c, and v flags, requiring that they be in the WRITES lists, +Affects n, z, c, and v flags, requiring that they be in the WRITES, and initializing them afterwards. dest and src continue to be initialized afterwards. @@ -233,10 +338,9 @@ Decrements the value in dest. Does not honour carry. * It is illegal if dest is uninitialized. * It is illegal if dest is read-only. -* It is illegal if dest does not occur in the WRITES lists - of the current routine. +* It is illegal if dest does not occur in the WRITES of the current routine. -Affects n and z flags, requiring that they be in the WRITES lists, +Affects n and z flags, requiring that they be in the WRITES, and initializing them afterwards. ### cmp ### @@ -248,7 +352,7 @@ does not store the result anywhere, only sets the resulting flags. * It is illegal if src OR dest is uninitialized. -Affects n, z, and c flags, requiring that they be in the WRITES lists, +Affects n, z, and c flags, requiring that they be in the WRITES, and initializing them afterwards. ### and, or, xor ### @@ -262,10 +366,9 @@ the result in dest. * It is illegal if src OR dest OR is uninitialized. * It is illegal if dest is read-only. -* It is illegal if dest does not occur in the WRITES lists - of the current routine. +* It is illegal if dest does not occur in the WRITES of the current routine. -Affects n and z flags, requiring that they be in the WRITES lists of the +Affects n and z flags, requiring that they be in the WRITES of the current routine, and sets them as initialized afterwards. dest and src continue to be initialized afterwards. @@ -284,10 +387,9 @@ and `c` becomes the bit that was shifted off the right. * It is illegal if dest is a register besides `a`. * It is illegal if dest is read-only. * It is illegal if dest OR c is uninitialized. -* It is illegal if dest does not occur in the WRITES lists - of the current routine. +* It is illegal if dest does not occur in the WRITES of the current routine. -Affects the c flag, requiring that it be in the WRITES lists of the +Affects the c flag, requiring that it be in the WRITES of the current routine, and it continues to be initialized afterwards. ### call ### @@ -299,17 +401,15 @@ defined routine, or a vector location which contains the address of a routine which will be called indirectly. Execution will be transferred back to the current routine, when execution of the executable is finished. -Just before the call, - -* It is illegal if any of the memory locations in the target executable's - READS list is uninitialized. +* It is illegal if any of the memory locations listed in the called routine's + `inputs` are uninitialized immediately before the call. Just after the call, -* All memory locations listed as TRASHED in the called routine's WRITES - list are considered uninitialized. -* All memory locations listed as TRASHED in the called routine's OUTPUTS - list are considered initialized. +* All memory locations listed in the called routine's `trashes` are considered + to now be uninitialized. +* All memory locations listed in the called routine's `outputs` are considered + to now be initialized. ### goto ### @@ -325,13 +425,13 @@ must be the final instruction in the routine. Just before the goto, -* It is illegal if any of the memory locations in the target executable's - READS list is uninitialized. +* It is illegal if any of the memory locations in the target routine's + `inputs` list is uninitialized. In addition, -* The target executable's WRITES lists must not include any locations - that are not already included in the current routine's WRITES lists. +* The target executable's WRITES must not include any locations + that are not already included in the current routine's WRITES. ### if ### @@ -350,6 +450,8 @@ it is treated like an empty block. * It is illegal if any location initialized at the end of the true-branch is not initialized at the end of the false-branch, and vice versa. +The sense of the test can be inverted with `not`. + ### repeat ### repeat { @@ -372,24 +474,21 @@ To simulate a "while" loop, use an `if` internal to the block, like } } until z -"until" is optional, but if omitted, must be replaced with "forever". +"until" is optional, but if omitted, must be replaced with "forever": -### copy ### + repeat { + cmp y, 25 + if z { + } + } forever - copy , +The sense of the test can be inverted with `not`. -Reads from src and writes to dest. Differs from `st` in that is able to -copy more general types of data (for example, vectors,) and it trashes the -`z` and `n` flags and the `a` register. - -* It is illegal if dest is read-only. -* It is illegal if dest does not occur in the WRITES lists of the current - routine. -* It is illegal if src is not of same type as dest. -* It is illegal if src is uninitialized. - -After execution, dest is considered initialized, and `z` and `n`, and -`a` are considered uninitialized. + repeat { + cmp y, 25 + if z { + } + } until not z Grammar ------- diff --git a/eg/buffer.60p b/eg/buffer.60p new file mode 100644 index 0000000..ce4dfbe --- /dev/null +++ b/eg/buffer.60p @@ -0,0 +1,15 @@ +buffer[2048] buf +pointer ptr @ 254 +byte foo + +routine main + inputs buf + outputs buf, y, foo + trashes a, z, n, ptr +{ + ld y, 0 + copy ^buf, ptr + copy 123, [ptr] + y + copy [ptr] + y, foo + copy foo, [ptr] + y +} diff --git a/eg/joystick.60p b/eg/joystick.60p new file mode 100644 index 0000000..8783bd7 --- /dev/null +++ b/eg/joystick.60p @@ -0,0 +1,49 @@ +word screen @ 1024 +byte joy2 @ $dc00 + +word delta + +routine read_stick + inputs joy2 + outputs delta + trashes a, x, z, n +{ + ld x, joy2 + ld a, x + and a, 1 // up + if z { + copy $ffd8, delta // -40 + } else { + ld a, x + and a, 2 // down + if z { + copy word 40, delta + } else { + ld a, x + and a, 4 // left + if z { + copy $ffff, delta // -1 + } else { + ld a, x + and a, 8 // right + if z { + copy word 1, delta + } else { + copy word 0, delta + } + } + } + } +} + +routine main + inputs joy2 + outputs delta + trashes a, x, z, n, screen +{ + repeat { + call read_stick + copy delta, screen + ld a, 1 + } until z +} diff --git a/src/sixtypical/analyzer.py b/src/sixtypical/analyzer.py index 5ac2aad..4a2b522 100644 --- a/src/sixtypical/analyzer.py +++ b/src/sixtypical/analyzer.py @@ -2,10 +2,9 @@ from sixtypical.ast import Program, Routine, Block, Instr from sixtypical.model import ( - TYPE_BYTE, TYPE_BYTE_TABLE, - VectorType, ExecutableType, - ConstantRef, LocationRef, - REG_A, FLAG_Z, FLAG_N, FLAG_V, FLAG_C + TYPE_BYTE, TYPE_BYTE_TABLE, BufferType, PointerType, VectorType, ExecutableType, + ConstantRef, LocationRef, IndirectRef, AddressRef, + REG_A, REG_Y, FLAG_Z, FLAG_N, FLAG_V, FLAG_C ) @@ -114,11 +113,13 @@ class Context(object): def assert_meaningful(self, *refs, **kwargs): exception_class = kwargs.get('exception_class', UnmeaningfulReadError) for ref in refs: - if isinstance(ref, ConstantRef) or ref in self.routines: + if ref.is_constant() or ref in self.routines: pass elif isinstance(ref, LocationRef): if ref not in self._meaningful: message = '%s in %s' % (ref.name, self.routine.name) + if kwargs.get('message'): + message += ' (%s)' % kwargs['message'] raise exception_class(message) else: raise NotImplementedError(ref) @@ -128,6 +129,8 @@ class Context(object): for ref in refs: if ref not in self._writeable: message = '%s in %s' % (ref.name, self.routine.name) + if kwargs.get('message'): + message += ' (%s)' % kwargs['message'] raise exception_class(message) def set_touched(self, *refs): @@ -270,45 +273,79 @@ class Analyzer(object): # probably not; if it wasn't meaningful in the first place, it # doesn't really matter if you modified it or not, coming out. for ref in context1.each_meaningful(): - context2.assert_meaningful(ref, exception_class=InconsistentInitializationError) + context2.assert_meaningful( + ref, exception_class=InconsistentInitializationError, message='initialized in block 1 but not in block 2' + ) for ref in context2.each_meaningful(): - context1.assert_meaningful(ref, exception_class=InconsistentInitializationError) + context1.assert_meaningful( + ref, exception_class=InconsistentInitializationError, message='initialized in block 2 but not in block 1' + ) context.set_from(context1) elif opcode == 'repeat': # it will always be executed at least once, so analyze it having # been executed the first time. self.analyze_block(instr.block, context) - + context.assert_meaningful(src) + # now analyze it having been executed a second time, with the context # of it having already been executed. self.analyze_block(instr.block, context) - - # NB I *think* that's enough... but it might not be? + context.assert_meaningful(src) + elif opcode == 'copy': - # check that their types are basically compatible - if src.type == dest.type: - pass - elif isinstance(src.type, ExecutableType) and \ - isinstance(dest.type, VectorType): - pass + # 1. check that their types are compatible + + if isinstance(src, AddressRef) and isinstance(dest, LocationRef): + if isinstance(src.ref.type, BufferType) and isinstance(dest.type, PointerType): + pass + else: + raise TypeMismatchError((src, dest)) + elif isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndirectRef): + if src.type == TYPE_BYTE and isinstance(dest.ref.type, PointerType): + pass + else: + raise TypeMismatchError((src, dest)) + elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef): + if isinstance(src.ref.type, PointerType) and dest.type == TYPE_BYTE: + pass + else: + raise TypeMismatchError((src, dest)) + elif isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, LocationRef): + if src.type == dest.type: + pass + elif isinstance(src.type, ExecutableType) and \ + isinstance(dest.type, VectorType): + # if dealing with routines and vectors, + # check that they're not incompatible + if not (src.type.inputs <= dest.type.inputs): + raise IncompatibleConstraintsError(src.type.inputs - dest.type.inputs) + if not (src.type.outputs <= dest.type.outputs): + raise IncompatibleConstraintsError(src.type.outputs - dest.type.outputs) + if not (src.type.trashes <= dest.type.trashes): + raise IncompatibleConstraintsError(src.type.trashes - dest.type.trashes) + else: + raise TypeMismatchError((src, dest)) else: raise TypeMismatchError((src, dest)) - - # if dealing with routines and vectors, - # check that they're not incompatible - if isinstance(src.type, ExecutableType) and \ - isinstance(dest.type, VectorType): - if not (src.type.inputs <= dest.type.inputs): - raise IncompatibleConstraintsError(src.type.inputs - dest.type.inputs) - if not (src.type.outputs <= dest.type.outputs): - raise IncompatibleConstraintsError(src.type.outputs - dest.type.outputs) - if not (src.type.trashes <= dest.type.trashes): - raise IncompatibleConstraintsError(src.type.trashes - dest.type.trashes) - - context.assert_meaningful(src) - context.set_written(dest) + + # 2. check that the context is meaningful + + if isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndirectRef): + context.assert_meaningful(src, REG_Y) + # TODO this will need to be more sophisticated. it's the thing ref points to that is written, not ref itself. + context.set_written(dest.ref) + elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef): + context.assert_meaningful(src.ref, REG_Y) + # TODO this will need to be more sophisticated. the thing ref points to is touched, as well. + context.set_touched(src.ref) + context.set_written(dest) + else: + context.assert_meaningful(src) + context.set_written(dest) + context.set_touched(REG_A, FLAG_Z, FLAG_N) context.set_unmeaningful(REG_A, FLAG_Z, FLAG_N) + elif opcode == 'with-sei': self.analyze_block(instr.block, context) elif opcode == 'goto': diff --git a/src/sixtypical/compiler.py b/src/sixtypical/compiler.py index fdaee3b..7f94c2b 100644 --- a/src/sixtypical/compiler.py +++ b/src/sixtypical/compiler.py @@ -2,14 +2,13 @@ from sixtypical.ast import Program, Routine, Block, Instr from sixtypical.model import ( - ConstantRef, - TYPE_BIT, TYPE_BYTE, TYPE_WORD, - RoutineType, VectorType, + ConstantRef, LocationRef, IndirectRef, AddressRef, + TYPE_BIT, TYPE_BYTE, TYPE_WORD, BufferType, PointerType, RoutineType, VectorType, REG_A, REG_X, REG_Y, FLAG_C ) from sixtypical.emitter import Byte, Label, Offset, LowAddressByte, HighAddressByte from sixtypical.gen6502 import ( - Immediate, Absolute, AbsoluteX, AbsoluteY, Indirect, Relative, + Immediate, Absolute, AbsoluteX, AbsoluteY, ZeroPage, Indirect, IndirectY, Relative, LDA, LDX, LDY, STA, STX, STY, TAX, TAY, TXA, TYA, CLC, SEC, ADC, SBC, ROL, ROR, @@ -275,18 +274,63 @@ class Compiler(object): self.compile_block(instr.block) self.emitter.emit(CLI()) elif opcode == 'copy': - if src.type == TYPE_BYTE and dest.type == TYPE_BYTE: - src_label = self.labels[src.name] + if isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndirectRef): + if src.type == TYPE_BYTE and isinstance(dest.ref.type, PointerType): + if isinstance(src, ConstantRef): + dest_label = self.labels[dest.ref.name] + self.emitter.emit(LDA(Immediate(Byte(src.value)))) + self.emitter.emit(STA(IndirectY(dest_label))) + elif isinstance(src, LocationRef): + src_label = self.labels[src.name] + dest_label = self.labels[dest.ref.name] + self.emitter.emit(LDA(Absolute(src_label))) + self.emitter.emit(STA(IndirectY(dest_label))) + else: + raise NotImplementedError((src, dest)) + else: + raise NotImplementedError((src, dest)) + elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef): + if dest.type == TYPE_BYTE and isinstance(src.ref.type, PointerType): + src_label = self.labels[src.ref.name] + dest_label = self.labels[dest.name] + self.emitter.emit(LDA(IndirectY(src_label))) + self.emitter.emit(STA(Absolute(dest_label))) + else: + raise NotImplementedError((src, dest)) + elif isinstance(src, AddressRef) and isinstance(dest, LocationRef) and \ + isinstance(src.ref.type, BufferType) and isinstance(dest.type, PointerType): + src_label = self.labels[src.ref.name] dest_label = self.labels[dest.name] - self.emitter.emit(LDA(Absolute(src_label))) - self.emitter.emit(STA(Absolute(dest_label))) + self.emitter.emit(LDA(Immediate(HighAddressByte(src_label)))) + self.emitter.emit(STA(ZeroPage(dest_label))) + self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) + self.emitter.emit(STA(ZeroPage(Offset(dest_label, 1)))) + elif not isinstance(src, (ConstantRef, LocationRef)) or not isinstance(dest, LocationRef): + raise NotImplementedError((src, dest)) + elif src.type == TYPE_BYTE and dest.type == TYPE_BYTE: + if isinstance(src, ConstantRef): + raise NotImplementedError + else: + src_label = self.labels[src.name] + dest_label = self.labels[dest.name] + self.emitter.emit(LDA(Absolute(src_label))) + self.emitter.emit(STA(Absolute(dest_label))) elif src.type == TYPE_WORD and dest.type == TYPE_WORD: - src_label = self.labels[src.name] - dest_label = self.labels[dest.name] - self.emitter.emit(LDA(Absolute(src_label))) - self.emitter.emit(STA(Absolute(dest_label))) - self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) - self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) + if isinstance(src, ConstantRef): + dest_label = self.labels[dest.name] + hi = (src.value >> 8) & 255 + lo = src.value & 255 + self.emitter.emit(LDA(Immediate(Byte(hi)))) + self.emitter.emit(STA(Absolute(dest_label))) + self.emitter.emit(LDA(Immediate(Byte(lo)))) + self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) + else: + src_label = self.labels[src.name] + dest_label = self.labels[dest.name] + self.emitter.emit(LDA(Absolute(src_label))) + self.emitter.emit(STA(Absolute(dest_label))) + self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) + self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) elif isinstance(src.type, VectorType) and isinstance(dest.type, VectorType): src_label = self.labels[src.name] dest_label = self.labels[dest.name] diff --git a/src/sixtypical/emitter.py b/src/sixtypical/emitter.py index 5ae2225..82531d9 100644 --- a/src/sixtypical/emitter.py +++ b/src/sixtypical/emitter.py @@ -61,6 +61,10 @@ class Label(Emittable): assert self.addr is not None, "unresolved label: %s" % self.name return Byte(self.addr - (addr + 2)).serialize() + def serialize_as_zero_page(self, offset=0): + assert self.addr is not None, "unresolved label: %s" % self.name + return Byte(self.addr + offset).serialize() + def __repr__(self): addrs = ', addr=%r' % self.addr if self.addr is not None else '' return "%s(%r%s)" % (self.__class__.__name__, self.name, addrs) @@ -78,6 +82,9 @@ class Offset(Emittable): def serialize(self, addr=None): return self.label.serialize(offset=self.offset) + def serialize_as_zero_page(self, offset=0): + return self.label.serialize_as_zero_page(offset=self.offset) + def __repr__(self): return "%s(%r, %r)" % (self.__class__.__name__, self.label, self.offset) diff --git a/src/sixtypical/evaluator.py b/src/sixtypical/evaluator.py index f50e6a7..65feb88 100644 --- a/src/sixtypical/evaluator.py +++ b/src/sixtypical/evaluator.py @@ -2,7 +2,7 @@ from sixtypical.ast import Program, Routine, Block, Instr from sixtypical.model import ( - ConstantRef, LocationRef, PartRef, + ConstantRef, LocationRef, PartRef, IndirectRef, REG_A, REG_X, REG_Y, FLAG_Z, FLAG_N, FLAG_V, FLAG_C ) @@ -191,6 +191,12 @@ class Evaluator(object): while context.get(src) == 0: self.eval_block(instr.block, context) elif opcode == 'copy': + if isinstance(src, IndirectRef): + raise NotImplementedError("this doesn't actually work") + src = src.ref + if isinstance(dest, IndirectRef): + raise NotImplementedError("this doesn't actually work") + dest = dest.ref context.set(dest, context.get(src)) # these are trashed; so could be anything really context.set(REG_A, 0) diff --git a/src/sixtypical/gen6502.py b/src/sixtypical/gen6502.py index e0be7a4..8659ab6 100644 --- a/src/sixtypical/gen6502.py +++ b/src/sixtypical/gen6502.py @@ -58,6 +58,18 @@ class AbsoluteY(Absolute): pass +class ZeroPage(AddressingMode): + def __init__(self, value): + assert isinstance(value, (Label, Offset)) + self.value = value + + def size(self): + return 1 + + def serialize(self, addr=None): + return self.value.serialize_as_zero_page() + + class Indirect(AddressingMode): def __init__(self, value): assert isinstance(value, Label) @@ -70,6 +82,10 @@ class Indirect(AddressingMode): return self.value.serialize() +class IndirectY(ZeroPage): + pass + + class Relative(AddressingMode): def __init__(self, value): assert isinstance(value, Label) @@ -244,6 +260,8 @@ class LDA(Instruction): Absolute: 0xad, AbsoluteX: 0xbd, AbsoluteY: 0xb9, + IndirectY: 0xb1, + ZeroPage: 0xa5, } @@ -320,6 +338,8 @@ class STA(Instruction): Absolute: 0x8d, AbsoluteX: 0x9d, AbsoluteY: 0x99, + IndirectY: 0x91, + ZeroPage: 0x85, } diff --git a/src/sixtypical/model.py b/src/sixtypical/model.py index e509e95..449285b 100644 --- a/src/sixtypical/model.py +++ b/src/sixtypical/model.py @@ -58,6 +58,17 @@ class VectorType(ExecutableType): super(VectorType, self).__init__('vector', **kwargs) +class BufferType(Type): + def __init__(self, size): + self.size = size + self.name = 'buffer[%s]' % self.size + + +class PointerType(Type): + def __init__(self): + self.name = 'pointer' + + class Ref(object): def is_constant(self): """read-only means that the program cannot change the value @@ -76,7 +87,7 @@ class LocationRef(Ref): # but because we store the type in here and we want to treat # these objects as immutable, we compare the types, too. # Not sure if very wise. - return isinstance(other, LocationRef) and ( + return isinstance(other, self.__class__) and ( other.name == self.name and other.type == self.type ) @@ -90,6 +101,48 @@ class LocationRef(Ref): return isinstance(self.type, RoutineType) +class IndirectRef(Ref): + def __init__(self, ref): + self.ref = ref + + def __eq__(self, other): + return self.ref == other.ref + + def __hash__(self): + return hash(self.__class__.name) ^ hash(self.ref) + + def __repr__(self): + return '%s(%r)' % (self.__class__.__name__, self.ref) + + @property + def name(self): + return '[{}]+y'.format(self.ref.name) + + def is_constant(self): + return False + + +class AddressRef(Ref): + def __init__(self, ref): + self.ref = ref + + def __eq__(self, other): + return self.ref == other.ref + + def __hash__(self): + return hash(self.__class__.name) ^ hash(self.ref) + + def __repr__(self): + return '%s(%r)' % (self.__class__.__name__, self.ref) + + @property + def name(self): + return '^{}'.format(self.ref.name) + + def is_constant(self): + return True + + class PartRef(Ref): """For 'low byte of' location and 'high byte of' location modifiers. diff --git a/src/sixtypical/parser.py b/src/sixtypical/parser.py index a2b3b9f..872070d 100644 --- a/src/sixtypical/parser.py +++ b/src/sixtypical/parser.py @@ -3,8 +3,8 @@ from sixtypical.ast import Program, Defn, Routine, Block, Instr from sixtypical.model import ( TYPE_BIT, TYPE_BYTE, TYPE_BYTE_TABLE, TYPE_WORD, TYPE_WORD_TABLE, - RoutineType, VectorType, ExecutableType, - LocationRef, ConstantRef + RoutineType, VectorType, ExecutableType, BufferType, PointerType, + LocationRef, ConstantRef, IndirectRef, AddressRef, ) from sixtypical.scanner import Scanner @@ -32,7 +32,7 @@ class Parser(object): def program(self): defns = [] routines = [] - while self.scanner.on('byte', 'word', 'vector'): + while self.scanner.on('byte', 'word', 'vector', 'buffer', 'pointer'): defn = self.defn() name = defn.name if name in self.symbols: @@ -50,25 +50,15 @@ class Parser(object): return Program(defns=defns, routines=routines) def defn(self): - type = None - if self.scanner.consume('byte'): - type = TYPE_BYTE - if self.scanner.consume('table'): - type = TYPE_BYTE_TABLE - elif self.scanner.consume('word'): - type = TYPE_WORD - if self.scanner.consume('table'): - type = TYPE_WORD_TABLE - else: - self.scanner.expect('vector') - type = 'vector' # will be resolved to a Type below + type_ = self.defn_type() + self.scanner.check_type('identifier') name = self.scanner.token self.scanner.scan() (inputs, outputs, trashes) = self.constraints() - if type == 'vector': - type = VectorType(inputs=inputs, outputs=outputs, trashes=trashes) + if type_ == 'vector': + type_ = VectorType(inputs=inputs, outputs=outputs, trashes=trashes) elif inputs or outputs or trashes: raise SyntaxError("Cannot apply constraints to non-vector type") @@ -87,10 +77,32 @@ class Parser(object): if initial is not None and addr is not None: raise SyntaxError("Definition cannot have both initial value and explicit address") - location = LocationRef(type, name) + location = LocationRef(type_, name) return Defn(name=name, addr=addr, initial=initial, location=location) + def defn_type(self): + if self.scanner.consume('byte'): + if self.scanner.consume('table'): + return TYPE_BYTE_TABLE + return TYPE_BYTE + elif self.scanner.consume('word'): + if self.scanner.consume('table'): + return TYPE_WORD_TABLE + return TYPE_WORD + elif self.scanner.consume('vector'): + return 'vector' # will be resolved to a Type by caller + elif self.scanner.consume('buffer'): + self.scanner.expect('[') + self.scanner.check_type('integer literal') + size = int(self.scanner.token) + self.scanner.scan() + self.scanner.expect(']') + return BufferType(size) + else: + self.scanner.expect('pointer') + return PointerType() + def constraints(self): inputs = set() outputs = set() @@ -138,7 +150,13 @@ class Parser(object): self.scanner.scan() return loc elif self.scanner.on_type('integer literal'): - loc = ConstantRef(TYPE_BYTE, int(self.scanner.token)) + value = int(self.scanner.token) + type_ = TYPE_WORD if value > 255 else TYPE_BYTE + loc = ConstantRef(type_, value) + self.scanner.scan() + return loc + elif self.scanner.consume('word'): + loc = ConstantRef(TYPE_WORD, int(self.scanner.token)) self.scanner.scan() return loc else: @@ -146,6 +164,19 @@ class Parser(object): self.scanner.scan() return loc + def indlocexpr(self): + if self.scanner.consume('['): + loc = self.locexpr() + self.scanner.expect(']') + self.scanner.expect('+') + self.scanner.expect('y') + return IndirectRef(loc) + elif self.scanner.consume('^'): + loc = self.locexpr() + return AddressRef(loc) + else: + return self.locexpr() + def block(self): instrs = [] self.scanner.expect('{') @@ -216,9 +247,9 @@ class Parser(object): elif self.scanner.token in ("copy",): opcode = self.scanner.token self.scanner.scan() - src = self.locexpr() + src = self.indlocexpr() self.scanner.expect(',') - dest = self.locexpr() + dest = self.indlocexpr() return Instr(opcode=opcode, dest=dest, src=src) elif self.scanner.consume("with"): self.scanner.expect("interrupts") diff --git a/src/sixtypical/scanner.py b/src/sixtypical/scanner.py index c538c3e..fd48b3b 100644 --- a/src/sixtypical/scanner.py +++ b/src/sixtypical/scanner.py @@ -29,7 +29,7 @@ class Scanner(object): self.token = None self.type = 'EOF' return - if self.scan_pattern(r'\,|\@|\+|\:|\<|\>|\{|\}', 'operator'): + if self.scan_pattern(r'\,|\@|\+|\:|\<|\>|\{|\}|\[|\]|\^', 'operator'): return if self.scan_pattern(r'\d+', 'integer literal'): return diff --git a/tests/SixtyPical Analysis.md b/tests/SixtyPical Analysis.md index 7eaff5a..c07db8a 100644 --- a/tests/SixtyPical Analysis.md +++ b/tests/SixtyPical Analysis.md @@ -1017,6 +1017,23 @@ initialized at the start. | } ? UnmeaningfulReadError: y in main +And if you trash the test expression (i.e. `z` in the below) inside the loop, +this is an error too. + + | word one : 0 + | word two : 0 + | + | routine main + | inputs one, two + | outputs two + | trashes a, z, n + | { + | repeat { + | copy one, two + | } until z + | } + ? UnmeaningfulReadError: z in main + ### copy ### Can't `copy` from a memory location that isn't initialized. @@ -1155,6 +1172,90 @@ Can't `copy` from a `word` to a `byte`. | } ? TypeMismatchError +### copy[] ### + +Buffers and pointers. + +Note that `^buf` is a constant value, so it by itself does not require `buf` to be +listed in any input/output sets. + +However, if the code reads from it through a pointer, it *should* be in `inputs`. + +Likewise, if the code writes to it through a pointer, it *should* be in `outputs`. + +Of course, unless you write to *all* the bytes in a buffer, some of those bytes +might not be meaningful. So how meaningful is this check? + +This is an open problem. + +For now, convention says: if it is being read, list it in `inputs`, and if it is +being modified, list it in both `inputs` and `outputs`. + +Write literal through a pointer. + + | buffer[2048] buf + | pointer ptr + | + | routine main + | inputs buf + | outputs y, buf + | trashes a, z, n, ptr + | { + | ld y, 0 + | copy ^buf, ptr + | copy 123, [ptr] + y + | } + = ok + +It does use `y`. + + | buffer[2048] buf + | pointer ptr + | + | routine main + | inputs buf + | outputs buf + | trashes a, z, n, ptr + | { + | copy ^buf, ptr + | copy 123, [ptr] + y + | } + ? UnmeaningfulReadError + +Write stored value through a pointer. + + | buffer[2048] buf + | pointer ptr + | byte foo + | + | routine main + | inputs foo, buf + | outputs y, buf + | trashes a, z, n, ptr + | { + | ld y, 0 + | copy ^buf, ptr + | copy foo, [ptr] + y + | } + = ok + +Read through a pointer. + + | buffer[2048] buf + | pointer ptr + | byte foo + | + | routine main + | inputs buf + | outputs foo + | trashes a, y, z, n, ptr + | { + | ld y, 0 + | copy ^buf, ptr + | copy [ptr] + y, foo + | } + = ok + ### routines ### Routines are constants. You need not, and in fact cannot, specify a constant diff --git a/tests/SixtyPical Compilation.md b/tests/SixtyPical Compilation.md index 949b73d..64e2d77 100644 --- a/tests/SixtyPical Compilation.md +++ b/tests/SixtyPical Compilation.md @@ -1,15 +1,15 @@ -Sixtypical Compilation +SixtyPical Compilation ====================== This is a test suite, written in [Falderal][] format, for compiling -Sixtypical to 6502 machine code. +SixtyPical to 6502 machine code. [Falderal]: http://catseye.tc/node/Falderal - -> Functionality "Compile Sixtypical program" is implemented by + -> Functionality "Compile SixtyPical program" is implemented by -> shell command "bin/sixtypical --compile %(test-body-file) | fa-bin-to-hex" - -> Tests for functionality "Compile Sixtypical program" + -> Tests for functionality "Compile SixtyPical program" Null program. @@ -267,6 +267,18 @@ Copy word to word. | } = 00c0ad0fc08d0dc0ad10c08d0ec060 +Copy literal word to word. + + | word bar + | + | routine main + | outputs bar + | trashes a, n, z + | { + | copy 65535, bar + | } + = 00c0a9ff8d0bc0a9ff8d0cc060 + Copy vector to vector. | vector bar @@ -281,7 +293,7 @@ Copy vector to vector. | } = 00c0ad0fc08d0dc0ad10c08d0ec060 -Copy instruction inside an `interrupts off` block. +Copy routine to vector, inside an `interrupts off` block. | vector bar | @@ -329,3 +341,70 @@ goto. | goto bar | } = 00c0a0c84c06c060a2c860 + +### Buffers and Pointers + +Load address into pointer. + + | buffer[2048] buf + | pointer ptr @ 254 + | + | routine main + | inputs buf + | outputs buf, y + | trashes a, z, n, ptr + | { + | ld y, 0 + | copy ^buf, ptr + | } + = 00c0a000a90b85fea9c085ff60 + +Write literal through a pointer. + + | buffer[2048] buf + | pointer ptr @ 254 + | + | routine main + | inputs buf + | outputs buf, y + | trashes a, z, n, ptr + | { + | ld y, 0 + | copy ^buf, ptr + | copy 123, [ptr] + y + | } + = 00c0a000a90f85fea9c085ffa97b91fe60 + +Write stored value through a pointer. + + | buffer[2048] buf + | pointer ptr @ 254 + | byte foo + | + | routine main + | inputs foo, buf + | outputs y, buf + | trashes a, z, n, ptr + | { + | ld y, 0 + | copy ^buf, ptr + | copy foo, [ptr] + y + | } + = 00c0a000a91085fea9c085ffad12c091fe60 + +Read through a pointer. + + | buffer[2048] buf + | pointer ptr @ 254 + | byte foo + | + | routine main + | inputs buf + | outputs y, foo + | trashes a, z, n, ptr + | { + | ld y, 0 + | copy ^buf, ptr + | copy [ptr] + y, foo + | } + = 00c0a000a91085fea9c085ffb1fe8d12c060 diff --git a/tests/SixtyPical Execution.md b/tests/SixtyPical Execution.md index 696a8dc..50e3d7e 100644 --- a/tests/SixtyPical Execution.md +++ b/tests/SixtyPical Execution.md @@ -1,4 +1,4 @@ -Sixtypical Execution +SixtyPical Execution ==================== This is a test suite, written in [Falderal][] format, for the dynamic @@ -6,10 +6,10 @@ execution behaviour of the Sixtypical language, disgregarding static analysis. [Falderal]: http://catseye.tc/node/Falderal - -> Functionality "Execute Sixtypical program" is implemented by + -> Functionality "Execute SixtyPical program" is implemented by -> shell command "bin/sixtypical --execute %(test-body-file)" - -> Tests for functionality "Execute Sixtypical program" + -> Tests for functionality "Execute SixtyPical program" Rudimentary program. @@ -435,6 +435,22 @@ Copy word to word. = y: 0 = z: 0 +Copy literal word to word. + + | word bar + | + | routine main { + | copy word 2000, bar + | } + = a: 0 + = bar: 2000 + = c: 0 + = n: 0 + = v: 0 + = x: 0 + = y: 0 + = z: 0 + Indirect call. | vector foo outputs x trashes z, n diff --git a/tests/SixtyPical Syntax.md b/tests/SixtyPical Syntax.md index ad68ae7..98356af 100644 --- a/tests/SixtyPical Syntax.md +++ b/tests/SixtyPical Syntax.md @@ -1,5 +1,5 @@ -Sixtypical Execution -==================== +SixtyPical Syntax +================= This is a test suite, written in [Falderal][] format, for the syntax of the Sixtypical language, disgregarding execution, static analysis, etc. @@ -9,10 +9,10 @@ but not necessarily sensible programs. [Falderal]: http://catseye.tc/node/Falderal - -> Functionality "Check syntax of Sixtypical program" is implemented by + -> Functionality "Check syntax of SixtyPical program" is implemented by -> shell command "bin/sixtypical %(test-body-file) && echo ok" - -> Tests for functionality "Check syntax of Sixtypical program" + -> Tests for functionality "Check syntax of SixtyPical program" Rudimentary program. @@ -123,6 +123,19 @@ Repeat with not | } = ok +User-defined memory addresses of different types. + + | byte byt + | word wor + | vector vec + | byte table tab + | buffer[2048] buf + | pointer ptr + | + | routine main { + | } + = ok + Explicit memory address. | byte screen @ 1024 @@ -308,3 +321,16 @@ goto. | goto foo | } ? SyntaxError + +Buffers and pointers. + + | buffer[2048] buf + | pointer ptr + | byte foo + | + | routine main { + | copy ^buf, ptr + | copy 123, [ptr] + y + | copy [ptr] + y, foo + | } + = ok