From c66b33918125e71a308268497c93a23c5f2e869b Mon Sep 17 00:00:00 2001 From: Cat's Eye Technologies Date: Fri, 4 Apr 2014 15:32:47 +0100 Subject: [PATCH] Spiffy up the README, move meaty stuff into docs. --- README.markdown | 131 +++++++++++------------------------------- doc/Checking.markdown | 102 +++++++++++++++++++++++++++++++- 2 files changed, 134 insertions(+), 99 deletions(-) diff --git a/README.markdown b/README.markdown index 7d3ed32..8339974 100644 --- a/README.markdown +++ b/README.markdown @@ -50,7 +50,10 @@ certain ways. For example, these are illegal: ### Abstract Interpretation ### SixtyPical tries to prevent the program from using data that has no meaning. -For example, the following: + +The instructions of a routine are analyzed using abstract interpretation. +One thing we specifically do is determine which registers and memory locations +are *not* affected by the routine. For example, the following: routine do_it { lda #0 @@ -63,114 +66,46 @@ For example, the following: * the A register is declared to be a meaningful output of `update_score` * `update_score` was determined to not change the value of the A register -The first must be done with an explicit declaration on `update_score` (NYI). -The second will be done using abstract interpretation of the code of -`update_score` (needs to be implemented again, now, and better). +The first case must be done with an explicit declaration on `update_score`. +The second case will be be inferred using abstract interpretation of the code +of `update_score`. ### Structured Programming ### -You get an `if` and a `repeat` and instructions like `sei` work like `with` -where they are followed by a block and the `cli` instruction is implicitly -(and unavoidably) added at the end. - -For more information, see the docs (which are written in the form of a -Falderal literate test suite.) - -Concepts --------- - -### Routines ### +SixtyPical eschews labels for code and instead organizes code into _blocks_. Instead of the assembly-language subroutine, SixtyPical provides the _routine_ -as the abstraction for a reusable sequence of code. +as the abstraction for a reusable sequence of code. A routine may be called, +or may be included inline, by another routine. The body of a routine is a +block. -A routine may be called, or may be included inline, by another routine. +Along with routines, you get `if`, `repeat`, and `with` constructs which take +blocks. The `with` construct takes an instruction like `sei` and implicitly +(and unavoidably) inserts the corresponding `cli` at the end of the block. -There is one top-level routine called `main` which represents the entire -program. +For More Information +-------------------- -The instructions of a routine are analyzed using abstract interpretation. -One thing we specifically do is determine which registers and memory locations -are *not* affected by the routine. +For more information, see the docs (which are written in the form of +Falderal literate test suites. If you have Falderal installed, you can run +the tests with `./test.sh`.) -If a register is not affected by a routine, then a caller of that routine may -assume that the value in that register is retained. +Ideas +----- -Of course, a routine may intentionally affect a register or memory location, -as an output. It must declare this. We're not there yet. - -### Addresses ### - -The body of a routine may not refer to an address literally. It must use -a symbol that was declared previously. - -An address may be declared with `reserve`, which is like `.data` or `.bss` -in an assembler. This is an address into the program's data. It is global -to all routines. - -An address may be declared with `locate`, which is like `.alias` in an -assembler, with the understanding that the value will be treated "like an -address." This is generally an address into the operating system or hardware -(e.g. kernal routine, I/O port, etc.) - -Not there. yet: - -> Inside a routine, an address may be declared with `temporary`. This is like -> `static` in C, except the value at that address is not guaranteed to be -> retained between invokations of the routine. Such addresses may only be used -> within the routine where they are declared. If analysis indicates that two -> temporary addresses are never used simultaneously, they may be merged -> to the same address. - -An address knows what kind of data is stored at the address: +These aren't implemented yet: -* `byte`: an 8-bit byte. not part of a word. not to be used as an address. - (could be an index though.) -* `word`: a 16-bit word. not to be used as an address. -* `vector`: a 16-bit address of a routine. Only a handful of operations - are supported on vectors: - - * copying the contents of one vector to another - * copying the address of a routine into a vector - * jumping indirectly to a vector (i.e. to the code at the address - contained in the vector (and this can only happen at the end of a - routine (NYI)) - * `jsr`'ing indirectly to a vector (which is done with a fun - generated trick (NYI)) - -* `byte table`: a series of `byte`s contiguous in memory starting from the - address. This is the only kind of address that can be used in - indexed addressing. +* Abstract interpretation must extend to `if`, `repeat`, and `with` + blocks. The two incoming contexts must be merged, and any storage + locations updated differently or poisoned in either context, will be + considered poisoned in the result context. -### Blocks ### - -Each routine is a block. It may be composed of inner blocks, if those -inner blocks are attached to certain instructions. - -SixtyPical does not have instructions that map literally to the 6502 branch -instructions. Instead, it has an `if` construct, with two blocks (for the -"then" and `else` parts), and the branch instructions map to conditions for -this construct. - -Similarly, there is a `repeat` construct. The same branch instructions can -be used in the condition to this construct. In this case, they branch back -to the top of the `repeat` loop. - -The abstract states of the machine at each of the different block exits are -merged during analysis. If any register or memory location is treated -inconsistently (e.g. updated in one branch of the test, but not the other,) -that register cannot subsequently be used without a declaration to the effect -that we know what's going on. (This is all a bit fuzzy right now.) - -There is also no `rts` instruction. It is included at the end of a routine, -but only when the routine is used as a subroutine. Also, if the routine -ends by `jsr`ing another routine, it reserves the right to do a tail-call -or even a fallthrough. - -There are also _with_ instructions, which are associated with three opcodes -that have natural symmetrical opcodes: `pha`, `php`, and `sei`. These -instructions take a block. The natural symmetrical opcode is inserted at -the end of the block. +* Inside a routine, an address may be declared with `temporary`. This is like + `static` in C, except the value at that address is not guaranteed to be + retained between invokations of the routine. Such addresses may only be used + within the routine where they are declared. If analysis indicates that two + temporary addresses are never used simultaneously, they may be merged + to the same address. TODO ---- @@ -178,9 +113,9 @@ TODO * Initial values for reserved, incl. tables * give length for tables, must be there for reserved, if no init val * Character tables ("strings" to everybody else) -* Work out the analyses again and document them * Addressing modes — indexed mode on more instructions * `jsr (vector)` * `jmp routine` * insist on EOL after each instruction. need spacesWOEOL production * asl .a +* `outputs` on externals diff --git a/doc/Checking.markdown b/doc/Checking.markdown index a6e2949..1774263 100644 --- a/doc/Checking.markdown +++ b/doc/Checking.markdown @@ -11,6 +11,9 @@ Checking SixtyPical Programs -> Functionality "Check SixtyPical program" is implemented by -> shell command "bin/sixtypical check %(test-file)" +Some Basic Syntax +----------------- + `main` must be present. | routine main { @@ -45,7 +48,52 @@ A comment may appear after each declaration. | } = True -A program may `reserve` and `assign`. +Addresses +--------- + +An address may be declared with `reserve`, which is like `.data` or `.bss` +in an assembler. This is an address into the program's data. It is global +to all routines. + + | reserve byte lives + | routine main { + | lda #3 + | sta lives + | } + | routine died { + | dec lives + | } + = True + +An address may be declared with `locate`, which is like `.alias` in an +assembler, with the understanding that the value will be treated "like an +address." This is generally an address into the operating system or hardware +(e.g. kernal routine, I/O port, etc.) + + | assign byte screen $0400 + | routine main { + | lda #0 + | sta screen + | } + = True + +The body of a routine may not refer to an address literally. It must use +a symbol that was declared previously with `reserve` or `assign`. + + | routine main { + | lda #0 + | sta $0400 + | } + ? unexpected "$" + + | assign byte screen $0400 + | routine main { + | lda #0 + | sta screen + | } + = True + +Test for many combinations of `reserve` and `assign`. | reserve byte lives | assign byte gdcol 647 @@ -214,3 +262,55 @@ We cannot absolute access a vector. | lda screen | } ? incompatible types 'Vector' and 'Byte' + +### Addresses ### + +An address knows what kind of data is stored at the address: + +* `byte`: an 8-bit byte. not part of a word. not to be used as an address. + (could be an index though.) +* `word`: a 16-bit word. not to be used as an address. +* `vector`: a 16-bit address of a routine. Only a handful of operations + are supported on vectors: + + * copying the contents of one vector to another + * copying the address of a routine into a vector + * jumping indirectly to a vector (i.e. to the code at the address + contained in the vector (and this can only happen at the end of a + routine (NYI)) + * `jsr`'ing indirectly to a vector (which is done with a fun + generated trick (NYI)) + +* `byte table`: a series of `byte`s contiguous in memory starting from the + address. This is the only kind of address that can be used in + indexed addressing. + +### Blocks ### + +Each routine is a block. It may be composed of inner blocks, if those +inner blocks are attached to certain instructions. + +SixtyPical does not have instructions that map literally to the 6502 branch +instructions. Instead, it has an `if` construct, with two blocks (for the +"then" and `else` parts), and the branch instructions map to conditions for +this construct. + +Similarly, there is a `repeat` construct. The same branch instructions can +be used in the condition to this construct. In this case, they branch back +to the top of the `repeat` loop. + +The abstract states of the machine at each of the different block exits are +merged during analysis. If any register or memory location is treated +inconsistently (e.g. updated in one branch of the test, but not the other,) +that register cannot subsequently be used without a declaration to the effect +that we know what's going on. (This is all a bit fuzzy right now.) + +There is also no `rts` instruction. It is included at the end of a routine, +but only when the routine is used as a subroutine. Also, if the routine +ends by `jsr`ing another routine, it reserves the right to do a tail-call +or even a fallthrough. + +There are also _with_ instructions, which are associated with three opcodes +that have natural symmetrical opcodes: `pha`, `php`, and `sei`. These +instructions take a block. The natural symmetrical opcode is inserted at +the end of the block.