1
0
mirror of https://github.com/catseye/SixtyPical.git synced 2024-11-25 23:49:17 +00:00

Spiffy up the README, move meaty stuff into docs.

This commit is contained in:
Cat's Eye Technologies 2014-04-04 15:32:47 +01:00
parent af05d77d2d
commit c66b339181
2 changed files with 134 additions and 99 deletions

View File

@ -50,7 +50,10 @@ certain ways. For example, these are illegal:
### Abstract Interpretation ###
SixtyPical tries to prevent the program from using data that has no meaning.
For example, the following:
The instructions of a routine are analyzed using abstract interpretation.
One thing we specifically do is determine which registers and memory locations
are *not* affected by the routine. For example, the following:
routine do_it {
lda #0
@ -63,114 +66,46 @@ For example, the following:
* the A register is declared to be a meaningful output of `update_score`
* `update_score` was determined to not change the value of the A register
The first must be done with an explicit declaration on `update_score` (NYI).
The second will be done using abstract interpretation of the code of
`update_score` (needs to be implemented again, now, and better).
The first case must be done with an explicit declaration on `update_score`.
The second case will be be inferred using abstract interpretation of the code
of `update_score`.
### Structured Programming ###
You get an `if` and a `repeat` and instructions like `sei` work like `with`
where they are followed by a block and the `cli` instruction is implicitly
(and unavoidably) added at the end.
For more information, see the docs (which are written in the form of a
Falderal literate test suite.)
Concepts
--------
### Routines ###
SixtyPical eschews labels for code and instead organizes code into _blocks_.
Instead of the assembly-language subroutine, SixtyPical provides the _routine_
as the abstraction for a reusable sequence of code.
as the abstraction for a reusable sequence of code. A routine may be called,
or may be included inline, by another routine. The body of a routine is a
block.
A routine may be called, or may be included inline, by another routine.
Along with routines, you get `if`, `repeat`, and `with` constructs which take
blocks. The `with` construct takes an instruction like `sei` and implicitly
(and unavoidably) inserts the corresponding `cli` at the end of the block.
There is one top-level routine called `main` which represents the entire
program.
For More Information
--------------------
The instructions of a routine are analyzed using abstract interpretation.
One thing we specifically do is determine which registers and memory locations
are *not* affected by the routine.
For more information, see the docs (which are written in the form of
Falderal literate test suites. If you have Falderal installed, you can run
the tests with `./test.sh`.)
If a register is not affected by a routine, then a caller of that routine may
assume that the value in that register is retained.
Ideas
-----
Of course, a routine may intentionally affect a register or memory location,
as an output. It must declare this. We're not there yet.
### Addresses ###
The body of a routine may not refer to an address literally. It must use
a symbol that was declared previously.
An address may be declared with `reserve`, which is like `.data` or `.bss`
in an assembler. This is an address into the program's data. It is global
to all routines.
An address may be declared with `locate`, which is like `.alias` in an
assembler, with the understanding that the value will be treated "like an
address." This is generally an address into the operating system or hardware
(e.g. kernal routine, I/O port, etc.)
Not there. yet:
> Inside a routine, an address may be declared with `temporary`. This is like
> `static` in C, except the value at that address is not guaranteed to be
> retained between invokations of the routine. Such addresses may only be used
> within the routine where they are declared. If analysis indicates that two
> temporary addresses are never used simultaneously, they may be merged
> to the same address.
An address knows what kind of data is stored at the address:
These aren't implemented yet:
* `byte`: an 8-bit byte. not part of a word. not to be used as an address.
(could be an index though.)
* `word`: a 16-bit word. not to be used as an address.
* `vector`: a 16-bit address of a routine. Only a handful of operations
are supported on vectors:
* copying the contents of one vector to another
* copying the address of a routine into a vector
* jumping indirectly to a vector (i.e. to the code at the address
contained in the vector (and this can only happen at the end of a
routine (NYI))
* `jsr`'ing indirectly to a vector (which is done with a fun
generated trick (NYI))
* `byte table`: a series of `byte`s contiguous in memory starting from the
address. This is the only kind of address that can be used in
indexed addressing.
* Abstract interpretation must extend to `if`, `repeat`, and `with`
blocks. The two incoming contexts must be merged, and any storage
locations updated differently or poisoned in either context, will be
considered poisoned in the result context.
### Blocks ###
Each routine is a block. It may be composed of inner blocks, if those
inner blocks are attached to certain instructions.
SixtyPical does not have instructions that map literally to the 6502 branch
instructions. Instead, it has an `if` construct, with two blocks (for the
"then" and `else` parts), and the branch instructions map to conditions for
this construct.
Similarly, there is a `repeat` construct. The same branch instructions can
be used in the condition to this construct. In this case, they branch back
to the top of the `repeat` loop.
The abstract states of the machine at each of the different block exits are
merged during analysis. If any register or memory location is treated
inconsistently (e.g. updated in one branch of the test, but not the other,)
that register cannot subsequently be used without a declaration to the effect
that we know what's going on. (This is all a bit fuzzy right now.)
There is also no `rts` instruction. It is included at the end of a routine,
but only when the routine is used as a subroutine. Also, if the routine
ends by `jsr`ing another routine, it reserves the right to do a tail-call
or even a fallthrough.
There are also _with_ instructions, which are associated with three opcodes
that have natural symmetrical opcodes: `pha`, `php`, and `sei`. These
instructions take a block. The natural symmetrical opcode is inserted at
the end of the block.
* Inside a routine, an address may be declared with `temporary`. This is like
`static` in C, except the value at that address is not guaranteed to be
retained between invokations of the routine. Such addresses may only be used
within the routine where they are declared. If analysis indicates that two
temporary addresses are never used simultaneously, they may be merged
to the same address.
TODO
----
@ -178,9 +113,9 @@ TODO
* Initial values for reserved, incl. tables
* give length for tables, must be there for reserved, if no init val
* Character tables ("strings" to everybody else)
* Work out the analyses again and document them
* Addressing modes — indexed mode on more instructions
* `jsr (vector)`
* `jmp routine`
* insist on EOL after each instruction. need spacesWOEOL production
* asl .a
* `outputs` on externals

View File

@ -11,6 +11,9 @@ Checking SixtyPical Programs
-> Functionality "Check SixtyPical program" is implemented by
-> shell command "bin/sixtypical check %(test-file)"
Some Basic Syntax
-----------------
`main` must be present.
| routine main {
@ -45,7 +48,52 @@ A comment may appear after each declaration.
| }
= True
A program may `reserve` and `assign`.
Addresses
---------
An address may be declared with `reserve`, which is like `.data` or `.bss`
in an assembler. This is an address into the program's data. It is global
to all routines.
| reserve byte lives
| routine main {
| lda #3
| sta lives
| }
| routine died {
| dec lives
| }
= True
An address may be declared with `locate`, which is like `.alias` in an
assembler, with the understanding that the value will be treated "like an
address." This is generally an address into the operating system or hardware
(e.g. kernal routine, I/O port, etc.)
| assign byte screen $0400
| routine main {
| lda #0
| sta screen
| }
= True
The body of a routine may not refer to an address literally. It must use
a symbol that was declared previously with `reserve` or `assign`.
| routine main {
| lda #0
| sta $0400
| }
? unexpected "$"
| assign byte screen $0400
| routine main {
| lda #0
| sta screen
| }
= True
Test for many combinations of `reserve` and `assign`.
| reserve byte lives
| assign byte gdcol 647
@ -214,3 +262,55 @@ We cannot absolute access a vector.
| lda screen
| }
? incompatible types 'Vector' and 'Byte'
### Addresses ###
An address knows what kind of data is stored at the address:
* `byte`: an 8-bit byte. not part of a word. not to be used as an address.
(could be an index though.)
* `word`: a 16-bit word. not to be used as an address.
* `vector`: a 16-bit address of a routine. Only a handful of operations
are supported on vectors:
* copying the contents of one vector to another
* copying the address of a routine into a vector
* jumping indirectly to a vector (i.e. to the code at the address
contained in the vector (and this can only happen at the end of a
routine (NYI))
* `jsr`'ing indirectly to a vector (which is done with a fun
generated trick (NYI))
* `byte table`: a series of `byte`s contiguous in memory starting from the
address. This is the only kind of address that can be used in
indexed addressing.
### Blocks ###
Each routine is a block. It may be composed of inner blocks, if those
inner blocks are attached to certain instructions.
SixtyPical does not have instructions that map literally to the 6502 branch
instructions. Instead, it has an `if` construct, with two blocks (for the
"then" and `else` parts), and the branch instructions map to conditions for
this construct.
Similarly, there is a `repeat` construct. The same branch instructions can
be used in the condition to this construct. In this case, they branch back
to the top of the `repeat` loop.
The abstract states of the machine at each of the different block exits are
merged during analysis. If any register or memory location is treated
inconsistently (e.g. updated in one branch of the test, but not the other,)
that register cannot subsequently be used without a declaration to the effect
that we know what's going on. (This is all a bit fuzzy right now.)
There is also no `rts` instruction. It is included at the end of a routine,
but only when the routine is used as a subroutine. Also, if the routine
ends by `jsr`ing another routine, it reserves the right to do a tail-call
or even a fallthrough.
There are also _with_ instructions, which are associated with three opcodes
that have natural symmetrical opcodes: `pha`, `php`, and `sei`. These
instructions take a block. The natural symmetrical opcode is inserted at
the end of the block.