1
0
mirror of https://github.com/catseye/SixtyPical.git synced 2024-11-22 01:32:13 +00:00
SixtyPical/README.md
2018-04-19 10:35:21 +01:00

176 lines
6.7 KiB
Markdown

SixtyPical
==========
_Version 0.16. Work-in-progress, everything is subject to change._
**SixtyPical** is a 6502-like programming language with advanced
static analysis.
"6502-like" means that it has similar restrictions as programming
in 6502 assembly (e.g. the programmer must choose the registers that
values will be stored in) and is concomitantly easy for a compiler to
translate it to 6502 machine language code.
"Advanced static analysis" includes _abstract interpretation_, where we
go through the program step by step, tracking not just the changes that
happen during a _specific_ execution of the program, but _sets_ of changes
that could _possibly_ happen in any run of the program. This lets us
determine that certain things can never happen, which we can then formulate
as safety checks.
In practice, this means it catches things like
* you forgot to clear carry before adding something to the accumulator
* a subroutine that you call trashes a register you thought was preserved
* you tried to read or write a byte beyond the end of a byte array
* you tried to write the address of something that was not a routine, to
a jump vector
and suchlike. It also provides some convenient operations based on
machine-language programming idioms, such as
* copying values from one register to another (via a third register when
there are no underlying instructions that directly support it); this
includes 16-bit values, which are copied in two steps
* explicit tail calls
* indirect subroutine calls
The reference implementation can analyze and compile SixtyPical programs to
6502 machine code.
Quick Start
-----------
If you have the [VICE][] emulator installed, from this directory, you can run
./loadngo.sh c64 eg/c64/hearts.60p
and it will compile the [hearts.60p source code](eg/c64/hearts.60p) and
automatically start it in the `x64` emulator, and you should see:
![Screenshot of result of running hearts.60p](https://raw.github.com/catseye/SixtyPical/master/images/hearts.png)
You can try the `loadngo.sh` script on other sources in the `eg` directory
tree, which contains more extensive examples, including an entire
game(-like program); see [eg/README.md](eg/README.md) for a listing.
[VICE]: http://vice-emu.sourceforge.net/
Documentation
-------------
* [Design Goals](doc/Design%20Goals.md)
* [SixtyPical specification](doc/SixtyPical.md)
* [SixtyPical revision history](HISTORY.md)
* [Literate test suite for SixtyPical syntax](tests/SixtyPical%20Syntax.md)
* [Literate test suite for SixtyPical analysis](tests/SixtyPical%20Analysis.md)
* [Literate test suite for SixtyPical compilation](tests/SixtyPical%20Compilation.md)
* [Literate test suite for SixtyPical fallthru optimization](tests/SixtyPical%20Fallthru.md)
* [6502 Opcodes used/not used in SixtyPical](doc/6502%20Opcodes.md)
TODO
----
### `low` and `high` address operators
To turn `word` type into `byte`.
Trying to remember if we have a compelling case for this or now. The best I can think
of is for implementing 16-bit `cmp` in an efficient way. Maybe we should see if we
can get by with 16-bit `cmp` instead though.
The problem is that once a byte is extracted, putting it back into a word is awkward.
The address operators have to modify a destination in a special way. That is, when
you say `st a, >word`, you are updating `word` to be `word & $ff | a << 8`, somelike.
Is that consistent with `st`? Well, probably it is, but we have to explain it.
It might make more sense, then, for it to be "part of the operation" instead of "part of
the reference"; something like `st.hi x, word`; `st.lo y, word`. Dunno.
### Save values
This preserves them, so that, semantically, they can be used later even though they
are trashed (or otherwise alternately used) inside the block.
Inside the block, we set them as writeable (but not meaningful). When the block
exits, we restore whatever status they had.
This act will trash `a`, both in the block, and outside it, unless the value being
saved is `a`. One idiom would be something like
save a { save var {
...
} }
which would save all values. Maybe abbreviate this to
save a, var {
...
}
This can use the stack. But it need not use the stack.
### Make all symbols forward-referencable
Basically, don't do symbol-table lookups when parsing, but do have a more formal
"symbol resolution" (linking) phase right after parsing.
### Associate each pointer with the buffer it points into
Check that the buffer being read or written to through pointer, appears in appropriate
inputs or outputs set.
In the analysis, when we obtain a pointer, we need to record, in contect, what buffer
that pointer came from.
When we write through that pointer, we need to set that buffer as written.
When we read through the pointer, we need to check that the buffer is readable.
### Table overlays
They are uninitialized, but the twist is, the address is a buffer that is
an input to and/or output of the routine. So, they are defined (insofar
as the buffer is defined.)
They are therefore a "view" of a section of a buffer.
This is slightly dangerous since it does permit aliases: the buffer and the
table refer to the same memory.
Although, if they are `static`, you could say, in the routine in which they
are `static`, as soon as you've established one, you can no longer use the
buffer; and the ones you establish must be disjoint.
(That seems to be the most compelling case for restricting them to `static`.)
An alternative would be `static` pointers, which are currently not possible because
pointers must be zero-page, thus `@`, thus uninitialized.
### Question "consistent initialization"
Question the value of the "consistent initialization" principle for `if` statement analysis.
Part of this is the trashes at the end; I think what it should be is that the trashes
after the `if` is the union of the trashes in each of the branches; this would obviate the
need to `trash` values explicitly, but if you tried to access them afterwards, it would still
error.
### Tail-call optimization
More generally, define a block as having zero or one `goto`s at the end. (and `goto`s cannot
appear elsewhere.)
If a block ends in a `call` can that be converted to end in a `goto`? Why not? I think it can.
The constraints should iron out the same both ways.
And - once we have this - why do we need `goto` to be in tail position, strictly?
As long as the routine has consistent type context every place it exits, that should be fine.
### "Include" directives
Search a searchlist of include paths. And use them to make libraries of routines.
One such library routine might be an `interrupt routine` type for various architectures.
Since "the supervisor" has stored values on the stack, we should be able to trash them
with impunity, in such a routine.