1
0
mirror of https://github.com/catseye/SixtyPical.git synced 2025-01-07 12:29:52 +00:00
A 6502-oriented low-level programming language supporting advanced static analysis
Go to file
2014-04-04 19:06:58 +01:00
doc Poisoning high/low byte of word poisons the word. 2014-04-04 19:06:58 +01:00
eg Get eg/* working again. 2014-04-04 16:16:51 +01:00
lib forgotten file 2014-04-02 13:47:37 +01:00
src Poisoning high/low byte of word poisons the word. 2014-04-04 19:06:58 +01:00
.hgignore Initial import. 2014-03-31 23:14:07 +01:00
build.sh Split into modules. 2014-04-01 12:12:12 +01:00
clean.sh Rename some modules. 2014-04-01 13:11:25 +01:00
loadngo.sh Move tests to own files in doc. 2014-04-03 13:32:06 +01:00
README.markdown Poisoning high/low byte of word poisons the word. 2014-04-04 19:06:58 +01:00
test.sh Beginnings of rework on the analyzer. 2014-04-04 13:06:12 +01:00

SixtyPical

SixtyPical is a very low-level programming language, similar to 6502 assembly, with static analysis through type-checking and abstract interpretation.

It is a work in progress, currently at the proof-of-concept stage.

It is expected that a common use case for SixtyPical would be retroprogramming for the Commodore 64 and other 6502-based computers such as the VIC-20.

Many SixtyPical instructions map precisely to 6502 opcodes. However, SixtyPical is not an assembly language: the programmer does not have total control over the layout of code and data in memory. Some 6502 opcodes have no SixtyPical equivalent, while some have an equivalent that acts in a slightly different (but intuitively related) way. And some commands are unique to SixtyPical.

sixtypical is the reference implementation of SixtyPical. It is written in Haskell. It can currently parse and check a SixtyPical program, and can emit an Ophis assembler listing for it.

This distribution will soon be placed under an open-source license.

Quick Start

If you have ghc, Ophis, and VICE 2.4 installed, clone this repo, cd into it, and run

./loadngo.sh eg/demo.60p

The Big Idea(s)

Typed Addresses

SixtyPical distinguishes several kinds of addresses: those that hold a byte, those that hold a word (in low-byte-high-byte sequence), those that are the beginning of a table of bytes, and vectors (those that hold a word pointer to a machine-language routine.) It prevents the program from accessing them in certain ways. For example, these are illegal:

reserve byte lives
reserve word score
routine do_it {
    lda score        ; no! can't treat word as if it were a byte
    lda lives, x     ; no! can't treat a byte as if it were a table
}

Abstract Interpretation

SixtyPical tries to prevent the program from using data that has no meaning.

The instructions of a routine are analyzed using abstract interpretation. One thing we specifically do is determine which registers and memory locations are not affected by the routine. For example, the following:

routine do_it {
    lda #0
    jsr update_score
    sta vic_border_colour    ; uh... what do we know about reg A here?
}

...is illegal unless one of the following is true:

  • the A register is declared to be a meaningful output of update_score
  • update_score was determined to not change the value of the A register

The first case must be done with an explicit declaration on update_score. The second case will be be inferred using abstract interpretation of the code of update_score.

Structured Programming

SixtyPical eschews labels for code and instead organizes code into blocks.

Instead of the assembly-language subroutine, SixtyPical provides the routine as the abstraction for a reusable sequence of code. A routine may be called, or may be included inline, by another routine. The body of a routine is a block.

Along with routines, you get if, repeat, and with constructs which take blocks. The with construct takes an instruction like sei and implicitly (and unavoidably) inserts the corresponding cli at the end of the block.

Abstract interpretation extends to if blocks. The two incoming contexts are merged, and any storage locations poisoned in either context are considered poisoned in the result context.

(Same should apply for repeat and with and, really, many other cases which there just aren't enough test cases for yet.)

For More Information

For more information, see the docs (which are written in the form of Falderal literate test suites. If you have Falderal installed, you can run the tests with ./test.sh.)

Ideas

These aren't implemented yet:

  • Inside a routine, an address may be declared with temporary. This is like static in C, except the value at that address is not guaranteed to be retained between invokations of the routine. Such addresses may only be used within the routine where they are declared. If analysis indicates that two temporary addresses are never used simultaneously, they may be merged to the same address.

TODO

  • Initial values for reserved, incl. tables
  • give length for tables, must be there for reserved, if no init val
  • Character tables ("strings" to everybody else)
  • Addressing modes — indexed mode on more instructions
  • jsr (vector)
  • jmp routine
  • comments in any spaces; forget the eol thing
  • outputs on externals
  • Routine is a kind of StorageLocation? (Location)?
  • remove DELTA -> ADD/SUB (requires carry be notated on ADD and SUB though)
  • explicit with syntax