diff --git a/.hgignore b/.hgignore deleted file mode 100644 index bb6d589..0000000 --- a/.hgignore +++ /dev/null @@ -1,3 +0,0 @@ -syntax: glob - -*.pyc diff --git a/.hgtags b/.hgtags deleted file mode 100644 index d8a6bfa..0000000 --- a/.hgtags +++ /dev/null @@ -1,11 +0,0 @@ -923e42a2d0c156f3eeea006301a067e693658fe1 0.1 -6490aea10e20c346a2f1598f698a781799b522ba 0.1-2014.1230 -9ad29480d9bb8425445504ff90fc14a1b1862789 0.2 -5d95f1d75a3226cb117e6b33abc5bdb04e161ef8 0.3 -d30f05a8bb46a2e6c5c61424204a4c8db6365723 0.4 -7a39b84bb002a0938109d705e0faf90d9fd8c424 0.6 -7a39b84bb002a0938109d705e0faf90d9fd8c424 0.6 -0000000000000000000000000000000000000000 0.6 -19c782179db9ed786d6e93a57d53bb31ac355dc4 0.5 -0000000000000000000000000000000000000000 0.6 -f89772f47de989c54434ec11935693c8b8d0f46d 0.6 diff --git a/HISTORY.md b/HISTORY.md index 62a3214..08e004d 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -1,6 +1,27 @@ History of SixtyPical ===================== +0.13 +---- + +* It is a static analysis error if it cannot be proven that a read or write + to a table falls within the defined size of that table. +* The reference analyzer's ability to prove this is currently fairly weak, + but it does exist: + * Loading a constant into a memory location means we know the range + is exactly that one constant value. + * `AND`ing a memory location with a value means the range of the + memory location cannot exceed the range of the value. + * Doing arithmetic on a memory location invalidates our knowledge + of its range. + * Copying a value from one memory location to another copies the + known range as well. +* Cleaned up the internals of the reference implementation (incl. the AST) + and re-organized the example programs in the `eg` subdirectory. +* Most errors produced by the reference implementation now include a line number. +* Compiler supports multiple preludes, specifically both Commodore 64 and + Commodore VIC-20; the `loadngo.sh` script supports both architectures too. + 0.12 ---- diff --git a/LICENSE b/LICENSE index 35e493b..9d5dd32 100644 --- a/LICENSE +++ b/LICENSE @@ -7,7 +7,7 @@ covered by the following BSD-compatible license, modelled after the ----------------------------------------------------------------------------- - Copyright (c)2014-2015 Chris Pressey, Cat's Eye Technologies. + Copyright (c)2014-2018 Chris Pressey, Cat's Eye Technologies. The authors intend this Report to belong to the entire SixtyPical community, and so we grant permission to copy and distribute it for @@ -24,7 +24,7 @@ The source code for the reference implementation and supporting tools (in the ----------------------------------------------------------------------------- - Copyright (c)2014-2015, Chris Pressey, Cat's Eye Technologies. + Copyright (c)2014-2018, Chris Pressey, Cat's Eye Technologies. All rights reserved. Redistribution and use in source and binary forms, with or without diff --git a/README.md b/README.md index 1509c44..eb6d894 100644 --- a/README.md +++ b/README.md @@ -1,15 +1,28 @@ SixtyPical ========== -_Version 0.12. Work-in-progress, everything is subject to change._ +_Version 0.13. Work-in-progress, everything is subject to change._ -SixtyPical is a very low-level programming language, similar to 6502 assembly, -with static analysis through abstract interpretation. +**SixtyPical** is a 6502-assembly-like programming language with advanced +static analysis. + +"6502-assembly-like" means that it has similar restrictions as programming +in 6502 assembly (e.g. the programmer must choose the registers that +values will be stored in) and is concomittantly easy for a compiler to +translate it to 6502 machine language code. + +"Advanced static analysis" includes _abstract interpretation_, where we +go through the program step by step, tracking not just the changes that +happen during a _specific_ execution of the program, but _sets_ of changes +that could _possibly_ happen in any run of the program. This lets us +determine that certain things can never happen, which we can present as +safety guarantees. In practice, this means it catches things like * you forgot to clear carry before adding something to the accumulator * a subroutine that you call trashes a register you thought was preserved +* you tried to read or write a byte beyond the end of a byte array * you tried to write the address of something that was not a routine, to a jump vector @@ -17,13 +30,29 @@ and suchlike. It also provides some convenient operations and abstractions based on common machine-language programming idioms, such as * copying values from one register to another (via a third register when - there are no underlying instructions that directly support it) + there are no underlying instructions that directly support it); this + includes 16-bit values, which are copied in two steps * explicit tail calls * indirect subroutine calls The reference implementation can analyze and compile SixtyPical programs to 6502 machine code. +Quick Start +----------- + +If you have the [VICE][] emulator installed, from this directory, you can run + + ./loadngo.sh c64 eg/c64/hearts.60p + +and it will compile the [hearts.60p source code](eg/c64/hearts.60p) and +automatically start it in the `x64` emulator, and you should see: + +![Screenshot of result of running hearts.60p](https://raw.github.com/catseye/SixtyPical/master/images/hearts.png) + +You can try the `loadngo.sh` script on other sources in the `eg` directory +tree. There is an entire small game(-like program) in [demo-game.60p](eg/c64/demo-game.60p). + Documentation ------------- @@ -31,7 +60,6 @@ Documentation * [SixtyPical specification](doc/SixtyPical.md) * [SixtyPical revision history](HISTORY.md) * [Literate test suite for SixtyPical syntax](tests/SixtyPical%20Syntax.md) -* [Literate test suite for SixtyPical execution](tests/SixtyPical%20Execution.md) * [Literate test suite for SixtyPical analysis](tests/SixtyPical%20Analysis.md) * [Literate test suite for SixtyPical compilation](tests/SixtyPical%20Compilation.md) * [6502 Opcodes used/not used in SixtyPical](doc/6502%20Opcodes.md) @@ -39,26 +67,46 @@ Documentation TODO ---- +### `for`-like loop + +We have range-checking in the abstract analysis now, but we lack practical ways +to use it. + +We can `and` a value to ensure it is within a certain range. However, in the 6502 +ISA the only register you can `and` is `A`, while loops are done with `X` or `Y`. +Insisting this as the way to do it would result in a lot of `TXA`s and `TAX`s. + +What would be better is a dedicated `for` loop, like + + for x in 0 to 15 { + // in here, we know the range of x is exactly 0-15 inclusive + // also in here: we are disallowed from changing x + } + +However, this is slightly restrictive, and hides a lot. + +However however, options which do not hide a lot, require a lot of looking at +(to ensure: did you increment the loop variable? only once? etc.) + +The leading compromise so far is an "open-faced for loop", like + + ld x, 15 + for x downto 0 { + // same as above + } + +This makes it a little more explicit, at least, even though the loop +decrementation is still hidden. + ### Save registers on stack This preserves them, so that, semantically, they can be used later even though they are trashed inside the block. -### Range checking in the abstract interpretation - -If you copy the address of a buffer (say it is size N) to a pointer, it is valid. -If you add a value from 0 to N-1 to the pointer, it is still valid. -But if you add a value ≥ N to it, it becomes invalid. -This should be tracked in the abstract interpretation. -(If only because abstract interpretation is the major point of this project!) - -Range-checking buffers might be too difficult. Range checking tables will be easier. -If a value is ANDed with 15, its range must be 0-15, etc. - ### Re-order routines and optimize tail-calls to fallthroughs Not because it saves 3 bytes, but because it's a neat trick. Doing it optimally -is probably NP-complete. But doing it adeuqately is probably not that hard. +is probably NP-complete. But doing it adequately is probably not that hard. ### And at some point... @@ -71,8 +119,9 @@ is probably NP-complete. But doing it adeuqately is probably not that hard. * `static` pointers -- currently not possible because pointers must be zero-page, thus `@`, thus uninitialized. * Question the value of the "consistent initialization" principle for `if` statement analysis. * `interrupt` routines -- to indicate that "the supervisor" has stored values on the stack, so we can trash them. -* Error messages that include the line number of the source code. * Add absolute addressing in shl/shr, absolute-indexed for add, sub, etc. * Automatic tail-call optimization (could be tricky, w/constraints?) * Possibly `ld x, [ptr] + y`, possibly `st x, [ptr] + y`. * Maybe even `copy [ptra] + y, [ptrb] + y`, which can be compiled to indirect LDA then indirect STA! + +[VICE]: http://vice-emu.sourceforge.net/ diff --git a/bin/sixtypical b/bin/sixtypical index 62b3c6a..72aabef 100755 --- a/bin/sixtypical +++ b/bin/sixtypical @@ -13,7 +13,7 @@ sys.path.insert(0, join(dirname(realpath(sys.argv[0])), '..', 'src')) # ----------------------------------------------------------------- # import codecs -from optparse import OptionParser +from argparse import ArgumentParser from pprint import pprint import sys import traceback @@ -25,28 +25,44 @@ from sixtypical.compiler import Compiler if __name__ == '__main__': - optparser = OptionParser(__doc__.strip()) + argparser = ArgumentParser(__doc__.strip()) - optparser.add_option("--analyze-only", - action="store_true", - help="Only parse and analyze the program; do not compile it.") - optparser.add_option("--basic-prelude", - action="store_true", - help="Insert a Commodore BASIC 2.0 snippet before the program " - "so that it can be LOADed and RUN on Commodore platforms.") - optparser.add_option("--debug", - action="store_true", - help="Display debugging information when analyzing and compiling.") - optparser.add_option("--parse-only", - action="store_true", - help="Only parse the program; do not analyze or compile it.") - optparser.add_option("--traceback", - action="store_true", - help="When an error occurs, display a full Python traceback.") + argparser.add_argument( + 'filenames', metavar='FILENAME', type=str, nargs='+', + help="The SixtyPical source files to compile." + ) + argparser.add_argument( + "--analyze-only", + action="store_true", + help="Only parse and analyze the program; do not compile it." + ) + argparser.add_argument( + "--prelude", type=str, + help="Insert a snippet before the compiled program " + "so that it can be LOADed and RUN on a certain platforms. " + "Also sets the origin. " + "Options are: c64 or vic20." + ) + argparser.add_argument( + "--debug", + action="store_true", + help="Display debugging information when analyzing and compiling." + ) + argparser.add_argument( + "--parse-only", + action="store_true", + help="Only parse the program; do not analyze or compile it." + ) + argparser.add_argument( + "--traceback", + action="store_true", + help="When an error occurs, display a full Python traceback." + ) - (options, args) = optparser.parse_args(sys.argv[1:]) + options, unknown = argparser.parse_known_args(sys.argv[1:]) + remainder = ' '.join(unknown) - for filename in args: + for filename in options.filenames: text = open(filename).read() try: @@ -78,10 +94,16 @@ if __name__ == '__main__': fh = sys.stdout start_addr = 0xc000 prelude = [] - if options.basic_prelude: + if options.prelude == 'c64': start_addr = 0x0801 prelude = [0x10, 0x08, 0xc9, 0x07, 0x9e, 0x32, 0x30, 0x36, 0x31, 0x00, 0x00, 0x00] + elif options.prelude == 'vic20': + start_addr = 0x1001 + prelude = [0x0b, 0x10, 0xc9, 0x07, 0x9e, 0x34, + 0x31, 0x30, 0x39, 0x00, 0x00, 0x00] + else: + raise NotImplementedError # we are outputting a .PRG, so we output the load address first # we don't use the Emitter for this b/c not part of addr space diff --git a/doc/SixtyPical.md b/doc/SixtyPical.md index 79f4782..7e29545 100644 --- a/doc/SixtyPical.md +++ b/doc/SixtyPical.md @@ -25,9 +25,9 @@ There are five *primitive types* in SixtyPical: There are also three *type constructors*: -* T table[N] (N is a power of 2, 1 ≤ N ≤ 256; each entry holds a value +* T table[N] (N entries, 1 ≤ N ≤ 256; each entry holds a value of type T, where T is `byte`, `word`, or `vector`) -* buffer[N] (N entries; each entry is a byte; N is a power of 2, ≤ 64K) +* buffer[N] (N entries; each entry is a byte; 1 ≤ N ≤ 65536) * vector T (address of a value of type T; T must be a routine type) ### User-defined ### diff --git a/eg/proto-game.60p b/eg/c64/demo-game.60p similarity index 98% rename from eg/proto-game.60p rename to eg/c64/demo-game.60p index 7ce80de..afcda06 100644 --- a/eg/proto-game.60p +++ b/eg/c64/demo-game.60p @@ -385,6 +385,13 @@ define game_state_title_screen game_state_routine { ld y, 0 repeat { + + // First we "clip" the index to 0-31 to ensure we don't + // read outside the bounds of the table: + ld a, y + and a, 31 + ld y, a + ld a, press_fire_msg + y st on, c diff --git a/eg/screen2.60p b/eg/c64/hearts.60p similarity index 93% rename from eg/screen2.60p rename to eg/c64/hearts.60p index eb45824..43df36d 100644 --- a/eg/screen2.60p +++ b/eg/c64/hearts.60p @@ -1,7 +1,7 @@ // Displays 256 hearts at the top of the Commodore 64's screen. // Define where the screen starts in memory: -byte table screen @ 1024 +byte table[256] screen @ 1024 routine main // These are the values that will be written to by this routine: diff --git a/eg/intr1.60p b/eg/c64/intr1.60p similarity index 100% rename from eg/intr1.60p rename to eg/c64/intr1.60p diff --git a/eg/joystick.60p b/eg/c64/joystick.60p similarity index 100% rename from eg/joystick.60p rename to eg/c64/joystick.60p diff --git a/eg/screen1.60p b/eg/c64/screen1.60p similarity index 100% rename from eg/screen1.60p rename to eg/c64/screen1.60p diff --git a/eg/rudiments/README.md b/eg/rudiments/README.md new file mode 100644 index 0000000..51230f4 --- /dev/null +++ b/eg/rudiments/README.md @@ -0,0 +1,12 @@ +This directory contains example sources which demonstrate +the rudiments of SixtyPical. + +Some are meant to fail and produce an error message. + +They are not meant to be specific to any architecture, but +many do assume the existence of a routine at 65490 which +outputs the value of the accumulator as an ASCII character, +simply for the purposes of producing some observable output. +(This is an address of a KERNAL routine which does this +on both the Commodore 64 and the Commodore VIC-20, so these +sources should be usable on these architectures.) diff --git a/eg/add-fail.60p b/eg/rudiments/add-fail.60p similarity index 100% rename from eg/add-fail.60p rename to eg/rudiments/add-fail.60p diff --git a/eg/add-pass.60p b/eg/rudiments/add-pass.60p similarity index 100% rename from eg/add-pass.60p rename to eg/rudiments/add-pass.60p diff --git a/eg/add-word.60p b/eg/rudiments/add-word.60p similarity index 100% rename from eg/add-word.60p rename to eg/rudiments/add-word.60p diff --git a/eg/bad-vector.60p b/eg/rudiments/bad-vector.60p similarity index 100% rename from eg/bad-vector.60p rename to eg/rudiments/bad-vector.60p diff --git a/eg/buffer.60p b/eg/rudiments/buffer.60p similarity index 100% rename from eg/buffer.60p rename to eg/rudiments/buffer.60p diff --git a/eg/call.60p b/eg/rudiments/call.60p similarity index 100% rename from eg/call.60p rename to eg/rudiments/call.60p diff --git a/eg/conditional.60p b/eg/rudiments/conditional.60p similarity index 100% rename from eg/conditional.60p rename to eg/rudiments/conditional.60p diff --git a/eg/conditional2.60p b/eg/rudiments/conditional2.60p similarity index 100% rename from eg/conditional2.60p rename to eg/rudiments/conditional2.60p diff --git a/eg/copy.60p b/eg/rudiments/copy.60p similarity index 100% rename from eg/copy.60p rename to eg/rudiments/copy.60p diff --git a/eg/example.60p b/eg/rudiments/example.60p similarity index 100% rename from eg/example.60p rename to eg/rudiments/example.60p diff --git a/eg/forever.60p b/eg/rudiments/forever.60p similarity index 100% rename from eg/forever.60p rename to eg/rudiments/forever.60p diff --git a/eg/goto.60p b/eg/rudiments/goto.60p similarity index 100% rename from eg/goto.60p rename to eg/rudiments/goto.60p diff --git a/eg/if.60p b/eg/rudiments/if.60p similarity index 100% rename from eg/if.60p rename to eg/rudiments/if.60p diff --git a/eg/loop.p60 b/eg/rudiments/loop.60p similarity index 100% rename from eg/loop.p60 rename to eg/rudiments/loop.60p diff --git a/eg/memloc.p60 b/eg/rudiments/memloc.60p similarity index 100% rename from eg/memloc.p60 rename to eg/rudiments/memloc.60p diff --git a/eg/new-style-routine.60p b/eg/rudiments/new-style-routine.60p similarity index 100% rename from eg/new-style-routine.60p rename to eg/rudiments/new-style-routine.60p diff --git a/eg/print.60p b/eg/rudiments/print.60p similarity index 100% rename from eg/print.60p rename to eg/rudiments/print.60p diff --git a/eg/rudiments/range-error.60p b/eg/rudiments/range-error.60p new file mode 100644 index 0000000..32b61a1 --- /dev/null +++ b/eg/rudiments/range-error.60p @@ -0,0 +1,9 @@ +byte table[8] message : "WHAT?" + +routine main + inputs message + outputs x, a, z, n +{ + ld x, 9 + ld a, message + x +} diff --git a/eg/vector-table.60p b/eg/rudiments/vector-table.60p similarity index 100% rename from eg/vector-table.60p rename to eg/rudiments/vector-table.60p diff --git a/eg/vector.60p b/eg/rudiments/vector.60p similarity index 100% rename from eg/vector.60p rename to eg/rudiments/vector.60p diff --git a/eg/word-table.60p b/eg/rudiments/word-table.60p similarity index 91% rename from eg/word-table.60p rename to eg/rudiments/word-table.60p index a0dc978..ef40dd8 100644 --- a/eg/word-table.60p +++ b/eg/rudiments/word-table.60p @@ -1,5 +1,5 @@ word one -word table many +word table[256] many routine main inputs one, many diff --git a/images/hearts.png b/images/hearts.png new file mode 100644 index 0000000..355a506 Binary files /dev/null and b/images/hearts.png differ diff --git a/loadngo.sh b/loadngo.sh index 0a3401e..b3d0e1d 100755 --- a/loadngo.sh +++ b/loadngo.sh @@ -1,19 +1,41 @@ #!/bin/sh -if [ "X$X64" = "X" ]; then - X64=x64 -fi -SRC=$1 -if [ "X$1" = "X" ]; then - echo "Usage: ./loadngo.sh " - exit 1 -fi -OUT=/tmp/a-out.prg -bin/sixtypical --traceback --basic-prelude $SRC > $OUT || exit 1 -ls -la $OUT -if [ -e vicerc ]; then - $X64 -config vicerc $OUT +usage="Usage: loadngo.sh (c64|vic20) [--dry-run] " + +arch="$1" +shift 1 +if [ "X$arch" = "Xc64" ]; then + prelude='c64' + if [ -e vicerc ]; then + emu="x64 -config vicerc" + else + emu="x64" + fi +elif [ "X$arch" = "Xvic20" ]; then + prelude='vic20' + if [ -e vicerc ]; then + emu="xvic -config vicerc" + else + emu="xvic" + fi else - $X64 $OUT + echo $usage && exit 1 fi -rm -f $OUT + +if [ "X$1" = "X--dry-run" ]; then + shift 1 + emu='echo' +fi + +src="$1" +if [ "X$src" = "X" ]; then + echo $usage && exit 1 +fi + +### do it ### + +out=/tmp/a-out.prg +bin/sixtypical --traceback --prelude=$prelude $src > $out || exit 1 +ls -la $out +$emu $out +rm -f $out diff --git a/src/sixtypical/analyzer.py b/src/sixtypical/analyzer.py index ecc88e4..88088aa 100644 --- a/src/sixtypical/analyzer.py +++ b/src/sixtypical/analyzer.py @@ -1,6 +1,6 @@ # encoding: UTF-8 -from sixtypical.ast import Program, Routine, Block, Instr +from sixtypical.ast import Program, Routine, Block, Instr, SingleOp, If, Repeat, WithInterruptsOff from sixtypical.model import ( TYPE_BYTE, TYPE_WORD, TableType, BufferType, PointerType, VectorType, RoutineType, @@ -10,7 +10,16 @@ from sixtypical.model import ( class StaticAnalysisError(ValueError): - pass + def __init__(self, ast, message): + super(StaticAnalysisError, self).__init__(ast, message) + + def __str__(self): + ast = self.args[0] + message = self.args[1] + if isinstance(ast, Routine): + return "{} (in {}, line {})".format(message, ast.name, ast.line_number) + else: + return "{} (line {})".format(message, ast.line_number) class UnmeaningfulReadError(StaticAnalysisError): @@ -37,7 +46,12 @@ class IllegalJumpError(StaticAnalysisError): pass +class RangeExceededError(StaticAnalysisError): + pass + + class ConstraintsError(StaticAnalysisError): + """The constraints of a routine (inputs, outputs, trashes) have been violated.""" pass @@ -66,11 +80,17 @@ class Context(object): """ A location is touched if it was changed (or even potentially changed) during this routine, or some routine called by this routine. - + A location is meaningful if it was an input to this routine, or if it was set to a meaningful value by some operation in this - routine (or some routine called by this routine. - + routine (or some routine called by this routine). + + If a location is meaningful, it has a range. This range represents + the lowest and highest values that it might possibly be (i.e. we know + it cannot possibly be below the lowest or above the highest.) In the + absence of any usage information, the range of a byte, is 0..255 and + the range of a word is 0..65535. + A location is writeable if it was listed in the outputs and trashes lists of this routine. """ @@ -78,40 +98,40 @@ class Context(object): self.routines = routines # Location -> AST node self.routine = routine self._touched = set() - self._meaningful = set() + self._range = dict() self._writeable = set() for ref in inputs: if ref.is_constant(): - raise ConstantConstraintError('%s in %s' % (ref.name, routine.name)) - self._meaningful.add(ref) + raise ConstantConstraintError(self.routine, ref.name) + self._range[ref] = ref.max_range() output_names = set() for ref in outputs: if ref.is_constant(): - raise ConstantConstraintError('%s in %s' % (ref.name, routine.name)) + raise ConstantConstraintError(self.routine, ref.name) output_names.add(ref.name) self._writeable.add(ref) for ref in trashes: if ref.is_constant(): - raise ConstantConstraintError('%s in %s' % (ref.name, routine.name)) + raise ConstantConstraintError(self.routine, ref.name) if ref.name in output_names: - raise InconsistentConstraintsError('%s in %s' % (ref.name, routine.name)) + raise InconsistentConstraintsError(self.routine, ref.name) self._writeable.add(ref) def __str__(self): - return "Context(\n _touched={},\n _meaningful={},\n _writeable={}\n)".format( - LocationRef.format_set(self._touched), LocationRef.format_set(self._meaningful), LocationRef.format_set(self._writeable) + return "Context(\n _touched={},\n _range={},\n _writeable={}\n)".format( + LocationRef.format_set(self._touched), LocationRef.format_set(self._range), LocationRef.format_set(self._writeable) ) def clone(self): c = Context(self.routines, self.routine, [], [], []) c._touched = set(self._touched) - c._meaningful = set(self._meaningful) + c._range = dict(self._range) c._writeable = set(self._writeable) return c def each_meaningful(self): - for ref in self._meaningful: + for ref in self._range.keys(): yield ref def each_touched(self): @@ -127,11 +147,11 @@ class Context(object): if ref.is_constant() or ref in self.routines: pass elif isinstance(ref, LocationRef): - if ref not in self._meaningful: - message = '%s in %s' % (ref.name, self.routine.name) + if ref not in self._range: + message = ref.name if kwargs.get('message'): message += ' (%s)' % kwargs['message'] - raise exception_class(message) + raise exception_class(self.routine, message) elif isinstance(ref, IndexedRef): self.assert_meaningful(ref.ref, **kwargs) self.assert_meaningful(ref.index, **kwargs) @@ -145,23 +165,70 @@ class Context(object): if routine_has_static(self.routine, ref): continue if ref not in self._writeable: - message = '%s in %s' % (ref.name, self.routine.name) + message = ref.name if kwargs.get('message'): message += ' (%s)' % kwargs['message'] - raise exception_class(message) + raise exception_class(self.routine, message) + + def assert_in_range(self, inside, outside): + # FIXME there's a bit of I'm-not-sure-the-best-way-to-do-this-ness, here... + + # inside should always be meaningful + inside_range = self._range[inside] + + # outside might not be meaningful, so default to max range if necessary + if outside in self._range: + outside_range = self._range[outside] + else: + outside_range = outside.max_range() + if isinstance(outside.type, TableType): + outside_range = (0, outside.type.size-1) + + if inside_range[0] < outside_range[0] or inside_range[1] > outside_range[1]: + raise RangeExceededError(self.routine, + "Possible range of {} {} exceeds acceptable range of {} {}".format( + inside, inside_range, outside, outside_range + ) + ) def set_touched(self, *refs): for ref in refs: self._touched.add(ref) + # TODO: it might be possible to invalidate the range here def set_meaningful(self, *refs): for ref in refs: - self._meaningful.add(ref) + if ref not in self._range: + self._range[ref] = ref.max_range() + + def set_top_of_range(self, ref, top): + self.assert_meaningful(ref) + (bottom, _) = self._range[ref] + self._range[ref] = (bottom, top) + + def get_top_of_range(self, ref): + if isinstance(ref, ConstantRef): + return ref.value + self.assert_meaningful(ref) + (_, top) = self._range[ref] + return top + + def copy_range(self, src, dest): + self.assert_meaningful(src) + if src in self._range: + src_range = self._range[src] + else: + src_range = src.max_range() + self._range[dest] = src_range + + def invalidate_range(self, ref): + self.assert_meaningful(ref) + self._range[ref] = ref.max_range() def set_unmeaningful(self, *refs): for ref in refs: - if ref in self._meaningful: - self._meaningful.remove(ref) + if ref in self._range: + del self._range[ref] def set_written(self, *refs): """A "helper" method which does the following common sequence for @@ -183,9 +250,7 @@ class Analyzer(object): def assert_type(self, type, *locations): for location in locations: if location.type != type: - raise TypeMismatchError('%s in %s' % - (location.name, self.current_routine.name) - ) + raise TypeMismatchError(self.current_routine, location.name) def assert_affected_within(self, name, affecting_type, limiting_type): assert name in ('inputs', 'outputs', 'trashes') @@ -194,13 +259,13 @@ class Analyzer(object): overage = affected - limited_to if not overage: return - message = 'in %s: %s for %s are %s\n\nbut %s affects %s\n\nwhich exceeds it by: %s ' % ( - self.current_routine.name, name, + message = '%s for %s are %s\n\nbut %s affects %s\n\nwhich exceeds it by: %s ' % ( + name, limiting_type, LocationRef.format_set(limited_to), affecting_type, LocationRef.format_set(affected), LocationRef.format_set(overage) ) - raise IncompatibleConstraintsError(message) + raise IncompatibleConstraintsError(self.current_routine, message) def analyze_program(self, program): assert isinstance(program, Program) @@ -223,7 +288,7 @@ class Analyzer(object): print context self.analyze_block(routine.block, context) - trashed = set(context.each_touched()) - context._meaningful + trashed = set(context.each_touched()) - set(context.each_meaningful()) if self.debug: print "at end of routine `{}`:".format(routine.name) @@ -240,26 +305,37 @@ class Analyzer(object): # even if we goto another routine, we can't trash an output. for ref in trashed: if ref in type_.outputs: - raise UnmeaningfulOutputError('%s in %s' % (ref.name, routine.name)) + raise UnmeaningfulOutputError(routine, ref.name) if not self.has_encountered_goto: for ref in type_.outputs: context.assert_meaningful(ref, exception_class=UnmeaningfulOutputError) for ref in context.each_touched(): if ref not in type_.outputs and ref not in type_.trashes and not routine_has_static(routine, ref): - message = '%s in %s' % (ref.name, routine.name) - raise ForbiddenWriteError(message) + raise ForbiddenWriteError(routine, ref.name) self.current_routine = None def analyze_block(self, block, context): assert isinstance(block, Block) for i in block.instrs: if self.has_encountered_goto: - raise IllegalJumpError(i) + raise IllegalJumpError(i, i) self.analyze_instr(i, context) def analyze_instr(self, instr, context): - assert isinstance(instr, Instr) + if isinstance(instr, SingleOp): + self.analyze_single_op(instr, context) + elif isinstance(instr, If): + self.analyze_if(instr, context) + elif isinstance(instr, Repeat): + self.analyze_repeat(instr, context) + elif isinstance(instr, WithInterruptsOff): + self.analyze_block(instr.block, context) + else: + raise NotImplementedError + + def analyze_single_op(self, instr, context): + opcode = instr.opcode dest = instr.dest src = instr.src @@ -269,46 +345,44 @@ class Analyzer(object): if TableType.is_a_table_type(src.ref.type, TYPE_BYTE) and dest.type == TYPE_BYTE: pass else: - raise TypeMismatchError('%s and %s in %s' % - (src.ref.name, dest.name, self.current_routine.name) - ) + raise TypeMismatchError(instr, '{} and {}'.format(src.ref.name, dest.name)) context.assert_meaningful(src, src.index) + context.assert_in_range(src.index, src.ref) elif isinstance(src, IndirectRef): # copying this analysis from the matching branch in `copy`, below if isinstance(src.ref.type, PointerType) and dest.type == TYPE_BYTE: pass else: - raise TypeMismatchError((src, dest)) + raise TypeMismatchError(instr, (src, dest)) context.assert_meaningful(src.ref, REG_Y) elif src.type != dest.type: - raise TypeMismatchError('%s and %s in %s' % - (src.name, dest.name, self.current_routine.name) - ) + raise TypeMismatchError(instr, '{} and {}'.format(src.name, dest.name)) else: context.assert_meaningful(src) + context.copy_range(src, dest) context.set_written(dest, FLAG_Z, FLAG_N) elif opcode == 'st': if isinstance(dest, IndexedRef): if src.type == TYPE_BYTE and TableType.is_a_table_type(dest.ref.type, TYPE_BYTE): pass else: - raise TypeMismatchError((src, dest)) + raise TypeMismatchError(instr, (src, dest)) context.assert_meaningful(dest.index) + context.assert_in_range(dest.index, dest.ref) context.set_written(dest.ref) elif isinstance(dest, IndirectRef): # copying this analysis from the matching branch in `copy`, below if isinstance(dest.ref.type, PointerType) and src.type == TYPE_BYTE: pass else: - raise TypeMismatchError((src, dest)) + raise TypeMismatchError(instr, (src, dest)) context.assert_meaningful(dest.ref, REG_Y) context.set_written(dest.ref) elif src.type != dest.type: - raise TypeMismatchError('%r and %r in %s' % - (src, dest, self.current_routine.name) - ) + raise TypeMismatchError(instr, '{} and {}'.format(src, name)) else: context.set_written(dest) + # FIXME: context.copy_range(src, dest) ? context.assert_meaningful(src) elif opcode == 'add': context.assert_meaningful(src, dest, FLAG_C) @@ -327,6 +401,7 @@ class Analyzer(object): context.set_unmeaningful(REG_A) else: self.assert_type(TYPE_WORD, dest) + context.invalidate_range(dest) elif opcode == 'sub': context.assert_meaningful(src, dest, FLAG_C) if src.type == TYPE_BYTE: @@ -337,22 +412,34 @@ class Analyzer(object): context.set_written(dest, FLAG_Z, FLAG_N, FLAG_C, FLAG_V) context.set_touched(REG_A) context.set_unmeaningful(REG_A) + context.invalidate_range(dest) elif opcode in ('inc', 'dec'): self.assert_type(TYPE_BYTE, dest) context.assert_meaningful(dest) context.set_written(dest, FLAG_Z, FLAG_N) + context.invalidate_range(dest) elif opcode == 'cmp': self.assert_type(TYPE_BYTE, src, dest) context.assert_meaningful(src, dest) context.set_written(FLAG_Z, FLAG_N, FLAG_C) - elif opcode in ('and', 'or', 'xor'): + elif opcode == 'and': self.assert_type(TYPE_BYTE, src, dest) context.assert_meaningful(src, dest) context.set_written(dest, FLAG_Z, FLAG_N) + # If you AND the A register with a value V, the resulting value of A + # cannot exceed the value of V; i.e. the maximum value of A becomes + # the maximum value of V. + context.set_top_of_range(dest, context.get_top_of_range(src)) + elif opcode in ('or', 'xor'): + self.assert_type(TYPE_BYTE, src, dest) + context.assert_meaningful(src, dest) + context.set_written(dest, FLAG_Z, FLAG_N) + context.invalidate_range(dest) elif opcode in ('shl', 'shr'): self.assert_type(TYPE_BYTE, dest) context.assert_meaningful(dest, FLAG_C) context.set_written(dest, FLAG_Z, FLAG_N, FLAG_C) + context.invalidate_range(dest) elif opcode == 'call': type = instr.location.type if isinstance(type, VectorType): @@ -365,61 +452,9 @@ class Analyzer(object): context.assert_writeable(ref) context.set_touched(ref) context.set_unmeaningful(ref) - elif opcode == 'if': - incoming_meaningful = set(context.each_meaningful()) - - context1 = context.clone() - context2 = context.clone() - self.analyze_block(instr.block1, context1) - if instr.block2 is not None: - self.analyze_block(instr.block2, context2) - - outgoing_meaningful = set(context1.each_meaningful()) & set(context2.each_meaningful()) - outgoing_trashes = incoming_meaningful - outgoing_meaningful - - # TODO may we need to deal with touched separately here too? - # probably not; if it wasn't meaningful in the first place, it - # doesn't really matter if you modified it or not, coming out. - for ref in context1.each_meaningful(): - if ref in outgoing_trashes: - continue - context2.assert_meaningful( - ref, exception_class=InconsistentInitializationError, - message='initialized in block 1 but not in block 2 of `if {}`'.format(src) - ) - for ref in context2.each_meaningful(): - if ref in outgoing_trashes: - continue - context1.assert_meaningful( - ref, exception_class=InconsistentInitializationError, - message='initialized in block 2 but not in block 1 of `if {}`'.format(src) - ) - - # merge the contexts. this used to be a method called `set_from` - context._touched = set(context1._touched) | set(context2._touched) - context._meaningful = outgoing_meaningful - context._writeable = set(context1._writeable) | set(context2._writeable) - - for ref in outgoing_trashes: - context.set_touched(ref) - context.set_unmeaningful(ref) - - elif opcode == 'repeat': - # it will always be executed at least once, so analyze it having - # been executed the first time. - self.analyze_block(instr.block, context) - if src is not None: # None indicates 'repeat forever' - context.assert_meaningful(src) - - # now analyze it having been executed a second time, with the context - # of it having already been executed. - self.analyze_block(instr.block, context) - if src is not None: - context.assert_meaningful(src) - elif opcode == 'copy': if dest == REG_A: - raise ForbiddenWriteError("{} cannot be used as destination for copy".format(dest)) + raise ForbiddenWriteError(instr, "{} cannot be used as destination for copy".format(dest)) # 1. check that their types are compatible @@ -427,17 +462,17 @@ class Analyzer(object): if isinstance(src.ref.type, BufferType) and isinstance(dest.type, PointerType): pass else: - raise TypeMismatchError((src, dest)) + raise TypeMismatchError(instr, (src, dest)) elif isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndirectRef): if src.type == TYPE_BYTE and isinstance(dest.ref.type, PointerType): pass else: - raise TypeMismatchError((src, dest)) + raise TypeMismatchError(instr, (src, dest)) elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef): if isinstance(src.ref.type, PointerType) and dest.type == TYPE_BYTE: pass else: - raise TypeMismatchError((src, dest)) + raise TypeMismatchError(instr, (src, dest)) elif isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndexedRef): if src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD): @@ -449,7 +484,8 @@ class Analyzer(object): RoutineType.executable_types_compatible(src.type, dest.ref.type.of_type)): pass else: - raise TypeMismatchError((src, dest)) + raise TypeMismatchError(instr, (src, dest)) + context.assert_in_range(dest.index, dest.ref) elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef): if TableType.is_a_table_type(src.ref.type, TYPE_WORD) and dest.type == TYPE_WORD: @@ -458,7 +494,8 @@ class Analyzer(object): RoutineType.executable_types_compatible(src.ref.type.of_type, dest.type.of_type)): pass else: - raise TypeMismatchError((src, dest)) + raise TypeMismatchError(instr, (src, dest)) + context.assert_in_range(src.index, src.ref) elif isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, LocationRef): if src.type == dest.type: @@ -468,9 +505,9 @@ class Analyzer(object): self.assert_affected_within('outputs', src.type, dest.type.of_type) self.assert_affected_within('trashes', src.type, dest.type.of_type) else: - raise TypeMismatchError((src, dest)) + raise TypeMismatchError(instr, (src, dest)) else: - raise TypeMismatchError((src, dest)) + raise TypeMismatchError(instr, (src, dest)) # 2. check that the context is meaningful @@ -497,15 +534,12 @@ class Analyzer(object): context.set_touched(REG_A, FLAG_Z, FLAG_N) context.set_unmeaningful(REG_A, FLAG_Z, FLAG_N) - - elif opcode == 'with-sei': - self.analyze_block(instr.block, context) elif opcode == 'goto': location = instr.location type_ = location.type if not isinstance(type_, (RoutineType, VectorType)): - raise TypeMismatchError(location) + raise TypeMismatchError(instr, location) # assert that the dest routine's inputs are all initialized if isinstance(type_, VectorType): @@ -525,3 +559,55 @@ class Analyzer(object): context.set_unmeaningful(instr.dest) else: raise NotImplementedError(opcode) + + def analyze_if(self, instr, context): + incoming_meaningful = set(context.each_meaningful()) + + context1 = context.clone() + context2 = context.clone() + self.analyze_block(instr.block1, context1) + if instr.block2 is not None: + self.analyze_block(instr.block2, context2) + + outgoing_meaningful = set(context1.each_meaningful()) & set(context2.each_meaningful()) + outgoing_trashes = incoming_meaningful - outgoing_meaningful + + # TODO may we need to deal with touched separately here too? + # probably not; if it wasn't meaningful in the first place, it + # doesn't really matter if you modified it or not, coming out. + for ref in context1.each_meaningful(): + if ref in outgoing_trashes: + continue + context2.assert_meaningful( + ref, exception_class=InconsistentInitializationError, + message='initialized in block 1 but not in block 2 of `if {}`'.format(instr.src) + ) + for ref in context2.each_meaningful(): + if ref in outgoing_trashes: + continue + context1.assert_meaningful( + ref, exception_class=InconsistentInitializationError, + message='initialized in block 2 but not in block 1 of `if {}`'.format(instr.src) + ) + + # merge the contexts. this used to be a method called `set_from` + context._touched = set(context1._touched) | set(context2._touched) + context.set_meaningful(*list(outgoing_meaningful)) + context._writeable = set(context1._writeable) | set(context2._writeable) + + for ref in outgoing_trashes: + context.set_touched(ref) + context.set_unmeaningful(ref) + + def analyze_repeat(self, instr, context): + # it will always be executed at least once, so analyze it having + # been executed the first time. + self.analyze_block(instr.block, context) + if instr.src is not None: # None indicates 'repeat forever' + context.assert_meaningful(instr.src) + + # now analyze it having been executed a second time, with the context + # of it having already been executed. + self.analyze_block(instr.block, context) + if instr.src is not None: + context.assert_meaningful(instr.src) diff --git a/src/sixtypical/ast.py b/src/sixtypical/ast.py index bbcc68a..0ab9f33 100644 --- a/src/sixtypical/ast.py +++ b/src/sixtypical/ast.py @@ -1,8 +1,30 @@ # encoding: UTF-8 class AST(object): - def __init__(self, **kwargs): - self.attrs = kwargs + children_attrs = () + child_attrs = () + value_attrs = () + + def __init__(self, line_number, **kwargs): + self.line_number = line_number + self.attrs = {} + for attr in self.children_attrs: + self.attrs[attr] = kwargs.pop(attr, []) + for child in self.attrs[attr]: + assert child is None or isinstance(child, AST), \ + "child %s=%r of %r is not an AST node" % (attr, child, self) + for attr in self.child_attrs: + self.attrs[attr] = kwargs.pop(attr, None) + child = self.attrs[attr] + assert child is None or isinstance(child, AST), \ + "child %s=%r of %r is not an AST node" % (attr, child, self) + for attr in self.value_attrs: + self.attrs[attr] = kwargs.pop(attr, None) + assert (not kwargs), "extra arguments supplied to {} node: {}".format(self.type, kwargs) + + @property + def type(self): + return self.__class__.__name__ def __repr__(self): return "%s(%r)" % (self.__class__.__name__, self.attrs) @@ -12,22 +34,54 @@ class AST(object): return self.attrs[name] raise AttributeError(name) + def all_children(self): + for attr in self.children_attrs: + for child in self.attrs[attr]: + yield child + for subchild in child.all_children(): + yield subchild + for attr in self.child_attrs: + child = self.attrs[attr] + yield child + for subchild in child.all_children(): + yield subchild + class Program(AST): - pass + children_attrs = ('defns', 'routines',) class Defn(AST): - pass + value_attrs = ('name', 'addr', 'initial', 'location',) class Routine(AST): - pass + value_attrs = ('name', 'addr', 'initial', 'location',) + children_attrs = ('statics',) + child_attrs = ('block',) class Block(AST): - pass + children_attrs = ('instrs',) class Instr(AST): pass + + +class SingleOp(Instr): + value_attrs = ('opcode', 'dest', 'src', 'location',) + + +class If(Instr): + value_attrs = ('src', 'inverted') + child_attrs = ('block1', 'block2',) + + +class Repeat(Instr): + value_attrs = ('src', 'inverted') + child_attrs = ('block',) + + +class WithInterruptsOff(Instr): + child_attrs = ('block',) diff --git a/src/sixtypical/compiler.py b/src/sixtypical/compiler.py index 6ea7c4e..7e160c3 100644 --- a/src/sixtypical/compiler.py +++ b/src/sixtypical/compiler.py @@ -1,6 +1,6 @@ # encoding: UTF-8 -from sixtypical.ast import Program, Routine, Block, Instr +from sixtypical.ast import Program, Routine, Block, Instr, SingleOp, If, Repeat, WithInterruptsOff from sixtypical.model import ( ConstantRef, LocationRef, IndexedRef, IndirectRef, AddressRef, TYPE_BIT, TYPE_BYTE, TYPE_WORD, @@ -142,7 +142,19 @@ class Compiler(object): self.compile_instr(instr) def compile_instr(self, instr): - assert isinstance(instr, Instr) + if isinstance(instr, SingleOp): + return self.compile_single_op(instr) + elif isinstance(instr, If): + return self.compile_if(instr) + elif isinstance(instr, Repeat): + return self.compile_repeat(instr) + elif isinstance(instr, WithInterruptsOff): + return self.compile_with_interrupts_off(instr) + else: + raise NotImplementedError + + def compile_single_op(self, instr): + opcode = instr.opcode dest = instr.dest src = instr.src @@ -356,7 +368,164 @@ class Compiler(object): self.emitter.emit(JMP(Indirect(label))) else: raise NotImplementedError - elif opcode == 'if': + elif opcode == 'copy': + self.compile_copy_op(instr) + elif opcode == 'trash': + pass + else: + raise NotImplementedError(opcode) + + def compile_copy_op(self, instr): + + opcode = instr.opcode + dest = instr.dest + src = instr.src + + if isinstance(src, ConstantRef) and isinstance(dest, IndirectRef) and src.type == TYPE_BYTE and isinstance(dest.ref.type, PointerType): + ### copy 123, [ptr] + y + dest_label = self.get_label(dest.ref.name) + self.emitter.emit(LDA(Immediate(Byte(src.value)))) + self.emitter.emit(STA(IndirectY(dest_label))) + elif isinstance(src, LocationRef) and isinstance(dest, IndirectRef) and src.type == TYPE_BYTE and isinstance(dest.ref.type, PointerType): + ### copy b, [ptr] + y + src_label = self.get_label(src.name) + dest_label = self.get_label(dest.ref.name) + self.emitter.emit(LDA(Absolute(src_label))) + self.emitter.emit(STA(IndirectY(dest_label))) + elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef) and dest.type == TYPE_BYTE and isinstance(src.ref.type, PointerType): + ### copy [ptr] + y, b + src_label = self.get_label(src.ref.name) + dest_label = self.get_label(dest.name) + self.emitter.emit(LDA(IndirectY(src_label))) + self.emitter.emit(STA(Absolute(dest_label))) + elif isinstance(src, AddressRef) and isinstance(dest, LocationRef) and isinstance(src.ref.type, BufferType) and isinstance(dest.type, PointerType): + ### copy ^buf, ptr + src_label = self.get_label(src.ref.name) + dest_label = self.get_label(dest.name) + self.emitter.emit(LDA(Immediate(HighAddressByte(src_label)))) + self.emitter.emit(STA(ZeroPage(dest_label))) + self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) + self.emitter.emit(STA(ZeroPage(Offset(dest_label, 1)))) + elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD): + ### copy w, wtab + y + src_label = self.get_label(src.name) + dest_label = self.get_label(dest.ref.name) + self.emitter.emit(LDA(Absolute(src_label))) + self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) + self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) + self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) + elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and isinstance(src.type, VectorType) and isinstance(dest.ref.type, TableType) and isinstance(dest.ref.type.of_type, VectorType): + ### copy vec, vtab + y + # FIXME this is the exact same as above - can this be simplified? + src_label = self.get_label(src.name) + dest_label = self.get_label(dest.ref.name) + self.emitter.emit(LDA(Absolute(src_label))) + self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) + self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) + self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) + elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and isinstance(src.type, RoutineType) and isinstance(dest.ref.type, TableType) and isinstance(dest.ref.type.of_type, VectorType): + ### copy routine, vtab + y + src_label = self.get_label(src.name) + dest_label = self.get_label(dest.ref.name) + self.emitter.emit(LDA(Immediate(HighAddressByte(src_label)))) + self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) + self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) + self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) + elif isinstance(src, ConstantRef) and isinstance(dest, IndexedRef) and src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD): + ### copy 9999, wtab + y + dest_label = self.get_label(dest.ref.name) + self.emitter.emit(LDA(Immediate(Byte(src.low_byte())))) + self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) + self.emitter.emit(LDA(Immediate(Byte(src.high_byte())))) + self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) + elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef) and TableType.is_a_table_type(src.ref.type, TYPE_WORD) and dest.type == TYPE_WORD: + ### copy wtab + y, w + src_label = self.get_label(src.ref.name) + dest_label = self.get_label(dest.name) + self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(src_label))) + self.emitter.emit(STA(Absolute(dest_label))) + self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(Offset(src_label, 256)))) + self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) + elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef) and isinstance(dest.type, VectorType) and isinstance(src.ref.type, TableType) and isinstance(src.ref.type.of_type, VectorType): + ### copy vtab + y, vec + # FIXME this is the exact same as above - can this be simplified? + src_label = self.get_label(src.ref.name) + dest_label = self.get_label(dest.name) + self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(src_label))) + self.emitter.emit(STA(Absolute(dest_label))) + self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(Offset(src_label, 256)))) + self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) + elif src.type == TYPE_BYTE and dest.type == TYPE_BYTE and not isinstance(src, ConstantRef): + ### copy b1, b2 + src_label = self.get_label(src.name) + dest_label = self.get_label(dest.name) + self.emitter.emit(LDA(Absolute(src_label))) + self.emitter.emit(STA(Absolute(dest_label))) + elif src.type == TYPE_WORD and dest.type == TYPE_WORD and isinstance(src, ConstantRef): + ### copy 9999, w + dest_label = self.get_label(dest.name) + self.emitter.emit(LDA(Immediate(Byte(src.low_byte())))) + self.emitter.emit(STA(Absolute(dest_label))) + self.emitter.emit(LDA(Immediate(Byte(src.high_byte())))) + self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) + elif src.type == TYPE_WORD and dest.type == TYPE_WORD and not isinstance(src, ConstantRef): + ### copy w1, w2 + src_label = self.get_label(src.name) + dest_label = self.get_label(dest.name) + self.emitter.emit(LDA(Absolute(src_label))) + self.emitter.emit(STA(Absolute(dest_label))) + self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) + self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) + elif isinstance(src.type, VectorType) and isinstance(dest.type, VectorType): + ### copy v1, v2 + src_label = self.get_label(src.name) + dest_label = self.get_label(dest.name) + self.emitter.emit(LDA(Absolute(src_label))) + self.emitter.emit(STA(Absolute(dest_label))) + self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) + self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) + elif isinstance(src.type, RoutineType) and isinstance(dest.type, VectorType): + ### copy routine, vec + src_label = self.get_label(src.name) + dest_label = self.get_label(dest.name) + self.emitter.emit(LDA(Immediate(HighAddressByte(src_label)))) + self.emitter.emit(STA(Absolute(dest_label))) + self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) + self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) + else: + raise NotImplementedError(src.type) + + def compile_if(self, instr): + cls = { + False: { + 'c': BCC, + 'z': BNE, + }, + True: { + 'c': BCS, + 'z': BEQ, + }, + }[instr.inverted].get(instr.src.name) + if cls is None: + raise UnsupportedOpcodeError(instr) + else_label = Label('else_label') + self.emitter.emit(cls(Relative(else_label))) + self.compile_block(instr.block1) + if instr.block2 is not None: + end_label = Label('end_label') + self.emitter.emit(JMP(Absolute(end_label))) + self.emitter.resolve_label(else_label) + self.compile_block(instr.block2) + self.emitter.resolve_label(end_label) + else: + self.emitter.resolve_label(else_label) + + def compile_repeat(self, instr): + top_label = self.emitter.make_label() + self.compile_block(instr.block) + if instr.src is None: # indicates 'repeat forever' + self.emitter.emit(JMP(Absolute(top_label))) + else: cls = { False: { 'c': BCC, @@ -366,169 +535,12 @@ class Compiler(object): 'c': BCS, 'z': BEQ, }, - }[instr.inverted].get(src.name) + }[instr.inverted].get(instr.src.name) if cls is None: raise UnsupportedOpcodeError(instr) - else_label = Label('else_label') - self.emitter.emit(cls(Relative(else_label))) - self.compile_block(instr.block1) - if instr.block2 is not None: - end_label = Label('end_label') - self.emitter.emit(JMP(Absolute(end_label))) - self.emitter.resolve_label(else_label) - self.compile_block(instr.block2) - self.emitter.resolve_label(end_label) - else: - self.emitter.resolve_label(else_label) - elif opcode == 'repeat': - top_label = self.emitter.make_label() - self.compile_block(instr.block) - if src is None: # indicates 'repeat forever' - self.emitter.emit(JMP(Absolute(top_label))) - else: - cls = { - False: { - 'c': BCC, - 'z': BNE, - }, - True: { - 'c': BCS, - 'z': BEQ, - }, - }[instr.inverted].get(src.name) - if cls is None: - raise UnsupportedOpcodeError(instr) - self.emitter.emit(cls(Relative(top_label))) - elif opcode == 'with-sei': - self.emitter.emit(SEI()) - self.compile_block(instr.block) - self.emitter.emit(CLI()) - elif opcode == 'copy': - if isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndirectRef): - if src.type == TYPE_BYTE and isinstance(dest.ref.type, PointerType): - if isinstance(src, ConstantRef): - dest_label = self.get_label(dest.ref.name) - self.emitter.emit(LDA(Immediate(Byte(src.value)))) - self.emitter.emit(STA(IndirectY(dest_label))) - elif isinstance(src, LocationRef): - src_label = self.get_label(src.name) - dest_label = self.get_label(dest.ref.name) - self.emitter.emit(LDA(Absolute(src_label))) - self.emitter.emit(STA(IndirectY(dest_label))) - else: - raise NotImplementedError((src, dest)) - else: - raise NotImplementedError((src, dest)) - elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef): - if dest.type == TYPE_BYTE and isinstance(src.ref.type, PointerType): - src_label = self.get_label(src.ref.name) - dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(IndirectY(src_label))) - self.emitter.emit(STA(Absolute(dest_label))) - else: - raise NotImplementedError((src, dest)) - elif isinstance(src, AddressRef) and isinstance(dest, LocationRef) and \ - isinstance(src.ref.type, BufferType) and isinstance(dest.type, PointerType): - src_label = self.get_label(src.ref.name) - dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(Immediate(HighAddressByte(src_label)))) - self.emitter.emit(STA(ZeroPage(dest_label))) - self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) - self.emitter.emit(STA(ZeroPage(Offset(dest_label, 1)))) - elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef): - if src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD): - src_label = self.get_label(src.name) - dest_label = self.get_label(dest.ref.name) - self.emitter.emit(LDA(Absolute(src_label))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) - self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) - elif isinstance(src.type, VectorType) and isinstance(dest.ref.type, TableType) and isinstance(dest.ref.type.of_type, VectorType): - # FIXME this is the exact same as above - can this be simplified? - src_label = self.get_label(src.name) - dest_label = self.get_label(dest.ref.name) - self.emitter.emit(LDA(Absolute(src_label))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) - self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) - elif isinstance(src.type, RoutineType) and isinstance(dest.ref.type, TableType) and isinstance(dest.ref.type.of_type, VectorType): - src_label = self.get_label(src.name) - dest_label = self.get_label(dest.ref.name) - self.emitter.emit(LDA(Immediate(HighAddressByte(src_label)))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) - self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) - else: - raise NotImplementedError - elif isinstance(src, ConstantRef) and isinstance(dest, IndexedRef): - if src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD): - dest_label = self.get_label(dest.ref.name) - self.emitter.emit(LDA(Immediate(Byte(src.low_byte())))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) - self.emitter.emit(LDA(Immediate(Byte(src.high_byte())))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) - else: - raise NotImplementedError - elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef): - if TableType.is_a_table_type(src.ref.type, TYPE_WORD) and dest.type == TYPE_WORD: - src_label = self.get_label(src.ref.name) - dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(src_label))) - self.emitter.emit(STA(Absolute(dest_label))) - self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(Offset(src_label, 256)))) - self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) - elif isinstance(dest.type, VectorType) and isinstance(src.ref.type, TableType) and isinstance(src.ref.type.of_type, VectorType): - # FIXME this is the exact same as above - can this be simplified? - src_label = self.get_label(src.ref.name) - dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(src_label))) - self.emitter.emit(STA(Absolute(dest_label))) - self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(Offset(src_label, 256)))) - self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) - else: - raise NotImplementedError + self.emitter.emit(cls(Relative(top_label))) - elif not isinstance(src, (ConstantRef, LocationRef)) or not isinstance(dest, LocationRef): - raise NotImplementedError((src, dest)) - elif src.type == TYPE_BYTE and dest.type == TYPE_BYTE: - if isinstance(src, ConstantRef): - raise NotImplementedError - else: - src_label = self.get_label(src.name) - dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(Absolute(src_label))) - self.emitter.emit(STA(Absolute(dest_label))) - elif src.type == TYPE_WORD and dest.type == TYPE_WORD: - if isinstance(src, ConstantRef): - dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(Immediate(Byte(src.low_byte())))) - self.emitter.emit(STA(Absolute(dest_label))) - self.emitter.emit(LDA(Immediate(Byte(src.high_byte())))) - self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) - else: - src_label = self.get_label(src.name) - dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(Absolute(src_label))) - self.emitter.emit(STA(Absolute(dest_label))) - self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) - self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) - elif isinstance(src.type, VectorType) and isinstance(dest.type, VectorType): - src_label = self.get_label(src.name) - dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(Absolute(src_label))) - self.emitter.emit(STA(Absolute(dest_label))) - self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) - self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) - elif isinstance(src.type, RoutineType) and isinstance(dest.type, VectorType): - src_label = self.get_label(src.name) - dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(Immediate(HighAddressByte(src_label)))) - self.emitter.emit(STA(Absolute(dest_label))) - self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) - self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) - else: - raise NotImplementedError(src.type) - elif opcode == 'trash': - pass - else: - raise NotImplementedError(opcode) + def compile_with_interrupts_off(self, instr): + self.emitter.emit(SEI()) + self.compile_block(instr.block) + self.emitter.emit(CLI()) diff --git a/src/sixtypical/model.py b/src/sixtypical/model.py index dde7da9..75f7a7f 100644 --- a/src/sixtypical/model.py +++ b/src/sixtypical/model.py @@ -2,8 +2,9 @@ class Type(object): - def __init__(self, name): + def __init__(self, name, max_range=None): self.name = name + self.max_range = max_range def __repr__(self): return 'Type(%r)' % self.name @@ -32,9 +33,9 @@ class Type(object): self.trashes = set([resolve(w) for w in self.trashes]) -TYPE_BIT = Type('bit') -TYPE_BYTE = Type('byte') -TYPE_WORD = Type('word') +TYPE_BIT = Type('bit', max_range=(0, 1)) +TYPE_BYTE = Type('byte', max_range=(0, 255)) +TYPE_WORD = Type('word', max_range=(0, 65535)) @@ -130,7 +131,10 @@ class Ref(object): """read-only means that the program cannot change the value of a location. constant means that the value of the location will not change during the lifetime of the program.""" - raise NotImplementedError + raise NotImplementedError("class {} must implement is_constant()".format(self.__class__.__name__)) + + def max_range(self): + raise NotImplementedError("class {} must implement max_range()".format(self.__class__.__name__)) class LocationRef(Ref): @@ -160,6 +164,12 @@ class LocationRef(Ref): def is_constant(self): return isinstance(self.type, RoutineType) + def max_range(self): + try: + return self.type.max_range + except: + return (0, 0) + @classmethod def format_set(cls, location_refs): return '{%s}' % ', '.join([str(loc) for loc in sorted(location_refs)]) @@ -277,6 +287,9 @@ class ConstantRef(Ref): def is_constant(self): return True + def max_range(self): + return (self.value, self.value) + def high_byte(self): return (self.value >> 8) & 255 diff --git a/src/sixtypical/parser.py b/src/sixtypical/parser.py index d7f5cd4..4caf862 100644 --- a/src/sixtypical/parser.py +++ b/src/sixtypical/parser.py @@ -1,12 +1,12 @@ # encoding: UTF-8 -from sixtypical.ast import Program, Defn, Routine, Block, Instr +from sixtypical.ast import Program, Defn, Routine, Block, SingleOp, If, Repeat, WithInterruptsOff from sixtypical.model import ( TYPE_BIT, TYPE_BYTE, TYPE_WORD, RoutineType, VectorType, TableType, BufferType, PointerType, LocationRef, ConstantRef, IndirectRef, IndexedRef, AddressRef, ) -from sixtypical.scanner import Scanner +from sixtypical.scanner import Scanner, SixtyPicalSyntaxError class SymEntry(object): @@ -30,6 +30,9 @@ class Parser(object): self.symbols[token] = SymEntry(None, LocationRef(TYPE_BIT, token)) self.backpatch_instrs = [] + def syntax_error(self, msg): + raise SixtyPicalSyntaxError(self.scanner.line_number, msg) + def soft_lookup(self, name): if name in self.current_statics: return self.current_statics[name].model @@ -40,7 +43,7 @@ class Parser(object): def lookup(self, name): model = self.soft_lookup(name) if model is None: - raise SyntaxError('Undefined symbol "%s"' % name) + self.syntax_error('Undefined symbol "{}"'.format(name)) return model # --- grammar productions @@ -56,7 +59,7 @@ class Parser(object): defn = self.defn() name = defn.name if name in self.symbols: - raise SyntaxError('Symbol "%s" already declared' % name) + self.syntax_error('Symbol "%s" already declared' % name) self.symbols[name] = SymEntry(defn, defn.location) defns.append(defn) while self.scanner.on('define', 'routine'): @@ -68,7 +71,7 @@ class Parser(object): routine = self.legacy_routine() name = routine.name if name in self.symbols: - raise SyntaxError('Symbol "%s" already declared' % name) + self.syntax_error('Symbol "%s" already declared' % name) self.symbols[name] = SymEntry(routine, routine.location) routines.append(routine) self.scanner.check_type('EOF') @@ -84,26 +87,26 @@ class Parser(object): if instr.opcode in ('call', 'goto'): name = instr.location if name not in self.symbols: - raise SyntaxError('Undefined routine "%s"' % name) + self.syntax_error('Undefined routine "%s"' % name) if not isinstance(self.symbols[name].model.type, (RoutineType, VectorType)): - raise SyntaxError('Illegal call of non-executable "%s"' % name) + self.syntax_error('Illegal call of non-executable "%s"' % name) instr.location = self.symbols[name].model if instr.opcode in ('copy',) and isinstance(instr.src, basestring): name = instr.src if name not in self.symbols: - raise SyntaxError('Undefined routine "%s"' % name) + self.syntax_error('Undefined routine "%s"' % name) if not isinstance(self.symbols[name].model.type, (RoutineType, VectorType)): - raise SyntaxError('Illegal copy of non-executable "%s"' % name) + self.syntax_error('Illegal copy of non-executable "%s"' % name) instr.src = self.symbols[name].model - return Program(defns=defns, routines=routines) + return Program(self.scanner.line_number, defns=defns, routines=routines) def typedef(self): self.scanner.expect('typedef') type_ = self.defn_type() name = self.defn_name() if name in self.typedefs: - raise SyntaxError('Type "%s" already declared' % name) + self.syntax_error('Type "%s" already declared' % name) self.typedefs[name] = type_ return type_ @@ -127,11 +130,11 @@ class Parser(object): self.scanner.scan() if initial is not None and addr is not None: - raise SyntaxError("Definition cannot have both initial value and explicit address") + self.syntax_error("Definition cannot have both initial value and explicit address") location = LocationRef(type_, name) - return Defn(name=name, addr=addr, initial=initial, location=location) + return Defn(self.scanner.line_number, name=name, addr=addr, initial=initial, location=location) def defn_size(self): self.scanner.expect('[') @@ -146,6 +149,8 @@ class Parser(object): if self.scanner.consume('table'): size = self.defn_size() + if size <= 0 or size > 256: + self.syntax_error("Table size must be > 0 and <= 256") type_ = TableType(type_, size) return type_ @@ -165,7 +170,7 @@ class Parser(object): elif self.scanner.consume('vector'): type_ = self.defn_type_term() if not isinstance(type_, RoutineType): - raise SyntaxError("Vectors can only be of a routine, not %r" % type_) + self.syntax_error("Vectors can only be of a routine, not %r" % type_) type_ = VectorType(type_) elif self.scanner.consume('routine'): (inputs, outputs, trashes) = self.constraints() @@ -179,7 +184,7 @@ class Parser(object): type_name = self.scanner.token self.scanner.scan() if type_name not in self.typedefs: - raise SyntaxError("Undefined type '%s'" % type_name) + self.syntax_error("Undefined type '%s'" % type_name) type_ = self.typedefs[type_name] return type_ @@ -218,6 +223,7 @@ class Parser(object): addr = None location = LocationRef(type_, name) return Routine( + self.scanner.line_number, name=name, block=block, addr=addr, location=location ) @@ -225,7 +231,7 @@ class Parser(object): def routine(self, name): type_ = self.defn_type() if not isinstance(type_, RoutineType): - raise SyntaxError("Can only define a routine, not %r" % type_) + self.syntax_error("Can only define a routine, not %r" % type_) statics = [] if self.scanner.consume('@'): self.scanner.check_type('integer literal') @@ -242,6 +248,7 @@ class Parser(object): addr = None location = LocationRef(type_, name) return Routine( + self.scanner.line_number, name=name, block=block, addr=addr, location=location, statics=statics ) @@ -251,7 +258,7 @@ class Parser(object): for defn in statics: name = defn.name if name in self.symbols or name in self.current_statics: - raise SyntaxError('Symbol "%s" already declared' % name) + self.syntax_error('Symbol "%s" already declared' % name) c[name] = SymEntry(defn, defn.location) return c @@ -315,20 +322,23 @@ class Parser(object): loc = self.locexpr() return AddressRef(loc) else: - loc = self.locexpr(forward=forward) - if not isinstance(loc, basestring): - index = None - if self.scanner.consume('+'): - index = self.locexpr() - loc = IndexedRef(loc, index) - return loc + return self.indexed_locexpr(forward=forward) + + def indexed_locexpr(self, forward=False): + loc = self.locexpr(forward=forward) + if not isinstance(loc, basestring): + index = None + if self.scanner.consume('+'): + index = self.locexpr() + loc = IndexedRef(loc, index) + return loc def statics(self): defns = [] while self.scanner.consume('static'): defn = self.defn() if defn.initial is None: - raise SyntaxError("Static definition {} must have initial value".format(defn)) + self.syntax_error("Static definition {} must have initial value".format(defn)) defns.append(defn) return defns @@ -338,7 +348,7 @@ class Parser(object): while not self.scanner.on('}'): instrs.append(self.instr()) self.scanner.expect('}') - return Block(instrs=instrs) + return Block(self.scanner.line_number, instrs=instrs) def instr(self): if self.scanner.consume('if'): @@ -350,8 +360,7 @@ class Parser(object): block2 = None if self.scanner.consume('else'): block2 = self.block() - return Instr(opcode='if', dest=None, src=src, - block1=block1, block2=block2, inverted=inverted) + return If(self.scanner.line_number, src=src, block1=block1, block2=block2, inverted=inverted) elif self.scanner.consume('repeat'): inverted = False src = None @@ -362,8 +371,7 @@ class Parser(object): src = self.locexpr() else: self.scanner.expect('forever') - return Instr(opcode='repeat', dest=None, src=src, - block=block, inverted=inverted) + return Repeat(self.scanner.line_number, src=src, block=block, inverted=inverted) elif self.scanner.token in ("ld",): # the same as add, sub, cmp etc below, except supports an indlocexpr for the src opcode = self.scanner.token @@ -371,35 +379,32 @@ class Parser(object): dest = self.locexpr() self.scanner.expect(',') src = self.indlocexpr() - return Instr(opcode=opcode, dest=dest, src=src, index=None) + return SingleOp(self.scanner.line_number, opcode=opcode, dest=dest, src=src) elif self.scanner.token in ("add", "sub", "cmp", "and", "or", "xor"): opcode = self.scanner.token self.scanner.scan() dest = self.locexpr() self.scanner.expect(',') - src = self.locexpr() - index = None - if self.scanner.consume('+'): - index = self.locexpr() - return Instr(opcode=opcode, dest=dest, src=src, index=index) + src = self.indexed_locexpr() + return SingleOp(self.scanner.line_number, opcode=opcode, dest=dest, src=src) elif self.scanner.token in ("st",): opcode = self.scanner.token self.scanner.scan() src = self.locexpr() self.scanner.expect(',') dest = self.indlocexpr() - return Instr(opcode=opcode, dest=dest, src=src, index=None) + return SingleOp(self.scanner.line_number, opcode=opcode, dest=dest, src=src) elif self.scanner.token in ("shl", "shr", "inc", "dec"): opcode = self.scanner.token self.scanner.scan() dest = self.locexpr() - return Instr(opcode=opcode, dest=dest, src=None) + return SingleOp(self.scanner.line_number, opcode=opcode, dest=dest, src=None) elif self.scanner.token in ("call", "goto"): opcode = self.scanner.token self.scanner.scan() name = self.scanner.token self.scanner.scan() - instr = Instr(opcode=opcode, location=name, dest=None, src=None) + instr = SingleOp(self.scanner.line_number, opcode=opcode, location=name, dest=None, src=None) self.backpatch_instrs.append(instr) return instr elif self.scanner.token in ("copy",): @@ -408,16 +413,16 @@ class Parser(object): src = self.indlocexpr(forward=True) self.scanner.expect(',') dest = self.indlocexpr() - instr = Instr(opcode=opcode, dest=dest, src=src) + instr = SingleOp(self.scanner.line_number, opcode=opcode, dest=dest, src=src) self.backpatch_instrs.append(instr) return instr elif self.scanner.consume("with"): self.scanner.expect("interrupts") self.scanner.expect("off") block = self.block() - return Instr(opcode='with-sei', dest=None, src=None, block=block) + return WithInterruptsOff(self.scanner.line_number, block=block) elif self.scanner.consume("trash"): dest = self.locexpr() - return Instr(opcode='trash', src=None, dest=dest) + return SingleOp(self.scanner.line_number, opcode='trash', src=None, dest=dest) else: - raise ValueError('bad opcode "%s"' % self.scanner.token) + self.syntax_error('bad opcode "%s"' % self.scanner.token) diff --git a/src/sixtypical/scanner.py b/src/sixtypical/scanner.py index fd48b3b..72bca62 100644 --- a/src/sixtypical/scanner.py +++ b/src/sixtypical/scanner.py @@ -3,11 +3,20 @@ import re +class SixtyPicalSyntaxError(ValueError): + def __init__(self, line_number, message): + super(SixtyPicalSyntaxError, self).__init__(line_number, message) + + def __str__(self): + return "Line {}: {}".format(self.args[0], self.args[1]) + + class Scanner(object): def __init__(self, text): self.text = text self.token = None self.type = None + self.line_number = 1 self.scan() def scan_pattern(self, pattern, type, token_group=1, rest_group=2): @@ -19,6 +28,7 @@ class Scanner(object): self.type = type self.token = match.group(token_group) self.text = match.group(rest_group) + self.line_number += self.token.count('\n') return True def scan(self): @@ -46,14 +56,15 @@ class Scanner(object): if self.scan_pattern(r'.', 'unknown character'): return else: - raise AssertionError("this should never happen, self.text=(%s)" % self.text) + raise AssertionError("this should never happen, self.text=({})".format(self.text)) def expect(self, token): if self.token == token: self.scan() else: - raise SyntaxError("Expected '%s', but found '%s'" % - (token, self.token)) + raise SixtyPicalSyntaxError(self.scanner.line_number, "Expected '{}', but found '{}'".format( + token, self.token + )) def on(self, *tokens): return self.token in tokens @@ -63,8 +74,9 @@ class Scanner(object): def check_type(self, type): if not self.type == type: - raise SyntaxError("Expected %s, but found %s ('%s')" % - (type, self.type, self.token)) + raise SixtyPicalSyntaxError(self.scanner.line_number, "Expected {}, but found '{}'".format( + self.type, self.token + )) def consume(self, token): if self.token == token: diff --git a/tests/SixtyPical Analysis.md b/tests/SixtyPical Analysis.md index 05c556f..47a9124 100644 --- a/tests/SixtyPical Analysis.md +++ b/tests/SixtyPical Analysis.md @@ -42,7 +42,7 @@ If a routine declares it outputs a location, that location should be initialized | { | ld x, 0 | } - ? UnmeaningfulOutputError: a in main + ? UnmeaningfulOutputError: a | routine main | inputs a @@ -73,7 +73,7 @@ If a routine modifies a location, it needs to either output it or trash it. | { | ld x, 0 | } - ? ForbiddenWriteError: x in main + ? ForbiddenWriteError: x | routine main | outputs x, z, n @@ -96,7 +96,7 @@ This is true regardless of whether it's an input or not. | { | ld x, 0 | } - ? ForbiddenWriteError: x in main + ? ForbiddenWriteError: x | routine main | inputs x @@ -127,14 +127,14 @@ If a routine trashes a location, this must be declared. | { | trash x | } - ? ForbiddenWriteError: x in foo + ? ForbiddenWriteError: x | routine foo | outputs x | { | trash x | } - ? UnmeaningfulOutputError: x in foo + ? UnmeaningfulOutputError: x If a routine causes a location to be trashed, this must be declared in the caller. @@ -162,7 +162,7 @@ If a routine causes a location to be trashed, this must be declared in the calle | { | call trash_x | } - ? ForbiddenWriteError: x in foo + ? ForbiddenWriteError: x | routine trash_x | trashes x, z, n @@ -176,7 +176,7 @@ If a routine causes a location to be trashed, this must be declared in the calle | { | call trash_x | } - ? UnmeaningfulOutputError: x in foo + ? UnmeaningfulOutputError: x (in foo, line 12) If a routine reads or writes a user-define memory location, it needs to declare that too. @@ -214,7 +214,7 @@ Can't `ld` from a memory location that isn't initialized. | { | ld a, x | } - ? UnmeaningfulReadError: x in main + ? UnmeaningfulReadError: x Can't `ld` to a memory location that doesn't appear in (outputs ∪ trashes). @@ -246,14 +246,14 @@ Can't `ld` to a memory location that doesn't appear in (outputs ∪ trashes). | { | ld a, 0 | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a | routine main | trashes a, n | { | ld a, 0 | } - ? ForbiddenWriteError: z in main + ? ForbiddenWriteError: z Can't `ld` a `word` type. @@ -265,7 +265,7 @@ Can't `ld` a `word` type. | { | ld a, foo | } - ? TypeMismatchError: foo and a in main + ? TypeMismatchError: foo and a ### st ### @@ -286,7 +286,7 @@ Can't `st` from a memory location that isn't initialized. | { | st x, lives | } - ? UnmeaningfulReadError: x in main + ? UnmeaningfulReadError: x Can't `st` to a memory location that doesn't appear in (outputs ∪ trashes). @@ -312,7 +312,7 @@ Can't `st` to a memory location that doesn't appear in (outputs ∪ trashes). | { | st 0, lives | } - ? ForbiddenWriteError: lives in main + ? ForbiddenWriteError: lives Can't `st` a `word` type. @@ -517,8 +517,9 @@ Copying to and from a word table. ? TypeMismatchError You can also copy a literal word to a word table. +(Even if the table has fewer than 256 entries.) - | word table[256] many + | word table[32] many | | routine main | inputs many @@ -530,6 +531,154 @@ You can also copy a literal word to a word table. | } = ok +#### tables: range checking #### + +It is a static analysis error if it cannot be proven that a read or write +to a table falls within the defined size of that table. + +(If a table has 256 entries, then there is never a problem, because a byte +cannot index any entry outside of 0..255.) + +A SixtyPical implementation must be able to prove that the index is inside +the range of the table in various ways. The simplest is to show that a +constant value falls inside or outside the range of the table. + + | byte table[32] many + | + | routine main + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 31 + | ld a, many + x + | st a, many + x + | } + = ok + + | byte table[32] many + | + | routine main + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 32 + | ld a, many + x + | } + ? RangeExceededError + + | byte table[32] many + | + | routine main + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 32 + | ld a, 0 + | st a, many + x + | } + ? RangeExceededError + +This applies to `copy` as well. + + | word one: 77 + | word table[32] many + | + | routine main + | inputs many, one + | outputs many, one + | trashes a, x, n, z + | { + | ld x, 31 + | copy one, many + x + | copy many + x, one + | } + = ok + + | word one: 77 + | word table[32] many + | + | routine main + | inputs many, one + | outputs many, one + | trashes a, x, n, z + | { + | ld x, 32 + | copy many + x, one + | } + ? RangeExceededError + + | word one: 77 + | word table[32] many + | + | routine main + | inputs many, one + | outputs many, one + | trashes a, x, n, z + | { + | ld x, 32 + | copy one, many + x + | } + ? RangeExceededError + +`AND`'ing a register with a value ensures the range of the +register will not exceed the range of the value. This can +be used to "clip" the range of an index so that it fits in +a table. + + | word one: 77 + | word table[32] many + | + | routine main + | inputs a, many, one + | outputs many, one + | trashes a, x, n, z + | { + | and a, 31 + | ld x, a + | copy one, many + x + | copy many + x, one + | } + = ok + +Test for "clipping", but not enough. + + | word one: 77 + | word table[32] many + | + | routine main + | inputs a, many, one + | outputs many, one + | trashes a, x, n, z + | { + | and a, 63 + | ld x, a + | copy one, many + x + | copy many + x, one + | } + ? RangeExceededError + +If you alter the value after "clipping" it, the range can +no longer be guaranteed. + + | word one: 77 + | word table[32] many + | + | routine main + | inputs a, many, one + | outputs many, one + | trashes a, x, n, z + | { + | and a, 31 + | ld x, a + | inc x + | copy one, many + x + | copy many + x, one + | } + ? RangeExceededError + ### add ### Can't `add` from or to a memory location that isn't initialized. @@ -553,7 +702,7 @@ Can't `add` from or to a memory location that isn't initialized. | st off, c | add a, lives | } - ? UnmeaningfulReadError: lives in main + ? UnmeaningfulReadError: lives | byte lives | routine main @@ -564,7 +713,7 @@ Can't `add` from or to a memory location that isn't initialized. | st off, c | add a, lives | } - ? UnmeaningfulReadError: a in main + ? UnmeaningfulReadError: a Can't `add` to a memory location that isn't writeable. @@ -575,7 +724,7 @@ Can't `add` to a memory location that isn't writeable. | st off, c | add a, 0 | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a You can `add` a word constant to a word memory location. @@ -601,7 +750,7 @@ You can `add` a word constant to a word memory location. | st off, c | add score, 1999 | } - ? UnmeaningfulOutputError: a in main + ? UnmeaningfulOutputError: a To be sure, `add`ing a word constant to a word memory location trashes `a`. @@ -614,7 +763,7 @@ To be sure, `add`ing a word constant to a word memory location trashes `a`. | st off, c | add score, 1999 | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a You can `add` a word memory location to another word memory location. @@ -642,7 +791,7 @@ You can `add` a word memory location to another word memory location. | st off, c | add score, delta | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a You can `add` a word memory location, or a constant, to a pointer. @@ -672,7 +821,7 @@ You can `add` a word memory location, or a constant, to a pointer. | add ptr, delta | add ptr, word 1 | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a ### sub ### @@ -697,7 +846,7 @@ Can't `sub` from or to a memory location that isn't initialized. | st off, c | sub a, lives | } - ? UnmeaningfulReadError: lives in main + ? UnmeaningfulReadError: lives | byte lives | routine main @@ -708,7 +857,7 @@ Can't `sub` from or to a memory location that isn't initialized. | st off, c | sub a, lives | } - ? UnmeaningfulReadError: a in main + ? UnmeaningfulReadError: a Can't `sub` to a memory location that isn't writeable. @@ -719,7 +868,7 @@ Can't `sub` to a memory location that isn't writeable. | st off, c | sub a, 0 | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a You can `sub` a word constant from a word memory location. @@ -745,7 +894,7 @@ You can `sub` a word constant from a word memory location. | st on, c | sub score, 1999 | } - ? UnmeaningfulOutputError: a in main + ? UnmeaningfulOutputError: a You can `sub` a word memory location from another word memory location. @@ -773,7 +922,7 @@ You can `sub` a word memory location from another word memory location. | st off, c | sub score, delta | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a ### inc ### @@ -785,7 +934,7 @@ Location must be initialized and writeable. | { | inc x | } - ? UnmeaningfulReadError: x in main + ? UnmeaningfulReadError: x | routine main | inputs x @@ -793,7 +942,7 @@ Location must be initialized and writeable. | { | inc x | } - ? ForbiddenWriteError: x in main + ? ForbiddenWriteError: x | routine main | inputs x @@ -815,7 +964,7 @@ Can't `inc` a `word` type. | { | inc foo | } - ? TypeMismatchError: foo in main + ? TypeMismatchError: foo ### dec ### @@ -827,7 +976,7 @@ Location must be initialized and writeable. | { | dec x | } - ? UnmeaningfulReadError: x in main + ? UnmeaningfulReadError: x | routine main | inputs x @@ -835,7 +984,7 @@ Location must be initialized and writeable. | { | dec x | } - ? ForbiddenWriteError: x in main + ? ForbiddenWriteError: x | routine main | inputs x @@ -857,7 +1006,7 @@ Can't `dec` a `word` type. | { | dec foo | } - ? TypeMismatchError: foo in main + ? TypeMismatchError: foo ### cmp ### @@ -877,14 +1026,14 @@ Some rudimentary tests for cmp. | { | cmp a, 4 | } - ? ForbiddenWriteError: c in main + ? ForbiddenWriteError: c | routine main | trashes z, c, n | { | cmp a, 4 | } - ? UnmeaningfulReadError: a in main + ? UnmeaningfulReadError: a ### and ### @@ -904,14 +1053,14 @@ Some rudimentary tests for and. | { | and a, 4 | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a | routine main | trashes z, n | { | and a, 4 | } - ? UnmeaningfulReadError: a in main + ? UnmeaningfulReadError: a ### or ### @@ -931,14 +1080,14 @@ Writing unit tests on a train. Wow. | { | or a, 4 | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a | routine main | trashes z, n | { | or a, 4 | } - ? UnmeaningfulReadError: a in main + ? UnmeaningfulReadError: a ### xor ### @@ -958,14 +1107,14 @@ Writing unit tests on a train. Wow. | { | xor a, 4 | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a | routine main | trashes z, n | { | xor a, 4 | } - ? UnmeaningfulReadError: a in main + ? UnmeaningfulReadError: a ### shl ### @@ -985,7 +1134,7 @@ Some rudimentary tests for shl. | { | shl a | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a | routine main | inputs a @@ -993,7 +1142,7 @@ Some rudimentary tests for shl. | { | shl a | } - ? UnmeaningfulReadError: c in main + ? UnmeaningfulReadError: c ### shr ### @@ -1013,7 +1162,7 @@ Some rudimentary tests for shr. | { | shr a | } - ? ForbiddenWriteError: a in main + ? ForbiddenWriteError: a | routine main | inputs a @@ -1021,7 +1170,7 @@ Some rudimentary tests for shr. | { | shr a | } - ? UnmeaningfulReadError: c in main + ? UnmeaningfulReadError: c ### call ### @@ -1041,7 +1190,7 @@ initialized. | { | call foo | } - ? UnmeaningfulReadError: x in main + ? UnmeaningfulReadError: x Note that if you call a routine that trashes a location, you also trash it. @@ -1060,7 +1209,7 @@ Note that if you call a routine that trashes a location, you also trash it. | ld x, 0 | call foo | } - ? ForbiddenWriteError: lives in main + ? ForbiddenWriteError: lives | byte lives | @@ -1097,7 +1246,7 @@ You can't output a value that the thing you called trashed. | ld x, 0 | call foo | } - ? UnmeaningfulOutputError: lives in main + ? UnmeaningfulOutputError: lives ...unless you write to it yourself afterwards. @@ -1148,7 +1297,7 @@ calling it. | call foo | ld a, x | } - ? UnmeaningfulReadError: x in main + ? UnmeaningfulReadError: x If a routine trashes locations, they are uninitialized in the caller after calling it. @@ -1173,7 +1322,7 @@ calling it. | call foo | ld a, x | } - ? UnmeaningfulReadError: x in main + ? UnmeaningfulReadError: x Calling an extern is just the same as calling a defined routine with the same constraints. @@ -1201,7 +1350,7 @@ same constraints. | { | call chrout | } - ? UnmeaningfulReadError: a in main + ? UnmeaningfulReadError: a | routine chrout | inputs a @@ -1215,7 +1364,7 @@ same constraints. | call chrout | ld x, a | } - ? UnmeaningfulReadError: a in main + ? UnmeaningfulReadError: a ### trash ### @@ -1241,7 +1390,7 @@ Trash does nothing except indicate that we do not care about the value anymore. | ld a, 0 | trash a | } - ? UnmeaningfulOutputError: a in foo + ? UnmeaningfulOutputError: a | routine foo | inputs a @@ -1252,7 +1401,7 @@ Trash does nothing except indicate that we do not care about the value anymore. | trash a | st a, x | } - ? UnmeaningfulReadError: a in foo + ? UnmeaningfulReadError: a ### if ### @@ -1417,7 +1566,7 @@ trashes {`a`, `b`}. | trash x | } | } - ? ForbiddenWriteError: x in foo + ? ForbiddenWriteError: x (in foo, line 10) | routine foo | inputs a, x, z @@ -1429,7 +1578,7 @@ trashes {`a`, `b`}. | trash x | } | } - ? ForbiddenWriteError: a in foo + ? ForbiddenWriteError: a (in foo, line 10) ### repeat ### @@ -1482,7 +1631,7 @@ initialized at the start. | cmp x, 10 | } until z | } - ? UnmeaningfulReadError: y in main + ? UnmeaningfulReadError: y And if you trash the test expression (i.e. `z` in the below) inside the loop, this is an error too. @@ -1499,7 +1648,7 @@ this is an error too. | copy one, two | } until z | } - ? UnmeaningfulReadError: z in main + ? UnmeaningfulReadError: z The body of `repeat forever` can be empty. @@ -1531,7 +1680,7 @@ Can't `copy` from a memory location that isn't initialized. | { | copy x, lives | } - ? UnmeaningfulReadError: x in main + ? UnmeaningfulReadError: x Can't `copy` to a memory location that doesn't appear in (outputs ∪ trashes). @@ -1559,7 +1708,7 @@ Can't `copy` to a memory location that doesn't appear in (outputs ∪ trashes). | { | copy 0, lives | } - ? ForbiddenWriteError: lives in main + ? ForbiddenWriteError: lives a, z, and n are trashed, and must be declared as such @@ -1569,7 +1718,7 @@ a, z, and n are trashed, and must be declared as such | { | copy 0, lives | } - ? ForbiddenWriteError: n in main + ? ForbiddenWriteError: n a, z, and n are trashed, and must not be declared as outputs. @@ -1579,7 +1728,7 @@ a, z, and n are trashed, and must not be declared as outputs. | { | copy 0, lives | } - ? UnmeaningfulOutputError: n in main + ? UnmeaningfulOutputError: n Unless of course you subsequently initialize them. @@ -1808,7 +1957,7 @@ as an input to, an output of, or as a trashed value of a routine. | { | copy foo, vec | } - ? ConstantConstraintError: foo in main + ? ConstantConstraintError: foo | vector routine | inputs x @@ -1830,7 +1979,7 @@ as an input to, an output of, or as a trashed value of a routine. | { | copy foo, vec | } - ? ConstantConstraintError: foo in main + ? ConstantConstraintError: foo | vector routine | inputs x @@ -1852,7 +2001,7 @@ as an input to, an output of, or as a trashed value of a routine. | { | copy foo, vec | } - ? ConstantConstraintError: foo in main + ? ConstantConstraintError: foo You can copy the address of a routine into a vector, if that vector is declared appropriately. @@ -1981,7 +2130,7 @@ Calling the vector does indeed trash the things the vector says it does. | copy bar, foo | call foo | } - ? UnmeaningfulOutputError: x in main + ? UnmeaningfulOutputError: x `goto`, if present, must be in tail position (the final instruction in a routine.) @@ -2107,7 +2256,7 @@ vector says it does. | call sub | ld a, x | } - ? UnmeaningfulReadError: x in main + ? UnmeaningfulReadError: x | vector routine | outputs x @@ -2230,7 +2379,7 @@ A vector in a vector table cannot be directly called. | copy bar, many + x | call many + x | } - ? ValueError + ? SyntaxError ### typedef ### diff --git a/tests/SixtyPical Compilation.md b/tests/SixtyPical Compilation.md index 831b94e..2ead442 100644 --- a/tests/SixtyPical Compilation.md +++ b/tests/SixtyPical Compilation.md @@ -7,7 +7,7 @@ SixtyPical to 6502 machine code. [Falderal]: http://catseye.tc/node/Falderal -> Functionality "Compile SixtyPical program" is implemented by - -> shell command "bin/sixtypical --basic-prelude --traceback %(test-body-file) >/tmp/foo && tests/appliances/bin/dcc6502-adapter shell command "bin/sixtypical --prelude=c64 --traceback %(test-body-file) >/tmp/foo && tests/appliances/bin/dcc6502-adapter Tests for functionality "Compile SixtyPical program" diff --git a/tests/SixtyPical Syntax.md b/tests/SixtyPical Syntax.md index 1a3b5b4..a434345 100644 --- a/tests/SixtyPical Syntax.md +++ b/tests/SixtyPical Syntax.md @@ -153,6 +153,45 @@ Tables of different types. | } = ok +The number of entries in a table must be +greater than 0 and less than or equal to 256. + + | word table[512] many + | + | routine main + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 0 + | copy 9999, many + x + | } + ? SyntaxError + + | word table[0] many + | + | routine main + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 0 + | copy 9999, many + x + | } + ? SyntaxError + + | word table[48] many + | + | routine main + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 0 + | copy 9999, many + x + | } + = ok + Typedefs of different types. | typedef byte octet diff --git a/tests/appliances/bin/dcc6502-adapter b/tests/appliances/bin/dcc6502-adapter index d92339b..e11b6aa 100755 --- a/tests/appliances/bin/dcc6502-adapter +++ b/tests/appliances/bin/dcc6502-adapter @@ -1,6 +1,6 @@ #!/usr/bin/env python -# script that allows the binary output of sixtypical --basic-prelude --compile to be +# script that allows the binary output of sixtypical --prelude=c64 --compile to be # disassembled by https://github.com/tcarmelveilleux/dcc6502 import sys