From b19267d3ba4f9e93af07f2b988cf58558e25143d Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Mon, 8 Apr 2019 11:50:54 +0100 Subject: [PATCH 01/18] Checkpoint import of changes for version 0.19. --- HISTORY.md | 18 ++ TODO.md | 69 +++-- bin/sixtypical | 20 +- doc/SixtyPical.md | 125 +++++---- eg/c64/demo-game/demo-game.60p | 112 ++++---- eg/rudiments/buffer.60p | 2 +- src/sixtypical/analyzer.py | 196 ++++++++++---- src/sixtypical/ast.py | 5 + src/sixtypical/compiler.py | 92 ++++--- src/sixtypical/model.py | 108 ++------ src/sixtypical/parser.py | 38 +-- tests/SixtyPical Analysis.md | 458 +++++++++++++++++++++++++++----- tests/SixtyPical Compilation.md | 291 ++++++++++++++++---- tests/SixtyPical Syntax.md | 99 ++++++- 14 files changed, 1189 insertions(+), 444 deletions(-) diff --git a/HISTORY.md b/HISTORY.md index 2493386..554a82c 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -1,6 +1,24 @@ History of SixtyPical ===================== +0.19 +---- + +* A `table` may be defined with more than 256 entries, even + though the conventional index syntax can only refer to the + first 256 entries. +* A `pointer` may point inside values of type `byte table`, + allowing access to entries beyond the 256th. +* `buffer` types have been eliminated from the language, + as the above two improvements allow `byte table`s to + do everything `buffer`s previously did. +* When accessing a table with an index, a constant offset + can also be given. +* Accessing a `table` through a `pointer` must be done in + the context of a `point ... into` block. This allows the + analyzer to check *which* table is being modified. +* Added `--dump-exit-contexts` option to `sixtypical`. + 0.18 ---- diff --git a/TODO.md b/TODO.md index 6e7857e..f0b1a6c 100644 --- a/TODO.md +++ b/TODO.md @@ -12,37 +12,66 @@ Allow Which uses some other storage location instead of the stack. A local static would be a good candidate for such. -### Associate each pointer with the buffer it points into +### Analyze `call` within blocks? -Check that the buffer being read or written to through pointer, appears in appropriate -inputs or outputs set. +What happens if you call another routine from inside a `with interrupts off` block? -In the analysis, when we obtain a pointer, we need to record, in context, what buffer -that pointer came from. +What happens if you call another routine from inside a `save` block? -When we write through that pointer, we need to set that buffer as written. +What happens if you call another routine from inside a `point into` block? -When we read through the pointer, we need to check that the buffer is readable. +What happens if you call another routine from inside a `for` block? -### Table overlays +Remember that any of these may have a `goto` ... and they may have a second +instance of the same block (e.g. `with interrupts off` nested within +`with interrupts off` shouldn't be allowed to turn them back on after the +inner block has finished -- even if there is no `call`.) -They are uninitialized, but the twist is, the address is a buffer that is -an input to and/or output of the routine. So, they are defined (insofar -as the buffer is defined.) +These holes need to be plugged. -They are therefore a "view" of a section of a buffer. +### Pointers associated globally with a table -This is slightly dangerous since it does permit aliases: the buffer and the -table refer to the same memory. +We have `point into` blocks, but we would also like to sometimes pass a pointer +around to different routines, and have them all "know" what table it operates on. -Although, if they are `static`, you could say, in the routine in which they -are `static`, as soon as you've established one, you can no longer use the -buffer; and the ones you establish must be disjoint. +We could associate every pointer variable with a specific table variable, in its +declaration. This makes some things simple, and would allow us to know what table a +pointer is supposed to point into, even if that pointer was passed into our routine. -(That seems to be the most compelling case for restricting them to `static`.) +One drawback is that it would limit each pointer to be used only on one table. Since a +pointer basically represents a zero-page location, and since those are a relatively scarce +resource, we would prefer if a single pointer could be used to point into different tables +at different times. -An alternative would be `static` pointers, which are currently not possible because -pointers must be zero-page, thus `@`, thus uninitialized. +These can co-exist with general, non-specific-table-linked `pointer` variables. + +### Local non-statics + +Somewhat related to the above, it should be possible to declare a local storage +location which is not static. + +In this case, it would be considered uninitialized each time the routine was +entered. + +So, you do not have a guarantee that it has a valid value. But you are guaranteed +that no other routine can read or modify it. + +It also enables a trick: if there are two routines A and B, and A never calls B +(even indirectly), and B never calls A (even indirectly), then their locals can +be allocated at the same space. + +A local could also be given an explicit address. In this case, two locals in +different routines could be given the same address, and as long as the condition +in the above paragraph holds, that's okay. (If it doesn't, the analyzer should +detect it.) + +This would permit local pointers, which would be one way of addressing the +"same pointer to different tables" problem. + +### Copy byte to/from table + +Do we want a `copy bytevar, table + x` instruction? We don't currently have one. +You have to `ld a`, `st a`. I think maybe we should have one. ### Tail-call optimization diff --git a/bin/sixtypical b/bin/sixtypical index eb1867c..505a301 100755 --- a/bin/sixtypical +++ b/bin/sixtypical @@ -12,8 +12,9 @@ sys.path.insert(0, join(dirname(realpath(sys.argv[0])), '..', 'src')) # ----------------------------------------------------------------- # -import codecs from argparse import ArgumentParser +import codecs +import json from pprint import pprint import sys import traceback @@ -43,14 +44,19 @@ def process_input_files(filenames, options): program = merge_programs(programs) analyzer = Analyzer(debug=options.debug) - analyzer.analyze_program(program) + + try: + analyzer.analyze_program(program) + finally: + if options.dump_exit_contexts: + sys.stdout.write(json.dumps(analyzer.exit_contexts_map, indent=4, sort_keys=True, separators=(',', ':'))) + sys.stdout.write("\n") compilation_roster = None if options.optimize_fallthru: from sixtypical.fallthru import FallthruAnalyzer def dump(data, label=None): - import json if not options.dump_fallthru_info: return if label: @@ -114,6 +120,12 @@ if __name__ == '__main__': action="store_true", help="Only parse and analyze the program; do not compile it." ) + argparser.add_argument( + "--dump-exit-contexts", + action="store_true", + help="Dump a map, in JSON, of the analysis context at each exit of each routine " + "after analyzing the program." + ) argparser.add_argument( "--optimize-fallthru", action="store_true", @@ -123,7 +135,7 @@ if __name__ == '__main__': argparser.add_argument( "--dump-fallthru-info", action="store_true", - help="Dump the fallthru map and ordering to stdout after analyzing the program." + help="Dump the ordered fallthru map, in JSON, to stdout after analyzing the program." ) argparser.add_argument( "--parse-only", diff --git a/doc/SixtyPical.md b/doc/SixtyPical.md index d9b177d..1107ef4 100644 --- a/doc/SixtyPical.md +++ b/doc/SixtyPical.md @@ -1,7 +1,7 @@ SixtyPical ========== -This document describes the SixtyPical programming language version 0.15, +This document describes the SixtyPical programming language version 0.19, both its static semantics (the capabilities and limits of the static analyses it defines) and its runtime semantics (with reference to the semantics of 6502 machine code.) @@ -12,34 +12,55 @@ are even more normative. Refer to the bottom of this document for an EBNF grammar of the syntax of the language. -Types ------ +Data Model +---------- -There are five *primitive types* in SixtyPical: +SixtyPical defines a data model where every value has some type +information associated with it. The values include those that are +directly manipulable by a SixtyPical program, but are not limited to them. +Type information includes not only what kind of structure the data has, +but other properties as well (sometimes called "type annotations".) + +### Basic types ### + +SixtyPical defines a handful of basic types. There are three types that +are "primitive" in that they are not parameterized in any way: * bit (2 possible values) * byte (256 possible values) * word (65536 possible values) -* routine (code stored somewhere in memory, read-only) -* pointer (address of a byte in a buffer) -There are also three *type constructors*: +Types can also be parameterized and constructed from other types +(which is a kind of parameterization). One such type constructor is -* T table[N] (N entries, 1 ≤ N ≤ 256; each entry holds a value - of type T, where T is `byte`, `word`, or `vector`) -* buffer[N] (N entries; each entry is a byte; 1 ≤ N ≤ 65536) +* pointer (16-bit address of a byte inside a byte table) * vector T (address of a value of type T; T must be a routine type) -### User-defined ### +Values of the above-listed types are directly manipulable by a SixtyPical +program. Other types describe values which can only be indirectly +manipulated by a program: + +* routine (code stored somewhere in memory, read-only) +* T table[N] (series of 1 ≤ N ≤ 65536 values of type T) + +There are some restrictions here; for example, a table may only +consist of `byte`, `word`, or `vector` types. A pointer may only +point to a byte inside a `table` of `byte` type. + +Each routine is associated with a rich set of type information, +which is basically the types and statuses of memory locations that +have been declared as being relevant to that routine. + +#### User-defined #### A program may define its own types using the `typedef` feature. Typedefs must occur before everything else in the program. A typedef takes a type expression and an identifier which has not previously been used in the program. It associates that identifer with that type. This is merely -a type alias; two types with different names will compare as equal. +a type alias; if two types have identical structure but different names, +they will compare as equal. -Memory locations ----------------- +### Memory locations ### A primary concept in SixtyPical is the *memory location*. At any given point in time during execution, each memory location is either *uninitialized* or @@ -51,7 +72,7 @@ the program text; thus, it is a static property. There are four general kinds of memory location. The first three are pre-defined and built-in. -### Registers ### +#### Registers #### Each of these hold a byte. They are initially uninitialized. @@ -59,7 +80,7 @@ Each of these hold a byte. They are initially uninitialized. x y -### Flags ### +#### Flags #### Each of these hold a bit. They are initially uninitialized. @@ -68,7 +89,7 @@ Each of these hold a bit. They are initially uninitialized. v (overflow) n (negative) -### Constants ### +#### Constants #### It may be strange to think of constants as memory locations, but keep in mind that a memory location in SixtyPical need not map to a memory location in the @@ -97,7 +118,7 @@ and sixty-five thousand five hundred and thirty-six word constants, Note that if a word constant is between 256 and 65535, the leading `word` token can be omitted. -### User-defined ### +#### User-defined #### There may be any number of user-defined memory locations. They are defined by giving the type (which may be any type except `bit` and `routine`) and the @@ -137,13 +158,37 @@ This is actually useful, at least at this point, as you can rely on the fact that literal integers in the code are always immediate values. (But this may change at some point.) -### Buffers and Pointers ### +### Tables and Pointers ### -Roughly speaking, a `buffer` is a table that can be longer than 256 bytes, -and a `pointer` is an address within a buffer. +A table is a collection of memory locations that can be indexed in a number +of ways. + +The simplest way is to use another memory location as an index. There +are restrictions on which memory locations can be used as indexes; +only the `x` and `y` locations can be used this way. Since those can +only hold a byte, this method, by itself, only allows access to the first +256 entries of the table. + + byte table[1024] tab + ... + ld a, tab + x + st a, tab + y + +However, by combining indexing with a constant _offset_, entries beyond the +256th entry can be accessed. + + byte table[1024] tab + ... + ld a, tab + 512 + x + st a, tab + 512 + y + +Even with an offset, the range of indexing still cannot exceed 256 entries. +Accessing entries at an arbitrary address inside a table can be done with +a `pointer`. Pointers can only be point inside `byte` tables. When a +pointer is used, indexing with `x` or `y` will also take place. A `pointer` is implemented as a zero-page memory location, and accessing the -buffer pointed to is implemented with "indirect indexed" addressing, as in +table pointed to is implemented with "indirect indexed" addressing, as in LDA ($02), Y STA ($02), Y @@ -151,14 +196,15 @@ buffer pointed to is implemented with "indirect indexed" addressing, as in There are extended instruction modes for using these types of memory location. See `copy` below, but here is some illustrative example code: - copy ^buf, ptr // this is the only way to initialize a pointer - add ptr, 4 // ok, but only if it does not exceed buffer's size - ld y, 0 // you must set this to something yourself - copy [ptr] + y, byt // read memory through pointer, into byte - copy 100, [ptr] + y // write memory through pointer (still trashes a) + point ptr into buf { // this is the only way to initialize a pointer + add ptr, 4 // note, this is unchecked against table's size! + ld y, 0 // you must set this to something yourself + copy [ptr] + y, byt // read memory through pointer, into byte + copy 100, [ptr] + y // write memory through pointer (still trashes a) + } // after this block, ptr can no longer be used -where `ptr` is a user-defined storage location of `pointer` type, and the -`+ y` part is mandatory. +where `ptr` is a user-defined storage location of `pointer` type, `buf` +is a `table` of `byte` type, and the `+ y` part is mandatory. Routines -------- @@ -300,17 +346,7 @@ and it trashes the `z` and `n` flags and the `a` register. After execution, dest is considered initialized, and `z` and `n`, and `a` are considered uninitialized. -There are two extra modes that this instruction can be used in. The first is -to load an address into a pointer: - - copy ^, - -This copies the address of src into dest. In this case, src must be -of type buffer, and dest must be of type pointer. src will not be -considered a memory location that is read, since it is only its address -that is being retrieved. - -The second is to read or write indirectly through a pointer. +There is an extra mode that this instruction can be used in: copy [] + y, copy , [] + y @@ -350,7 +386,7 @@ In fact, this instruction trashes the `a` register in all cases except when the dest is `a`. NOTE: If dest is a pointer, the addition does not check if the result of -the pointer arithmetic continues to be valid (within a buffer) or not. +the pointer arithmetic continues to be valid (within a table) or not. ### inc ### @@ -581,11 +617,10 @@ Grammar Program ::= {ConstDefn | TypeDefn} {Defn} {Routine}. ConstDefn::= "const" Ident Const. TypeDefn::= "typedef" Type Ident. - Defn ::= Type Ident [Constraints] (":" Const | "@" LitWord). + Defn ::= Type Ident (":" Const | "@" LitWord). Type ::= TypeTerm ["table" TypeSize]. TypeExpr::= "byte" | "word" - | "buffer" TypeSize | "pointer" | "vector" TypeTerm | "routine" Constraints @@ -594,10 +629,8 @@ Grammar TypeSize::= "[" LitWord "]". Constrnt::= ["inputs" LocExprs] ["outputs" LocExprs] ["trashes" LocExprs]. Routine ::= "define" Ident Type (Block | "@" LitWord). - | "routine" Ident Constraints (Block | "@" LitWord) - . LocExprs::= LocExpr {"," LocExpr}. - LocExpr ::= Register | Flag | Const | Ident. + LocExpr ::= Register | Flag | Const | Ident [["+" Const] "+" Register]. Register::= "a" | "x" | "y". Flag ::= "c" | "z" | "n" | "v". Const ::= Literal | Ident. diff --git a/eg/c64/demo-game/demo-game.60p b/eg/c64/demo-game/demo-game.60p index 1e6fbd6..3135895 100644 --- a/eg/c64/demo-game/demo-game.60p +++ b/eg/c64/demo-game/demo-game.60p @@ -21,24 +21,16 @@ // and the end of their own routines, so the type needs to be compatible. // (In a good sense, it is a continuation.) // -// Further, -// -// It's very arguable that screen1/2/3/4 and colormap1/2/3/4 are not REALLY inputs. -// They're only there to support the fact that game states sometimes clear the -// screen, and sometimes don't. When they don't, they preserve the screen, and -// currently the way to say "we preserve the screen" is to have it as both input -// and output. There is probably a better way to do this, but it needs thought. -// typedef routine inputs joy2, press_fire_msg, dispatch_game_state, actor_pos, actor_delta, actor_logic, player_died, - screen, screen1, screen2, screen3, screen4, colormap1, colormap2, colormap3, colormap4 + screen, colormap outputs dispatch_game_state, actor_pos, actor_delta, actor_logic, player_died, - screen, screen1, screen2, screen3, screen4, colormap1, colormap2, colormap3, colormap4 + screen, colormap trashes a, x, y, c, z, n, v, pos, new_pos, delta, ptr, dispatch_logic game_state_routine @@ -62,18 +54,8 @@ typedef routine byte vic_border @ 53280 byte vic_bg @ 53281 - -byte table[256] screen1 @ 1024 -byte table[256] screen2 @ 1274 -byte table[256] screen3 @ 1524 -byte table[256] screen4 @ 1774 - -byte table[256] colormap1 @ 55296 -byte table[256] colormap2 @ 55546 -byte table[256] colormap3 @ 55796 -byte table[256] colormap4 @ 56046 - -buffer[2048] screen @ 1024 +byte table[2048] screen @ 1024 +byte table[2048] colormap @ 55296 byte joy2 @ $dc00 // ---------------------------------------------------------------- @@ -187,22 +169,22 @@ define check_button routine } define clear_screen routine - outputs screen1, screen2, screen3, screen4, colormap1, colormap2, colormap3, colormap4 + outputs screen, colormap trashes a, y, c, n, z { ld y, 0 repeat { ld a, 1 - st a, colormap1 + y - st a, colormap2 + y - st a, colormap3 + y - st a, colormap4 + y + st a, colormap + y + st a, colormap + 250 + y + st a, colormap + 500 + y + st a, colormap + 750 + y ld a, 32 - st a, screen1 + y - st a, screen2 + y - st a, screen3 + y - st a, screen4 + y + st a, screen + y + st a, screen + 250 + y + st a, screen + 500 + y + st a, screen + 750 + y inc y cmp y, 250 @@ -282,13 +264,14 @@ define player_logic logic_routine call check_new_position_in_bounds if c { - copy ^screen, ptr - st off, c - add ptr, new_pos - ld y, 0 + point ptr into screen { + st off, c + add ptr, new_pos + ld y, 0 + // check collision. + ld a, [ptr] + y + } - // check collision. - ld a, [ptr] + y // if "collision" is with your own self, treat it as if it's blank space! cmp a, 81 if z { @@ -296,17 +279,19 @@ define player_logic logic_routine } cmp a, 32 if z { - copy ^screen, ptr - st off, c - add ptr, pos - copy 32, [ptr] + y + point ptr into screen { + st off, c + add ptr, pos + copy 32, [ptr] + y + } copy new_pos, pos - copy ^screen, ptr - st off, c - add ptr, pos - copy 81, [ptr] + y + point ptr into screen { + st off, c + add ptr, pos + copy 81, [ptr] + y + } } else { ld a, 1 st a, player_died @@ -321,13 +306,13 @@ define enemy_logic logic_routine call check_new_position_in_bounds if c { - copy ^screen, ptr - st off, c - add ptr, new_pos - ld y, 0 - - // check collision. - ld a, [ptr] + y + point ptr into screen { + st off, c + add ptr, new_pos + ld y, 0 + // check collision. + ld a, [ptr] + y + } // if "collision" is with your own self, treat it as if it's blank space! cmp a, 82 if z { @@ -335,17 +320,19 @@ define enemy_logic logic_routine } cmp a, 32 if z { - copy ^screen, ptr - st off, c - add ptr, pos - copy 32, [ptr] + y + point ptr into screen { + st off, c + add ptr, pos + copy 32, [ptr] + y + } copy new_pos, pos - copy ^screen, ptr - st off, c - add ptr, pos - copy 82, [ptr] + y + point ptr into screen { + st off, c + add ptr, pos + copy 82, [ptr] + y + } } } else { copy delta, compare_target @@ -372,7 +359,7 @@ define game_state_title_screen game_state_routine st on, c sub a, 64 // yuck. oh well - st a, screen1 + y + st a, screen + y } st off, c @@ -444,8 +431,7 @@ define our_cinv game_state_routine define main routine inputs cinv - outputs cinv, save_cinv, pos, dispatch_game_state, - screen1, screen2, screen3, screen4, colormap1, colormap2, colormap3, colormap4 + outputs cinv, save_cinv, pos, dispatch_game_state, screen, colormap trashes a, y, n, c, z, vic_border, vic_bg { ld a, 5 diff --git a/eg/rudiments/buffer.60p b/eg/rudiments/buffer.60p index 7772449..72ff18a 100644 --- a/eg/rudiments/buffer.60p +++ b/eg/rudiments/buffer.60p @@ -1,7 +1,7 @@ // Include `support/${PLATFORM}.60p` before this source // Should print Y -buffer[2048] buf +byte table[2048] buf pointer ptr @ 254 byte foo diff --git a/src/sixtypical/analyzer.py b/src/sixtypical/analyzer.py index 39ed406..7d115e8 100644 --- a/src/sixtypical/analyzer.py +++ b/src/sixtypical/analyzer.py @@ -1,10 +1,12 @@ # encoding: UTF-8 -from sixtypical.ast import Program, Routine, Block, SingleOp, If, Repeat, For, WithInterruptsOff, Save +from sixtypical.ast import ( + Program, Routine, Block, SingleOp, If, Repeat, For, WithInterruptsOff, Save, PointInto +) from sixtypical.model import ( TYPE_BYTE, TYPE_WORD, - TableType, BufferType, PointerType, VectorType, RoutineType, - ConstantRef, LocationRef, IndirectRef, IndexedRef, AddressRef, + TableType, PointerType, VectorType, RoutineType, + ConstantRef, LocationRef, IndirectRef, IndexedRef, REG_A, REG_Y, FLAG_Z, FLAG_N, FLAG_V, FLAG_C ) @@ -107,13 +109,14 @@ class Context(object): unwriteable in certain contexts, such as `for` loops. """ def __init__(self, routines, routine, inputs, outputs, trashes): - self.routines = routines # Location -> AST node - self.routine = routine - self._touched = set() - self._range = dict() - self._writeable = set() + self.routines = routines # LocationRef -> Routine (AST node) + self.routine = routine # Routine (AST node) + self._touched = set() # {LocationRef} + self._range = dict() # LocationRef -> (Int, Int) + self._writeable = set() # {LocationRef} self._terminated = False self._gotos_encountered = set() + self._pointer_assoc = dict() for ref in inputs: if ref.is_constant(): @@ -137,21 +140,38 @@ class Context(object): LocationRef.format_set(self._touched), LocationRef.format_set(self._range), LocationRef.format_set(self._writeable) ) + def to_json_data(self): + return { + 'touched': ','.join(sorted(loc.name for loc in self._touched)), + 'range': dict((loc.name, '{}-{}'.format(rng[0], rng[1])) for (loc, rng) in self._range.items()), + 'writeable': ','.join(sorted(loc.name for loc in self._writeable)), + 'terminated': self._terminated, + 'gotos_encountered': ','.join(sorted(loc.name for loc in self._gotos_encountered)), + } + def clone(self): c = Context(self.routines, self.routine, [], [], []) c._touched = set(self._touched) c._range = dict(self._range) c._writeable = set(self._writeable) + c._pointer_assoc = dict(self._pointer_assoc) + c._gotos_encountered = set(self._gotos_encountered) return c def update_from(self, other): + """Replaces the information in this context, with the information from the other context. + This is an overwriting action - it does not attempt to merge the contexts. + + We do not replace the gotos_encountered for technical reasons. (In `analyze_if`, + we merge those sets afterwards; at the end of `analyze_routine`, they are not distinct in the + set of contexts we are updating from, and we want to retain our own.)""" self.routines = other.routines self.routine = other.routine self._touched = set(other._touched) self._range = dict(other._range) self._writeable = set(other._writeable) self._terminated = other._terminated - self._gotos_encounters = set(other._gotos_encountered) + self._pointer_assoc = dict(other._pointer_assoc) def each_meaningful(self): for ref in self._range.keys(): @@ -197,8 +217,11 @@ class Context(object): message += ' (%s)' % kwargs['message'] raise exception_class(self.routine, message) - def assert_in_range(self, inside, outside): - # FIXME there's a bit of I'm-not-sure-the-best-way-to-do-this-ness, here... + def assert_in_range(self, inside, outside, offset): + """Given two locations, assert that the first location, offset by the given offset, + is contained 'inside' the second location.""" + assert isinstance(inside, LocationRef) + assert isinstance(outside, LocationRef) # inside should always be meaningful inside_range = self._range[inside] @@ -208,13 +231,11 @@ class Context(object): outside_range = self._range[outside] else: outside_range = outside.max_range() - if isinstance(outside.type, TableType): - outside_range = (0, outside.type.size-1) - if inside_range[0] < outside_range[0] or inside_range[1] > outside_range[1]: + if (inside_range[0] + offset.value) < outside_range[0] or (inside_range[1] + offset.value) > outside_range[1]: raise RangeExceededError(self.routine, - "Possible range of {} {} exceeds acceptable range of {} {}".format( - inside, inside_range, outside, outside_range + "Possible range of {} {} (+{}) exceeds acceptable range of {} {}".format( + inside, inside_range, offset, outside, outside_range ) ) @@ -311,17 +332,17 @@ class Context(object): def has_terminated(self): return self._terminated - def assert_types_for_read_table(self, instr, src, dest, type_): + def assert_types_for_read_table(self, instr, src, dest, type_, offset): if (not TableType.is_a_table_type(src.ref.type, type_)) or (not dest.type == type_): raise TypeMismatchError(instr, '{} and {}'.format(src.ref.name, dest.name)) self.assert_meaningful(src, src.index) - self.assert_in_range(src.index, src.ref) + self.assert_in_range(src.index, src.ref, offset) - def assert_types_for_update_table(self, instr, dest, type_): + def assert_types_for_update_table(self, instr, dest, type_, offset): if not TableType.is_a_table_type(dest.ref.type, type_): raise TypeMismatchError(instr, '{}'.format(dest.ref.name)) self.assert_meaningful(dest.index) - self.assert_in_range(dest.index, dest.ref) + self.assert_in_range(dest.index, dest.ref, offset) self.set_written(dest.ref) def extract(self, location): @@ -359,6 +380,12 @@ class Context(object): elif location in self._writeable: self._writeable.remove(location) + def get_assoc(self, pointer): + return self._pointer_assoc.get(pointer) + + def set_assoc(self, pointer, table): + self._pointer_assoc[pointer] = table + class Analyzer(object): @@ -366,6 +393,7 @@ class Analyzer(object): self.current_routine = None self.routines = {} self.debug = debug + self.exit_contexts_map = {} def assert_type(self, type_, *locations): for location in locations: @@ -398,32 +426,21 @@ class Analyzer(object): assert isinstance(routine, Routine) if routine.block is None: # it's an extern, that's fine - return + return None self.current_routine = routine type_ = routine.location.type context = Context(self.routines, routine, type_.inputs, type_.outputs, type_.trashes) self.exit_contexts = [] - if self.debug: - print("at start of routine `{}`:".format(routine.name)) - print(context) - self.analyze_block(routine.block, context) trashed = set(context.each_touched()) - set(context.each_meaningful()) - if self.debug: - print("at end of routine `{}`:".format(routine.name)) - print(context) - print("trashed: ", LocationRef.format_set(trashed)) - print("outputs: ", LocationRef.format_set(type_.outputs)) - trashed_outputs = type_.outputs & trashed - if trashed_outputs: - print("TRASHED OUTPUTS: ", LocationRef.format_set(trashed_outputs)) - print('') - print('-' * 79) - print('') + self.exit_contexts_map[routine.name] = { + 'end_context': context.to_json_data(), + 'exit_contexts': [e.to_json_data() for e in self.exit_contexts] + } if self.exit_contexts: # check that they are all consistent @@ -438,6 +455,10 @@ class Analyzer(object): raise InconsistentExitError("Exit contexts are not consistent") if set(ex.each_writeable()) != exit_writeable: raise InconsistentExitError("Exit contexts are not consistent") + + # We now set the main context to the (consistent) exit context + # so that this routine is perceived as having the same effect + # that any of the goto'ed routines have. context.update_from(exit_context) # these all apply whether we encountered goto(s) in this routine, or not...: @@ -480,6 +501,8 @@ class Analyzer(object): raise IllegalJumpError(instr, instr) elif isinstance(instr, Save): self.analyze_save(instr, context) + elif isinstance(instr, PointInto): + self.analyze_point_into(instr, context) else: raise NotImplementedError @@ -494,13 +517,19 @@ class Analyzer(object): if opcode == 'ld': if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE) + context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) elif isinstance(src, IndirectRef): # copying this analysis from the matching branch in `copy`, below if isinstance(src.ref.type, PointerType) and dest.type == TYPE_BYTE: pass else: raise TypeMismatchError(instr, (src, dest)) + + origin = context.get_assoc(src.ref) + if not origin: + raise UnmeaningfulReadError(instr, src.ref) + context.assert_meaningful(origin) + context.assert_meaningful(src.ref, REG_Y) elif src.type != dest.type: raise TypeMismatchError(instr, '{} and {}'.format(src.name, dest.name)) @@ -512,15 +541,22 @@ class Analyzer(object): if isinstance(dest, IndexedRef): if src.type != TYPE_BYTE: raise TypeMismatchError(instr, (src, dest)) - context.assert_types_for_update_table(instr, dest, TYPE_BYTE) + context.assert_types_for_update_table(instr, dest, TYPE_BYTE, dest.offset) elif isinstance(dest, IndirectRef): # copying this analysis from the matching branch in `copy`, below if isinstance(dest.ref.type, PointerType) and src.type == TYPE_BYTE: pass else: raise TypeMismatchError(instr, (src, dest)) + context.assert_meaningful(dest.ref, REG_Y) - context.set_written(dest.ref) + + target = context.get_assoc(dest.ref) + if not target: + raise ForbiddenWriteError(instr, dest.ref) + context.set_touched(target) + context.set_written(target) + elif src.type != dest.type: raise TypeMismatchError(instr, '{} and {}'.format(src, dest)) else: @@ -530,7 +566,7 @@ class Analyzer(object): elif opcode == 'add': context.assert_meaningful(src, dest, FLAG_C) if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE) + context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) elif src.type == TYPE_BYTE: self.assert_type(TYPE_BYTE, src, dest) if dest != REG_A: @@ -551,7 +587,7 @@ class Analyzer(object): elif opcode == 'sub': context.assert_meaningful(src, dest, FLAG_C) if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE) + context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) elif src.type == TYPE_BYTE: self.assert_type(TYPE_BYTE, src, dest) if dest != REG_A: @@ -566,7 +602,7 @@ class Analyzer(object): elif opcode == 'cmp': context.assert_meaningful(src, dest) if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE) + context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) elif src.type == TYPE_BYTE: self.assert_type(TYPE_BYTE, src, dest) else: @@ -576,7 +612,7 @@ class Analyzer(object): context.set_written(FLAG_Z, FLAG_N, FLAG_C) elif opcode == 'and': if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE) + context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) else: self.assert_type(TYPE_BYTE, src, dest) context.assert_meaningful(src, dest) @@ -588,7 +624,7 @@ class Analyzer(object): context.set_top_of_range(dest, context.get_top_of_range(src)) elif opcode in ('or', 'xor'): if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE) + context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) else: self.assert_type(TYPE_BYTE, src, dest) context.assert_meaningful(src, dest) @@ -597,7 +633,7 @@ class Analyzer(object): elif opcode in ('inc', 'dec'): context.assert_meaningful(dest) if isinstance(dest, IndexedRef): - context.assert_types_for_update_table(instr, dest, TYPE_BYTE) + context.assert_types_for_update_table(instr, dest, TYPE_BYTE, dest.offset) context.set_written(dest.ref, FLAG_Z, FLAG_N) #context.invalidate_range(dest) else: @@ -620,7 +656,7 @@ class Analyzer(object): elif opcode in ('shl', 'shr'): context.assert_meaningful(dest, FLAG_C) if isinstance(dest, IndexedRef): - context.assert_types_for_update_table(instr, dest, TYPE_BYTE) + context.assert_types_for_update_table(instr, dest, TYPE_BYTE, dest.offset) context.set_written(dest.ref, FLAG_Z, FLAG_N, FLAG_C) #context.invalidate_range(dest) else: @@ -647,12 +683,7 @@ class Analyzer(object): # 1. check that their types are compatible - if isinstance(src, AddressRef) and isinstance(dest, LocationRef): - if isinstance(src.ref.type, BufferType) and isinstance(dest.type, PointerType): - pass - else: - raise TypeMismatchError(instr, (src, dest)) - elif isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndirectRef): + if isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndirectRef): if src.type == TYPE_BYTE and isinstance(dest.ref.type, PointerType): pass else: @@ -679,7 +710,7 @@ class Analyzer(object): pass else: raise TypeMismatchError(instr, (src, dest)) - context.assert_in_range(dest.index, dest.ref) + context.assert_in_range(dest.index, dest.ref, dest.offset) elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef): if TableType.is_a_table_type(src.ref.type, TYPE_WORD) and dest.type == TYPE_WORD: @@ -689,7 +720,7 @@ class Analyzer(object): pass else: raise TypeMismatchError(instr, (src, dest)) - context.assert_in_range(src.index, src.ref) + context.assert_in_range(src.index, src.ref, src.offset) elif isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, LocationRef): if src.type == dest.type: @@ -707,16 +738,37 @@ class Analyzer(object): if isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndirectRef): context.assert_meaningful(src, REG_Y) - # TODO this will need to be more sophisticated. it's the thing ref points to that is written, not ref itself. - context.set_written(dest.ref) + + target = context.get_assoc(dest.ref) + if not target: + raise ForbiddenWriteError(instr, dest.ref) + context.set_touched(target) + context.set_written(target) + elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef): context.assert_meaningful(src.ref, REG_Y) - # TODO more sophisticated? + + origin = context.get_assoc(src.ref) + if not origin: + raise UnmeaningfulReadError(instr, src.ref) + context.assert_meaningful(origin) + + context.set_touched(dest) context.set_written(dest) elif isinstance(src, IndirectRef) and isinstance(dest, IndirectRef): context.assert_meaningful(src.ref, REG_Y) - # TODO more sophisticated? - context.set_written(dest.ref) + + origin = context.get_assoc(src.ref) + if not origin: + raise UnmeaningfulReadError(instr, src.ref) + context.assert_meaningful(origin) + + target = context.get_assoc(dest.ref) + if not target: + raise ForbiddenWriteError(instr, dest.ref) + context.set_touched(target) + context.set_written(target) + elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef): context.assert_meaningful(src, dest.ref, dest.index) context.set_written(dest.ref) @@ -905,3 +957,29 @@ class Analyzer(object): else: context.set_touched(REG_A) context.set_unmeaningful(REG_A) + + def analyze_point_into(self, instr, context): + if not isinstance(instr.pointer.type, PointerType): + raise TypeMismatchError(instr, instr.pointer) + if not TableType.is_a_table_type(instr.table.type, TYPE_BYTE): + raise TypeMismatchError(instr, instr.table) + + # check that pointer is not yet associated with any table. + + if context.get_assoc(instr.pointer): + raise ForbiddenWriteError(instr, instr.pointer) + + # associate pointer with table, mark it as meaningful. + + context.set_assoc(instr.pointer, instr.table) + context.set_meaningful(instr.pointer) + context.set_touched(instr.pointer) + + self.analyze_block(instr.block, context) + if context.encountered_gotos(): + raise IllegalJumpError(instr, instr) + + # unassociate pointer with table, mark as unmeaningful. + + context.set_assoc(instr.pointer, None) + context.set_unmeaningful(instr.pointer) diff --git a/src/sixtypical/ast.py b/src/sixtypical/ast.py index f7aad36..9866921 100644 --- a/src/sixtypical/ast.py +++ b/src/sixtypical/ast.py @@ -97,3 +97,8 @@ class WithInterruptsOff(Instr): class Save(Instr): value_attrs = ('locations',) child_attrs = ('block',) + + +class PointInto(Instr): + value_attrs = ('pointer', 'table',) + child_attrs = ('block',) diff --git a/src/sixtypical/compiler.py b/src/sixtypical/compiler.py index fe1e1c5..63b13b5 100644 --- a/src/sixtypical/compiler.py +++ b/src/sixtypical/compiler.py @@ -1,10 +1,12 @@ # encoding: UTF-8 -from sixtypical.ast import Program, Routine, Block, SingleOp, If, Repeat, For, WithInterruptsOff, Save +from sixtypical.ast import ( + Program, Routine, Block, SingleOp, If, Repeat, For, WithInterruptsOff, Save, PointInto +) from sixtypical.model import ( - ConstantRef, LocationRef, IndexedRef, IndirectRef, AddressRef, + ConstantRef, LocationRef, IndexedRef, IndirectRef, TYPE_BIT, TYPE_BYTE, TYPE_WORD, - TableType, BufferType, PointerType, RoutineType, VectorType, + TableType, PointerType, RoutineType, VectorType, REG_A, REG_X, REG_Y, FLAG_C ) from sixtypical.emitter import Byte, Word, Table, Label, Offset, LowAddressByte, HighAddressByte @@ -55,8 +57,6 @@ class Compiler(object): length = 2 elif isinstance(type_, TableType): length = type_.size * (1 if type_.of_type == TYPE_BYTE else 2) - elif isinstance(type_, BufferType): - length = type_.size if length is None: raise NotImplementedError("Need size for type {}".format(type_)) return length @@ -172,6 +172,8 @@ class Compiler(object): return self.compile_with_interrupts_off(instr) elif isinstance(instr, Save): return self.compile_save(instr) + elif isinstance(instr, PointInto): + return self.compile_point_into(instr) else: raise NotImplementedError @@ -190,9 +192,9 @@ class Compiler(object): elif isinstance(src, ConstantRef): self.emitter.emit(LDA(Immediate(Byte(src.value)))) elif isinstance(src, IndexedRef) and src.index == REG_X: - self.emitter.emit(LDA(AbsoluteX(self.get_label(src.ref.name)))) + self.emitter.emit(LDA(AbsoluteX(Offset(self.get_label(src.ref.name), src.offset.value)))) elif isinstance(src, IndexedRef) and src.index == REG_Y: - self.emitter.emit(LDA(AbsoluteY(self.get_label(src.ref.name)))) + self.emitter.emit(LDA(AbsoluteY(Offset(self.get_label(src.ref.name), src.offset.value)))) elif isinstance(src, IndirectRef) and isinstance(src.ref.type, PointerType): self.emitter.emit(LDA(IndirectY(self.get_label(src.ref.name)))) else: @@ -203,7 +205,7 @@ class Compiler(object): elif isinstance(src, ConstantRef): self.emitter.emit(LDX(Immediate(Byte(src.value)))) elif isinstance(src, IndexedRef) and src.index == REG_Y: - self.emitter.emit(LDX(AbsoluteY(self.get_label(src.ref.name)))) + self.emitter.emit(LDX(AbsoluteY(Offset(self.get_label(src.ref.name), src.offset.value)))) else: self.emitter.emit(LDX(self.absolute_or_zero_page(self.get_label(src.name)))) elif dest == REG_Y: @@ -212,7 +214,7 @@ class Compiler(object): elif isinstance(src, ConstantRef): self.emitter.emit(LDY(Immediate(Byte(src.value)))) elif isinstance(src, IndexedRef) and src.index == REG_X: - self.emitter.emit(LDY(AbsoluteX(self.get_label(src.ref.name)))) + self.emitter.emit(LDY(AbsoluteX(Offset(self.get_label(src.ref.name), src.offset.value)))) else: self.emitter.emit(LDY(self.absolute_or_zero_page(self.get_label(src.name)))) else: @@ -234,7 +236,7 @@ class Compiler(object): REG_X: AbsoluteX, REG_Y: AbsoluteY, }[dest.index] - operand = mode_cls(self.get_label(dest.ref.name)) + operand = mode_cls(Offset(self.get_label(dest.ref.name), dest.offset.value)) elif isinstance(dest, IndirectRef) and isinstance(dest.ref.type, PointerType): operand = IndirectY(self.get_label(dest.ref.name)) else: @@ -250,7 +252,8 @@ class Compiler(object): if isinstance(src, ConstantRef): self.emitter.emit(ADC(Immediate(Byte(src.value)))) elif isinstance(src, IndexedRef): - self.emitter.emit(ADC(self.addressing_mode_for_index(src.index)(self.get_label(src.ref.name)))) + mode = self.addressing_mode_for_index(src.index) + self.emitter.emit(ADC(mode(Offset(self.get_label(src.ref.name), src.offset.value)))) else: self.emitter.emit(ADC(Absolute(self.get_label(src.name)))) elif isinstance(dest, LocationRef) and src.type == TYPE_BYTE and dest.type == TYPE_BYTE: @@ -316,7 +319,8 @@ class Compiler(object): if isinstance(src, ConstantRef): self.emitter.emit(SBC(Immediate(Byte(src.value)))) elif isinstance(src, IndexedRef): - self.emitter.emit(SBC(self.addressing_mode_for_index(src.index)(self.get_label(src.ref.name)))) + mode = self.addressing_mode_for_index(src.index) + self.emitter.emit(SBC(mode(Offset(self.get_label(src.ref.name), src.offset.value)))) else: self.emitter.emit(SBC(Absolute(self.get_label(src.name)))) elif isinstance(dest, LocationRef) and src.type == TYPE_BYTE and dest.type == TYPE_BYTE: @@ -367,7 +371,8 @@ class Compiler(object): if isinstance(src, ConstantRef): self.emitter.emit(cls(Immediate(Byte(src.value)))) elif isinstance(src, IndexedRef): - self.emitter.emit(cls(self.addressing_mode_for_index(src.index)(self.get_label(src.ref.name)))) + mode = self.addressing_mode_for_index(src.index) + self.emitter.emit(cls(mode(Offset(self.get_label(src.ref.name), src.offset.value)))) else: self.emitter.emit(cls(self.absolute_or_zero_page(self.get_label(src.name)))) else: @@ -384,7 +389,8 @@ class Compiler(object): if dest == REG_A: self.emitter.emit(cls()) elif isinstance(dest, IndexedRef): - self.emitter.emit(cls(self.addressing_mode_for_index(dest.index)(self.get_label(dest.ref.name)))) + mode = self.addressing_mode_for_index(dest.index) + self.emitter.emit(cls(mode(Offset(self.get_label(dest.ref.name), dest.offset.value)))) else: self.emitter.emit(cls(self.absolute_or_zero_page(self.get_label(dest.name)))) elif opcode == 'call': @@ -455,7 +461,8 @@ class Compiler(object): self.emitter.emit(cls(Immediate(Byte(src.value)))) elif isinstance(src, IndexedRef): # FIXME might not work for some dest's (that is, cls's) - self.emitter.emit(cls(self.addressing_mode_for_index(src.index)(self.get_label(src.ref.name)))) + mode = self.addressing_mode_for_index(src.index) + self.emitter.emit(cls(mode(Offset(self.get_label(src.ref.name), src.offset.value)))) else: self.emitter.emit(cls(Absolute(self.get_label(src.name)))) @@ -466,7 +473,8 @@ class Compiler(object): elif dest == REG_Y: self.emitter.emit(INY()) elif isinstance(dest, IndexedRef): - self.emitter.emit(INC(self.addressing_mode_for_index(dest.index)(self.get_label(dest.ref.name)))) + mode = self.addressing_mode_for_index(dest.index) + self.emitter.emit(INC(mode(Offset(self.get_label(dest.ref.name), dest.offset.value)))) else: self.emitter.emit(INC(Absolute(self.get_label(dest.name)))) @@ -477,7 +485,8 @@ class Compiler(object): elif dest == REG_Y: self.emitter.emit(DEY()) elif isinstance(dest, IndexedRef): - self.emitter.emit(DEC(self.addressing_mode_for_index(dest.index)(self.get_label(dest.ref.name)))) + mode = self.addressing_mode_for_index(dest.index) + self.emitter.emit(DEC(mode(Offset(self.get_label(dest.ref.name), dest.offset.value)))) else: self.emitter.emit(DEC(Absolute(self.get_label(dest.name)))) @@ -505,62 +514,60 @@ class Compiler(object): dest_label = self.get_label(dest.ref.name) self.emitter.emit(LDA(IndirectY(src_label))) self.emitter.emit(STA(IndirectY(dest_label))) - elif isinstance(src, AddressRef) and isinstance(dest, LocationRef) and isinstance(src.ref.type, BufferType) and isinstance(dest.type, PointerType): - ### copy ^buf, ptr - src_label = self.get_label(src.ref.name) - dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(Immediate(HighAddressByte(src_label)))) - self.emitter.emit(STA(ZeroPage(dest_label))) - self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) - self.emitter.emit(STA(ZeroPage(Offset(dest_label, 1)))) elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD): ### copy w, wtab + y src_label = self.get_label(src.name) dest_label = self.get_label(dest.ref.name) + mode = self.addressing_mode_for_index(dest.index) self.emitter.emit(LDA(Absolute(src_label))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) + self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value)))) self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) + self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value + 256)))) elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and isinstance(src.type, VectorType) and isinstance(dest.ref.type, TableType) and isinstance(dest.ref.type.of_type, VectorType): ### copy vec, vtab + y # FIXME this is the exact same as above - can this be simplified? src_label = self.get_label(src.name) dest_label = self.get_label(dest.ref.name) + mode = self.addressing_mode_for_index(dest.index) self.emitter.emit(LDA(Absolute(src_label))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) + self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value)))) self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) + self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value + 256)))) elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and isinstance(src.type, RoutineType) and isinstance(dest.ref.type, TableType) and isinstance(dest.ref.type.of_type, VectorType): ### copy routine, vtab + y src_label = self.get_label(src.name) dest_label = self.get_label(dest.ref.name) + mode = self.addressing_mode_for_index(dest.index) self.emitter.emit(LDA(Immediate(HighAddressByte(src_label)))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) + self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value)))) self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) + self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value + 256)))) elif isinstance(src, ConstantRef) and isinstance(dest, IndexedRef) and src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD): ### copy 9999, wtab + y dest_label = self.get_label(dest.ref.name) + mode = self.addressing_mode_for_index(dest.index) self.emitter.emit(LDA(Immediate(Byte(src.low_byte())))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label))) + self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value)))) self.emitter.emit(LDA(Immediate(Byte(src.high_byte())))) - self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256)))) + self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value + 256)))) elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef) and TableType.is_a_table_type(src.ref.type, TYPE_WORD) and dest.type == TYPE_WORD: ### copy wtab + y, w src_label = self.get_label(src.ref.name) dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(src_label))) + mode = self.addressing_mode_for_index(src.index) + self.emitter.emit(LDA(mode(Offset(src_label, src.offset.value)))) self.emitter.emit(STA(Absolute(dest_label))) - self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(Offset(src_label, 256)))) + self.emitter.emit(LDA(mode(Offset(src_label, src.offset.value + 256)))) self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef) and isinstance(dest.type, VectorType) and isinstance(src.ref.type, TableType) and isinstance(src.ref.type.of_type, VectorType): ### copy vtab + y, vec # FIXME this is the exact same as above - can this be simplified? src_label = self.get_label(src.ref.name) dest_label = self.get_label(dest.name) - self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(src_label))) + mode = self.addressing_mode_for_index(src.index) + self.emitter.emit(LDA(mode(Offset(src_label, src.offset.value)))) self.emitter.emit(STA(Absolute(dest_label))) - self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(Offset(src_label, 256)))) + self.emitter.emit(LDA(mode(Offset(src_label, src.offset.value + 256)))) self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) elif src.type == TYPE_BYTE and dest.type == TYPE_BYTE and not isinstance(src, ConstantRef): ### copy b1, b2 @@ -700,3 +707,14 @@ class Compiler(object): src_label = self.get_label(location.name) self.emitter.emit(PLA()) self.emitter.emit(STA(Absolute(src_label))) + + def compile_point_into(self, instr): + src_label = self.get_label(instr.table.name) + dest_label = self.get_label(instr.pointer.name) + + self.emitter.emit(LDA(Immediate(HighAddressByte(src_label)))) + self.emitter.emit(STA(ZeroPage(dest_label))) + self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) + self.emitter.emit(STA(ZeroPage(Offset(dest_label, 1)))) + + self.compile_block(instr.block) diff --git a/src/sixtypical/model.py b/src/sixtypical/model.py index 7204dde..f89340d 100644 --- a/src/sixtypical/model.py +++ b/src/sixtypical/model.py @@ -9,14 +9,8 @@ class Type(object): def __repr__(self): return 'Type(%r)' % self.name - def __str__(self): - return self.name - def __eq__(self, other): - return isinstance(other, Type) and other.name == self.name - - def __hash__(self): - return hash(self.name) + return other.__class__ == self.__class__ and other.name == self.name TYPE_BIT = Type('bit', max_range=(0, 1)) @@ -24,31 +18,25 @@ TYPE_BYTE = Type('byte', max_range=(0, 255)) TYPE_WORD = Type('word', max_range=(0, 65535)) - class RoutineType(Type): """This memory location contains the code for a routine.""" def __init__(self, inputs, outputs, trashes): - self.name = 'routine' self.inputs = inputs self.outputs = outputs self.trashes = trashes def __repr__(self): - return '%s(%r, inputs=%r, outputs=%r, trashes=%r)' % ( - self.__class__.__name__, self.name, self.inputs, self.outputs, self.trashes + return '%s(inputs=%r, outputs=%r, trashes=%r)' % ( + self.__class__.__name__, self.inputs, self.outputs, self.trashes ) def __eq__(self, other): return isinstance(other, RoutineType) and ( - other.name == self.name and other.inputs == self.inputs and other.outputs == self.outputs and other.trashes == self.trashes ) - def __hash__(self): - return hash(self.name) ^ hash(self.inputs) ^ hash(self.outputs) ^ hash(self.trashes) - @classmethod def executable_types_compatible(cls_, src, dest): """Returns True iff a value of type `src` can be assigned to a storage location of type `dest`.""" @@ -70,7 +58,6 @@ class RoutineType(Type): class VectorType(Type): """This memory location contains the address of some other type (currently, only RoutineType).""" def __init__(self, of_type): - self.name = 'vector' self.of_type = of_type def __repr__(self): @@ -79,38 +66,38 @@ class VectorType(Type): ) def __eq__(self, other): - return self.name == other.name and self.of_type == other.of_type - - def __hash__(self): - return hash(self.name) ^ hash(self.of_type) + return isinstance(other, VectorType) and self.of_type == other.of_type class TableType(Type): def __init__(self, of_type, size): self.of_type = of_type self.size = size - self.name = '{} table[{}]'.format(self.of_type.name, self.size) def __repr__(self): return '%s(%r, %r)' % ( self.__class__.__name__, self.of_type, self.size ) + def __eq__(self, other): + return isinstance(other, TableType) and self.of_type == other.of_type and self.size == other.size + + @property + def max_range(self): + return (0, self.size - 1) + @classmethod def is_a_table_type(cls_, x, of_type): return isinstance(x, TableType) and x.of_type == of_type -class BufferType(Type): - def __init__(self, size): - self.size = size - self.name = 'buffer[%s]' % self.size - - class PointerType(Type): def __init__(self): self.name = 'pointer' + def __eq__(self, other): + return other.__class__ == self.__class__ + class Ref(object): def is_constant(self): @@ -139,7 +126,7 @@ class LocationRef(Ref): return equal def __hash__(self): - return hash(self.name + str(self.type)) + return hash(self.name + repr(self.type)) def __repr__(self): return '%s(%r, %r)' % (self.__class__.__name__, self.type, self.name) @@ -183,77 +170,28 @@ class IndirectRef(Ref): class IndexedRef(Ref): - def __init__(self, ref, index): + def __init__(self, ref, offset, index): self.ref = ref + self.offset = offset self.index = index def __eq__(self, other): - return isinstance(other, self.__class__) and self.ref == other.ref and self.index == other.index + return isinstance(other, self.__class__) and self.ref == other.ref and self.offset == other.offset and self.index == other.index def __hash__(self): - return hash(self.__class__.name) ^ hash(self.ref) ^ hash(self.index) + return hash(self.__class__.name) ^ hash(self.ref) ^ hash(self.offset) ^ hash(self.index) def __repr__(self): - return '%s(%r, %r)' % (self.__class__.__name__, self.ref, self.index) + return '%s(%r, %r, %r)' % (self.__class__.__name__, self.ref, self.offset, self.index) @property def name(self): - return '{}+{}'.format(self.ref.name, self.index.name) + return '{}+{}+{}'.format(self.ref.name, self.offset, self.index.name) def is_constant(self): return False -class AddressRef(Ref): - def __init__(self, ref): - self.ref = ref - - def __eq__(self, other): - return self.ref == other.ref - - def __hash__(self): - return hash(self.__class__.name) ^ hash(self.ref) - - def __repr__(self): - return '%s(%r)' % (self.__class__.__name__, self.ref) - - @property - def name(self): - return '^{}'.format(self.ref.name) - - def is_constant(self): - return True - - -class PartRef(Ref): - """For 'low byte of' location and 'high byte of' location modifiers. - - height=0 = low byte, height=1 = high byte. - - NOTE: Not actually used yet. Might require more thought before it's usable. - """ - def __init__(self, ref, height): - assert isinstance(ref, Ref) - assert ref.type == TYPE_WORD - self.ref = ref - self.height = height - self.type = TYPE_BYTE - - def __eq__(self, other): - return isinstance(other, PartRef) and ( - other.height == self.height and other.ref == self.ref - ) - - def __hash__(self): - return hash(self.ref) ^ hash(self.height) ^ hash(self.type) - - def __repr__(self): - return '%s(%r, %r)' % (self.__class__.__name__, self.ref, self.height) - - def is_constant(self): - return self.ref.is_constant() - - class ConstantRef(Ref): def __init__(self, type, value): self.type = type @@ -296,6 +234,10 @@ class ConstantRef(Ref): value -= 256 return ConstantRef(self.type, value) + @property + def name(self): + return 'constant({})'.format(self.value) + REG_A = LocationRef(TYPE_BYTE, 'a') REG_X = LocationRef(TYPE_BYTE, 'x') diff --git a/src/sixtypical/parser.py b/src/sixtypical/parser.py index 9bae211..0a8e299 100644 --- a/src/sixtypical/parser.py +++ b/src/sixtypical/parser.py @@ -1,10 +1,12 @@ # encoding: UTF-8 -from sixtypical.ast import Program, Defn, Routine, Block, SingleOp, If, Repeat, For, WithInterruptsOff, Save +from sixtypical.ast import ( + Program, Defn, Routine, Block, SingleOp, If, Repeat, For, WithInterruptsOff, Save, PointInto +) from sixtypical.model import ( TYPE_BIT, TYPE_BYTE, TYPE_WORD, - RoutineType, VectorType, TableType, BufferType, PointerType, - LocationRef, ConstantRef, IndirectRef, IndexedRef, AddressRef, + RoutineType, VectorType, TableType, PointerType, + LocationRef, ConstantRef, IndirectRef, IndexedRef, ) from sixtypical.scanner import Scanner @@ -81,9 +83,9 @@ class Parser(object): def backpatch_constraint_labels(type_): def resolve(w): - if not isinstance(w, ForwardReference): - return w - return self.lookup(w.name) + if not isinstance(w, ForwardReference): + return w + return self.lookup(w.name) if isinstance(type_, TableType): backpatch_constraint_labels(type_.of_type) elif isinstance(type_, VectorType): @@ -122,7 +124,7 @@ class Parser(object): self.typedef() if self.scanner.on('const'): self.defn_const() - typenames = ['byte', 'word', 'table', 'vector', 'buffer', 'pointer'] # 'routine', + typenames = ['byte', 'word', 'table', 'vector', 'pointer'] # 'routine', typenames.extend(self.context.typedefs.keys()) while self.scanner.on(*typenames): defn = self.defn() @@ -222,8 +224,8 @@ class Parser(object): if self.scanner.consume('table'): size = self.defn_size() - if size <= 0 or size > 256: - self.syntax_error("Table size must be > 0 and <= 256") + if size <= 0 or size > 65536: + self.syntax_error("Table size must be > 0 and <= 65536") type_ = TableType(type_, size) return type_ @@ -248,9 +250,6 @@ class Parser(object): elif self.scanner.consume('routine'): (inputs, outputs, trashes) = self.constraints() type_ = RoutineType(inputs=inputs, outputs=outputs, trashes=trashes) - elif self.scanner.consume('buffer'): - size = self.defn_size() - type_ = BufferType(size) elif self.scanner.consume('pointer'): type_ = PointerType() else: @@ -351,9 +350,6 @@ class Parser(object): self.scanner.expect('+') self.scanner.expect('y') return IndirectRef(loc) - elif self.scanner.consume('^'): - loc = self.locexpr() - return AddressRef(loc) else: return self.indexed_locexpr() @@ -361,9 +357,13 @@ class Parser(object): loc = self.locexpr() if not isinstance(loc, str): index = None + offset = ConstantRef(TYPE_BYTE, 0) if self.scanner.consume('+'): + if self.scanner.token in self.context.consts or self.scanner.on_type('integer literal'): + offset = self.const() + self.scanner.expect('+') index = self.locexpr() - loc = IndexedRef(loc, index) + loc = IndexedRef(loc, offset, index) return loc def statics(self): @@ -474,6 +474,12 @@ class Parser(object): locations = self.locexprs() block = self.block() return Save(self.scanner.line_number, locations=locations, block=block) + elif self.scanner.consume("point"): + pointer = self.locexpr() + self.scanner.expect("into") + table = self.locexpr() + block = self.block() + return PointInto(self.scanner.line_number, pointer=pointer, table=table, block=block) elif self.scanner.consume("trash"): dest = self.locexpr() return SingleOp(self.scanner.line_number, opcode='trash', src=None, dest=dest) diff --git a/tests/SixtyPical Analysis.md b/tests/SixtyPical Analysis.md index 4493d7c..89ae302 100644 --- a/tests/SixtyPical Analysis.md +++ b/tests/SixtyPical Analysis.md @@ -503,6 +503,52 @@ The index must be initialized. | } ? UnmeaningfulReadError: x +Storing to a table, you may also include a constant offset. + + | byte one + | byte table[256] many + | + | define main routine + | outputs many + | trashes a, x, n, z + | { + | ld x, 0 + | ld a, 0 + | st a, many + 100 + x + | } + = ok + +Reading from a table, you may also include a constant offset. + + | byte table[256] many + | + | define main routine + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 0 + | ld a, many + 100 + x + | } + = ok + +Using a constant offset, you can read and write entries in +the table beyond the 256th. + + | byte one + | byte table[1024] many + | + | define main routine + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 0 + | ld a, many + 999 + x + | st a, many + 1000 + x + | } + = ok + There are other operations you can do on tables. (1/3) | byte table[256] many @@ -614,13 +660,35 @@ You can also copy a literal word to a word table. | } = ok +Copying to and from a word table with a constant offset. + + | word one + | word table[256] many + | + | define main routine + | inputs one, many + | outputs one, many + | trashes a, x, n, z + | { + | ld x, 0 + | copy one, many + 100 + x + | copy many + 100 + x, one + | copy 9999, many + 1 + x + | } + = ok + #### tables: range checking #### It is a static analysis error if it cannot be proven that a read or write to a table falls within the defined size of that table. -(If a table has 256 entries, then there is never a problem, because a byte -cannot index any entry outside of 0..255.) +If a table has 256 entries, then there is never a problem (so long as +no constant offset is supplied), because a byte cannot index any entry +outside of 0..255. + +But if the table has fewer than 256 entries, or if a constant offset is +supplied, there is the possibility that the index will refer to an +entry in the table which does not exist. A SixtyPical implementation must be able to prove that the index is inside the range of the table in various ways. The simplest is to show that a @@ -664,6 +732,33 @@ constant value falls inside or outside the range of the table. | } ? RangeExceededError +Any constant offset is taken into account in this check. + + | byte table[32] many + | + | define main routine + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 31 + | ld a, many + 1 + x + | } + ? RangeExceededError + + | byte table[32] many + | + | define main routine + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 31 + | ld a, 0 + | st a, many + 1 + x + | } + ? RangeExceededError + This applies to `copy` as well. | word one: 77 @@ -706,6 +801,34 @@ This applies to `copy` as well. | } ? RangeExceededError +Any constant offset is taken into account in this check. + + | word one: 77 + | word table[32] many + | + | define main routine + | inputs many, one + | outputs many, one + | trashes a, x, n, z + | { + | ld x, 31 + | copy many + 1 + x, one + | } + ? RangeExceededError + + | word one: 77 + | word table[32] many + | + | define main routine + | inputs many, one + | outputs many, one + | trashes a, x, n, z + | { + | ld x, 31 + | copy one, many + 1 + x + | } + ? RangeExceededError + `AND`'ing a register with a value ensures the range of the register will not exceed the range of the value. This can be used to "clip" the range of an index so that it fits in @@ -726,7 +849,7 @@ a table. | } = ok -Test for "clipping", but not enough. +Tests for "clipping", but not enough. | word one: 77 | word table[32] many @@ -743,6 +866,21 @@ Test for "clipping", but not enough. | } ? RangeExceededError + | word one: 77 + | word table[32] many + | + | define main routine + | inputs a, many, one + | outputs many, one + | trashes a, x, n, z + | { + | and a, 31 + | ld x, a + | copy one, many + 1 + x + | copy many + 1 + x, one + | } + ? RangeExceededError + If you alter the value after "clipping" it, the range can no longer be guaranteed. @@ -2779,140 +2917,338 @@ Can't `copy` from a `word` to a `byte`. | } ? TypeMismatchError -### Buffers and pointers ### +### point ... into blocks ### -Note that `^buf` is a constant value, so it by itself does not require `buf` to be -listed in any input/output sets. +Pointer must be a pointer type. -However, if the code reads from it through a pointer, it *should* be in `inputs`. - -Likewise, if the code writes to it through a pointer, it *should* be in `outputs`. - -Of course, unless you write to *all* the bytes in a buffer, some of those bytes -might not be meaningful. So how meaningful is this check? - -This is an open problem. - -For now, convention says: if it is being read, list it in `inputs`, and if it is -being modified, list it in both `inputs` and `outputs`. - -Write literal through a pointer. - - | buffer[2048] buf - | pointer ptr + | byte table[256] tab + | word ptr | | define main routine - | inputs buf - | outputs y, buf + | inputs tab + | outputs y, tab | trashes a, z, n, ptr | { | ld y, 0 - | copy ^buf, ptr - | copy 123, [ptr] + y + | point ptr into tab { + | copy 123, [ptr] + y + | } | } - = ok + ? TypeMismatchError -It does use `y`. +Cannot write through pointer outside a `point ... into` block. - | buffer[2048] buf + | byte table[256] tab | pointer ptr | | define main routine - | inputs buf - | outputs buf + | inputs tab, ptr + | outputs y, tab | trashes a, z, n, ptr | { - | copy ^buf, ptr + | ld y, 0 | copy 123, [ptr] + y | } + ? ForbiddenWriteError + + | byte table[256] tab + | pointer ptr + | + | define main routine + | inputs tab + | outputs y, tab + | trashes a, z, n, ptr + | { + | ld y, 0 + | point ptr into tab { + | copy 123, [ptr] + y + | } + | copy 123, [ptr] + y + | } + ? ForbiddenWriteError + +Write literal through a pointer into a table. + + | byte table[256] tab + | pointer ptr + | + | define main routine + | inputs tab + | outputs y, tab + | trashes a, z, n, ptr + | { + | ld y, 0 + | point ptr into tab { + | copy 123, [ptr] + y + | } + | } + = ok + +Writing into a table via a pointer does use `y`. + + | byte table[256] tab + | pointer ptr + | + | define main routine + | inputs tab + | outputs tab + | trashes a, z, n, ptr + | { + | point ptr into tab { + | copy 123, [ptr] + y + | } + | } ? UnmeaningfulReadError -Write stored value through a pointer. +Write stored value through a pointer into a table. - | buffer[2048] buf + | byte table[256] tab | pointer ptr | byte foo | | define main routine - | inputs foo, buf - | outputs y, buf + | inputs foo, tab + | outputs y, tab | trashes a, z, n, ptr | { | ld y, 0 - | copy ^buf, ptr - | copy foo, [ptr] + y + | point ptr into tab { + | copy foo, [ptr] + y + | } | } = ok -Read through a pointer. +Read a table entry via a pointer. - | buffer[2048] buf + | byte table[256] tab | pointer ptr | byte foo | | define main routine - | inputs buf + | inputs tab | outputs foo | trashes a, y, z, n, ptr | { | ld y, 0 - | copy ^buf, ptr + | point ptr into tab { | copy [ptr] + y, foo + | } | } = ok -Read and write through two pointers. +Read and write through two pointers into a table. - | buffer[2048] buf + | byte table[256] tab | pointer ptra | pointer ptrb | | define main routine - | inputs buf - | outputs buf + | inputs tab + | outputs tab | trashes a, y, z, n, ptra, ptrb | { | ld y, 0 - | copy ^buf, ptra - | copy ^buf, ptrb - | copy [ptra] + y, [ptrb] + y + | point ptra into tab { + | point ptrb into tab { + | copy [ptra] + y, [ptrb] + y + | } + | } | } = ok -Read through a pointer to the `a` register. Note that this is done with `ld`, +Read through a pointer into a table, to the `a` register. Note that this is done with `ld`, not `copy`. - | buffer[2048] buf + | byte table[256] tab | pointer ptr | byte foo | | define main routine - | inputs buf + | inputs tab | outputs a | trashes y, z, n, ptr | { | ld y, 0 - | copy ^buf, ptr - | ld a, [ptr] + y + | point ptr into tab { + | ld a, [ptr] + y + | } | } = ok -Write the `a` register through a pointer. Note that this is done with `st`, +Write the `a` register through a pointer into a table. Note that this is done with `st`, not `copy`. - | buffer[2048] buf + | byte table[256] tab | pointer ptr | byte foo | | define main routine - | inputs buf - | outputs buf + | inputs tab + | outputs tab | trashes a, y, z, n, ptr | { | ld y, 0 - | copy ^buf, ptr - | ld a, 255 - | st a, [ptr] + y + | point ptr into tab { + | ld a, 255 + | st a, [ptr] + y + | } + | } + = ok + +Cannot get a pointer into a non-byte (for instance, word) table. + + | word table[256] tab + | pointer ptr + | byte foo + | + | define main routine + | inputs tab + | outputs foo + | trashes a, y, z, n, ptr + | { + | ld y, 0 + | point ptr into tab { + | copy [ptr] + y, foo + | } + | } + ? TypeMismatchError + +Cannot get a pointer into a non-byte (for instance, vector) table. + + | vector (routine trashes a, z, n) table[256] tab + | pointer ptr + | vector (routine trashes a, z, n) foo + | + | define main routine + | inputs tab + | outputs foo + | trashes a, y, z, n, ptr + | { + | ld y, 0 + | point ptr into tab { + | copy [ptr] + y, foo + | } + | } + ? TypeMismatchError + +`point into` by itself only requires `ptr` to be writeable. By itself, +it does not require `tab` to be readable or writeable. + + | byte table[256] tab + | pointer ptr + | + | define main routine + | trashes a, z, n, ptr + | { + | point ptr into tab { + | ld a, 0 + | } + | } + = ok + + | byte table[256] tab + | pointer ptr + | + | define main routine + | trashes a, z, n + | { + | point ptr into tab { + | ld a, 0 + | } + | } + ? ForbiddenWriteError + +After a `point into` block, the pointer is no longer meaningful and cannot +be considered an output of the routine. + + | byte table[256] tab + | pointer ptr + | + | define main routine + | inputs tab + | outputs y, tab, ptr + | trashes a, z, n + | { + | ld y, 0 + | point ptr into tab { + | copy 123, [ptr] + y + | } + | } + ? UnmeaningfulOutputError + +If code in a routine reads from a table through a pointer, the table must be in +the `inputs` of that routine. + + | byte table[256] tab + | pointer ptr + | byte foo + | + | define main routine + | outputs foo + | trashes a, y, z, n, ptr + | { + | ld y, 0 + | point ptr into tab { + | copy [ptr] + y, foo + | } + | } + ? UnmeaningfulReadError + +Likewise, if code in a routine writes into a table via a pointer, the table must +be in the `outputs` of that routine. + + | byte table[256] tab + | pointer ptr + | + | define main routine + | inputs tab + | trashes a, y, z, n, ptr + | { + | ld y, 0 + | point ptr into tab { + | copy 123, [ptr] + y + | } + | } + ? ForbiddenWriteError + +If code in a routine reads from a table through a pointer, the pointer *should* +remain inside the range of the table. This is currently not checked. + + | byte table[32] tab + | pointer ptr + | byte foo + | + | define main routine + | inputs tab + | outputs foo + | trashes a, y, c, z, n, v, ptr + | { + | ld y, 0 + | point ptr into tab { + | st off, c + | add ptr, word 100 + | copy [ptr] + y, foo + | } + | } + = ok + +Likewise, if code in a routine writes into a table through a pointer, the pointer +*should* remain inside the range of the table. This is currently not checked. + + | byte table[32] tab + | pointer ptr + | + | define main routine + | inputs tab + | outputs tab + | trashes a, y, c, z, n, v, ptr + | { + | ld y, 0 + | point ptr into tab { + | st off, c + | add ptr, word 100 + | copy 123, [ptr] + y + | } | } = ok diff --git a/tests/SixtyPical Compilation.md b/tests/SixtyPical Compilation.md index c3a4ec7..ec29a16 100644 --- a/tests/SixtyPical Compilation.md +++ b/tests/SixtyPical Compilation.md @@ -385,6 +385,68 @@ Some instructions on tables. (3/3) = $081B DEC $081F,X = $081E RTS +Using a constant offset, you can read and write entries in +the table beyond the 256th. + + | byte one + | byte table[1024] many + | + | define main routine + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 0 + | ld a, many + x + | st a, many + x + | ld a, many + 999 + x + | st a, many + 1000 + x + | } + = $080D LDX #$00 + = $080F LDA $081D,X + = $0812 STA $081D,X + = $0815 LDA $0C04,X + = $0818 STA $0C05,X + = $081B RTS + +Instructions on tables, with constant offsets. + + | byte table[256] many + | + | define main routine + | inputs many + | outputs many + | trashes a, x, c, n, z, v + | { + | ld x, 0 + | ld a, 0 + | st off, c + | add a, many + 1 + x + | sub a, many + 2 + x + | cmp a, many + 3 + x + | and a, many + 4 + x + | or a, many + 5 + x + | xor a, many + 6 + x + | shl many + 7 + x + | shr many + 8 + x + | inc many + 9 + x + | dec many + 10 + x + | } + = $080D LDX #$00 + = $080F LDA #$00 + = $0811 CLC + = $0812 ADC $0832,X + = $0815 SBC $0833,X + = $0818 CMP $0834,X + = $081B AND $0835,X + = $081E ORA $0836,X + = $0821 EOR $0837,X + = $0824 ROL $0838,X + = $0827 ROR $0839,X + = $082A INC $083A,X + = $082D DEC $083B,X + = $0830 RTS + Compiling 16-bit `cmp`. | word za @ 60001 @@ -876,6 +938,42 @@ Copy routine (by forward reference) to vector. = $0818 INX = $0819 RTS +Copy byte to byte table and back, with both `x` and `y` as indexes, +plus constant offsets. + + | byte one + | byte table[256] many + | + | define main routine + | inputs one, many + | outputs one, many + | trashes a, x, y, n, z + | { + | ld x, 0 + | ld y, 0 + | ld a, 77 + | st a, many + x + | st a, many + y + | st a, many + 1 + x + | st a, many + 1 + y + | ld a, many + x + | ld a, many + y + | ld a, many + 8 + x + | ld a, many + 8 + y + | } + = $080D LDX #$00 + = $080F LDY #$00 + = $0811 LDA #$4D + = $0813 STA $082D,X + = $0816 STA $082D,Y + = $0819 STA $082E,X + = $081C STA $082E,Y + = $081F LDA $082D,X + = $0822 LDA $082D,Y + = $0825 LDA $0835,X + = $0828 LDA $0835,Y + = $082B RTS + Copy word to word table and back, with both `x` and `y` as indexes. | word one @@ -918,6 +1016,48 @@ Copy word to word table and back, with both `x` and `y` as indexes. = $0848 STA $084D = $084B RTS +Copy word to word table and back, with constant offsets. + + | word one + | word table[256] many + | + | define main routine + | inputs one, many + | outputs one, many + | trashes a, x, y, n, z + | { + | ld x, 0 + | ld y, 0 + | copy 777, one + | copy one, many + 1 + x + | copy one, many + 2 + y + | copy many + 3 + x, one + | copy many + 4 + y, one + | } + = $080D LDX #$00 + = $080F LDY #$00 + = $0811 LDA #$09 + = $0813 STA $084C + = $0816 LDA #$03 + = $0818 STA $084D + = $081B LDA $084C + = $081E STA $084F,X + = $0821 LDA $084D + = $0824 STA $094F,X + = $0827 LDA $084C + = $082A STA $0850,Y + = $082D LDA $084D + = $0830 STA $0950,Y + = $0833 LDA $0851,X + = $0836 STA $084C + = $0839 LDA $0951,X + = $083C STA $084D + = $083F LDA $0852,Y + = $0842 STA $084C + = $0845 LDA $0952,Y + = $0848 STA $084D + = $084B RTS + Indirect call. | vector routine @@ -1025,6 +1165,57 @@ Copying to and from a vector table. = $0842 JMP ($0846) = $0845 RTS +Copying to and from a vector table, with constant offsets. + + | vector routine + | outputs x + | trashes a, z, n + | one + | vector routine + | outputs x + | trashes a, z, n + | table[256] many + | + | define bar routine outputs x trashes a, z, n { + | ld x, 200 + | } + | + | define main routine + | inputs one, many + | outputs one, many + | trashes a, x, n, z + | { + | ld x, 0 + | copy bar, one + | copy bar, many + 1 + x + | copy one, many + 2 + x + | copy many + 3 + x, one + | call one + | } + = $080D LDX #$00 + = $080F LDA #$3F + = $0811 STA $0846 + = $0814 LDA #$08 + = $0816 STA $0847 + = $0819 LDA #$3F + = $081B STA $0849,X + = $081E LDA #$08 + = $0820 STA $0949,X + = $0823 LDA $0846 + = $0826 STA $084A,X + = $0829 LDA $0847 + = $082C STA $094A,X + = $082F LDA $084B,X + = $0832 STA $0846 + = $0835 LDA $094B,X + = $0838 STA $0847 + = $083B JSR $0842 + = $083E RTS + = $083F LDX #$C8 + = $0841 RTS + = $0842 JMP ($0846) + = $0845 RTS + ### add, sub Various modes of `add`. @@ -1207,20 +1398,21 @@ Subtracting a word memory location from another word memory location. = $081D STA $0822 = $0820 RTS -### Buffers and Pointers +### Tables and Pointers -Load address into pointer. +Load address of table into pointer. - | buffer[2048] buf + | byte table[256] tab | pointer ptr @ 254 | | define main routine - | inputs buf - | outputs buf, y + | inputs tab + | outputs tab, y | trashes a, z, n, ptr | { | ld y, 0 - | copy ^buf, ptr + | point ptr into tab { + | } | } = $080D LDY #$00 = $080F LDA #$18 @@ -1231,17 +1423,18 @@ Load address into pointer. Write literal through a pointer. - | buffer[2048] buf + | byte table[256] tab | pointer ptr @ 254 | | define main routine - | inputs buf - | outputs buf, y + | inputs tab + | outputs tab, y | trashes a, z, n, ptr | { | ld y, 0 - | copy ^buf, ptr - | copy 123, [ptr] + y + | point ptr into tab { + | copy 123, [ptr] + y + | } | } = $080D LDY #$00 = $080F LDA #$1C @@ -1254,43 +1447,45 @@ Write literal through a pointer. Write stored value through a pointer. - | buffer[2048] buf + | byte table[256] tab | pointer ptr @ 254 | byte foo | | define main routine - | inputs foo, buf - | outputs y, buf + | inputs foo, tab + | outputs y, tab | trashes a, z, n, ptr | { | ld y, 0 - | copy ^buf, ptr - | copy foo, [ptr] + y + | point ptr into tab { + | copy foo, [ptr] + y + | } | } = $080D LDY #$00 = $080F LDA #$1D = $0811 STA $FE = $0813 LDA #$08 = $0815 STA $FF - = $0817 LDA $101D + = $0817 LDA $091D = $081A STA ($FE),Y = $081C RTS Read through a pointer, into a byte storage location, or the `a` register. - | buffer[2048] buf + | byte table[256] tab | pointer ptr @ 254 | byte foo | | define main routine - | inputs buf + | inputs tab | outputs y, foo | trashes a, z, n, ptr | { | ld y, 0 - | copy ^buf, ptr - | copy [ptr] + y, foo - | ld a, [ptr] + y + | point ptr into tab { + | copy [ptr] + y, foo + | ld a, [ptr] + y + | } | } = $080D LDY #$00 = $080F LDA #$1F @@ -1298,25 +1493,27 @@ Read through a pointer, into a byte storage location, or the `a` register. = $0813 LDA #$08 = $0815 STA $FF = $0817 LDA ($FE),Y - = $0819 STA $101F + = $0819 STA $091F = $081C LDA ($FE),Y = $081E RTS Read and write through two pointers. - | buffer[2048] buf + | byte table[256] tab | pointer ptra @ 252 | pointer ptrb @ 254 | | define main routine - | inputs buf - | outputs buf + | inputs tab + | outputs tab | trashes a, y, z, n, ptra, ptrb | { | ld y, 0 - | copy ^buf, ptra - | copy ^buf, ptrb - | copy [ptra] + y, [ptrb] + y + | point ptra into tab { + | point ptrb into tab { + | copy [ptra] + y, [ptrb] + y + | } + | } | } = $080D LDY #$00 = $080F LDA #$24 @@ -1333,19 +1530,20 @@ Read and write through two pointers. Write the `a` register through a pointer. - | buffer[2048] buf + | byte table[256] tab | pointer ptr @ 254 | byte foo | | define main routine - | inputs buf - | outputs buf + | inputs tab + | outputs tab | trashes a, y, z, n, ptr | { | ld y, 0 - | copy ^buf, ptr - | ld a, 255 - | st a, [ptr] + y + | point ptr into tab { + | ld a, 255 + | st a, [ptr] + y + | } | } = $080D LDY #$00 = $080F LDA #$1C @@ -1359,28 +1557,29 @@ Write the `a` register through a pointer. Add a word memory location, and a literal word, to a pointer, and then read through it. Note that this is *not* range-checked. (Yet.) - | buffer[2048] buf + | byte table[256] tab | pointer ptr @ 254 | byte foo | word delta | | define main routine - | inputs buf + | inputs tab | outputs y, foo, delta | trashes a, c, v, z, n, ptr | { | copy 619, delta | ld y, 0 | st off, c - | copy ^buf, ptr - | add ptr, delta - | add ptr, word 1 - | copy [ptr] + y, foo + | point ptr into tab { + | add ptr, delta + | add ptr, word 1 + | copy [ptr] + y, foo + | } | } = $080D LDA #$6B - = $080F STA $1043 + = $080F STA $0943 = $0812 LDA #$02 - = $0814 STA $1044 + = $0814 STA $0944 = $0817 LDY #$00 = $0819 CLC = $081A LDA #$42 @@ -1388,10 +1587,10 @@ Note that this is *not* range-checked. (Yet.) = $081E LDA #$08 = $0820 STA $FF = $0822 LDA $FE - = $0824 ADC $1043 + = $0824 ADC $0943 = $0827 STA $FE = $0829 LDA $FF - = $082B ADC $1044 + = $082B ADC $0944 = $082E STA $FF = $0830 LDA $FE = $0832 ADC #$01 @@ -1400,7 +1599,7 @@ Note that this is *not* range-checked. (Yet.) = $0838 ADC #$00 = $083A STA $FF = $083C LDA ($FE),Y - = $083E STA $1042 + = $083E STA $0942 = $0841 RTS ### Trash diff --git a/tests/SixtyPical Syntax.md b/tests/SixtyPical Syntax.md index 7a660c8..ea02c99 100644 --- a/tests/SixtyPical Syntax.md +++ b/tests/SixtyPical Syntax.md @@ -163,6 +163,9 @@ Basic "open-faced for" loops, up and down. Other blocks. + | byte table[256] tab + | pointer ptr + | | define main routine trashes a, x, c, z, v { | with interrupts off { | save a, x, c { @@ -172,6 +175,9 @@ Other blocks. | save a, x, c { | ld a, 0 | } + | point ptr into tab { + | ld a, [ptr] + y + | } | } = ok @@ -180,7 +186,7 @@ User-defined memory addresses of different types. | byte byt | word wor | vector routine trashes a vec - | buffer[2048] buf + | byte table[2048] buf | pointer ptr | | define main routine { @@ -192,6 +198,8 @@ Tables of different types and some operations on them. | byte table[256] many | word table[256] wmany | vector (routine trashes a) table[256] vmany + | byte bval + | word wval | | define main routine { | ld x, 0 @@ -207,11 +215,48 @@ Tables of different types and some operations on them. | shr many + x | inc many + x | dec many + x + | ld a, many + x + | st a, many + x + | copy wval, wmany + x + | copy wmany + x, wval + | } + = ok + +Indexing with an offset in some tables. + + | byte table[256] many + | word table[256] wmany + | byte bval + | word wval + | + | define main routine { + | ld x, 0 + | ld a, 0 + | st off, c + | add a, many + 100 + x + | sub a, many + 100 + x + | cmp a, many + 100 + x + | and a, many + 100 + x + | or a, many + 100 + x + | xor a, many + 100 + x + | shl many + 100 + x + | shr many + 100 + x + | inc many + 100 + x + | dec many + 100 + x + | ld a, many + 100 + x + | st a, many + 100 + x + | copy wval, wmany + 100 + x + | copy wmany + 100 + x, wval | } = ok The number of entries in a table must be -greater than 0 and less than or equal to 256. +greater than 0 and less than or equal to 65536. + +(In previous versions, a table could have at +most 256 entries. They can now have more, however +the offset-access syntax can only access the +first 256. To access more, a pointer is required.) | word table[512] many | @@ -223,6 +268,30 @@ greater than 0 and less than or equal to 256. | ld x, 0 | copy 9999, many + x | } + = ok + + | byte table[65536] many + | + | define main routine + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 0 + | copy 99, many + x + | } + = ok + + | byte table[65537] many + | + | define main routine + | inputs many + | outputs many + | trashes a, x, n, z + | { + | ld x, 0 + | copy 99, many + x + | } ? SyntaxError | word table[0] many @@ -285,6 +354,19 @@ Constants. | } = ok +Named constants can be used as offsets. + + | const lives 3 + | const w1 1000 + | + | byte table[w1] those + | + | define main routine { + | ld y, 0 + | ld a, those + lives + y + | } + = ok + Can't have two constants with the same name. | const w1 1000 @@ -590,18 +672,19 @@ But you can't `goto` a label that never gets defined. | } ? Expected '}', but found 'ld' -Buffers and pointers. +Tables and pointers. - | buffer[2048] buf + | byte table[2048] buf | pointer ptr | pointer ptrb | byte foo | | define main routine { - | copy ^buf, ptr - | copy 123, [ptr] + y - | copy [ptr] + y, foo - | copy [ptr] + y, [ptrb] + y + | point ptr into buf { + | copy 123, [ptr] + y + | copy [ptr] + y, foo + | copy [ptr] + y, [ptrb] + y + | } | } = ok From bd462d6d8b7ed2ff30fea9ba93128a02d523a9c5 Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Mon, 8 Apr 2019 12:23:44 +0100 Subject: [PATCH 02/18] Include more info in --dump-exit-contexts. --- src/sixtypical/analyzer.py | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/sixtypical/analyzer.py b/src/sixtypical/analyzer.py index 7d115e8..5b1aa6b 100644 --- a/src/sixtypical/analyzer.py +++ b/src/sixtypical/analyzer.py @@ -141,7 +141,11 @@ class Context(object): ) def to_json_data(self): + type_ = self.routine.location.type return { + 'routine_inputs': ','.join(sorted(loc.name for loc in type_.inputs)), + 'routine_outputs': ','.join(sorted(loc.name for loc in type_.outputs)), + 'routine_trashes': ','.join(sorted(loc.name for loc in type_.trashes)), 'touched': ','.join(sorted(loc.name for loc in self._touched)), 'range': dict((loc.name, '{}-{}'.format(rng[0], rng[1])) for (loc, rng) in self._range.items()), 'writeable': ','.join(sorted(loc.name for loc in self._writeable)), From 4615d8d054be771f3533a4cc29f397fec61ab38d Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Mon, 8 Apr 2019 16:26:51 +0100 Subject: [PATCH 03/18] Distinct AST nodes for call and goto instructions. --- src/sixtypical/analyzer.py | 142 +++++++++++++++++++------------------ src/sixtypical/ast.py | 10 ++- src/sixtypical/compiler.py | 58 ++++++++------- src/sixtypical/parser.py | 17 +++-- 4 files changed, 126 insertions(+), 101 deletions(-) diff --git a/src/sixtypical/analyzer.py b/src/sixtypical/analyzer.py index 5b1aa6b..158232f 100644 --- a/src/sixtypical/analyzer.py +++ b/src/sixtypical/analyzer.py @@ -1,7 +1,7 @@ # encoding: UTF-8 from sixtypical.ast import ( - Program, Routine, Block, SingleOp, If, Repeat, For, WithInterruptsOff, Save, PointInto + Program, Routine, Block, SingleOp, Call, GoTo, If, Repeat, For, WithInterruptsOff, Save, PointInto ) from sixtypical.model import ( TYPE_BYTE, TYPE_WORD, @@ -493,6 +493,10 @@ class Analyzer(object): def analyze_instr(self, instr, context): if isinstance(instr, SingleOp): self.analyze_single_op(instr, context) + elif isinstance(instr, Call): + self.analyze_call(instr, context) + elif isinstance(instr, GoTo): + self.analyze_goto(instr, context) elif isinstance(instr, If): self.analyze_if(instr, context) elif isinstance(instr, Repeat): @@ -667,20 +671,6 @@ class Analyzer(object): self.assert_type(TYPE_BYTE, dest) context.set_written(dest, FLAG_Z, FLAG_N, FLAG_C) context.invalidate_range(dest) - elif opcode == 'call': - type = instr.location.type - if not isinstance(type, (RoutineType, VectorType)): - raise TypeMismatchError(instr, instr.location) - if isinstance(type, VectorType): - type = type.of_type - for ref in type.inputs: - context.assert_meaningful(ref) - for ref in type.outputs: - context.set_written(ref) - for ref in type.trashes: - context.assert_writeable(ref) - context.set_touched(ref) - context.set_unmeaningful(ref) elif opcode == 'copy': if dest == REG_A: raise ForbiddenWriteError(instr, "{} cannot be used as destination for copy".format(dest)) @@ -789,59 +779,6 @@ class Analyzer(object): context.set_touched(REG_A, FLAG_Z, FLAG_N) context.set_unmeaningful(REG_A, FLAG_Z, FLAG_N) - elif opcode == 'goto': - location = instr.location - type_ = location.type - - if not isinstance(type_, (RoutineType, VectorType)): - raise TypeMismatchError(instr, location) - - # assert that the dest routine's inputs are all initialized - if isinstance(type_, VectorType): - type_ = type_.of_type - for ref in type_.inputs: - context.assert_meaningful(ref) - - # and that this routine's trashes and output constraints are a - # superset of the called routine's - current_type = self.current_routine.location.type - self.assert_affected_within('outputs', type_, current_type) - self.assert_affected_within('trashes', type_, current_type) - - context.encounter_gotos(set([instr.location])) - - # Now that we have encountered a goto, we update the - # context here to match what someone calling the goto'ed - # function directly, would expect. (which makes sense - # when you think about it; if this goto's F, then calling - # this is like calling F, from the perspective of what is - # returned.) - # - # However, this isn't the current context anymore. This - # is an exit context of this routine. - - exit_context = context.clone() - - for ref in type_.outputs: - exit_context.set_touched(ref) # ? - exit_context.set_written(ref) - - for ref in type_.trashes: - exit_context.assert_writeable(ref) - exit_context.set_touched(ref) - exit_context.set_unmeaningful(ref) - - self.exit_contexts.append(exit_context) - - # When we get to the end, we'll check that all the - # exit contexts are consistent with each other. - - # We set the current context as having terminated. - # If we are in a branch, the merge will deal with - # having terminated. If we are at the end of the - # routine, the routine end will deal with that. - - context.set_terminated() elif opcode == 'trash': context.set_touched(instr.dest) @@ -851,6 +788,75 @@ class Analyzer(object): else: raise NotImplementedError(opcode) + def analyze_call(self, instr, context): + type = instr.location.type + if not isinstance(type, (RoutineType, VectorType)): + raise TypeMismatchError(instr, instr.location) + if isinstance(type, VectorType): + type = type.of_type + for ref in type.inputs: + context.assert_meaningful(ref) + for ref in type.outputs: + context.set_written(ref) + for ref in type.trashes: + context.assert_writeable(ref) + context.set_touched(ref) + context.set_unmeaningful(ref) + + def analyze_goto(self, instr, context): + location = instr.location + type_ = location.type + + if not isinstance(type_, (RoutineType, VectorType)): + raise TypeMismatchError(instr, location) + + # assert that the dest routine's inputs are all initialized + if isinstance(type_, VectorType): + type_ = type_.of_type + for ref in type_.inputs: + context.assert_meaningful(ref) + + # and that this routine's trashes and output constraints are a + # superset of the called routine's + current_type = self.current_routine.location.type + self.assert_affected_within('outputs', type_, current_type) + self.assert_affected_within('trashes', type_, current_type) + + context.encounter_gotos(set([instr.location])) + + # Now that we have encountered a goto, we update the + # context here to match what someone calling the goto'ed + # function directly, would expect. (which makes sense + # when you think about it; if this goto's F, then calling + # this is like calling F, from the perspective of what is + # returned.) + # + # However, this isn't the current context anymore. This + # is an exit context of this routine. + + exit_context = context.clone() + + for ref in type_.outputs: + exit_context.set_touched(ref) # ? + exit_context.set_written(ref) + + for ref in type_.trashes: + exit_context.assert_writeable(ref) + exit_context.set_touched(ref) + exit_context.set_unmeaningful(ref) + + self.exit_contexts.append(exit_context) + + # When we get to the end, we'll check that all the + # exit contexts are consistent with each other. + + # We set the current context as having terminated. + # If we are in a branch, the merge will deal with + # having terminated. If we are at the end of the + # routine, the routine end will deal with that. + + context.set_terminated() + def analyze_if(self, instr, context): incoming_meaningful = set(context.each_meaningful()) diff --git a/src/sixtypical/ast.py b/src/sixtypical/ast.py index 9866921..bc9a5ee 100644 --- a/src/sixtypical/ast.py +++ b/src/sixtypical/ast.py @@ -72,7 +72,15 @@ class Instr(AST): class SingleOp(Instr): - value_attrs = ('opcode', 'dest', 'src', 'location',) + value_attrs = ('opcode', 'dest', 'src',) + + +class Call(Instr): + value_attrs = ('location',) + + +class GoTo(Instr): + value_attrs = ('location',) class If(Instr): diff --git a/src/sixtypical/compiler.py b/src/sixtypical/compiler.py index 63b13b5..0d0d862 100644 --- a/src/sixtypical/compiler.py +++ b/src/sixtypical/compiler.py @@ -1,7 +1,7 @@ # encoding: UTF-8 from sixtypical.ast import ( - Program, Routine, Block, SingleOp, If, Repeat, For, WithInterruptsOff, Save, PointInto + Program, Routine, Block, SingleOp, Call, GoTo, If, Repeat, For, WithInterruptsOff, Save, PointInto ) from sixtypical.model import ( ConstantRef, LocationRef, IndexedRef, IndirectRef, @@ -162,6 +162,10 @@ class Compiler(object): def compile_instr(self, instr): if isinstance(instr, SingleOp): return self.compile_single_op(instr) + elif isinstance(instr, Call): + return self.compile_call(instr) + elif isinstance(instr, GoTo): + return self.compile_goto(instr) elif isinstance(instr, If): return self.compile_if(instr) elif isinstance(instr, Repeat): @@ -393,31 +397,6 @@ class Compiler(object): self.emitter.emit(cls(mode(Offset(self.get_label(dest.ref.name), dest.offset.value)))) else: self.emitter.emit(cls(self.absolute_or_zero_page(self.get_label(dest.name)))) - elif opcode == 'call': - location = instr.location - label = self.get_label(instr.location.name) - if isinstance(location.type, RoutineType): - self.emitter.emit(JSR(Absolute(label))) - elif isinstance(location.type, VectorType): - trampoline = self.trampolines.setdefault( - location, Label(location.name + '_trampoline') - ) - self.emitter.emit(JSR(Absolute(trampoline))) - else: - raise NotImplementedError - elif opcode == 'goto': - self.final_goto_seen = True - if self.skip_final_goto: - pass - else: - location = instr.location - label = self.get_label(instr.location.name) - if isinstance(location.type, RoutineType): - self.emitter.emit(JMP(Absolute(label))) - elif isinstance(location.type, VectorType): - self.emitter.emit(JMP(Indirect(label))) - else: - raise NotImplementedError elif opcode == 'copy': self.compile_copy(instr, instr.src, instr.dest) elif opcode == 'trash': @@ -427,6 +406,33 @@ class Compiler(object): else: raise NotImplementedError(opcode) + def compile_call(self, instr): + location = instr.location + label = self.get_label(instr.location.name) + if isinstance(location.type, RoutineType): + self.emitter.emit(JSR(Absolute(label))) + elif isinstance(location.type, VectorType): + trampoline = self.trampolines.setdefault( + location, Label(location.name + '_trampoline') + ) + self.emitter.emit(JSR(Absolute(trampoline))) + else: + raise NotImplementedError + + def compile_goto(self, instr): + self.final_goto_seen = True + if self.skip_final_goto: + pass + else: + location = instr.location + label = self.get_label(instr.location.name) + if isinstance(location.type, RoutineType): + self.emitter.emit(JMP(Absolute(label))) + elif isinstance(location.type, VectorType): + self.emitter.emit(JMP(Indirect(label))) + else: + raise NotImplementedError + def compile_cmp(self, instr, src, dest): """`instr` is only for reporting purposes""" if isinstance(src, LocationRef) and src.type == TYPE_WORD: diff --git a/src/sixtypical/parser.py b/src/sixtypical/parser.py index 0a8e299..e94a205 100644 --- a/src/sixtypical/parser.py +++ b/src/sixtypical/parser.py @@ -1,7 +1,7 @@ # encoding: UTF-8 from sixtypical.ast import ( - Program, Defn, Routine, Block, SingleOp, If, Repeat, For, WithInterruptsOff, Save, PointInto + Program, Defn, Routine, Block, SingleOp, Call, GoTo, If, Repeat, For, WithInterruptsOff, Save, PointInto ) from sixtypical.model import ( TYPE_BIT, TYPE_BYTE, TYPE_WORD, @@ -113,6 +113,8 @@ class Parser(object): resolve_fwd_reference(node, 'location') resolve_fwd_reference(node, 'src') resolve_fwd_reference(node, 'dest') + if isinstance(node, (Call, GoTo)): + resolve_fwd_reference(node, 'location') # --- grammar productions @@ -380,7 +382,7 @@ class Parser(object): self.scanner.expect('{') while not self.scanner.on('}'): instrs.append(self.instr()) - if isinstance(instrs[-1], SingleOp) and instrs[-1].opcode == 'goto': + if isinstance(instrs[-1], GoTo): break self.scanner.expect('}') return Block(self.scanner.line_number, instrs=instrs) @@ -450,12 +452,15 @@ class Parser(object): opcode = self.scanner.token self.scanner.scan() return SingleOp(self.scanner.line_number, opcode=opcode, dest=None, src=None) - elif self.scanner.token in ("call", "goto"): - opcode = self.scanner.token - self.scanner.scan() + elif self.scanner.consume("call"): name = self.scanner.token self.scanner.scan() - instr = SingleOp(self.scanner.line_number, opcode=opcode, location=ForwardReference(name), dest=None, src=None) + instr = Call(self.scanner.line_number, location=ForwardReference(name)) + return instr + elif self.scanner.consume("goto"): + name = self.scanner.token + self.scanner.scan() + instr = GoTo(self.scanner.line_number, location=ForwardReference(name)) return instr elif self.scanner.token in ("copy",): opcode = self.scanner.token From 394fbddad65f627dc4e4f51d7af112c2fa8f63bf Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Tue, 9 Apr 2019 09:42:22 +0100 Subject: [PATCH 04/18] Refactor to avoid storing LocationRefs in SymEntry. --- src/sixtypical/parser.py | 50 ++++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/src/sixtypical/parser.py b/src/sixtypical/parser.py index e94a205..c23ec54 100644 --- a/src/sixtypical/parser.py +++ b/src/sixtypical/parser.py @@ -12,12 +12,12 @@ from sixtypical.scanner import Scanner class SymEntry(object): - def __init__(self, ast_node, model): + def __init__(self, ast_node, type_): self.ast_node = ast_node - self.model = model + self.type_ = type_ def __repr__(self): - return "%s(%r, %r)" % (self.__class__.__name__, self.ast_node, self.model) + return "%s(%r, %r)" % (self.__class__.__name__, self.ast_node, self.type_) class ForwardReference(object): @@ -35,19 +35,19 @@ class ParsingContext(object): self.typedefs = {} # token -> Type AST self.consts = {} # token -> Loc - for token in ('a', 'x', 'y'): - self.symbols[token] = SymEntry(None, LocationRef(TYPE_BYTE, token)) - for token in ('c', 'z', 'n', 'v'): - self.symbols[token] = SymEntry(None, LocationRef(TYPE_BIT, token)) + for name in ('a', 'x', 'y'): + self.symbols[name] = SymEntry(None, TYPE_BYTE) + for name in ('c', 'z', 'n', 'v'): + self.symbols[name] = SymEntry(None, TYPE_BIT) def __str__(self): return "Symbols: {}\nStatics: {}\nTypedefs: {}\nConsts: {}".format(self.symbols, self.statics, self.typedefs, self.consts) - def fetch(self, name): + def fetch_ref(self, name): if name in self.statics: - return self.statics[name].model + return LocationRef(self.statics[name].type_, name) if name in self.symbols: - return self.symbols[name].model + return LocationRef(self.symbols[name].type_, name) return None @@ -60,18 +60,18 @@ class Parser(object): self.scanner.syntax_error(msg) def lookup(self, name): - model = self.context.fetch(name) + model = self.context.fetch_ref(name) if model is None: self.syntax_error('Undefined symbol "{}"'.format(name)) return model - def declare(self, name, symentry, static=False): - if self.context.fetch(name): + def declare(self, name, ast_node, type_, static=False): + if self.context.fetch_ref(name): self.syntax_error('Symbol "%s" already declared' % name) if static: - self.context.statics[name] = symentry + self.context.statics[name] = SymEntry(ast_node, type_) else: - self.context.symbols[name] = symentry + self.context.symbols[name] = SymEntry(ast_node, type_) def clear_statics(self): self.context.statics = {} @@ -129,14 +129,14 @@ class Parser(object): typenames = ['byte', 'word', 'table', 'vector', 'pointer'] # 'routine', typenames.extend(self.context.typedefs.keys()) while self.scanner.on(*typenames): - defn = self.defn() - self.declare(defn.name, SymEntry(defn, defn.location)) + type_, defn = self.defn() + self.declare(defn.name, defn, type_) defns.append(defn) while self.scanner.consume('define'): name = self.scanner.token self.scanner.scan() - routine = self.routine(name) - self.declare(name, SymEntry(routine, routine.location)) + type_, routine = self.routine(name) + self.declare(name, routine, type_) routines.append(routine) self.scanner.check_type('EOF') @@ -191,7 +191,7 @@ class Parser(object): location = LocationRef(type_, name) - return Defn(self.scanner.line_number, name=name, addr=addr, initial=initial, location=location) + return type_, Defn(self.scanner.line_number, name=name, addr=addr, initial=initial, location=location) def const(self): if self.scanner.token in ('on', 'off'): @@ -300,16 +300,16 @@ class Parser(object): self.clear_statics() for defn in statics: - self.declare(defn.name, SymEntry(defn, defn.location), static=True) + self.declare(defn.name, defn, defn.location.type, static=True) block = self.block() self.clear_statics() addr = None location = LocationRef(type_, name) - return Routine( + return type_, Routine( self.scanner.line_number, name=name, block=block, addr=addr, - location=location, statics=statics + location=location, statics=statics, ) def labels(self): @@ -339,7 +339,7 @@ class Parser(object): else: name = self.scanner.token self.scanner.scan() - loc = self.context.fetch(name) + loc = self.context.fetch_ref(name) if loc: return loc else: @@ -371,7 +371,7 @@ class Parser(object): def statics(self): defns = [] while self.scanner.consume('static'): - defn = self.defn() + type_, defn = self.defn() if defn.initial is None: self.syntax_error("Static definition {} must have initial value".format(defn)) defns.append(defn) From a0328b8840b5b7817c27a1d68a9ac459d8f71421 Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Wed, 10 Apr 2019 08:48:33 +0100 Subject: [PATCH 05/18] Store type information in SymbolTable shared across phases. --- bin/sixtypical | 14 +-- src/sixtypical/analyzer.py | 205 +++++++++++++++++++++---------------- src/sixtypical/ast.py | 4 +- src/sixtypical/compiler.py | 109 ++++++++++++-------- src/sixtypical/fallthru.py | 5 +- src/sixtypical/model.py | 50 ++------- src/sixtypical/parser.py | 112 ++++++++++---------- 7 files changed, 263 insertions(+), 236 deletions(-) diff --git a/bin/sixtypical b/bin/sixtypical index 505a301..7a86cce 100755 --- a/bin/sixtypical +++ b/bin/sixtypical @@ -19,22 +19,22 @@ from pprint import pprint import sys import traceback -from sixtypical.parser import Parser, ParsingContext, merge_programs +from sixtypical.parser import Parser, SymbolTable, merge_programs from sixtypical.analyzer import Analyzer from sixtypical.outputter import outputter_class_for from sixtypical.compiler import Compiler def process_input_files(filenames, options): - context = ParsingContext() + symtab = SymbolTable() programs = [] for filename in options.filenames: text = open(filename).read() - parser = Parser(context, text, filename) + parser = Parser(symtab, text, filename) if options.debug: - print(context) + print(symtab) program = parser.program() programs.append(program) @@ -43,7 +43,7 @@ def process_input_files(filenames, options): program = merge_programs(programs) - analyzer = Analyzer(debug=options.debug) + analyzer = Analyzer(symtab, debug=options.debug) try: analyzer.analyze_program(program) @@ -64,7 +64,7 @@ def process_input_files(filenames, options): sys.stdout.write(json.dumps(data, indent=4, sort_keys=True, separators=(',', ':'))) sys.stdout.write("\n") - fa = FallthruAnalyzer(debug=options.debug) + fa = FallthruAnalyzer(symtab, debug=options.debug) fa.analyze_program(program) compilation_roster = fa.serialize() dump(compilation_roster) @@ -82,7 +82,7 @@ def process_input_files(filenames, options): with open(options.output, 'wb') as fh: outputter = outputter_class_for(options.output_format)(fh, start_addr=start_addr) outputter.write_prelude() - compiler = Compiler(outputter.emitter) + compiler = Compiler(symtab, outputter.emitter) compiler.compile_program(program, compilation_roster=compilation_roster) outputter.write_postlude() if options.debug: diff --git a/src/sixtypical/analyzer.py b/src/sixtypical/analyzer.py index 158232f..68ff759 100644 --- a/src/sixtypical/analyzer.py +++ b/src/sixtypical/analyzer.py @@ -80,16 +80,7 @@ class IncompatibleConstraintsError(ConstraintsError): pass -def routine_has_static(routine, ref): - if not hasattr(routine, 'statics'): - return False - for static in routine.statics: - if static.location == ref: - return True - return False - - -class Context(object): +class AnalysisContext(object): """ A location is touched if it was changed (or even potentially changed) during this routine, or some routine called by this routine. @@ -108,8 +99,8 @@ class Context(object): lists of this routine. A location can also be temporarily marked unwriteable in certain contexts, such as `for` loops. """ - def __init__(self, routines, routine, inputs, outputs, trashes): - self.routines = routines # LocationRef -> Routine (AST node) + def __init__(self, symtab, routine, inputs, outputs, trashes): + self.symtab = symtab self.routine = routine # Routine (AST node) self._touched = set() # {LocationRef} self._range = dict() # LocationRef -> (Int, Int) @@ -119,29 +110,30 @@ class Context(object): self._pointer_assoc = dict() for ref in inputs: - if ref.is_constant(): + if self.is_constant(ref): raise ConstantConstraintError(self.routine, ref.name) - self._range[ref] = ref.max_range() + self._range[ref] = self.max_range(ref) output_names = set() for ref in outputs: - if ref.is_constant(): + if self.is_constant(ref): raise ConstantConstraintError(self.routine, ref.name) output_names.add(ref.name) self._writeable.add(ref) for ref in trashes: - if ref.is_constant(): + if self.is_constant(ref): raise ConstantConstraintError(self.routine, ref.name) if ref.name in output_names: raise InconsistentConstraintsError(self.routine, ref.name) self._writeable.add(ref) def __str__(self): - return "Context(\n _touched={},\n _range={},\n _writeable={}\n)".format( + return "{}(\n _touched={},\n _range={},\n _writeable={}\n)".format( + self.__class__.__name__, LocationRef.format_set(self._touched), LocationRef.format_set(self._range), LocationRef.format_set(self._writeable) ) def to_json_data(self): - type_ = self.routine.location.type + type_ = self.symtab.fetch_global_type(self.routine.name) return { 'routine_inputs': ','.join(sorted(loc.name for loc in type_.inputs)), 'routine_outputs': ','.join(sorted(loc.name for loc in type_.outputs)), @@ -154,7 +146,7 @@ class Context(object): } def clone(self): - c = Context(self.routines, self.routine, [], [], []) + c = AnalysisContext(self.symtab, self.routine, [], [], []) c._touched = set(self._touched) c._range = dict(self._range) c._writeable = set(self._writeable) @@ -169,7 +161,6 @@ class Context(object): We do not replace the gotos_encountered for technical reasons. (In `analyze_if`, we merge those sets afterwards; at the end of `analyze_routine`, they are not distinct in the set of contexts we are updating from, and we want to retain our own.)""" - self.routines = other.routines self.routine = other.routine self._touched = set(other._touched) self._range = dict(other._range) @@ -193,9 +184,9 @@ class Context(object): exception_class = kwargs.get('exception_class', UnmeaningfulReadError) for ref in refs: # statics are always meaningful - if routine_has_static(self.routine, ref): + if self.symtab.fetch_static_ref(self.routine.name, ref.name): continue - if ref.is_constant() or ref in self.routines: + if self.is_constant(ref): pass elif isinstance(ref, LocationRef): if ref not in self._range: @@ -213,7 +204,7 @@ class Context(object): exception_class = kwargs.get('exception_class', ForbiddenWriteError) for ref in refs: # statics are always writeable - if routine_has_static(self.routine, ref): + if self.symtab.fetch_static_ref(self.routine.name, ref.name): continue if ref not in self._writeable: message = ref.name @@ -234,7 +225,7 @@ class Context(object): if outside in self._range: outside_range = self._range[outside] else: - outside_range = outside.max_range() + outside_range = self.max_range(outside) if (inside_range[0] + offset.value) < outside_range[0] or (inside_range[1] + offset.value) > outside_range[1]: raise RangeExceededError(self.routine, @@ -251,7 +242,7 @@ class Context(object): def set_meaningful(self, *refs): for ref in refs: if ref not in self._range: - self._range[ref] = ref.max_range() + self._range[ref] = self.max_range(ref) def set_top_of_range(self, ref, top): self.assert_meaningful(ref) @@ -293,12 +284,12 @@ class Context(object): if src in self._range: src_range = self._range[src] else: - src_range = src.max_range() + src_range = self.max_range(src) self._range[dest] = src_range def invalidate_range(self, ref): self.assert_meaningful(ref) - self._range[ref] = ref.max_range() + self._range[ref] = self.max_range(ref) def set_unmeaningful(self, *refs): for ref in refs: @@ -336,19 +327,6 @@ class Context(object): def has_terminated(self): return self._terminated - def assert_types_for_read_table(self, instr, src, dest, type_, offset): - if (not TableType.is_a_table_type(src.ref.type, type_)) or (not dest.type == type_): - raise TypeMismatchError(instr, '{} and {}'.format(src.ref.name, dest.name)) - self.assert_meaningful(src, src.index) - self.assert_in_range(src.index, src.ref, offset) - - def assert_types_for_update_table(self, instr, dest, type_, offset): - if not TableType.is_a_table_type(dest.ref.type, type_): - raise TypeMismatchError(instr, '{}'.format(dest.ref.name)) - self.assert_meaningful(dest.index) - self.assert_in_range(dest.index, dest.ref, offset) - self.set_written(dest.ref) - def extract(self, location): """Sets the given location as writeable in the context, and returns a 'baton' representing the previous state of context for that location. This 'baton' can be used to later restore @@ -390,18 +368,53 @@ class Context(object): def set_assoc(self, pointer, table): self._pointer_assoc[pointer] = table + def is_constant(self, ref): + """read-only means that the program cannot change the value + of a location. constant means that the value of the location + will not change during the lifetime of the program.""" + if isinstance(ref, ConstantRef): + return True + if isinstance(ref, (IndirectRef, IndexedRef)): + return False + if isinstance(ref, LocationRef): + type_ = self.symtab.fetch_global_type(ref.name) + return isinstance(type_, RoutineType) + raise NotImplementedError + + def max_range(self, ref): + if isinstance(ref, ConstantRef): + return (ref.value, ref.value) + elif self.symtab.has_static(self.routine.name, ref.name): + return self.symtab.fetch_static_type(self.routine.name, ref.name).max_range + else: + return self.symtab.fetch_global_type(ref.name).max_range + class Analyzer(object): - def __init__(self, debug=False): + def __init__(self, symtab, debug=False): + self.symtab = symtab self.current_routine = None - self.routines = {} self.debug = debug self.exit_contexts_map = {} + # - - - - helper methods - - - - + + def get_type_for_name(self, name): + if self.current_routine and self.symtab.has_static(self.current_routine.name, name): + return self.symtab.fetch_static_type(self.current_routine.name, name) + return self.symtab.fetch_global_type(name) + + def get_type(self, ref): + if isinstance(ref, ConstantRef): + return ref.type + if not isinstance(ref, LocationRef): + raise NotImplementedError + return self.get_type_for_name(ref.name) + def assert_type(self, type_, *locations): for location in locations: - if location.type != type_: + if self.get_type(location) != type_: raise TypeMismatchError(self.current_routine, location.name) def assert_affected_within(self, name, affecting_type, limiting_type): @@ -419,9 +432,23 @@ class Analyzer(object): ) raise IncompatibleConstraintsError(self.current_routine, message) + def assert_types_for_read_table(self, context, instr, src, dest, type_, offset): + if (not TableType.is_a_table_type(self.get_type(src.ref), type_)) or (not self.get_type(dest) == type_): + raise TypeMismatchError(instr, '{} and {}'.format(src.ref.name, dest.name)) + context.assert_meaningful(src, src.index) + context.assert_in_range(src.index, src.ref, offset) + + def assert_types_for_update_table(self, context, instr, dest, type_, offset): + if not TableType.is_a_table_type(self.get_type(dest.ref), type_): + raise TypeMismatchError(instr, '{}'.format(dest.ref.name)) + context.assert_meaningful(dest.index) + context.assert_in_range(dest.index, dest.ref, offset) + context.set_written(dest.ref) + + # - - - - visitor methods - - - - + def analyze_program(self, program): assert isinstance(program, Program) - self.routines = {r.location: r for r in program.routines} for routine in program.routines: context = self.analyze_routine(routine) routine.encountered_gotos = list(context.encountered_gotos()) if context else [] @@ -433,8 +460,8 @@ class Analyzer(object): return None self.current_routine = routine - type_ = routine.location.type - context = Context(self.routines, routine, type_.inputs, type_.outputs, type_.trashes) + type_ = self.get_type_for_name(routine.name) + context = AnalysisContext(self.symtab, routine, type_.inputs, type_.outputs, type_.trashes) self.exit_contexts = [] self.analyze_block(routine.block, context) @@ -478,7 +505,10 @@ class Analyzer(object): # if something was touched, then it should have been declared to be writable. for ref in context.each_touched(): - if ref not in type_.outputs and ref not in type_.trashes and not routine_has_static(routine, ref): + # FIXME once we have namedtuples, go back to comparing the ref directly! + outputs_names = [r.name for r in type_.outputs] + trashes_names = [r.name for r in type_.trashes] + if ref.name not in outputs_names and ref.name not in trashes_names and not self.symtab.has_static(routine.name, ref.name): raise ForbiddenWriteError(routine, ref.name) self.exit_contexts = None @@ -525,10 +555,10 @@ class Analyzer(object): if opcode == 'ld': if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) + self.assert_types_for_read_table(context, instr, src, dest, TYPE_BYTE, src.offset) elif isinstance(src, IndirectRef): # copying this analysis from the matching branch in `copy`, below - if isinstance(src.ref.type, PointerType) and dest.type == TYPE_BYTE: + if isinstance(self.get_type(src.ref), PointerType) and self.get_type(dest) == TYPE_BYTE: pass else: raise TypeMismatchError(instr, (src, dest)) @@ -539,7 +569,7 @@ class Analyzer(object): context.assert_meaningful(origin) context.assert_meaningful(src.ref, REG_Y) - elif src.type != dest.type: + elif self.get_type(src) != self.get_type(dest): raise TypeMismatchError(instr, '{} and {}'.format(src.name, dest.name)) else: context.assert_meaningful(src) @@ -547,12 +577,12 @@ class Analyzer(object): context.set_written(dest, FLAG_Z, FLAG_N) elif opcode == 'st': if isinstance(dest, IndexedRef): - if src.type != TYPE_BYTE: + if self.get_type(src) != TYPE_BYTE: raise TypeMismatchError(instr, (src, dest)) - context.assert_types_for_update_table(instr, dest, TYPE_BYTE, dest.offset) + self.assert_types_for_update_table(context, instr, dest, TYPE_BYTE, dest.offset) elif isinstance(dest, IndirectRef): # copying this analysis from the matching branch in `copy`, below - if isinstance(dest.ref.type, PointerType) and src.type == TYPE_BYTE: + if isinstance(self.get_type(dest.ref), PointerType) and self.get_type(src) == TYPE_BYTE: pass else: raise TypeMismatchError(instr, (src, dest)) @@ -565,7 +595,7 @@ class Analyzer(object): context.set_touched(target) context.set_written(target) - elif src.type != dest.type: + elif self.get_type(src) != self.get_type(dest): raise TypeMismatchError(instr, '{} and {}'.format(src, dest)) else: context.set_written(dest) @@ -574,18 +604,19 @@ class Analyzer(object): elif opcode == 'add': context.assert_meaningful(src, dest, FLAG_C) if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) - elif src.type == TYPE_BYTE: + self.assert_types_for_read_table(context, instr, src, dest, TYPE_BYTE, src.offset) + elif self.get_type(src) == TYPE_BYTE: self.assert_type(TYPE_BYTE, src, dest) if dest != REG_A: context.set_touched(REG_A) context.set_unmeaningful(REG_A) else: self.assert_type(TYPE_WORD, src) - if dest.type == TYPE_WORD: + dest_type = self.get_type(dest) + if dest_type == TYPE_WORD: context.set_touched(REG_A) context.set_unmeaningful(REG_A) - elif isinstance(dest.type, PointerType): + elif isinstance(dest_type, PointerType): context.set_touched(REG_A) context.set_unmeaningful(REG_A) else: @@ -595,8 +626,8 @@ class Analyzer(object): elif opcode == 'sub': context.assert_meaningful(src, dest, FLAG_C) if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) - elif src.type == TYPE_BYTE: + self.assert_types_for_read_table(context, instr, src, dest, TYPE_BYTE, src.offset) + elif self.get_type(src) == TYPE_BYTE: self.assert_type(TYPE_BYTE, src, dest) if dest != REG_A: context.set_touched(REG_A) @@ -610,8 +641,8 @@ class Analyzer(object): elif opcode == 'cmp': context.assert_meaningful(src, dest) if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) - elif src.type == TYPE_BYTE: + self.assert_types_for_read_table(context, instr, src, dest, TYPE_BYTE, src.offset) + elif self.get_type(src) == TYPE_BYTE: self.assert_type(TYPE_BYTE, src, dest) else: self.assert_type(TYPE_WORD, src, dest) @@ -620,7 +651,7 @@ class Analyzer(object): context.set_written(FLAG_Z, FLAG_N, FLAG_C) elif opcode == 'and': if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) + self.assert_types_for_read_table(context, instr, src, dest, TYPE_BYTE, src.offset) else: self.assert_type(TYPE_BYTE, src, dest) context.assert_meaningful(src, dest) @@ -632,7 +663,7 @@ class Analyzer(object): context.set_top_of_range(dest, context.get_top_of_range(src)) elif opcode in ('or', 'xor'): if isinstance(src, IndexedRef): - context.assert_types_for_read_table(instr, src, dest, TYPE_BYTE, src.offset) + self.assert_types_for_read_table(context, instr, src, dest, TYPE_BYTE, src.offset) else: self.assert_type(TYPE_BYTE, src, dest) context.assert_meaningful(src, dest) @@ -641,7 +672,7 @@ class Analyzer(object): elif opcode in ('inc', 'dec'): context.assert_meaningful(dest) if isinstance(dest, IndexedRef): - context.assert_types_for_update_table(instr, dest, TYPE_BYTE, dest.offset) + self.assert_types_for_update_table(context, instr, dest, TYPE_BYTE, dest.offset) context.set_written(dest.ref, FLAG_Z, FLAG_N) #context.invalidate_range(dest) else: @@ -664,7 +695,7 @@ class Analyzer(object): elif opcode in ('shl', 'shr'): context.assert_meaningful(dest, FLAG_C) if isinstance(dest, IndexedRef): - context.assert_types_for_update_table(instr, dest, TYPE_BYTE, dest.offset) + self.assert_types_for_update_table(context, instr, dest, TYPE_BYTE, dest.offset) context.set_written(dest.ref, FLAG_Z, FLAG_N, FLAG_C) #context.invalidate_range(dest) else: @@ -678,51 +709,51 @@ class Analyzer(object): # 1. check that their types are compatible if isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndirectRef): - if src.type == TYPE_BYTE and isinstance(dest.ref.type, PointerType): + if self.get_type(src) == TYPE_BYTE and isinstance(self.get_type(dest.ref), PointerType): pass else: raise TypeMismatchError(instr, (src, dest)) elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef): - if isinstance(src.ref.type, PointerType) and dest.type == TYPE_BYTE: + if isinstance(self.get_type(src.ref), PointerType) and self.get_type(dest) == TYPE_BYTE: pass else: raise TypeMismatchError(instr, (src, dest)) elif isinstance(src, IndirectRef) and isinstance(dest, IndirectRef): - if isinstance(src.ref.type, PointerType) and isinstance(dest.ref.type, PointerType): + if isinstance(self.get_type(src.ref), PointerType) and isinstance(self.get_type(dest.ref), PointerType): pass else: raise TypeMismatchError(instr, (src, dest)) elif isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndexedRef): - if src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD): + if self.get_type(src) == TYPE_WORD and TableType.is_a_table_type(self.get_type(dest.ref), TYPE_WORD): pass - elif (isinstance(src.type, VectorType) and isinstance(dest.ref.type, TableType) and - RoutineType.executable_types_compatible(src.type.of_type, dest.ref.type.of_type)): + elif (isinstance(self.get_type(src), VectorType) and isinstance(self.get_type(dest.ref), TableType) and + RoutineType.executable_types_compatible(self.get_type(src).of_type, self.get_type(dest.ref).of_type)): pass - elif (isinstance(src.type, RoutineType) and isinstance(dest.ref.type, TableType) and - RoutineType.executable_types_compatible(src.type, dest.ref.type.of_type)): + elif (isinstance(self.get_type(src), RoutineType) and isinstance(self.get_type(dest.ref), TableType) and + RoutineType.executable_types_compatible(self.get_type(src), self.get_type(dest.ref).of_type)): pass else: raise TypeMismatchError(instr, (src, dest)) context.assert_in_range(dest.index, dest.ref, dest.offset) elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef): - if TableType.is_a_table_type(src.ref.type, TYPE_WORD) and dest.type == TYPE_WORD: + if TableType.is_a_table_type(self.get_type(src.ref), TYPE_WORD) and self.get_type(dest) == TYPE_WORD: pass - elif (isinstance(src.ref.type, TableType) and isinstance(dest.type, VectorType) and - RoutineType.executable_types_compatible(src.ref.type.of_type, dest.type.of_type)): + elif (isinstance(self.get_type(src.ref), TableType) and isinstance(self.get_type(dest), VectorType) and + RoutineType.executable_types_compatible(self.get_type(src.ref).of_type, self.get_type(dest).of_type)): pass else: raise TypeMismatchError(instr, (src, dest)) context.assert_in_range(src.index, src.ref, src.offset) elif isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, LocationRef): - if src.type == dest.type: + if self.get_type(src) == self.get_type(dest): pass - elif isinstance(src.type, RoutineType) and isinstance(dest.type, VectorType): - self.assert_affected_within('inputs', src.type, dest.type.of_type) - self.assert_affected_within('outputs', src.type, dest.type.of_type) - self.assert_affected_within('trashes', src.type, dest.type.of_type) + elif isinstance(self.get_type(src), RoutineType) and isinstance(self.get_type(dest), VectorType): + self.assert_affected_within('inputs', self.get_type(src), self.get_type(dest).of_type) + self.assert_affected_within('outputs', self.get_type(src), self.get_type(dest).of_type) + self.assert_affected_within('trashes', self.get_type(src), self.get_type(dest).of_type) else: raise TypeMismatchError(instr, (src, dest)) else: @@ -789,7 +820,7 @@ class Analyzer(object): raise NotImplementedError(opcode) def analyze_call(self, instr, context): - type = instr.location.type + type = self.get_type(instr.location) if not isinstance(type, (RoutineType, VectorType)): raise TypeMismatchError(instr, instr.location) if isinstance(type, VectorType): @@ -805,7 +836,7 @@ class Analyzer(object): def analyze_goto(self, instr, context): location = instr.location - type_ = location.type + type_ = self.get_type(instr.location) if not isinstance(type_, (RoutineType, VectorType)): raise TypeMismatchError(instr, location) @@ -818,7 +849,7 @@ class Analyzer(object): # and that this routine's trashes and output constraints are a # superset of the called routine's - current_type = self.current_routine.location.type + current_type = self.get_type_for_name(self.current_routine.name) self.assert_affected_within('outputs', type_, current_type) self.assert_affected_within('trashes', type_, current_type) @@ -969,9 +1000,9 @@ class Analyzer(object): context.set_unmeaningful(REG_A) def analyze_point_into(self, instr, context): - if not isinstance(instr.pointer.type, PointerType): + if not isinstance(self.get_type(instr.pointer), PointerType): raise TypeMismatchError(instr, instr.pointer) - if not TableType.is_a_table_type(instr.table.type, TYPE_BYTE): + if not TableType.is_a_table_type(self.get_type(instr.table), TYPE_BYTE): raise TypeMismatchError(instr, instr.table) # check that pointer is not yet associated with any table. diff --git a/src/sixtypical/ast.py b/src/sixtypical/ast.py index bc9a5ee..fc5f96f 100644 --- a/src/sixtypical/ast.py +++ b/src/sixtypical/ast.py @@ -54,11 +54,11 @@ class Program(AST): class Defn(AST): - value_attrs = ('name', 'addr', 'initial', 'location',) + value_attrs = ('name', 'addr', 'initial',) class Routine(AST): - value_attrs = ('name', 'addr', 'initial', 'location',) + value_attrs = ('name', 'addr', 'initial',) children_attrs = ('statics',) child_attrs = ('block',) diff --git a/src/sixtypical/compiler.py b/src/sixtypical/compiler.py index 0d0d862..e033cb1 100644 --- a/src/sixtypical/compiler.py +++ b/src/sixtypical/compiler.py @@ -30,7 +30,8 @@ class UnsupportedOpcodeError(KeyError): class Compiler(object): - def __init__(self, emitter): + def __init__(self, symtab, emitter): + self.symtab = symtab self.emitter = emitter self.routines = {} # routine.name -> Routine self.routine_statics = {} # routine.name -> { static.name -> Label } @@ -38,7 +39,19 @@ class Compiler(object): self.trampolines = {} # Location -> Label self.current_routine = None - # helper methods + # - - - - helper methods - - - - + + def get_type_for_name(self, name): + if self.current_routine and self.symtab.has_static(self.current_routine.name, name): + return self.symtab.fetch_static_type(self.current_routine.name, name) + return self.symtab.fetch_global_type(name) + + def get_type(self, ref): + if isinstance(ref, ConstantRef): + return ref.type + if not isinstance(ref, LocationRef): + raise NotImplementedError + return self.get_type_for_name(ref.name) def addressing_mode_for_index(self, index): if index == REG_X: @@ -50,7 +63,7 @@ class Compiler(object): def compute_length_of_defn(self, defn): length = None - type_ = defn.location.type + type_ = self.get_type_for_name(defn.name) if type_ == TYPE_BYTE: length = 1 elif type_ == TYPE_WORD or isinstance(type_, (PointerType, VectorType)): @@ -74,18 +87,18 @@ class Compiler(object): else: return Absolute(label) - # visitor methods + # - - - - visitor methods - - - - def compile_program(self, program, compilation_roster=None): assert isinstance(program, Program) - defn_labels = [] + declarations = [] for defn in program.defns: length = self.compute_length_of_defn(defn) label = Label(defn.name, addr=defn.addr, length=length) self.labels[defn.name] = label - defn_labels.append((defn, label)) + declarations.append((defn, self.symtab.fetch_global_type(defn.name), label)) for routine in program.routines: self.routines[routine.name] = routine @@ -95,13 +108,15 @@ class Compiler(object): self.labels[routine.name] = label if hasattr(routine, 'statics'): + self.current_routine = routine static_labels = {} for defn in routine.statics: length = self.compute_length_of_defn(defn) label = Label(defn.name, addr=defn.addr, length=length) static_labels[defn.name] = label - defn_labels.append((defn, label)) + declarations.append((defn, self.symtab.fetch_static_type(routine.name, defn.name), label)) self.routine_statics[routine.name] = static_labels + self.current_routine = None if compilation_roster is None: compilation_roster = [['main']] + [[routine.name] for routine in program.routines if routine.name != 'main'] @@ -118,10 +133,9 @@ class Compiler(object): self.emitter.emit(RTS()) # initialized data - for defn, label in defn_labels: + for defn, type_, label in declarations: if defn.initial is not None: initial_data = None - type_ = defn.location.type if type_ == TYPE_BYTE: initial_data = Byte(defn.initial) elif type_ == TYPE_WORD: @@ -137,7 +151,7 @@ class Compiler(object): self.emitter.emit(initial_data) # uninitialized, "BSS" data - for defn, label in defn_labels: + for defn, type_, label in declarations: if defn.initial is None and defn.addr is None: self.emitter.resolve_bss_label(label) @@ -199,7 +213,7 @@ class Compiler(object): self.emitter.emit(LDA(AbsoluteX(Offset(self.get_label(src.ref.name), src.offset.value)))) elif isinstance(src, IndexedRef) and src.index == REG_Y: self.emitter.emit(LDA(AbsoluteY(Offset(self.get_label(src.ref.name), src.offset.value)))) - elif isinstance(src, IndirectRef) and isinstance(src.ref.type, PointerType): + elif isinstance(src, IndirectRef) and isinstance(self.get_type(src.ref), PointerType): self.emitter.emit(LDA(IndirectY(self.get_label(src.ref.name)))) else: self.emitter.emit(LDA(self.absolute_or_zero_page(self.get_label(src.name)))) @@ -241,7 +255,7 @@ class Compiler(object): REG_Y: AbsoluteY, }[dest.index] operand = mode_cls(Offset(self.get_label(dest.ref.name), dest.offset.value)) - elif isinstance(dest, IndirectRef) and isinstance(dest.ref.type, PointerType): + elif isinstance(dest, IndirectRef) and isinstance(self.get_type(dest.ref), PointerType): operand = IndirectY(self.get_label(dest.ref.name)) else: operand = self.absolute_or_zero_page(self.get_label(dest.name)) @@ -260,7 +274,7 @@ class Compiler(object): self.emitter.emit(ADC(mode(Offset(self.get_label(src.ref.name), src.offset.value)))) else: self.emitter.emit(ADC(Absolute(self.get_label(src.name)))) - elif isinstance(dest, LocationRef) and src.type == TYPE_BYTE and dest.type == TYPE_BYTE: + elif isinstance(dest, LocationRef) and self.get_type(src) == TYPE_BYTE and self.get_type(dest) == TYPE_BYTE: if isinstance(src, ConstantRef): dest_label = self.get_label(dest.name) self.emitter.emit(LDA(Absolute(dest_label))) @@ -274,7 +288,7 @@ class Compiler(object): self.emitter.emit(STA(Absolute(dest_label))) else: raise UnsupportedOpcodeError(instr) - elif isinstance(dest, LocationRef) and src.type == TYPE_WORD and dest.type == TYPE_WORD: + elif isinstance(dest, LocationRef) and self.get_type(src) == TYPE_WORD and self.get_type(dest) == TYPE_WORD: if isinstance(src, ConstantRef): dest_label = self.get_label(dest.name) self.emitter.emit(LDA(Absolute(dest_label))) @@ -294,7 +308,7 @@ class Compiler(object): self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) else: raise UnsupportedOpcodeError(instr) - elif isinstance(dest, LocationRef) and src.type == TYPE_WORD and isinstance(dest.type, PointerType): + elif isinstance(dest, LocationRef) and self.get_type(src) == TYPE_WORD and isinstance(self.get_type(dest), PointerType): if isinstance(src, ConstantRef): dest_label = self.get_label(dest.name) self.emitter.emit(LDA(ZeroPage(dest_label))) @@ -327,7 +341,7 @@ class Compiler(object): self.emitter.emit(SBC(mode(Offset(self.get_label(src.ref.name), src.offset.value)))) else: self.emitter.emit(SBC(Absolute(self.get_label(src.name)))) - elif isinstance(dest, LocationRef) and src.type == TYPE_BYTE and dest.type == TYPE_BYTE: + elif isinstance(dest, LocationRef) and self.get_type(src) == TYPE_BYTE and self.get_type(dest) == TYPE_BYTE: if isinstance(src, ConstantRef): dest_label = self.get_label(dest.name) self.emitter.emit(LDA(Absolute(dest_label))) @@ -341,7 +355,7 @@ class Compiler(object): self.emitter.emit(STA(Absolute(dest_label))) else: raise UnsupportedOpcodeError(instr) - elif isinstance(dest, LocationRef) and src.type == TYPE_WORD and dest.type == TYPE_WORD: + elif isinstance(dest, LocationRef) and self.get_type(src) == TYPE_WORD and self.get_type(dest) == TYPE_WORD: if isinstance(src, ConstantRef): dest_label = self.get_label(dest.name) self.emitter.emit(LDA(Absolute(dest_label))) @@ -409,15 +423,16 @@ class Compiler(object): def compile_call(self, instr): location = instr.location label = self.get_label(instr.location.name) - if isinstance(location.type, RoutineType): + location_type = self.get_type(location) + if isinstance(location_type, RoutineType): self.emitter.emit(JSR(Absolute(label))) - elif isinstance(location.type, VectorType): + elif isinstance(location_type, VectorType): trampoline = self.trampolines.setdefault( location, Label(location.name + '_trampoline') ) self.emitter.emit(JSR(Absolute(trampoline))) else: - raise NotImplementedError + raise NotImplementedError(location_type) def compile_goto(self, instr): self.final_goto_seen = True @@ -426,16 +441,17 @@ class Compiler(object): else: location = instr.location label = self.get_label(instr.location.name) - if isinstance(location.type, RoutineType): + location_type = self.get_type(location) + if isinstance(location_type, RoutineType): self.emitter.emit(JMP(Absolute(label))) - elif isinstance(location.type, VectorType): + elif isinstance(location_type, VectorType): self.emitter.emit(JMP(Indirect(label))) else: - raise NotImplementedError + raise NotImplementedError(location_type) def compile_cmp(self, instr, src, dest): """`instr` is only for reporting purposes""" - if isinstance(src, LocationRef) and src.type == TYPE_WORD: + if isinstance(src, LocationRef) and self.get_type(src) == TYPE_WORD: src_label = self.get_label(src.name) dest_label = self.get_label(dest.name) self.emitter.emit(LDA(Absolute(dest_label))) @@ -446,7 +462,7 @@ class Compiler(object): self.emitter.emit(CMP(Absolute(Offset(src_label, 1)))) self.emitter.resolve_label(end_label) return - if isinstance(src, ConstantRef) and src.type == TYPE_WORD: + if isinstance(src, ConstantRef) and self.get_type(src) == TYPE_WORD: dest_label = self.get_label(dest.name) self.emitter.emit(LDA(Absolute(dest_label))) self.emitter.emit(CMP(Immediate(Byte(src.low_byte())))) @@ -497,30 +513,41 @@ class Compiler(object): self.emitter.emit(DEC(Absolute(self.get_label(dest.name)))) def compile_copy(self, instr, src, dest): - if isinstance(src, ConstantRef) and isinstance(dest, IndirectRef) and src.type == TYPE_BYTE and isinstance(dest.ref.type, PointerType): + + if isinstance(src, (IndirectRef, IndexedRef)): + src_ref_type = self.get_type(src.ref) + else: + src_type = self.get_type(src) + + if isinstance(dest, (IndirectRef, IndexedRef)): + dest_ref_type = self.get_type(dest.ref) + else: + dest_type = self.get_type(dest) + + if isinstance(src, ConstantRef) and isinstance(dest, IndirectRef) and src_type == TYPE_BYTE and isinstance(dest_ref_type, PointerType): ### copy 123, [ptr] + y dest_label = self.get_label(dest.ref.name) self.emitter.emit(LDA(Immediate(Byte(src.value)))) self.emitter.emit(STA(IndirectY(dest_label))) - elif isinstance(src, LocationRef) and isinstance(dest, IndirectRef) and src.type == TYPE_BYTE and isinstance(dest.ref.type, PointerType): + elif isinstance(src, LocationRef) and isinstance(dest, IndirectRef) and src_type == TYPE_BYTE and isinstance(dest_ref_type, PointerType): ### copy b, [ptr] + y src_label = self.get_label(src.name) dest_label = self.get_label(dest.ref.name) self.emitter.emit(LDA(Absolute(src_label))) self.emitter.emit(STA(IndirectY(dest_label))) - elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef) and dest.type == TYPE_BYTE and isinstance(src.ref.type, PointerType): + elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef) and dest_type == TYPE_BYTE and isinstance(src_ref_type, PointerType): ### copy [ptr] + y, b src_label = self.get_label(src.ref.name) dest_label = self.get_label(dest.name) self.emitter.emit(LDA(IndirectY(src_label))) self.emitter.emit(STA(Absolute(dest_label))) - elif isinstance(src, IndirectRef) and isinstance(dest, IndirectRef) and isinstance(src.ref.type, PointerType) and isinstance(dest.ref.type, PointerType): + elif isinstance(src, IndirectRef) and isinstance(dest, IndirectRef) and isinstance(src_ref_type, PointerType) and isinstance(dest_ref_type, PointerType): ### copy [ptra] + y, [ptrb] + y src_label = self.get_label(src.ref.name) dest_label = self.get_label(dest.ref.name) self.emitter.emit(LDA(IndirectY(src_label))) self.emitter.emit(STA(IndirectY(dest_label))) - elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD): + elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and src_type == TYPE_WORD and TableType.is_a_table_type(dest_ref_type, TYPE_WORD): ### copy w, wtab + y src_label = self.get_label(src.name) dest_label = self.get_label(dest.ref.name) @@ -529,7 +556,7 @@ class Compiler(object): self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value)))) self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value + 256)))) - elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and isinstance(src.type, VectorType) and isinstance(dest.ref.type, TableType) and isinstance(dest.ref.type.of_type, VectorType): + elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and isinstance(src_type, VectorType) and isinstance(dest_ref_type, TableType) and isinstance(dest_ref_type.of_type, VectorType): ### copy vec, vtab + y # FIXME this is the exact same as above - can this be simplified? src_label = self.get_label(src.name) @@ -539,7 +566,7 @@ class Compiler(object): self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value)))) self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value + 256)))) - elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and isinstance(src.type, RoutineType) and isinstance(dest.ref.type, TableType) and isinstance(dest.ref.type.of_type, VectorType): + elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef) and isinstance(src_type, RoutineType) and isinstance(dest_ref_type, TableType) and isinstance(dest_ref_type.of_type, VectorType): ### copy routine, vtab + y src_label = self.get_label(src.name) dest_label = self.get_label(dest.ref.name) @@ -548,7 +575,7 @@ class Compiler(object): self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value)))) self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value + 256)))) - elif isinstance(src, ConstantRef) and isinstance(dest, IndexedRef) and src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD): + elif isinstance(src, ConstantRef) and isinstance(dest, IndexedRef) and src_type == TYPE_WORD and TableType.is_a_table_type(dest_ref_type, TYPE_WORD): ### copy 9999, wtab + y dest_label = self.get_label(dest.ref.name) mode = self.addressing_mode_for_index(dest.index) @@ -556,7 +583,7 @@ class Compiler(object): self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value)))) self.emitter.emit(LDA(Immediate(Byte(src.high_byte())))) self.emitter.emit(STA(mode(Offset(dest_label, dest.offset.value + 256)))) - elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef) and TableType.is_a_table_type(src.ref.type, TYPE_WORD) and dest.type == TYPE_WORD: + elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef) and TableType.is_a_table_type(src_ref_type, TYPE_WORD) and dest_type == TYPE_WORD: ### copy wtab + y, w src_label = self.get_label(src.ref.name) dest_label = self.get_label(dest.name) @@ -565,7 +592,7 @@ class Compiler(object): self.emitter.emit(STA(Absolute(dest_label))) self.emitter.emit(LDA(mode(Offset(src_label, src.offset.value + 256)))) self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) - elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef) and isinstance(dest.type, VectorType) and isinstance(src.ref.type, TableType) and isinstance(src.ref.type.of_type, VectorType): + elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef) and isinstance(dest_type, VectorType) and isinstance(src_ref_type, TableType) and isinstance(src_ref_type.of_type, VectorType): ### copy vtab + y, vec # FIXME this is the exact same as above - can this be simplified? src_label = self.get_label(src.ref.name) @@ -575,20 +602,20 @@ class Compiler(object): self.emitter.emit(STA(Absolute(dest_label))) self.emitter.emit(LDA(mode(Offset(src_label, src.offset.value + 256)))) self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) - elif src.type == TYPE_BYTE and dest.type == TYPE_BYTE and not isinstance(src, ConstantRef): + elif src_type == TYPE_BYTE and dest_type == TYPE_BYTE and not isinstance(src, ConstantRef): ### copy b1, b2 src_label = self.get_label(src.name) dest_label = self.get_label(dest.name) self.emitter.emit(LDA(Absolute(src_label))) self.emitter.emit(STA(Absolute(dest_label))) - elif src.type == TYPE_WORD and dest.type == TYPE_WORD and isinstance(src, ConstantRef): + elif src_type == TYPE_WORD and dest_type == TYPE_WORD and isinstance(src, ConstantRef): ### copy 9999, w dest_label = self.get_label(dest.name) self.emitter.emit(LDA(Immediate(Byte(src.low_byte())))) self.emitter.emit(STA(Absolute(dest_label))) self.emitter.emit(LDA(Immediate(Byte(src.high_byte())))) self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) - elif src.type == TYPE_WORD and dest.type == TYPE_WORD and not isinstance(src, ConstantRef): + elif src_type == TYPE_WORD and dest_type == TYPE_WORD and not isinstance(src, ConstantRef): ### copy w1, w2 src_label = self.get_label(src.name) dest_label = self.get_label(dest.name) @@ -596,7 +623,7 @@ class Compiler(object): self.emitter.emit(STA(Absolute(dest_label))) self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) - elif isinstance(src.type, VectorType) and isinstance(dest.type, VectorType): + elif isinstance(src_type, VectorType) and isinstance(dest_type, VectorType): ### copy v1, v2 src_label = self.get_label(src.name) dest_label = self.get_label(dest.name) @@ -604,7 +631,7 @@ class Compiler(object): self.emitter.emit(STA(Absolute(dest_label))) self.emitter.emit(LDA(Absolute(Offset(src_label, 1)))) self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) - elif isinstance(src.type, RoutineType) and isinstance(dest.type, VectorType): + elif isinstance(src_type, RoutineType) and isinstance(dest_type, VectorType): ### copy routine, vec src_label = self.get_label(src.name) dest_label = self.get_label(dest.name) @@ -613,7 +640,7 @@ class Compiler(object): self.emitter.emit(LDA(Immediate(LowAddressByte(src_label)))) self.emitter.emit(STA(Absolute(Offset(dest_label, 1)))) else: - raise NotImplementedError(src.type) + raise NotImplementedError(src_type) def compile_if(self, instr): cls = { diff --git a/src/sixtypical/fallthru.py b/src/sixtypical/fallthru.py index 995fbcf..66d95e2 100644 --- a/src/sixtypical/fallthru.py +++ b/src/sixtypical/fallthru.py @@ -7,7 +7,8 @@ from sixtypical.model import RoutineType class FallthruAnalyzer(object): - def __init__(self, debug=False): + def __init__(self, symtab, debug=False): + self.symtab = symtab self.debug = debug def analyze_program(self, program): @@ -16,7 +17,7 @@ class FallthruAnalyzer(object): self.fallthru_map = {} for routine in program.routines: encountered_gotos = list(routine.encountered_gotos) - if len(encountered_gotos) == 1 and isinstance(encountered_gotos[0].type, RoutineType): + if len(encountered_gotos) == 1 and isinstance(self.symtab.fetch_global_type(encountered_gotos[0].name), RoutineType): self.fallthru_map[routine.name] = encountered_gotos[0].name else: self.fallthru_map[routine.name] = None diff --git a/src/sixtypical/model.py b/src/sixtypical/model.py index f89340d..6b8724d 100644 --- a/src/sixtypical/model.py +++ b/src/sixtypical/model.py @@ -57,6 +57,8 @@ class RoutineType(Type): class VectorType(Type): """This memory location contains the address of some other type (currently, only RoutineType).""" + max_range = (0, 0) + def __init__(self, of_type): self.of_type = of_type @@ -92,6 +94,8 @@ class TableType(Type): class PointerType(Type): + max_range = (0, 0) + def __init__(self): self.name = 'pointer' @@ -100,48 +104,24 @@ class PointerType(Type): class Ref(object): - def is_constant(self): - """read-only means that the program cannot change the value - of a location. constant means that the value of the location - will not change during the lifetime of the program.""" - raise NotImplementedError("class {} must implement is_constant()".format(self.__class__.__name__)) - - def max_range(self): - raise NotImplementedError("class {} must implement max_range()".format(self.__class__.__name__)) + pass class LocationRef(Ref): def __init__(self, type, name): - self.type = type self.name = name def __eq__(self, other): - # Ordinarily there will only be one ref with a given name, - # but because we store the type in here and we want to treat - # these objects as immutable, we compare the types, too, - # just to be sure. - equal = isinstance(other, self.__class__) and other.name == self.name - if equal: - assert other.type == self.type, repr((self, other)) - return equal + return self.__class__ is other.__class__ and self.name == other.name def __hash__(self): - return hash(self.name + repr(self.type)) + return hash(self.name) def __repr__(self): - return '%s(%r, %r)' % (self.__class__.__name__, self.type, self.name) + return '%s(%r)' % (self.__class__.__name__, self.name) def __str__(self): - return "{}:{}".format(self.name, self.type) - - def is_constant(self): - return isinstance(self.type, RoutineType) - - def max_range(self): - try: - return self.type.max_range - except: - return (0, 0) + return self.name @classmethod def format_set(cls, location_refs): @@ -165,9 +145,6 @@ class IndirectRef(Ref): def name(self): return '[{}]+y'.format(self.ref.name) - def is_constant(self): - return False - class IndexedRef(Ref): def __init__(self, ref, offset, index): @@ -188,9 +165,6 @@ class IndexedRef(Ref): def name(self): return '{}+{}+{}'.format(self.ref.name, self.offset, self.index.name) - def is_constant(self): - return False - class ConstantRef(Ref): def __init__(self, type, value): @@ -208,12 +182,6 @@ class ConstantRef(Ref): def __repr__(self): return '%s(%r, %r)' % (self.__class__.__name__, self.type, self.value) - def is_constant(self): - return True - - def max_range(self): - return (self.value, self.value) - def high_byte(self): return (self.value >> 8) & 255 diff --git a/src/sixtypical/parser.py b/src/sixtypical/parser.py index c23ec54..c5034f0 100644 --- a/src/sixtypical/parser.py +++ b/src/sixtypical/parser.py @@ -28,12 +28,12 @@ class ForwardReference(object): return "%s(%r)" % (self.__class__.__name__, self.name) -class ParsingContext(object): +class SymbolTable(object): def __init__(self): - self.symbols = {} # token -> SymEntry - self.statics = {} # token -> SymEntry - self.typedefs = {} # token -> Type AST - self.consts = {} # token -> Loc + self.symbols = {} # symbol name -> SymEntry + self.statics = {} # routine name -> (symbol name -> SymEntry) + self.typedefs = {} # type name -> Type AST + self.consts = {} # const name -> ConstantRef for name in ('a', 'x', 'y'): self.symbols[name] = SymEntry(None, TYPE_BYTE) @@ -43,38 +43,55 @@ class ParsingContext(object): def __str__(self): return "Symbols: {}\nStatics: {}\nTypedefs: {}\nConsts: {}".format(self.symbols, self.statics, self.typedefs, self.consts) - def fetch_ref(self, name): - if name in self.statics: - return LocationRef(self.statics[name].type_, name) + def has_static(self, routine_name, name): + return name in self.statics.get(routine_name, {}) + + def fetch_global_type(self, name): + return self.symbols[name].type_ + + def fetch_static_type(self, routine_name, name): + return self.statics[routine_name][name].type_ + + def fetch_global_ref(self, name): if name in self.symbols: return LocationRef(self.symbols[name].type_, name) return None + def fetch_static_ref(self, routine_name, name): + routine_statics = self.statics.get(routine_name, {}) + if name in routine_statics: + return LocationRef(routine_statics[name].type_, name) + return None + class Parser(object): - def __init__(self, context, text, filename): - self.context = context + def __init__(self, symtab, text, filename): + self.symtab = symtab self.scanner = Scanner(text, filename) + self.current_routine_name = None def syntax_error(self, msg): self.scanner.syntax_error(msg) - def lookup(self, name): - model = self.context.fetch_ref(name) + def lookup(self, name, allow_forward=False, routine_name=None): + model = self.symtab.fetch_global_ref(name) + if model is None and routine_name: + model = self.symtab.fetch_static_ref(routine_name, name) + if model is None and allow_forward: + return ForwardReference(name) if model is None: self.syntax_error('Undefined symbol "{}"'.format(name)) return model - def declare(self, name, ast_node, type_, static=False): - if self.context.fetch_ref(name): + def declare(self, name, ast_node, type_): + if self.symtab.fetch_global_ref(name): self.syntax_error('Symbol "%s" already declared' % name) - if static: - self.context.statics[name] = SymEntry(ast_node, type_) - else: - self.context.symbols[name] = SymEntry(ast_node, type_) + self.symtab.symbols[name] = SymEntry(ast_node, type_) - def clear_statics(self): - self.context.statics = {} + def declare_static(self, routine_name, name, ast_node, type_): + if self.symtab.fetch_global_ref(name): + self.syntax_error('Symbol "%s" already declared' % name) + self.symtab.statics.setdefault(routine_name, {})[name] = SymEntry(ast_node, type_) # ---- symbol resolution @@ -95,10 +112,8 @@ class Parser(object): type_.outputs = set([resolve(w) for w in type_.outputs]) type_.trashes = set([resolve(w) for w in type_.trashes]) - for defn in program.defns: - backpatch_constraint_labels(defn.location.type) - for routine in program.routines: - backpatch_constraint_labels(routine.location.type) + for name, symentry in self.symtab.symbols.items(): + backpatch_constraint_labels(symentry.type_) def resolve_fwd_reference(obj, field): field_value = getattr(obj, field, None) @@ -110,7 +125,6 @@ class Parser(object): for node in program.all_children(): if isinstance(node, SingleOp): - resolve_fwd_reference(node, 'location') resolve_fwd_reference(node, 'src') resolve_fwd_reference(node, 'dest') if isinstance(node, (Call, GoTo)): @@ -127,7 +141,7 @@ class Parser(object): if self.scanner.on('const'): self.defn_const() typenames = ['byte', 'word', 'table', 'vector', 'pointer'] # 'routine', - typenames.extend(self.context.typedefs.keys()) + typenames.extend(self.symtab.typedefs.keys()) while self.scanner.on(*typenames): type_, defn = self.defn() self.declare(defn.name, defn, type_) @@ -135,9 +149,11 @@ class Parser(object): while self.scanner.consume('define'): name = self.scanner.token self.scanner.scan() + self.current_routine_name = name type_, routine = self.routine(name) self.declare(name, routine, type_) routines.append(routine) + self.current_routine_name = None self.scanner.check_type('EOF') program = Program(self.scanner.line_number, defns=defns, routines=routines) @@ -148,18 +164,18 @@ class Parser(object): self.scanner.expect('typedef') type_ = self.defn_type() name = self.defn_name() - if name in self.context.typedefs: + if name in self.symtab.typedefs: self.syntax_error('Type "%s" already declared' % name) - self.context.typedefs[name] = type_ + self.symtab.typedefs[name] = type_ return type_ def defn_const(self): self.scanner.expect('const') name = self.defn_name() - if name in self.context.consts: + if name in self.symtab.consts: self.syntax_error('Const "%s" already declared' % name) loc = self.const() - self.context.consts[name] = loc + self.symtab.consts[name] = loc return loc def defn(self): @@ -189,9 +205,7 @@ class Parser(object): if initial is not None and addr is not None: self.syntax_error("Definition cannot have both initial value and explicit address") - location = LocationRef(type_, name) - - return type_, Defn(self.scanner.line_number, name=name, addr=addr, initial=initial, location=location) + return type_, Defn(self.scanner.line_number, name=name, addr=addr, initial=initial) def const(self): if self.scanner.token in ('on', 'off'): @@ -208,8 +222,8 @@ class Parser(object): loc = ConstantRef(TYPE_WORD, int(self.scanner.token)) self.scanner.scan() return loc - elif self.scanner.token in self.context.consts: - loc = self.context.consts[self.scanner.token] + elif self.scanner.token in self.symtab.consts: + loc = self.symtab.consts[self.scanner.token] self.scanner.scan() return loc else: @@ -257,9 +271,9 @@ class Parser(object): else: type_name = self.scanner.token self.scanner.scan() - if type_name not in self.context.typedefs: + if type_name not in self.symtab.typedefs: self.syntax_error("Undefined type '%s'" % type_name) - type_ = self.context.typedefs[type_name] + type_ = self.symtab.typedefs[type_name] return type_ @@ -297,20 +311,9 @@ class Parser(object): self.scanner.scan() else: statics = self.statics() - - self.clear_statics() - for defn in statics: - self.declare(defn.name, defn, defn.location.type, static=True) block = self.block() - self.clear_statics() - addr = None - location = LocationRef(type_, name) - return type_, Routine( - self.scanner.line_number, - name=name, block=block, addr=addr, - location=location, statics=statics, - ) + return type_, Routine(self.scanner.line_number, name=name, block=block, addr=addr, statics=statics) def labels(self): accum = [] @@ -334,16 +337,12 @@ class Parser(object): return accum def locexpr(self): - if self.scanner.token in ('on', 'off', 'word') or self.scanner.token in self.context.consts or self.scanner.on_type('integer literal'): + if self.scanner.token in ('on', 'off', 'word') or self.scanner.token in self.symtab.consts or self.scanner.on_type('integer literal'): return self.const() else: name = self.scanner.token self.scanner.scan() - loc = self.context.fetch_ref(name) - if loc: - return loc - else: - return ForwardReference(name) + return self.lookup(name, allow_forward=True, routine_name=self.current_routine_name) def indlocexpr(self): if self.scanner.consume('['): @@ -361,7 +360,7 @@ class Parser(object): index = None offset = ConstantRef(TYPE_BYTE, 0) if self.scanner.consume('+'): - if self.scanner.token in self.context.consts or self.scanner.on_type('integer literal'): + if self.scanner.token in self.symtab.consts or self.scanner.on_type('integer literal'): offset = self.const() self.scanner.expect('+') index = self.locexpr() @@ -374,6 +373,7 @@ class Parser(object): type_, defn = self.defn() if defn.initial is None: self.syntax_error("Static definition {} must have initial value".format(defn)) + self.declare_static(self.current_routine_name, defn.name, defn, type_) defns.append(defn) return defns From 8d6e5e090deec8fbd552c1ec898b1aecac46ab20 Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Wed, 10 Apr 2019 08:50:13 +0100 Subject: [PATCH 06/18] The Type and Ref class hierarchies are now namedtuples. --- src/sixtypical/analyzer.py | 13 ++- src/sixtypical/model.py | 180 ++++++++++++------------------------- src/sixtypical/parser.py | 33 +++---- 3 files changed, 78 insertions(+), 148 deletions(-) diff --git a/src/sixtypical/analyzer.py b/src/sixtypical/analyzer.py index 68ff759..7008cdb 100644 --- a/src/sixtypical/analyzer.py +++ b/src/sixtypical/analyzer.py @@ -184,7 +184,7 @@ class AnalysisContext(object): exception_class = kwargs.get('exception_class', UnmeaningfulReadError) for ref in refs: # statics are always meaningful - if self.symtab.fetch_static_ref(self.routine.name, ref.name): + if self.symtab.has_static(self.routine.name, ref.name): continue if self.is_constant(ref): pass @@ -204,7 +204,7 @@ class AnalysisContext(object): exception_class = kwargs.get('exception_class', ForbiddenWriteError) for ref in refs: # statics are always writeable - if self.symtab.fetch_static_ref(self.routine.name, ref.name): + if self.symtab.has_static(self.routine.name, ref.name): continue if ref not in self._writeable: message = ref.name @@ -505,10 +505,7 @@ class Analyzer(object): # if something was touched, then it should have been declared to be writable. for ref in context.each_touched(): - # FIXME once we have namedtuples, go back to comparing the ref directly! - outputs_names = [r.name for r in type_.outputs] - trashes_names = [r.name for r in type_.trashes] - if ref.name not in outputs_names and ref.name not in trashes_names and not self.symtab.has_static(routine.name, ref.name): + if ref not in type_.outputs and ref not in type_.trashes and not self.symtab.has_static(routine.name, ref.name): raise ForbiddenWriteError(routine, ref.name) self.exit_contexts = None @@ -822,7 +819,7 @@ class Analyzer(object): def analyze_call(self, instr, context): type = self.get_type(instr.location) if not isinstance(type, (RoutineType, VectorType)): - raise TypeMismatchError(instr, instr.location) + raise TypeMismatchError(instr, instr.location.name) if isinstance(type, VectorType): type = type.of_type for ref in type.inputs: @@ -839,7 +836,7 @@ class Analyzer(object): type_ = self.get_type(instr.location) if not isinstance(type_, (RoutineType, VectorType)): - raise TypeMismatchError(instr, location) + raise TypeMismatchError(instr, location.name) # assert that the dest routine's inputs are all initialized if isinstance(type_, VectorType): diff --git a/src/sixtypical/model.py b/src/sixtypical/model.py index 6b8724d..716cbca 100644 --- a/src/sixtypical/model.py +++ b/src/sixtypical/model.py @@ -1,41 +1,35 @@ """Data/storage model for SixtyPical.""" - -class Type(object): - def __init__(self, name, max_range=None): - self.name = name - self.max_range = max_range - - def __repr__(self): - return 'Type(%r)' % self.name - - def __eq__(self, other): - return other.__class__ == self.__class__ and other.name == self.name +from collections import namedtuple -TYPE_BIT = Type('bit', max_range=(0, 1)) -TYPE_BYTE = Type('byte', max_range=(0, 255)) -TYPE_WORD = Type('word', max_range=(0, 65535)) +class BitType(namedtuple('BitType', ['typename'])): + max_range = (0, 1) + def __new__(cls): + return super(BitType, cls).__new__(cls, 'bit') +TYPE_BIT = BitType() -class RoutineType(Type): +class ByteType(namedtuple('ByteType', ['typename'])): + max_range = (0, 255) + def __new__(cls): + return super(ByteType, cls).__new__(cls, 'byte') +TYPE_BYTE = ByteType() + + +class WordType(namedtuple('WordType', ['typename'])): + max_range = (0, 65535) + def __new__(cls): + return super(WordType, cls).__new__(cls, 'word') +TYPE_WORD = WordType() + + +class RoutineType(namedtuple('RoutineType', ['typename', 'inputs', 'outputs', 'trashes'])): """This memory location contains the code for a routine.""" - def __init__(self, inputs, outputs, trashes): - self.inputs = inputs - self.outputs = outputs - self.trashes = trashes + max_range = (0, 0) - def __repr__(self): - return '%s(inputs=%r, outputs=%r, trashes=%r)' % ( - self.__class__.__name__, self.inputs, self.outputs, self.trashes - ) - - def __eq__(self, other): - return isinstance(other, RoutineType) and ( - other.inputs == self.inputs and - other.outputs == self.outputs and - other.trashes == self.trashes - ) + def __new__(cls, *args): + return super(RoutineType, cls).__new__(cls, 'routine', *args) @classmethod def executable_types_compatible(cls_, src, dest): @@ -55,34 +49,18 @@ class RoutineType(Type): return False -class VectorType(Type): +class VectorType(namedtuple('VectorType', ['typename', 'of_type'])): """This memory location contains the address of some other type (currently, only RoutineType).""" - max_range = (0, 0) + max_range = (0, 65535) - def __init__(self, of_type): - self.of_type = of_type - - def __repr__(self): - return '%s(%r)' % ( - self.__class__.__name__, self.of_type - ) - - def __eq__(self, other): - return isinstance(other, VectorType) and self.of_type == other.of_type + def __new__(cls, *args): + return super(VectorType, cls).__new__(cls, 'vector', *args) -class TableType(Type): - def __init__(self, of_type, size): - self.of_type = of_type - self.size = size +class TableType(namedtuple('TableType', ['typename', 'of_type', 'size'])): - def __repr__(self): - return '%s(%r, %r)' % ( - self.__class__.__name__, self.of_type, self.size - ) - - def __eq__(self, other): - return isinstance(other, TableType) and self.of_type == other.of_type and self.size == other.size + def __new__(cls, *args): + return super(TableType, cls).__new__(cls, 'table', *args) @property def max_range(self): @@ -93,94 +71,46 @@ class TableType(Type): return isinstance(x, TableType) and x.of_type == of_type -class PointerType(Type): - max_range = (0, 0) +class PointerType(namedtuple('PointerType', ['typename'])): + max_range = (0, 65535) - def __init__(self): - self.name = 'pointer' - - def __eq__(self, other): - return other.__class__ == self.__class__ + def __new__(cls): + return super(PointerType, cls).__new__(cls, 'pointer') -class Ref(object): - pass +# -------------------------------------------------------- -class LocationRef(Ref): - def __init__(self, type, name): - self.name = name - - def __eq__(self, other): - return self.__class__ is other.__class__ and self.name == other.name - - def __hash__(self): - return hash(self.name) - - def __repr__(self): - return '%s(%r)' % (self.__class__.__name__, self.name) - - def __str__(self): - return self.name +class LocationRef(namedtuple('LocationRef', ['reftype', 'name'])): + def __new__(cls, *args): + return super(LocationRef, cls).__new__(cls, 'location', *args) @classmethod def format_set(cls, location_refs): return '{%s}' % ', '.join([str(loc) for loc in sorted(location_refs, key=lambda x: x.name)]) -class IndirectRef(Ref): - def __init__(self, ref): - self.ref = ref - - def __eq__(self, other): - return isinstance(other, self.__class__) and self.ref == other.ref - - def __hash__(self): - return hash(self.__class__.name) ^ hash(self.ref) - - def __repr__(self): - return '%s(%r)' % (self.__class__.__name__, self.ref) +class IndirectRef(namedtuple('IndirectRef', ['reftype', 'ref'])): + def __new__(cls, *args): + return super(IndirectRef, cls).__new__(cls, 'indirect', *args) @property def name(self): return '[{}]+y'.format(self.ref.name) -class IndexedRef(Ref): - def __init__(self, ref, offset, index): - self.ref = ref - self.offset = offset - self.index = index - - def __eq__(self, other): - return isinstance(other, self.__class__) and self.ref == other.ref and self.offset == other.offset and self.index == other.index - - def __hash__(self): - return hash(self.__class__.name) ^ hash(self.ref) ^ hash(self.offset) ^ hash(self.index) - - def __repr__(self): - return '%s(%r, %r, %r)' % (self.__class__.__name__, self.ref, self.offset, self.index) +class IndexedRef(namedtuple('IndexedRef', ['reftype', 'ref', 'offset', 'index'])): + def __new__(cls, *args): + return super(IndexedRef, cls).__new__(cls, 'indexed', *args) @property def name(self): return '{}+{}+{}'.format(self.ref.name, self.offset, self.index.name) -class ConstantRef(Ref): - def __init__(self, type, value): - self.type = type - self.value = value - - def __eq__(self, other): - return isinstance(other, ConstantRef) and ( - other.type == self.type and other.value == self.value - ) - - def __hash__(self): - return hash(str(self.value) + str(self.type)) - - def __repr__(self): - return '%s(%r, %r)' % (self.__class__.__name__, self.type, self.value) +class ConstantRef(namedtuple('ConstantRef', ['reftype', 'type', 'value'])): + def __new__(cls, *args): + return super(ConstantRef, cls).__new__(cls, 'constant', *args) def high_byte(self): return (self.value >> 8) & 255 @@ -207,11 +137,11 @@ class ConstantRef(Ref): return 'constant({})'.format(self.value) -REG_A = LocationRef(TYPE_BYTE, 'a') -REG_X = LocationRef(TYPE_BYTE, 'x') -REG_Y = LocationRef(TYPE_BYTE, 'y') +REG_A = LocationRef('a') +REG_X = LocationRef('x') +REG_Y = LocationRef('y') -FLAG_Z = LocationRef(TYPE_BIT, 'z') -FLAG_C = LocationRef(TYPE_BIT, 'c') -FLAG_N = LocationRef(TYPE_BIT, 'n') -FLAG_V = LocationRef(TYPE_BIT, 'v') +FLAG_Z = LocationRef('z') +FLAG_C = LocationRef('c') +FLAG_N = LocationRef('n') +FLAG_V = LocationRef('v') diff --git a/src/sixtypical/parser.py b/src/sixtypical/parser.py index c5034f0..8a8cb29 100644 --- a/src/sixtypical/parser.py +++ b/src/sixtypical/parser.py @@ -54,13 +54,13 @@ class SymbolTable(object): def fetch_global_ref(self, name): if name in self.symbols: - return LocationRef(self.symbols[name].type_, name) + return LocationRef(name) return None def fetch_static_ref(self, routine_name, name): routine_statics = self.statics.get(routine_name, {}) if name in routine_statics: - return LocationRef(routine_statics[name].type_, name) + return LocationRef(name) return None @@ -98,22 +98,25 @@ class Parser(object): def resolve_symbols(self, program): # This could stand to be better unified. - def backpatch_constraint_labels(type_): - def resolve(w): - if not isinstance(w, ForwardReference): - return w - return self.lookup(w.name) + def resolve(w): + return self.lookup(w.name) if isinstance(w, ForwardReference) else w + + def backpatched_type(type_): if isinstance(type_, TableType): - backpatch_constraint_labels(type_.of_type) + return TableType(backpatched_type(type_.of_type), type_.size) elif isinstance(type_, VectorType): - backpatch_constraint_labels(type_.of_type) + return VectorType(backpatched_type(type_.of_type)) elif isinstance(type_, RoutineType): - type_.inputs = set([resolve(w) for w in type_.inputs]) - type_.outputs = set([resolve(w) for w in type_.outputs]) - type_.trashes = set([resolve(w) for w in type_.trashes]) + return RoutineType( + frozenset([resolve(w) for w in type_.inputs]), + frozenset([resolve(w) for w in type_.outputs]), + frozenset([resolve(w) for w in type_.trashes]), + ) + else: + return type_ for name, symentry in self.symtab.symbols.items(): - backpatch_constraint_labels(symentry.type_) + symentry.type_ = backpatched_type(symentry.type_) def resolve_fwd_reference(obj, field): field_value = getattr(obj, field, None) @@ -265,7 +268,7 @@ class Parser(object): type_ = VectorType(type_) elif self.scanner.consume('routine'): (inputs, outputs, trashes) = self.constraints() - type_ = RoutineType(inputs=inputs, outputs=outputs, trashes=trashes) + type_ = RoutineType(frozenset(inputs), frozenset(outputs), frozenset(trashes)) elif self.scanner.consume('pointer'): type_ = PointerType() else: @@ -302,7 +305,7 @@ class Parser(object): def routine(self, name): type_ = self.defn_type() if not isinstance(type_, RoutineType): - self.syntax_error("Can only define a routine, not %r" % type_) + self.syntax_error("Can only define a routine, not {}".format(repr(type_))) statics = [] if self.scanner.consume('@'): self.scanner.check_type('integer literal') From 21a187a1055527030bfb8d6c216221125bd1344b Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Wed, 10 Apr 2019 09:29:45 +0100 Subject: [PATCH 07/18] Update docs. --- HISTORY.md | 2 +- README.md | 139 ++++++++++++++++++++++++++++++++++------------------- TODO.md | 8 +++ 3 files changed, 99 insertions(+), 50 deletions(-) diff --git a/HISTORY.md b/HISTORY.md index 554a82c..5249b86 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -16,7 +16,7 @@ History of SixtyPical can also be given. * Accessing a `table` through a `pointer` must be done in the context of a `point ... into` block. This allows the - analyzer to check *which* table is being modified. + analyzer to check *which* table is being accessed. * Added `--dump-exit-contexts` option to `sixtypical`. 0.18 diff --git a/README.md b/README.md index fc947c1..1981aa2 100644 --- a/README.md +++ b/README.md @@ -1,60 +1,25 @@ SixtyPical ========== -_Version 0.18. Work-in-progress, everything is subject to change._ +_Version 0.19. Work-in-progress, everything is subject to change._ -**SixtyPical** is a low-level programming language with advanced -static analysis. Many of its primitive instructions resemble -those of the 6502 CPU — in fact it is intended to be compiled to -6502 machine code — but along with these instructions are -constructs which ease structuring and analyzing the code. The -language aims to fill this niche: +**SixtyPical** is a [very low-level](#very-low-level) programming language +supporting a [sophisticated static analysis](#sophisticated-static-analysis). -* You'd use assembly, but you don't want to spend hours - debugging (say) a memory overrun that happened because of a - ridiculous silly error. -* You'd use C or some other "high-level" language, but you don't - want the extra overhead added by the compiler to manage the - stack and registers. - -SixtyPical gives the programmer a coding regimen on par with assembly -language in terms of size and hands-on-ness, but also able to catch -many ridiculous silly errors at compile time, such as - -* you forgot to clear carry before adding something to the accumulator -* a subroutine that you called trashes a register you thought it preserved -* you tried to read or write a byte beyond the end of a byte array -* you tried to write the address of something that was not a routine, to - a jump vector - -Many of these checks are done with _abstract interpretation_, where we -go through the program step by step, tracking not just the changes that -happen during a _specific_ execution of the program, but _sets_ of changes -that could _possibly_ happen in any run of the program. - -SixtyPical also provides some convenient operations based on -machine-language programming idioms, such as - -* copying values from one register to another (via a third register when - there are no underlying instructions that directly support it); this - includes 16-bit values, which are copied in two steps -* explicit tail calls -* indirect subroutine calls - -SixtyPical is defined by a specification document, a set of test cases, -and a reference implementation written in Python 2. The reference -implementation can analyze and compile SixtyPical programs to 6502 machine -code, which can be run on several 6502-based 8-bit architectures: - -* Commodore 64 -* Commodore VIC-20 -* Atari 2600 VCS -* Apple II +Its reference compiler can generate [efficient code](#efficient-code) for +[several 6502-based platforms](#target-platforms) while catching many +common mistakes at compile-time, reducing the time spent in debugging. Quick Start ----------- -If you have the [VICE][] emulator installed, from this directory, you can run +Make sure you have Python (2.7 or 3.5+) installed. Then +clone this repository and put its `bin` directory on your +executable search path. Then run: + + sixtypical + +If you have the [VICE][] emulator installed, you can run ./loadngo.sh c64 eg/c64/hearts.60p @@ -67,11 +32,80 @@ You can try the `loadngo.sh` script on other sources in the `eg` directory tree, which contains more extensive examples, including an entire game(-like program); see [eg/README.md](eg/README.md) for a listing. -[VICE]: http://vice-emu.sourceforge.net/ +Features +-------- + +SixtyPical aims to fill this niche: + +* You'd use assembly, but you don't want to spend hours + debugging (say) a memory overrun that happened because of a + ridiculous silly error. +* You'd use C or some other "high-level" language, but you don't + want the extra overhead added by the compiler to manage the + stack and registers. + +SixtyPical gives the programmer a coding regimen on par with assembly +language in terms of size and hands-on-ness, but also able to catch +many ridiculous silly errors at compile time. + +### Very low level + +Many of SixtyPical's primitive instructions resemble +those of the 6502 CPU — in fact it is intended to be compiled to +6502 machine code — but along with these instructions are +constructs which ease structuring and analyzing the code. + +However, SixtyPical also does provide some "higher-level" operations +based on common 8-bit machine-language programming idioms, including + +* copying values from one register to another (via a third register when + there are no underlying instructions that directly support it) +* copying, adding, and comparing 16-bit values (done in two steps) +* explicit tail calls +* indirect subroutine calls + +While a programmer will find these constructs convenient, their +inclusion in the language is primarily to make programs easier to analyze. + +### Sophisticated static analysis + +The language defines an [effect system][], and the reference +compiler [abstractly interprets][] the input program to check that +it conforms to it. It can detect common mistakes such as + +* you forgot to clear carry before adding something to the accumulator +* a subroutine that you called trashes a register you thought it preserved +* you tried to read or write a byte beyond the end of a byte array +* you tried to write the address of something that was not a routine, to + a jump vector + +### Efficient code + +Unlike most languages, in SixtyPical the programmer must manage memory very +explicitly, selecting the registers and memory locations to store all data in. +So, unlike a C compiler such as [cc65][], a SixtyPical compiler doesn't need +to generate code to handle stack management or register spilling. This results +in smaller (and thus faster) programs. + +The flagship demo, a minigame for the Commodore 64, compiles to +a **930**-byte `.PRG` file. + +### Target platforms + +The reference implementation can analyze and compile SixtyPical programs to +6502 machine code formats which can run on several 6502-based 8-bit architectures: + +* [Commodore 64][] -- examples in [eg/c64/](eg/c64/) +* [Commodore VIC-20][] -- examples in [eg/vic20/](eg/vic20/) +* [Atari 2600][] -- examples in [eg/atari2600/](eg/atari2600/) +* [Apple II series][] -- examples in [eg/apple2/](eg/apple2/) Documentation ------------- +SixtyPical is defined by a specification document, a set of test cases, +and a reference implementation written in Python. + * [Design Goals](doc/Design%20Goals.md) * [SixtyPical specification](doc/SixtyPical.md) * [SixtyPical revision history](HISTORY.md) @@ -82,3 +116,10 @@ Documentation * [6502 Opcodes used/not used in SixtyPical](doc/6502%20Opcodes.md) * [Output formats supported by `sixtypical`](doc/Output%20Formats.md) * [TODO](TODO.md) + +[VICE]: http://vice-emu.sourceforge.net/ +[cc65]: https://cc65.github.io/ +[Commodore 64]: https://en.wikipedia.org/wiki/Commodore_64 +[Commodore VIC-20]: https://en.wikipedia.org/wiki/Commodore_VIC-20 +[Atari 2600]: https://en.wikipedia.org/wiki/Atari_2600 +[Apple II series]: https://en.wikipedia.org/wiki/Apple_II_series diff --git a/TODO.md b/TODO.md index f0b1a6c..7a0913d 100644 --- a/TODO.md +++ b/TODO.md @@ -29,6 +29,14 @@ inner block has finished -- even if there is no `call`.) These holes need to be plugged. +### Reset pointer in `point into` blocks + +We have `point into` blocks, but maybe the action when entering such a +block shouldn't always be to set the given pointer to the start of the given table. + +That is, sometimes we would like to start at some fixed offset. And +sometimes we want to (re)set the pointer, without closing and starting a new block. + ### Pointers associated globally with a table We have `point into` blocks, but we would also like to sometimes pass a pointer From 652ab1dc5c3f1c2de732b7ed636def763e3c0ed3 Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Wed, 10 Apr 2019 10:42:50 +0100 Subject: [PATCH 08/18] Clean up ArgumentParser usage message, add --version argument. --- bin/sixtypical | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/bin/sixtypical b/bin/sixtypical index 7a86cce..42e0c65 100755 --- a/bin/sixtypical +++ b/bin/sixtypical @@ -1,10 +1,5 @@ #!/usr/bin/env python -"""Usage: sixtypical [OPTIONS] FILES - -Analyzes and compiles a Sixtypical program. -""" - from os.path import realpath, dirname, join import sys @@ -92,7 +87,7 @@ def process_input_files(filenames, options): if __name__ == '__main__': - argparser = ArgumentParser(__doc__.strip()) + argparser = ArgumentParser() argparser.add_argument( 'filenames', metavar='FILENAME', type=str, nargs='+', @@ -152,6 +147,11 @@ if __name__ == '__main__': action="store_true", help="When an error occurs, display a full Python traceback." ) + argparser.add_argument( + "--version", + action="version", + version="%(prog)s 0.19" + ) options, unknown = argparser.parse_known_args(sys.argv[1:]) remainder = ' '.join(unknown) From 7854f7170692aad5e00691c584e7a88fe0fbe45a Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Wed, 10 Apr 2019 12:02:35 +0100 Subject: [PATCH 09/18] More edits to docs. --- HISTORY.md | 4 ++++ README.md | 23 ++++++++++++++--------- eg/README.md | 6 ++++++ 3 files changed, 24 insertions(+), 9 deletions(-) diff --git a/HISTORY.md b/HISTORY.md index 5249b86..eea41ac 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -17,6 +17,10 @@ History of SixtyPical * Accessing a `table` through a `pointer` must be done in the context of a `point ... into` block. This allows the analyzer to check *which* table is being accessed. +* Refactored compiler internals so that type information + is stored in a single symbol table shared by all phases. +* Refactored internal data structures that represent + references and types to be immutable `namedtuple`s. * Added `--dump-exit-contexts` option to `sixtypical`. 0.18 diff --git a/README.md b/README.md index 1981aa2..921e801 100644 --- a/README.md +++ b/README.md @@ -5,9 +5,8 @@ _Version 0.19. Work-in-progress, everything is subject to change._ **SixtyPical** is a [very low-level](#very-low-level) programming language supporting a [sophisticated static analysis](#sophisticated-static-analysis). - Its reference compiler can generate [efficient code](#efficient-code) for -[several 6502-based platforms](#target-platforms) while catching many +several 6502-based [target platforms](#target-platforms) while catching many common mistakes at compile-time, reducing the time spent in debugging. Quick Start @@ -15,7 +14,7 @@ Quick Start Make sure you have Python (2.7 or 3.5+) installed. Then clone this repository and put its `bin` directory on your -executable search path. Then run: +executable search path. Then you can run: sixtypical @@ -84,8 +83,8 @@ it conforms to it. It can detect common mistakes such as Unlike most languages, in SixtyPical the programmer must manage memory very explicitly, selecting the registers and memory locations to store all data in. So, unlike a C compiler such as [cc65][], a SixtyPical compiler doesn't need -to generate code to handle stack management or register spilling. This results -in smaller (and thus faster) programs. +to generate code to handle [call stack management][] or [register allocation][]. +This results in smaller (and thus faster) programs. The flagship demo, a minigame for the Commodore 64, compiles to a **930**-byte `.PRG` file. @@ -95,10 +94,12 @@ a **930**-byte `.PRG` file. The reference implementation can analyze and compile SixtyPical programs to 6502 machine code formats which can run on several 6502-based 8-bit architectures: -* [Commodore 64][] -- examples in [eg/c64/](eg/c64/) -* [Commodore VIC-20][] -- examples in [eg/vic20/](eg/vic20/) -* [Atari 2600][] -- examples in [eg/atari2600/](eg/atari2600/) -* [Apple II series][] -- examples in [eg/apple2/](eg/apple2/) +* [Commodore 64][] +* [Commodore VIC-20][] +* [Atari 2600][] +* [Apple II series][] + +For example programs for each of these, see [eg/README.md](eg/README.md). Documentation ------------- @@ -117,6 +118,10 @@ and a reference implementation written in Python. * [Output formats supported by `sixtypical`](doc/Output%20Formats.md) * [TODO](TODO.md) +[effect system]: https://en.wikipedia.org/wiki/Effect_system +[abstractly interprets]: https://en.wikipedia.org/wiki/Abstract_interpretation +[call stack management]: https://en.wikipedia.org/wiki/Call_stack +[register allocation]: https://en.wikipedia.org/wiki/Register_allocation [VICE]: http://vice-emu.sourceforge.net/ [cc65]: https://cc65.github.io/ [Commodore 64]: https://en.wikipedia.org/wiki/Commodore_64 diff --git a/eg/README.md b/eg/README.md index 51c3d70..eb4be82 100644 --- a/eg/README.md +++ b/eg/README.md @@ -46,4 +46,10 @@ Atari 2600 (4K cartridge). The directory itself contains a simple demo, [smiley.60p](atari2600/smiley.60p) which was converted from an older Atari 2600 skeleton program written in [Ophis][]. +### apple2 + +In the [apple2](apple2/) directory are programs that run on +Apple II series computers (Apple II+, Apple //e). `sixtypical`'s +support for this architecture could be called embryonic. + [Ophis]: http://michaelcmartin.github.io/Ophis/ From 1ca5cb0336b6977934b7fd9c3c5eaf3e362e000f Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Wed, 10 Apr 2019 12:23:50 +0100 Subject: [PATCH 10/18] You could argue that it's not *that* low level, so, okay. --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 921e801..6352595 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ SixtyPical _Version 0.19. Work-in-progress, everything is subject to change._ -**SixtyPical** is a [very low-level](#very-low-level) programming language +**SixtyPical** is a [low-level](#low-level) programming language supporting a [sophisticated static analysis](#sophisticated-static-analysis). Its reference compiler can generate [efficient code](#efficient-code) for several 6502-based [target platforms](#target-platforms) while catching many @@ -47,7 +47,7 @@ SixtyPical gives the programmer a coding regimen on par with assembly language in terms of size and hands-on-ness, but also able to catch many ridiculous silly errors at compile time. -### Very low level +### Low level Many of SixtyPical's primitive instructions resemble those of the 6502 CPU — in fact it is intended to be compiled to From c906ab7817dffd546e920c990885d6449849ed63 Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Wed, 10 Apr 2019 12:29:59 +0100 Subject: [PATCH 11/18] A few further small edits to README. --- README.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 6352595..fc9c2d4 100644 --- a/README.md +++ b/README.md @@ -49,12 +49,9 @@ many ridiculous silly errors at compile time. ### Low level -Many of SixtyPical's primitive instructions resemble -those of the 6502 CPU — in fact it is intended to be compiled to -6502 machine code — but along with these instructions are -constructs which ease structuring and analyzing the code. - -However, SixtyPical also does provide some "higher-level" operations +Many of SixtyPical's primitive instructions resemble those of the +[MOS Technology 6502][] — it is in fact intended to be compiled to 6502 +machine code. However, it also does provide some "higher-level" operations based on common 8-bit machine-language programming idioms, including * copying values from one register to another (via a third register when @@ -83,7 +80,7 @@ it conforms to it. It can detect common mistakes such as Unlike most languages, in SixtyPical the programmer must manage memory very explicitly, selecting the registers and memory locations to store all data in. So, unlike a C compiler such as [cc65][], a SixtyPical compiler doesn't need -to generate code to handle [call stack management][] or [register allocation][]. +to generate code to handle [calling conventions][] or [register allocation][]. This results in smaller (and thus faster) programs. The flagship demo, a minigame for the Commodore 64, compiles to @@ -118,9 +115,10 @@ and a reference implementation written in Python. * [Output formats supported by `sixtypical`](doc/Output%20Formats.md) * [TODO](TODO.md) +[MOS Technology 6520]: https://en.wikipedia.org/wiki/MOS_Technology_6502 [effect system]: https://en.wikipedia.org/wiki/Effect_system [abstractly interprets]: https://en.wikipedia.org/wiki/Abstract_interpretation -[call stack management]: https://en.wikipedia.org/wiki/Call_stack +[calling conventions]: https://en.wikipedia.org/wiki/Calling_convention [register allocation]: https://en.wikipedia.org/wiki/Register_allocation [VICE]: http://vice-emu.sourceforge.net/ [cc65]: https://cc65.github.io/ From 04a9438898ba0e43acb39b8d8ae5cb3f06e9265c Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Wed, 10 Apr 2019 16:53:01 +0100 Subject: [PATCH 12/18] The VICE emulators just keep going if they can't find the vicerc. --- eg/rudiments/loadngo.sh | 12 ++---------- loadngo.sh | 12 ++---------- 2 files changed, 4 insertions(+), 20 deletions(-) diff --git a/eg/rudiments/loadngo.sh b/eg/rudiments/loadngo.sh index 8fdb811..9b7d98a 100755 --- a/eg/rudiments/loadngo.sh +++ b/eg/rudiments/loadngo.sh @@ -6,18 +6,10 @@ arch="$1" shift 1 if [ "X$arch" = "Xc64" ]; then output_format='c64-basic-prg' - if [ -e vicerc ]; then - emu="x64 -config vicerc" - else - emu="x64" - fi + emu="x64 -config vicerc" elif [ "X$arch" = "Xvic20" ]; then output_format='vic20-basic-prg' - if [ -e vicerc ]; then - emu="xvic -config vicerc" - else - emu="xvic" - fi + emu="xvic -config vicerc" else echo $usage && exit 1 fi diff --git a/loadngo.sh b/loadngo.sh index b2d6461..c87b359 100755 --- a/loadngo.sh +++ b/loadngo.sh @@ -6,18 +6,10 @@ arch="$1" shift 1 if [ "X$arch" = "Xc64" ]; then output_format='c64-basic-prg' - if [ -e vicerc ]; then - emu="x64 -config vicerc" - else - emu="x64" - fi + emu="x64 -config vicerc" elif [ "X$arch" = "Xvic20" ]; then output_format='vic20-basic-prg' - if [ -e vicerc ]; then - emu="xvic -config vicerc" - else - emu="xvic" - fi + emu="xvic -config vicerc" elif [ "X$arch" = "Xatari2600" ]; then output_format='atari2600-cart' emu='stella' From ce8e83908bb6c990ca160707c3468cf8ee30b7f8 Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Thu, 11 Apr 2019 16:53:43 +0100 Subject: [PATCH 13/18] First cut at a --run option for sixtypical, replacing loadngo.sh. --- bin/sixtypical | 50 +++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 39 insertions(+), 11 deletions(-) diff --git a/bin/sixtypical b/bin/sixtypical index 42e0c65..264c8fe 100755 --- a/bin/sixtypical +++ b/bin/sixtypical @@ -11,7 +11,9 @@ from argparse import ArgumentParser import codecs import json from pprint import pprint +from subprocess import check_call import sys +from tempfile import NamedTemporaryFile import traceback from sixtypical.parser import Parser, SymbolTable, merge_programs @@ -64,7 +66,7 @@ def process_input_files(filenames, options): compilation_roster = fa.serialize() dump(compilation_roster) - if options.analyze_only or options.output is None: + if options.analyze_only or (options.output is None and not options.run): return start_addr = None @@ -74,16 +76,36 @@ def process_input_files(filenames, options): else: start_addr = int(options.origin, 10) - with open(options.output, 'wb') as fh: - outputter = outputter_class_for(options.output_format)(fh, start_addr=start_addr) - outputter.write_prelude() - compiler = Compiler(symtab, outputter.emitter) - compiler.compile_program(program, compilation_roster=compilation_roster) - outputter.write_postlude() - if options.debug: - pprint(outputter.emitter) - else: - outputter.emitter.serialize_to(fh) + if options.run: + fh = NamedTemporaryFile(delete=False) + output_filename = fh.name + else: + fh = open(options.output, 'wb') + output_filename = options.output + + outputter = outputter_class_for(options.output_format)(fh, start_addr=start_addr) + outputter.write_prelude() + compiler = Compiler(symtab, outputter.emitter) + compiler.compile_program(program, compilation_roster=compilation_roster) + outputter.write_postlude() + if options.debug: + pprint(outputter.emitter) + else: + outputter.emitter.serialize_to(fh) + + fh.close() + + if options.run: + emu = { + 'c64-basic-prg': "x64 -config vicerc", + 'vic20-basic-prg': "xvic -config vicerc", + 'atari2600-cart': "stella" + }.get(options.output_format) + if not emu: + raise ValueError("No emulator configured for selected --output-format '{}'".format(options.output_format)) + + command = "{} {}".format(emu, output_filename) + check_call(command, shell=True) if __name__ == '__main__': @@ -142,6 +164,12 @@ if __name__ == '__main__': action="store_true", help="Display debugging information when analyzing and compiling." ) + argparser.add_argument( + "--run", + action="store_true", + help="Engage 'load-and-go' operation: write the output to a temporary filename, " + "infer an emulator from the given --output-format, and boot the emulator." + ) argparser.add_argument( "--traceback", action="store_true", From a44b007ff01dc62ab085a4ac2dc82dc3a6962cff Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Mon, 15 Apr 2019 13:11:43 +0100 Subject: [PATCH 14/18] Declare that --run replaces loadngo.sh, and remove the latter. --- HISTORY.md | 2 ++ README.md | 4 ++-- eg/apple2/README.md | 29 ++++++++++++++++++++++++++ loadngo.sh | 51 --------------------------------------------- 4 files changed, 33 insertions(+), 53 deletions(-) create mode 100644 eg/apple2/README.md delete mode 100755 loadngo.sh diff --git a/HISTORY.md b/HISTORY.md index eea41ac..f5c452d 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -22,6 +22,8 @@ History of SixtyPical * Refactored internal data structures that represent references and types to be immutable `namedtuple`s. * Added `--dump-exit-contexts` option to `sixtypical`. +* Added a new `--run` option to `sixtypical`, which replaces + the old `loadngo.sh` script. 0.18 ---- diff --git a/README.md b/README.md index fc9c2d4..e9bcf59 100644 --- a/README.md +++ b/README.md @@ -20,14 +20,14 @@ executable search path. Then you can run: If you have the [VICE][] emulator installed, you can run - ./loadngo.sh c64 eg/c64/hearts.60p + sixtypical --output-format=c64-basic-prg --run eg/c64/hearts.60p and it will compile the [hearts.60p source code](eg/c64/hearts.60p) and automatically start it in the `x64` emulator, and you should see: ![Screenshot of result of running hearts.60p](images/hearts.png?raw=true) -You can try the `loadngo.sh` script on other sources in the `eg` directory +You can try `sixtypical --run` on other sources in the `eg` directory tree, which contains more extensive examples, including an entire game(-like program); see [eg/README.md](eg/README.md) for a listing. diff --git a/eg/apple2/README.md b/eg/apple2/README.md new file mode 100644 index 0000000..77bd7d8 --- /dev/null +++ b/eg/apple2/README.md @@ -0,0 +1,29 @@ +This directory contains SixtyPical example programs +specifically for the Apple II series of computers. + +See the [README in the parent directory](../README.md) for +more information on these example programs. + +Note that `sixtypical` does not currently support "load +and go" execution of these programs, because constructing +an Apple II disk image file on the fly is not something +it can currently do. If you have the linapple sources +checked out, and the a2tools available, you could do +something like this: + + bin/sixtypical --traceback --origin=0x2000 --output-format=raw eg/apple2/prog.60p --output prog.bin + cp /path/to/linapple/res/Master.dsk sixtypical.dsk + a2rm sixtypical.dsk PROG + a2in B sixtypical.dsk PROG prog.bin + linapple -d1 sixtypical.dsk -autoboot + +and then enter + + BLOAD PROG + CALL 8192 + +Ideally you could + + BRUN PROG + +But that does not always return to BASIC and I'm not sure why. diff --git a/loadngo.sh b/loadngo.sh deleted file mode 100755 index c87b359..0000000 --- a/loadngo.sh +++ /dev/null @@ -1,51 +0,0 @@ -#!/bin/sh - -usage="Usage: loadngo.sh (c64|vic20|atari2600|apple2) [--dry-run] " - -arch="$1" -shift 1 -if [ "X$arch" = "Xc64" ]; then - output_format='c64-basic-prg' - emu="x64 -config vicerc" -elif [ "X$arch" = "Xvic20" ]; then - output_format='vic20-basic-prg' - emu="xvic -config vicerc" -elif [ "X$arch" = "Xatari2600" ]; then - output_format='atari2600-cart' - emu='stella' -elif [ "X$arch" = "Xapple2" ]; then - src="$1" - out=/tmp/a-out.bin - bin/sixtypical --traceback --origin=0x2000 --output-format=raw $src --output $out || exit 1 - ls -la $out - cp ~/scratchpad/linapple/res/Master.dsk sixtypical.dsk - # TODO: replace HELLO with something that does like - # BLOAD "PROG" - # CALL 8192 - # (not BRUN because it does not always return to BASIC afterwards not sure why) - a2rm sixtypical.dsk PROG - a2in B sixtypical.dsk PROG $out - linapple -d1 sixtypical.dsk -autoboot - rm -f $out sixtypical.dsk - exit 0 -else - echo $usage && exit 1 -fi - -if [ "X$1" = "X--dry-run" ]; then - shift 1 - emu='echo' -fi - -src="$1" -if [ "X$src" = "X" ]; then - echo $usage && exit 1 -fi - -### do it ### - -out=/tmp/a-out.prg -bin/sixtypical --traceback --output-format=$output_format $src --output $out || exit 1 -ls -la $out -$emu $out -rm -f $out From 6d867867fe8f29865fc8a6e2be9e75f6c666da23 Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Mon, 15 Apr 2019 13:16:07 +0100 Subject: [PATCH 15/18] Remove the "local" loadngo script from here as well. --- eg/rudiments/README.md | 13 +++++-------- eg/rudiments/loadngo.sh | 28 ---------------------------- 2 files changed, 5 insertions(+), 36 deletions(-) delete mode 100755 eg/rudiments/loadngo.sh diff --git a/eg/rudiments/README.md b/eg/rudiments/README.md index 009854e..914e2cf 100644 --- a/eg/rudiments/README.md +++ b/eg/rudiments/README.md @@ -6,15 +6,12 @@ are in the `errorful/` subdirectory. These files are intended to be architecture-agnostic. For the ones that do produce output, an appropriate source -under `platform/`, should be included first, like +under `support/` should be included first, so that system entry +points such as `chrout` are defined. In addition, some of these +programs use "standard" support modules, so those should be included +first too. For example: - sixtypical platform/c64.60p vector-table.60p - -so that system entry points such as `chrout` are defined. - -There's a `loadngo.sh` script in this directory that does this. - - ./loadngo.sh c64 vector-table.60p + sixtypical --output-format=c64-basic-prg --run support/c64.60p support/stdlib.60p vector-table.60p `chrout` is a routine with outputs the value of the accumulator as an ASCII character, disturbing none of the other registers, diff --git a/eg/rudiments/loadngo.sh b/eg/rudiments/loadngo.sh deleted file mode 100755 index 9b7d98a..0000000 --- a/eg/rudiments/loadngo.sh +++ /dev/null @@ -1,28 +0,0 @@ -#!/bin/sh - -usage="Usage: loadngo.sh (c64|vic20) " - -arch="$1" -shift 1 -if [ "X$arch" = "Xc64" ]; then - output_format='c64-basic-prg' - emu="x64 -config vicerc" -elif [ "X$arch" = "Xvic20" ]; then - output_format='vic20-basic-prg' - emu="xvic -config vicerc" -else - echo $usage && exit 1 -fi - -src="$1" -if [ "X$src" = "X" ]; then - echo $usage && exit 1 -fi - -### do it ### - -out=/tmp/a-out.prg -../../bin/sixtypical --traceback --output-format=$output_format support/$arch.60p support/stdlib.60p $src --output $out || exit 1 -ls -la $out -$emu $out -rm -f $out From c24642493051161d7916016a2c373d683d63f779 Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Mon, 15 Apr 2019 17:35:17 +0100 Subject: [PATCH 16/18] Fix bug raising InconsistentExitError, and type-bug in ribos2.60p. --- eg/c64/ribos/ribos2.60p | 5 ++- src/sixtypical/analyzer.py | 6 +-- tests/SixtyPical Analysis.md | 78 +++++++++++++++++++++++++++++++++++- 3 files changed, 83 insertions(+), 6 deletions(-) diff --git a/eg/c64/ribos/ribos2.60p b/eg/c64/ribos/ribos2.60p index ee86006..0f8a361 100644 --- a/eg/c64/ribos/ribos2.60p +++ b/eg/c64/ribos/ribos2.60p @@ -50,8 +50,9 @@ byte scanline : 85 // %01010101 // be practical. So we just jump to this location instead. define pla_tay_pla_tax_pla_rti routine - inputs a - trashes a + inputs border_color, vic_intr + outputs border_color, vic_intr + trashes a, z, n, c @ $EA81 // ----- Interrupt Handler ----- diff --git a/src/sixtypical/analyzer.py b/src/sixtypical/analyzer.py index 7008cdb..197b68c 100644 --- a/src/sixtypical/analyzer.py +++ b/src/sixtypical/analyzer.py @@ -481,11 +481,11 @@ class Analyzer(object): exit_writeable = set(exit_context.each_writeable()) for ex in self.exit_contexts[1:]: if set(ex.each_meaningful()) != exit_meaningful: - raise InconsistentExitError("Exit contexts are not consistent") + raise InconsistentExitError(routine, "Exit contexts are not consistent") if set(ex.each_touched()) != exit_touched: - raise InconsistentExitError("Exit contexts are not consistent") + raise InconsistentExitError(routine, "Exit contexts are not consistent") if set(ex.each_writeable()) != exit_writeable: - raise InconsistentExitError("Exit contexts are not consistent") + raise InconsistentExitError(routine, "Exit contexts are not consistent") # We now set the main context to the (consistent) exit context # so that this routine is perceived as having the same effect diff --git a/tests/SixtyPical Analysis.md b/tests/SixtyPical Analysis.md index 89ae302..040f48f 100644 --- a/tests/SixtyPical Analysis.md +++ b/tests/SixtyPical Analysis.md @@ -3921,7 +3921,83 @@ Here is like the above, but the two routines have different inputs, and that's O | } = ok -TODO: we should have a lot more test cases for the above, here. +Another inconsistent exit test, this one based on "real" code +(the `ribos2` demo). + + | typedef routine + | inputs border_color, vic_intr + | outputs border_color, vic_intr + | trashes a, z, n, c + | irq_handler + | + | vector irq_handler cinv @ $314 + | vector irq_handler saved_irq_vec + | byte vic_intr @ $d019 + | byte border_color @ $d020 + | + | define pla_tay_pla_tax_pla_rti routine + | inputs a + | trashes a + | @ $EA81 + | + | define our_service_routine irq_handler + | { + | ld a, vic_intr + | st a, vic_intr + | and a, 1 + | cmp a, 1 + | if not z { + | goto saved_irq_vec + | } else { + | ld a, border_color + | xor a, $ff + | st a, border_color + | goto pla_tay_pla_tax_pla_rti + | } + | } + | + | define main routine + | { + | } + ? InconsistentExitError + + | typedef routine + | inputs border_color, vic_intr + | outputs border_color, vic_intr + | trashes a, z, n, c + | irq_handler + | + | vector irq_handler cinv @ $314 + | vector irq_handler saved_irq_vec + | byte vic_intr @ $d019 + | byte border_color @ $d020 + | + | define pla_tay_pla_tax_pla_rti routine + | inputs border_color, vic_intr + | outputs border_color, vic_intr + | trashes a, z, n, c + | @ $EA81 + | + | define our_service_routine irq_handler + | { + | ld a, vic_intr + | st a, vic_intr + | and a, 1 + | cmp a, 1 + | if not z { + | goto saved_irq_vec + | } else { + | ld a, border_color + | xor a, $ff + | st a, border_color + | goto pla_tay_pla_tax_pla_rti + | } + | } + | + | define main routine + | { + | } + = ok Can't `goto` a routine that outputs or trashes more than the current routine. From a10f1c65288db36176bcb5c16680698807c758e1 Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Tue, 16 Apr 2019 10:35:59 +0100 Subject: [PATCH 17/18] Replace --run option with terser --run-on option. --- HISTORY.md | 4 ++-- README.md | 10 +++++----- bin/sixtypical | 32 +++++++++++++++++++------------- eg/rudiments/README.md | 2 +- 4 files changed, 27 insertions(+), 21 deletions(-) diff --git a/HISTORY.md b/HISTORY.md index f5c452d..ea5938b 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -22,8 +22,8 @@ History of SixtyPical * Refactored internal data structures that represent references and types to be immutable `namedtuple`s. * Added `--dump-exit-contexts` option to `sixtypical`. -* Added a new `--run` option to `sixtypical`, which replaces - the old `loadngo.sh` script. +* Added a new `--run-on=` option to `sixtypical`, which + replaces the old `loadngo.sh` script. 0.18 ---- diff --git a/README.md b/README.md index e9bcf59..12d864b 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ SixtyPical _Version 0.19. Work-in-progress, everything is subject to change._ **SixtyPical** is a [low-level](#low-level) programming language -supporting a [sophisticated static analysis](#sophisticated-static-analysis). +supporting a sophisticated [static analysis](#static-analysis). Its reference compiler can generate [efficient code](#efficient-code) for several 6502-based [target platforms](#target-platforms) while catching many common mistakes at compile-time, reducing the time spent in debugging. @@ -18,16 +18,16 @@ executable search path. Then you can run: sixtypical -If you have the [VICE][] emulator installed, you can run +If you have the [VICE][] emulator suite installed, you can run - sixtypical --output-format=c64-basic-prg --run eg/c64/hearts.60p + sixtypical --run-on=x64 eg/c64/hearts.60p and it will compile the [hearts.60p source code](eg/c64/hearts.60p) and automatically start it in the `x64` emulator, and you should see: ![Screenshot of result of running hearts.60p](images/hearts.png?raw=true) -You can try `sixtypical --run` on other sources in the `eg` directory +You can try `sixtypical --run-on` on other sources in the `eg` directory tree, which contains more extensive examples, including an entire game(-like program); see [eg/README.md](eg/README.md) for a listing. @@ -63,7 +63,7 @@ based on common 8-bit machine-language programming idioms, including While a programmer will find these constructs convenient, their inclusion in the language is primarily to make programs easier to analyze. -### Sophisticated static analysis +### Static analysis The language defines an [effect system][], and the reference compiler [abstractly interprets][] the input program to check that diff --git a/bin/sixtypical b/bin/sixtypical index 264c8fe..7767078 100755 --- a/bin/sixtypical +++ b/bin/sixtypical @@ -66,7 +66,7 @@ def process_input_files(filenames, options): compilation_roster = fa.serialize() dump(compilation_roster) - if options.analyze_only or (options.output is None and not options.run): + if options.analyze_only or (options.output is None and not options.run_on): return start_addr = None @@ -76,14 +76,20 @@ def process_input_files(filenames, options): else: start_addr = int(options.origin, 10) - if options.run: + if options.run_on: fh = NamedTemporaryFile(delete=False) output_filename = fh.name + Outputter = outputter_class_for({ + 'x64': 'c64-basic-prg', + 'xvic': 'vic20-basic-prg', + 'stella': 'atari2600-cart', + }.get(options.run_on)) else: fh = open(options.output, 'wb') output_filename = options.output + Outputter = outputter_class_for(options.output_format) - outputter = outputter_class_for(options.output_format)(fh, start_addr=start_addr) + outputter = Outputter(fh, start_addr=start_addr) outputter.write_prelude() compiler = Compiler(symtab, outputter.emitter) compiler.compile_program(program, compilation_roster=compilation_roster) @@ -95,14 +101,14 @@ def process_input_files(filenames, options): fh.close() - if options.run: + if options.run_on: emu = { - 'c64-basic-prg': "x64 -config vicerc", - 'vic20-basic-prg': "xvic -config vicerc", - 'atari2600-cart': "stella" - }.get(options.output_format) + 'x64': "x64 -config vicerc", + 'xvic': "xvic -config vicerc", + 'stella': "stella" + }.get(options.run_on) if not emu: - raise ValueError("No emulator configured for selected --output-format '{}'".format(options.output_format)) + raise ValueError("No emulator configured for selected --run-on '{}'".format(options.output_format)) command = "{} {}".format(emu, output_filename) check_call(command, shell=True) @@ -165,10 +171,10 @@ if __name__ == '__main__': help="Display debugging information when analyzing and compiling." ) argparser.add_argument( - "--run", - action="store_true", - help="Engage 'load-and-go' operation: write the output to a temporary filename, " - "infer an emulator from the given --output-format, and boot the emulator." + "--run-on", type=str, default=None, + help="If given, engage 'load-and-go' operation with the given emulator: write " + "the output to a temporary filename, using an appropriate --output-format " + "and boot the emulator with it. Options are: x64, xvic, stella." ) argparser.add_argument( "--traceback", diff --git a/eg/rudiments/README.md b/eg/rudiments/README.md index 914e2cf..a3b675d 100644 --- a/eg/rudiments/README.md +++ b/eg/rudiments/README.md @@ -11,7 +11,7 @@ points such as `chrout` are defined. In addition, some of these programs use "standard" support modules, so those should be included first too. For example: - sixtypical --output-format=c64-basic-prg --run support/c64.60p support/stdlib.60p vector-table.60p + sixtypical --run-on=x64 support/c64.60p support/stdlib.60p vector-table.60p `chrout` is a routine with outputs the value of the accumulator as an ASCII character, disturbing none of the other registers, From 0c65954bc57b2540a1265b3094973754954cb0e5 Mon Sep 17 00:00:00 2001 From: Chris Pressey Date: Tue, 16 Apr 2019 11:37:46 +0100 Subject: [PATCH 18/18] Updates to README. Fix awkward punctuation in usage message. --- README.md | 29 +++++++++++++++++++---------- bin/sixtypical | 2 +- 2 files changed, 20 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index 12d864b..436ef5e 100644 --- a/README.md +++ b/README.md @@ -51,7 +51,7 @@ many ridiculous silly errors at compile time. Many of SixtyPical's primitive instructions resemble those of the [MOS Technology 6502][] — it is in fact intended to be compiled to 6502 -machine code. However, it also does provide some "higher-level" operations +machine code. However, it also provides some "higher-level" operations based on common 8-bit machine-language programming idioms, including * copying values from one register to another (via a third register when @@ -65,7 +65,7 @@ inclusion in the language is primarily to make programs easier to analyze. ### Static analysis -The language defines an [effect system][], and the reference +The SixtyPical language defines an [effect system][], and the reference compiler [abstractly interprets][] the input program to check that it conforms to it. It can detect common mistakes such as @@ -77,11 +77,11 @@ it conforms to it. It can detect common mistakes such as ### Efficient code -Unlike most languages, in SixtyPical the programmer must manage memory very -explicitly, selecting the registers and memory locations to store all data in. -So, unlike a C compiler such as [cc65][], a SixtyPical compiler doesn't need -to generate code to handle [calling conventions][] or [register allocation][]. -This results in smaller (and thus faster) programs. +Unlike most conventional languages, in SixtyPical the programmer must manage +memory very explicitly, selecting the registers and memory locations to store +each piece of data in. So, unlike a C compiler such as [cc65][], a SixtyPical +compiler doesn't need to generate code to handle [calling conventions][] or +[register allocation][]. This results in smaller (and thus faster) programs. The flagship demo, a minigame for the Commodore 64, compiles to a **930**-byte `.PRG` file. @@ -98,19 +98,26 @@ The reference implementation can analyze and compile SixtyPical programs to For example programs for each of these, see [eg/README.md](eg/README.md). -Documentation +Specification ------------- SixtyPical is defined by a specification document, a set of test cases, and a reference implementation written in Python. -* [Design Goals](doc/Design%20Goals.md) +There are over 400 test cases, written in [Falderal][] format for readability. +In order to run the tests for compilation, [dcc6502][] needs to be installed. + * [SixtyPical specification](doc/SixtyPical.md) -* [SixtyPical revision history](HISTORY.md) * [Literate test suite for SixtyPical syntax](tests/SixtyPical%20Syntax.md) * [Literate test suite for SixtyPical analysis](tests/SixtyPical%20Analysis.md) * [Literate test suite for SixtyPical compilation](tests/SixtyPical%20Compilation.md) * [Literate test suite for SixtyPical fallthru optimization](tests/SixtyPical%20Fallthru.md) + +Documentation +------------- + +* [Design Goals](doc/Design%20Goals.md) +* [SixtyPical revision history](HISTORY.md) * [6502 Opcodes used/not used in SixtyPical](doc/6502%20Opcodes.md) * [Output formats supported by `sixtypical`](doc/Output%20Formats.md) * [TODO](TODO.md) @@ -126,3 +133,5 @@ and a reference implementation written in Python. [Commodore VIC-20]: https://en.wikipedia.org/wiki/Commodore_VIC-20 [Atari 2600]: https://en.wikipedia.org/wiki/Atari_2600 [Apple II series]: https://en.wikipedia.org/wiki/Apple_II_series +[Falderal]: https://catseye.tc/node/Falderal +[dcc6502]: https://github.com/tcarmelveilleux/dcc6502 diff --git a/bin/sixtypical b/bin/sixtypical index 7767078..2777b8f 100755 --- a/bin/sixtypical +++ b/bin/sixtypical @@ -173,7 +173,7 @@ if __name__ == '__main__': argparser.add_argument( "--run-on", type=str, default=None, help="If given, engage 'load-and-go' operation with the given emulator: write " - "the output to a temporary filename, using an appropriate --output-format " + "the output to a temporary filename using an appropriate --output-format, " "and boot the emulator with it. Options are: x64, xvic, stella." ) argparser.add_argument(