Ophis Command Reference
Command Modes
These mostly follow the MOS Technology 6500
Microprocessor Family Programming Manual, except
for the Accumulator mode. Accumulator instructions are written
and interpreted identically to Implied mode instructions.
Implied: RTS
Accumulator: LSR
Immediate: LDA #$06
Zero Page: LDA $7C
Zero Page, X: LDA $7C,X
Zero Page, Y: LDA $7C,Y
Absolute: LDA $D020
Absolute, X: LDA $D000,X
Absolute, Y: LDA $D000,Y
(Zero Page Indirect, X): LDA ($80, X)
(Zero Page Indirect), Y: LDA ($80), Y
(Absolute Indirect): JMP ($A000)
Relative: BNE loop
(Absolute Indirect, X): JMP ($A000, X) — Only available with 65C02 extensions
(Zero Page Indirect): LDX ($80) — Only available with 65C02 extensions
Basic arguments
Most arguments are just a number or label. The formats for
these are below.
Numeric types
Hex: $41 (Prefixed with $)
Decimal: 65 (No markings)
Octal: 0101 (Prefixed with zero)
Binary: %01000001 (Prefixed with %)
Character: 'A (Prefixed with single quote)
Label types
Normal labels are simply referred to by name. Anonymous
labels may be referenced with strings of - or + signs (the
label - refers to the immediate
previous anonymous label, -- the
one before that, etc., while +
refers to the next anonymous label), and the special
label ^ refers to the program
counter at the start of the current instruction or directive.
Normal labels are defined by
prefixing a line with the label name and then a colon
(e.g., label:). Anonymous labels
are defined by prefixing a line with an asterisk
(e.g., *).
Temporary labels are only reachable from inside the
innermost enclosing .scope
statement. They are identical to normal labels in every
way, except that they start with an underscore.
String types
Strings are enclosed in double quotation marks. Backslashed
characters (including backslashes and double quotes) are
treated literally, so the string "The man said,
\"The \\ character is the backslash.\"" produces
the ASCII sequence for The man said, "The \
character is the backslash."
Strings are generally only used as arguments to assembler
directives—usually for filenames
(e.g., .include) but also for string
data (in association with .byte).
It is legal, though unusual, to attempt to pass a string to
the other data statements. This will produces a series of
words/dwords where all bytes that aren't least-significant
are zero. Endianness and size will match what the directive
itself indicated.
Compound Arguments
Compound arguments may be built up from simple ones, using the
standard +, -, *, and / operators, which carry the usual
precedence. Also, the unary operators > and <, which
bind more tightly than anything else, provide the high and low
bytes of 16-bit values, respectively.
Use brackets [ ] instead of parentheses ( ) when grouping
arithmetic operations, as the parentheses are needed for the
indirect addressing modes.
Examples:
$D000 evaluates to $D000
$D000+32 evaluates to $D020
$D000+$20 also evaluates to $D020
<$D000+32 evaluates to $20
>$D000+32 evaluates to $F0
>[$D000+32] evaluates to $D0
>$D000-275 evaluates to $CE
Memory Model
In order to properly compute the locations of labels and the
like, Ophis must keep track of where assembled code will
actually be sitting in memory, and it strives to do this in a
way that is independent both of the target file and of the
target machine.
Basic PC tracking
The primary technique Ophis uses is program counter
tracking. As it assembles the code, it keeps
track of a virtual program counter, and uses that to
determine where the labels should go.
In the absence of an .org directive, it
assumes a starting PC of zero. .org
is a simple directive, setting the PC to the value
that .org specifies. In the simplest
case, one .org directive appears at the
beginning of the code and sets the location for the rest of
the code, which is one contiguous block.
Basic Segmentation simulation
However, this isn't always practical. Often one wishes to
have a region of memory reserved for data without actually
mapping that memory to the file. On some systems (typically
cartridge-based systems where ROM and RAM are seperate, and
the target file only specifies the ROM image) this is
mandatory. In order to access these variables symbolically,
it's necessary to put the values into the label lookup
table.
It is possible, but inconvenient, to do this
with .alias, assigning a specific
memory location to each variable. This requires careful
coordination through your code, and makes creating reusable
libraries all but impossible.
A better approach is to reserve a section at the beginning
or end of your program, put an .org
directive in, then use the .space
directive to divide up the data area. This is still a bit
inconvenient, though, because all variables must be
assigned all at once. What we'd really like is to keep
multiple PC counters, one for data and one for code.
The .text
and .data directives do this. Each
has its own PC that starts at zero, and you can switch
between the two at any point without corrupting the other's
counter. In this way each function can have
a .data section (filled
with .space commands) and
a .text section (that contains the
actual code). This lets our library routines be almost
completely self-contained - we can have one source file
that could be .included by multiple
projects without getting in anything's way.
However, any given program may have its own ideas about
where data and code go, and it's good to ensure with
a .checkpc at the end of your code
that you haven't accidentally overwritten code with data or
vice versa. If your .data
segment did start at zero, it's
probably wise to make sure you aren't smashing the stack,
too (which is sitting in the region from $0100 to
$01FF).
If you write code with no segment-defining statements in
it, the default segment
is text.
The data segment is designed only
for organizing labels. As such, errors will be flagged if
you attempt to actually output information into
a data segment.
General Segmentation Simulation
One text and data segment each is usually sufficient, but
for the cases where it is not, Ophis allows for user-defined
segments. Putting a label
after .text
or .data produces a new segment with
the specified name.
Say, for example, that we have access to the RAM at the low
end of the address space, but want to reserve the zero page
for truly critical variables, and use the rest of RAM for
everything else. Let's also assume that this is a 6510
chip, and locations $00 and $01 are reserved for the I/O
port. We could start our program off with:
.data
.org $200
.data zp
.org $2
.text
.org $800
And, to be safe, we would probably want to end our code
with checks to make sure we aren't overwriting anything:
.data
.checkpc $800
.data zp
.checkpc $100
Macros
Assembly language is a powerful tool—however, there are
many tasks that need to be done repeatedly, and with
mind-numbing minor modifications. Ophis includes a facility
for macros to allow this. Ophis macros
are very similar in form to function calls in higher level
languages.
Defining Macros
Macros are defined with the .macro
and .macend commands. Here's a
simple one that will clear the screen on a Commodore
64:
.macro clr'screen
lda #147
jsr $FFD2
.macend
Invoking Macros
To invoke a macro, either use
the .invoke command or backquote the
name of the routine. The previous macro may be expanded
out in either of two ways, at any point in the
source:
.invoke clr'screen
or
`clr'screen
will work equally well.
Passing Arguments to Macros
Macros may take arguments. The arguments to a macro are
all of the word
type, though byte values may
be passed and used as bytes as well. The first argument in
an invocation is bound to the label
_1, the second
to _2, and so on. Here's a macro
for storing a 16-bit value into a word pointer:
.macro store16 ; `store16 dest, src
lda #<_2
sta _1
lda #>_2
sta _1+1
.macend
Macro arguments behave, for the most part, as if they were
defined by .alias
commands in the calling context.
(They differ in that they will not produce duplicate-label
errors if those names already exist in the calling scope,
and in that they disappear after the call is
completed.)
Features and Restrictions of the Ophis Macro Model
Unlike most macro systems (which do textual replacement),
Ophis macros evaluate their arguments and bind them into the
symbol table as temporary labels. This produces some
benefits, but it also puts some restrictions on what kinds of
macros may be defined.
The primary benefit of this expand-via-binding
discipline is that there are no surprises in the semantics.
The expression _1+1 in the macro above
will always evaluate to one more than the value that was
passed as the first argument, even if that first argument is
some immensely complex expression that an
expand-via-substitution method may accidentally
mangle.
The primary disadvantage of the expand-via-binding
discipline is that only fixed numbers of words and bytes
may be passed. A substitution-based system could define a
macro including the line LDA _1 and
accept as arguments both $C000
(which would put the value of memory location $C000 into
the accumulator) and #$40 (which
would put the immediate value $40 into the accumulator).
If you really need this kind of
behavior, a run a C preprocessor over your Ophis source,
and use #define to your heart's
content.
Assembler directives
Assembler directives are all instructions to the assembler
that are not actual instructions. Ophis's set of directives
follow.
.advance address:
Forces the program counter to
be address. Unlike
the .org
directive, .advance outputs zeroes until the
program counter reaches a specified address. Attempting
to .advance to a point behind the current
program counter is an assemble-time error.
.alias label value: The
.alias directive assigns an arbitrary value to a label. This
value may be an arbitrary argument, but cannot reference any
label that has not already been defined (this prevents
recursive label dependencies).
.byte arg [ , arg, ... ]:
Specifies a series of arguments, which are evaluated, and
strings, which are included as raw ASCII data. The final
results of these arguments must be one byte in size. Seperate
constants are seperated by comments.
.checkpc address: Ensures that the
program counter is less than or equal to the address
specified, and emits an assemble-time error if it is not.
This produces no code in the final binary - it is there to
ensure that linking a large amount of data together does not
overstep memory boundaries.
.data [label]: Sets the segment to
the segment name specified and disallows output. If no label
is given, switches to the default data segment.
.incbin filename: Inserts the
contents of the file specified as binary data. Use it to
include graphics information, precompiled code, or other
non-assembler data.
.include filename: Includes the
entirety of the file specified at that point in the program.
Use this to order your final sources.
.org address: Sets the program
counter to the address specified. This does not emit any
code in and of itself, nor does it overwrite anything that
previously existed. If you wish to jump ahead in memory,
use .advance.
.require filename: Includes the entirety
of the file specified at that point in the program. Unlike .include,
however, code included with .require will only be inserted once.
The .require directive is useful for ensuring that certain code libraries
are somewhere in the final binary. They are also very useful for guaranteeing that
macro libraries are available.
.space label size: This
directive is used to organize global variables. It defines the
label specified to be at the current location of the program
counter, and then advances the program counter size
steps ahead. No actual code is produced. This is equivalent
to label: .org ^+size.
.text [label]: Sets the segment to
the segment name specified and allows output. If no label is
given, switches to the default text segment.
.word arg [ , arg, ... ]:
Like .byte, but values are all treated as two-byte
values and stored low-end first (as is the 6502's wont). Use
this to create jump tables (an unadorned label will evaluate
to that label's location) or otherwise store 16-bit
data.
.dword arg [ , arg, ...]:
Like .word, but for 32-bit values.
.wordbe arg [ , arg, ...]:
Like .word, but stores the value in a big-endian format (high byte first).
.dwordbe arg [ , arg, ...]:
Like .dword, but stores the value high byte first.
.scope: Starts a new scope block. Labels
that begin with an underscore are only reachable from within
their innermost enclosing .scope statement.
.scend: Ends a scope block. Makes the
temporary labels defined since the last .scope
statement unreachable, and permits them to be redefined in a
new scope.
.macro name: Begins a macro
definition block. This is a scope block that can be inlined
at arbitrary points with .invoke. Arguments to the
macro will be bound to temporary labels with names like
_1, _2, etc.
.macend: Ends a macro definition
block.
.invoke label [argument [,
argument ...]]: invokes (inlines) the specified
macro, binding the values of the arguments to the ones the
macro definition intends to read. A shorthand for .invoke
is the name of the macro to invoke, backquoted.