mirror of
https://github.com/michaelcmartin/Ophis.git
synced 2024-12-21 12:29:46 +00:00
347 lines
14 KiB
Plaintext
347 lines
14 KiB
Plaintext
<chapter id="part1">
|
|
<title>The basics</title>
|
|
|
|
<para>
|
|
In this first part of the tutorial we will create a
|
|
simple <quote>Hello World</quote> program to run on the Commodore
|
|
64. This will cover:
|
|
|
|
<itemizedlist>
|
|
<listitem><para>How to make programs run on a Commodore 64</para></listitem>
|
|
<listitem><para>Writing simple code with labels</para></listitem>
|
|
<listitem><para>Numeric and string data</para></listitem>
|
|
<listitem><para>Invoking the assembler</para></listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<section>
|
|
<title>A note on numeric notation</title>
|
|
|
|
<para>
|
|
Throughout these tutorials, I will be using a lot of both
|
|
decimal and hexadecimal notation. Hex numbers will have a
|
|
dollar sign in front of them. Thus, 100 = $64, and $100 = 256.
|
|
</para>
|
|
</section>
|
|
|
|
<section>
|
|
<title>Producing Commodore 64 programs</title>
|
|
|
|
<para>
|
|
Commodore 64 programs are stored in
|
|
the <filename>PRG</filename> format on disk. Some emulators
|
|
(such as CCS64 or VICE) can run <filename>PRG</filename>
|
|
programs directly; others need them to be transferred to
|
|
a <filename>D64</filename> image first.
|
|
</para>
|
|
|
|
<para>
|
|
The <filename>PRG</filename> format is ludicrously simple. It
|
|
has two bytes of header data: This is a little-endian number
|
|
indicating the starting address. The rest of the file is a
|
|
single continuous chunk of data loaded into memory, starting at
|
|
that address. BASIC memory starts at memory location 2048, and
|
|
that's probably where we'll want to start.
|
|
</para>
|
|
|
|
<para>
|
|
Well, not quite. We want our program to be callable from BASIC,
|
|
so we should have a BASIC program at the start. We guess the
|
|
size of a simple one line BASIC program to be about 16 bytes.
|
|
Thus, we start our program at memory location 2064 ($0810), and
|
|
the BASIC program looks like this:
|
|
</para>
|
|
|
|
<programlisting>
|
|
10 SYS 2064
|
|
</programlisting>
|
|
|
|
<para>
|
|
We <userinput>SAVE</userinput> this program to a file, then
|
|
study it with a hex dumper. It's 15 bytes long:
|
|
</para>
|
|
|
|
<screen>
|
|
00000000 01 08 0c 08 0a 00 9e 20 32 30 36 34 00 00 00 |....... 2064...|
|
|
</screen>
|
|
|
|
<para>
|
|
The first two bytes are the memory location: $0801. The rest of
|
|
the data breaks down as follows:
|
|
</para>
|
|
|
|
<table frame="all">
|
|
<title>BASIC program breakdown</title>
|
|
<tgroup cols='3'>
|
|
<thead>
|
|
<row>
|
|
<entry align="center">File Offsets</entry>
|
|
<entry align="center">Memory Locations</entry>
|
|
<entry align="center">Value</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row><entry>0-1</entry><entry>Nowhere</entry><entry>2-byte pointer to where in memory to load the rest of the file ($0801).</entry></row>
|
|
<row><entry>2-3</entry><entry>$0801-$0802</entry><entry>2-byte pointer to the next line of BASIC code ($080C).</entry></row>
|
|
<row><entry>4-5</entry><entry>$0803-$0804</entry><entry>2-byte line number ($000A = 10).</entry></row>
|
|
<row><entry>6</entry><entry>$0805</entry><entry>Byte code for the <userinput>SYS</userinput> command.</entry></row>
|
|
<row><entry>7-11</entry><entry>$0806-$080A</entry><entry>The rest of the line, which is just the string <quote> 2064</quote>.</entry></row>
|
|
<row><entry>12</entry><entry>$080B</entry><entry>Null byte, terminating the line.</entry></row>
|
|
<row><entry>13-14</entry><entry>$080C-$080D</entry><entry>2-byte pointer to the next line of BASIC code ($0000 = end of program).</entry></row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
That's 15 bytes, of which 13 are actually loaded into memory.
|
|
We started at 2049, so we need 2 more bytes of filler to make
|
|
our code actually start at location 2064. These 17 bytes will
|
|
give us the file format and the BASIC code we need to have our
|
|
machine language program run.
|
|
</para>
|
|
|
|
<para>
|
|
These are just bytes—indistinguishable from any other sort of
|
|
data. In Ophis, bytes of data are specified with
|
|
the <literal>.byte</literal> command. We'll also have to tell
|
|
Ophis what the program counter should be, so that it knows what
|
|
values to assign to our labels. The <literal>.org</literal>
|
|
(origin) command tells Ophis this. Thus, the Ophis code for our
|
|
header and linking info is:
|
|
</para>
|
|
|
|
<programlisting>
|
|
.byte $01, $08, $0C, $08, $0A, $00, $9E, $20
|
|
.byte $32, $30, $36, $34, $00, $00, $00, $00
|
|
.byte $00, $00
|
|
.org $0810
|
|
</programlisting>
|
|
|
|
<para>
|
|
This gets the job done, but it's completely incomprehensible,
|
|
and it only uses two directives—not very good for a
|
|
tutorial. Here's a more complicated, but much clearer, way of
|
|
saying the same thing.
|
|
</para>
|
|
|
|
<programlisting>
|
|
.word $0801
|
|
.org $0801
|
|
|
|
.word next, 10 ; Next line and current line number
|
|
.byte $9e," 2064",0 ; SYS 2064
|
|
next: .word 0 ; End of program
|
|
|
|
.advance 2064
|
|
</programlisting>
|
|
|
|
<para>
|
|
This code has many advantages over the first.
|
|
|
|
<itemizedlist>
|
|
<listitem><para> It describes better what is actually
|
|
happening. The <literal>.word</literal> directive at the
|
|
beginning indicates a 16-bit value stored in the typical
|
|
65xx way (small byte first). This is followed by
|
|
an <literal>.org</literal> statement, so we let the
|
|
assembler know right away where everything is supposed to
|
|
be.
|
|
</para></listitem>
|
|
<listitem><para>
|
|
Instead of hardcoding in the value $080C, we instead use a
|
|
label to identify the location it's pointing to. Ophis
|
|
will compute the address of <literal>next</literal> and
|
|
put that value in as data. We also describe the line
|
|
number in decimal since BASIC line numbers
|
|
generally <emphasis>are</emphasis> in decimal. Labels are
|
|
defined by putting their name, then a colon, as seen in
|
|
the definition of <literal>next</literal>.
|
|
</para></listitem>
|
|
<listitem><para>
|
|
Instead of putting in the hex codes for the string part of
|
|
the BASIC code, we included the string directly. Each
|
|
character in the string becomes one byte.
|
|
</para></listitem>
|
|
<listitem><para>
|
|
Instead of adding the buffer ourselves, we
|
|
used <literal>.advance</literal>, which outputs zeros until
|
|
the specified address is reached. Attempting
|
|
to <literal>.advance</literal> backwards produces an
|
|
assemble-time error. (If we wanted to output something
|
|
besides zeros, we could add it as a second
|
|
argument: <literal>.advance 2064,$FF</literal>, for
|
|
instance.)
|
|
</para></listitem>
|
|
<listitem><para>
|
|
It has comments that explain what the data are for. The
|
|
semicolon is the comment marker; everything from a semicolon
|
|
to the end of the line is ignored.
|
|
</para></listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>
|
|
We can do better still, though. That initial starting address
|
|
of 2064 was only ever a guess; now that we know that we overshot
|
|
by two bytes, we can simply change the starting address to 2062
|
|
and omit the <literal>.advance</literal> directive entirely. In
|
|
fact, we can even remove the space before the number and make it
|
|
2061 instead—BASIC doesn't need that space in its
|
|
instruction and it's arguably a wasted byte.
|
|
</para>
|
|
</section>
|
|
|
|
<section>
|
|
<title>Related commands and options</title>
|
|
|
|
<para>
|
|
This code includes constants that are both in decimal and in
|
|
hex. It is also possible to specify constants in octal, binary,
|
|
or with an ASCII character.
|
|
|
|
<itemizedlist>
|
|
<listitem><para>To specify decimal constants, simply write the number.</para></listitem>
|
|
<listitem><para>To specify hexadecimal constants, put a $ in front.</para></listitem>
|
|
<listitem><para>To specify octal constants, put a 0 (zero) in front.</para></listitem>
|
|
<listitem><para>To specify binary constants, put a % in front.</para></listitem>
|
|
<listitem><para>To specify ASCII constants, put an apostrophe in front.</para></listitem>
|
|
</itemizedlist>
|
|
|
|
Example: 65 = $41 = 0101 = %1000001 = 'A
|
|
</para>
|
|
<para>
|
|
There are other commands besides <literal>.byte</literal>
|
|
and <literal>.word</literal> to specify data. In particular,
|
|
the <literal>.dword</literal> command specifies four-byte values
|
|
which some applications will find useful. Also, some linking
|
|
formats (such as the <filename>SID</filename> format) have
|
|
header data in big-endian (high byte first) format.
|
|
The <literal>.wordbe</literal> and <literal>.dwordbe</literal>
|
|
directives provide a way to specify multibyte constants in
|
|
big-endian formats cleanly.
|
|
</para>
|
|
</section>
|
|
|
|
<section>
|
|
<title>Writing the actual code</title>
|
|
|
|
<para>
|
|
Now that we have our header information, let's actually write
|
|
the <quote>Hello world</quote> program. It's pretty
|
|
short—a simple loop that steps through a hardcoded array
|
|
until it reaches a 0 or outputs 256 characters. It then returns
|
|
control to BASIC with an <literal>RTS</literal> statement.
|
|
</para>
|
|
|
|
<para>
|
|
Each character in the array is passed as an argument to a
|
|
subroutine at memory location $FFD2. This is part of the
|
|
Commodore 64's BIOS software, which its development
|
|
documentation calls the KERNAL. Location $FFD2 prints out the
|
|
character corresponding to the character code in the
|
|
accumulator.
|
|
</para>
|
|
|
|
<programlisting>
|
|
ldx #0
|
|
loop: lda hello, x
|
|
beq done
|
|
jsr $ffd2
|
|
inx
|
|
bne loop
|
|
done: rts
|
|
|
|
hello: .byte "HELLO, WORLD!", 0
|
|
</programlisting>
|
|
|
|
<para>
|
|
The complete, final source is available in
|
|
the <xref linkend="tutor1-src" endterm="tutor1-fname"> file.
|
|
</para>
|
|
</section>
|
|
<section>
|
|
<title>Assembling the code</title>
|
|
|
|
<para>
|
|
The Ophis assembler is a collection of Python modules,
|
|
controlled by a master script. On Windows, this should all
|
|
have been combined into an executable
|
|
file <command>ophis.exe</command>; on other platforms, the
|
|
Ophis modules should be in the library and
|
|
the <command>ophis</command> script should be in your path.
|
|
Typing <command>ophis</command> with no arguments should give a
|
|
summary of available command line options.
|
|
</para>
|
|
|
|
<para>
|
|
Ophis takes a list of source files and produces an output file
|
|
based on assembling each file you give it, in order. You can add
|
|
a line to your program like this to name the output file:
|
|
</para>
|
|
|
|
<programlisting>
|
|
.outfile "hello.prg"
|
|
</programlisting>
|
|
|
|
<para>
|
|
Alternately, you can use the <option>-o</option> option on the
|
|
command line. This will override any <literal>.outfile</literal>
|
|
directives. If you don't specify any name, it will put the
|
|
output into a file named <filename>ophis.bin</filename>.
|
|
</para>
|
|
|
|
<para>
|
|
If you are using Ophis as part of some larger toolchain, you can
|
|
also make it run in <emphasis>pipe mode</emphasis>. If you give
|
|
a dash <option>-</option> as an input file or as the output
|
|
target, Ophis will use standard input or output for
|
|
communication.
|
|
</para>
|
|
|
|
<table frame="all">
|
|
<title>Ophis Options</title>
|
|
<tgroup cols='2'>
|
|
<thead>
|
|
<row>
|
|
<entry align="center">Option</entry>
|
|
<entry align="center">Effect</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row><entry><option>-o FILE</option></entry><entry>Overrides the default filename for output.</entry></row>
|
|
<row><entry><option>-l FILE</option></entry><entry>Specifies an optional listing file that gives the emitted binary in human-readable form, with disassembly.</entry></row>
|
|
<row><entry><option>-m FILE</option></entry><entry>Specifies an optional map file that gives the in-source names for every label used in the program.</entry></row>
|
|
<row><entry><option>-u</option></entry><entry>Allows the 6510 undocumented opcodes as listed in the VICE documentation.</entry></row>
|
|
<row><entry><option>-c</option></entry><entry>Allows opcodes and addressing modes added by the 65C02.</entry></row>
|
|
<row><entry><option>-4</option></entry><entry>Allows opcodes and addressing modes added by the 4502. (Experimental.)</entry></row>
|
|
<row><entry><option>-q</option></entry><entry>Quiet operation. Only reports warnings and errors.</entry></row>
|
|
<row><entry><option>-v</option></entry><entry>Verbose operation. Reports files as they are loaded.</entry></row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
The only options Ophis demands are an input file and an output
|
|
file. Here's a sample session, assembling the tutorial file
|
|
here:
|
|
</para>
|
|
<screen>
|
|
localhost$ ophis -v hello1.oph
|
|
Loading hello1.oph
|
|
Assembly complete: 45 bytes output (14 code, 29 data, 2 filler)
|
|
</screen>
|
|
<para>
|
|
This will produce a file
|
|
named <filename>hello.prg</filename>. If your emulator can
|
|
run <filename>PRG</filename> files directly, this file will now
|
|
run (and print <computeroutput>HELLO, WORLD!</computeroutput>)
|
|
as many times as you type <userinput>RUN</userinput>.
|
|
Otherwise, use a <filename>D64</filename> management utility to
|
|
put the <filename>PRG</filename> on a <filename>D64</filename>,
|
|
then load and run the file off that. If you have access to a
|
|
device like the 1541 Ultimate II, you can even load the file
|
|
directly into the actual hardware.
|
|
</para>
|
|
</section>
|
|
</chapter>
|