macross/doc/linkerReleaseNotes

225 lines
9.8 KiB
Plaintext

An informal introduction to
code relocation
and
multi-file linking
using
Macross and Slinky
To use:
First, let me explain that if you don't want to use the linker and assemble
your programs in pieces, you need do nothing. Macross will continue to
operate for you just as it always has (except that it will be somewhat faster
since it no longer will be keeping track of some stuff internally that only
matters to the linker). If you're not interested stop reading this right now.
There is a new command line option for Macross, '-c'. This signals to the
assembler that, rather than generating a plain old ordinary object file, you
want it to produce a special linkable object file that contains all kinds of
information that the linker will need later. (I know it's obscure, but I'm
running out of letters and the C compiler uses this flag to signal the same
thing (maybe it should be '-m' or would that be even more obscure?)). E.g.:
macross -c -o foo foo.m
To link object files together, you use Slinky. The command is
slinky file1 file2 file3 ...
where file1, file2, etc. were generated by Macross using the '-c' option
described in the previous paragraph. By default the output will go in the
file 's.out' but the aesthetically enlightened will use the '-o' option to
slinky that will name the output file whatever you want. E.g.:
slinky -o foobar file1 file2 file3 ...
will name the output 'foobar' (just like the '-o' option to Macross). The
output file from Slinky will be a standard a65-style object file suitable for
downloading to your Atari or whatever. *Don't* try to directly download an
unlinked object that was produced by Macross using '-c'. Doing so will make
the downloader choke, puke and die (actually, I haven't tried this, but in any
case it will be wrong and the result will most likely be ugly).
Simple, no? Actually no, because...
You need to write your Macross programs in a way that can allow the various
files to be assembled separately. This requires an understanding of two
important concepts: 1. relocation and 2. external symbols.
A Macross relocatable object file consists of a bunch of little pieces of
object code (called 'segments') plus some other bookkeeping info. Each of
these little pieces of object code is either 'absolute' or 'relocatable'.
Being absolute means that a segment is given a fixed, pre-specified location
in memory. Being relocatable means that the segment can go wherever the
linker finds room to put it.
All right, you ask, how do my pieces of code get to be absolute or
relocatable? Well, at any given time, Macross is assembling in either
absolute-mode or relocatable-mode (it starts out in relocatable-mode (always
did, bet you didn't even notice!)). It gets into absolute-mode using the
'org' statement with an absolute-value as the address to org to, i.e.,
org 0x1000
This starts an absolute segment at location 0x1000. The segment continues
until the next 'org' or 'rel' statement. Macross gets into relocatable-mode
via the 'rel' statement:
rel
or by an 'org' statement with a relocatable value as the address to org to
(actually, using an org in this way is, as of the current implementation,
somewhat questionable). The relocatable segment continues until the next
'org' kicks Macross out of relocatable-mode or you use a 'constrain' or
'align' statement. Each of the latter two (in relocatable-mode but not in
absolute-mode) starts a new relocatable segment which is aligned or
constrained appropriately at link-time. Also, a new relocatable segment
starts at the end of the block that is the argument of a 'constrain'
statement (again, in relocatable mode only).
It is important for you to know where (in your source code) relocatable
segments begin and end, because each segment is relocated by the linker
independently of all the others. Thus, even though two segments might be
adjacent in your Macross source, in the eventual linked object they might not
be. Thus relative branches across segment boundaries may go out of range and
you cannot expect the flow of program execution to be continuous from one
segment to another, i.e., that the last instruction in a segment will be
followed immediately by the first instruction of the segment that follows it
in the source. For example, in the following:
...stuff...
and 123
constrain (0x100) {
sta foobar
...more stuff...
you can't assume that the 'sta' will follow the 'and'. So
RULE #1 -- Don't allow program flow of control to fall through from one
segment to the next.
RULE #2 -- Don't do relative branches across segment boundaries ('jmp's and
'jsr's are OK).
COROLLARY -- You can't put an 'align' or 'constrain' statement inside an 'if',
'while', 'do while' or 'do until' statement, and putting one inside a macro is
very likely to result in a weird program bug unless you really understand what
you are doing.
As with segments, a symbol in your Macross program is either absolute or
relocatable. The value of an absolute symbol (called an 'absolute value') is
a location in an absolute segment (or simply a fixed number like 42 or 137).
A relocatable symbol has a value (called a 'relocatable value') which is a
location in a relocatable segment. The important point to note is that the
value of a relocatable symbol is not known until the program is linked and the
relocatable segment to which it refers is given an actual location in memory.
This in turn means that Macross can't do any assembly-time arithmetic with the
symbol since it doesn't know what value to compute with. Since storing away
whole expressions in the object file would be both costly and messy we don't
even try. The only arithmetic operation that the linker knows how to do is
simple addition. Thus, the only operation you can perform with a relocatable
symbol or value is simple addition, and then only in contexts where the result
of the computation will get stored in the resultant object somewhere, such as
the argument to an instruction or a 'byte' or 'word' statement. Thus the
following are OK, for example (let's say 'foo' and 'bar' are relocatable
symbols):
and foo+3
ora bar-10 ; OK since this is just adding -10
word foo+bar
but these are not
and foo*3 ; not addition
ora 10-bar ; can't subtract 'bar'
lda (bar+10)/7 ; even though 'bar' gets added, the result of
; the addition is needed for the division
So,
RULE #3 -- No arithmetic more complicated than simple addition is allowed with
relocatable symbols or values.
Now to explain external symbols...
First of all, you need to understand about symbols being *defined*. When we
say that a symbol is 'defined' we mean that it has a value that the assembler
or linker can use. A symbol gets defined by the Macross 'define' statement or
by being used as a label. In order to actually use a symbol, e.g. as part of
the operand of an instruction, the symbol must be defined *somewhere*.
Now let's say you have a subroutine that is defined in one file (actually, the
label which is associated with the entry point to the subroutine is defined in
that file, but let's not quibble), but which is called (using a 'jsr') from a
second file. Two pieces of information need to be given to Macross when it
assembles these files. In the first file, where the subroutine label is
defined, you need to tell Macross, "Hey, this label is going to be used
outside of this file, so put something in the object file that will tell the
linker that it's here." In the second file, where the label is used (but not
defined), you need to tell Macross, "Yes, I know this symbol isn't defined
here. Don't worry about it. It's defined elsewhere and the linker will worry
about it." Both of these pieces of information are conveyed by declaring the
symbol to be external using the Macross 'extern' statement. E.g., in the
first file:
extern foo
...stuff...
foo: ...more stuff...
rts
and in the second file:
extern foo
...stuff...
jsr foo
...more stuff...
In addition to this, a shorthand form of declaring a label to be external when
it is defined is supported: you simply use two colons instead of one. In our
example above, then, the first file could be:
...stuff...
foo:: ...more stuff...
rts
and the result would be exactly the same. So,
RULE #4 -- If you want a symbol's value to be carried across multiple files,
that symbol MUST be declared external in the file where it is defined and in
all files in which it is used.
Note that, as with relocatable symbols, symbols which are defined externally
do not have a value which is known to Macross at assembly-time (though symbols
which are external but which are defined in a given file do have such a value
presuming that they are not also relocatable). This means that the same
restrictions about arithmetic on relocatable symbols apply to externally
defined symbols:
RULE #3a -- No arithmetic more complicated than simple addition is allowed
with externally defined symbols.
Not to belabor the obvious, but it is of course an error to define an external
symbol in more than one file. The linker will catch this and complain if you
try.
In summary then:
RULE #1 -- Don't allow program flow of control to fall through from one
segment to the next.
RULE #2 -- Don't do relative branches across segment boundaries ('jmp's and
'jsr's are OK).
COROLLARY -- You can't put an 'align' or 'constrain' statement inside an 'if',
'while', 'do while' or 'do until' statement, and putting one inside a macro is
very likely to result in a weird program bug unless you really understand what
you are doing.
RULE #3 -- No arithmetic more complicated than simple addition is allowed with
relocatable symbols or values or symbols which are defined externally.
RULE #4 -- If you want a symbol's value to be carried across multiple files,
that symbol MUST be declared external in the file in which it is defined and
in all files in which it is used.