1
0
mirror of https://github.com/cc65/cc65.git synced 2025-01-13 09:31:53 +00:00

Converted the ld65 docs to sgml

git-svn-id: svn://svn.cc65.org/cc65/trunk@525 b7a2c559-68d2-44c3-8de9-860c34a00d81
This commit is contained in:
cuz 2000-12-02 22:14:05 +00:00
parent fa46d84571
commit 812152fa50
4 changed files with 877 additions and 720 deletions

View File

@ -9,7 +9,8 @@ SGML = ar65.sgml \
ca65.sgml \
cc65.sgml \
cl65.sgml \
dio.sgml
dio.sgml \
ld65.sgml
TXT = $(SGML:.sgml=.txt)
HTML = $(SGML:.sgml=.html)

View File

@ -48,7 +48,7 @@ the assembler, have a look at ca65.txt).
<sect1>Command line option overview<p>
The compiler may be called as follows:
The compiler may be called as follows:
<tscreen><verb>
---------------------------------------------------------------------------
@ -154,6 +154,7 @@ Here is a description of all the command line options:
<itemize>
<item>none
<item>atari
<item>c64
<item>c128
<item>plus4

873
doc/ld65.sgml Normal file
View File

@ -0,0 +1,873 @@
<!doctype linuxdoc system>
<article>
<title>ld65 Users Guide
<author>Ullrich von Bassewitz, <htmlurl url="mailto:uz@cc65.org" name="uz@cc65.org">
<date>02.12.2000
<abstract>
The ld65 linker combines object files into an executable file. ld65 is highly
configurable and uses configuration files for high flexibility.
</abstract>
<!-- Table of contents -->
<toc>
<!-- Begin the document -->
<sect>Overview<p>
The ld65 linker combines several object modules created by the ca65
assembler, producing an executable file. The object modules may be read
from a library created by the ar65 archiver (this is somewhat faster and
more convenient). The linker was designed to be as flexible as possible.
It complements the features that are built into the ca65 macroassembler:
<itemize>
<item> Accept any number of segments to form an executable module.
<item> Resolve arbitrary expressions stored in the object files.
<item> In case of errors, use the meta information stored in the object files
to produce helpful error messages. In case of undefined symbols,
expression range errors, or symbol type mismatches, ld65 is able to
tell you the exact location in the original assembler source, where
the symbol was referenced.
<item> Flexible output. The output of ld65 is highly configurable by a config
file. More common platforms are supported by builtin configurations
that may be activated by naming the target system. The output
generation was designed with different output formats in mind, so
adding other formats shouldn't be a great problem.
</itemize>
<sect>Usage<p>
<sect1>Command line option overview<p>
The linker is called as follows:
<tscreen><verb>
---------------------------------------------------------------------------
Usage: ld65 [options] module ...
Short options:
-h Help (this text)
-m name Create a map file
-o name Name the default output file
-t sys Set the target system
-v Verbose mode
-vm Verbose map file
-C name Use linker config file
-Ln name Create a VICE label file
-Lp Mark write protected segments as such (VICE)
-S addr Set the default start address
-V Print the linker version
Long options:
--help Help (this text)
--mapfile name Create a map file
--target sys Set the target system
--version Print the linker version
---------------------------------------------------------------------------
</verb></tscreen>
<sect1>Command line options in detail<p>
Here is a description of all the command line options:
<descrip>
<tag><tt>-h, --help</tt></tag>
Print the short option summary shown above.
<label id="option-m">
<tag><tt>-m name, --mapfile name</tt></tag>
This option (which needs an argument that will used as a filename for
the generated map file) will cause the linker to generate a map file.
The map file does contain a detailed overview over the modules used, the
sizes for the different segments, and a table containing exported
symbols.
<label id="option-o">
<tag><tt>-o name</tt></tag>
The -o switch is used to give the name of the default output file.
Depending on your output configuration, this name may NOT be used as
name for the output file. However, for the builtin configurations, this
name is used for the output file name.
<label id="option-t">
<tag><tt>-t sys, --target sys</tt></tag>
The argument for the -t switch is the name of the target system. Since this
switch will activate a builtin configuration, it may not be used together
with the <tt><ref id="option-C" name="-C"></tt> option. The following target
systems are currently supported:
<itemize>
<item>none
<item>atari
<item>c64
<item>c128
<item>plus4
<item>cbm610 (all CBM series-II computers with 80 column video)
<item>pet (all CBM PET systems except the 2001)
<item>apple2
<item>geos
</itemize>
There are a few more targets defined but neither of them is actually
supported. See <ref id="builtin-configs" name="builtin configurations"> for
more information.
<label id="option-v">
<tag><tt>-v, --verbose</tt></tag>
Using the -v option, you may enable more output that may help you to
locate problems. If an undefined symbol is encountered, -v causes the
linker to print a detailed list of the references (that is, source file
and line) for this symbol.
<tag><tt>-vm</tt></tag>
Must be used in conjunction with <tt><ref id="option-m" name="-m"></tt>
(generate map file). Normally the map file will not include empty segments
and sections, or unreferenced symbols. Using this option, you can force the
linker to include all this information into the map file.
<label id="option-C">
<tag><tt>-C</tt></tag>
This gives the name of an output config file to use. See section 4 for more
information about config files. -C may not be used together with <tt><ref
id="option-t" name="-t"></tt>.
<tag><tt>-Ln</tt></tag>
This option allows you to create a file that contains all global labels and
may be loaded into VICE emulator using the <tt/ll/ (load label) command. You
may use this to debug your code with VICE. Note: Older versions had some
bugs in the label code. If you have problems, please get the latest VICE
version.
<tag><tt>-Lp</tt></tag>
Deprecated option.
<label id="option-S">
<tag><tt>-S addr, --start-addr addr</tt></tag>
Using -S you may define the default starting address. If and how this
address is used depends on the config file in use. For the builtin
configurations, only the "none" system honors an explicit start address,
all other builtin config provide their own.
<tag><tt>-V, --version</tt></tag>
This option print the version number of the linker. If you send any
suggestions or bugfixes, please include this number.
</descrip>
If one of the modules is not found in the current directory, and the module
name does not have a path component, the value of the environment variable
<tt/CC65_LIB/ is prepended to the name, and the linker tries to open the
module with this new name.
<sect>Detailed workings<p>
The linker does several things when combining object modules:
First, the command line is parsed from left to right. For each object file
encountered (object files are recognized by a magic word in the header, so
the linker does not care about the name), imported and exported
identifiers are read from the file and inserted in a table. If a library
name is given (libraries are also recognized by a magic word, there are no
special naming conventions), all modules in the library are checked if an
export from this module would satisfy an import from other modules. All
modules where this is the case are marked. If duplicate identifiers are
found, the linker issues a warning.
This procedure (parsing and reading from left to right) does mean, that a
library may only satisfy references for object modules (given directly or from
a library) named <em/before/ that library. With the command line
<tscreen><verb>
ld65 crt0.o clib.lib test.o
</verb></tscreen>
the module test.o may not contain references to modules in the library
clib.lib. If this is the case, you have to change the order of the modules
on the command line:
<tscreen><verb>
ld65 crt0.o test.o clib.lib
</verb></tscreen>
Step two is, to read the configuration file, and assign start addresses
for the segments and define any linker symbols (see <ref id="config-files"
name="Configuration files">).
After that, the linker is ready to produce an output file. Before doing that,
it checks it's data for consistency. That is, it checks for unresolved
externals (if the output format is not relocatable) and for symbol type
mismatches (for example a zero page symbol is imported by a module as absolute
symbol).
Step four is, to write the actual target files. In this step, the linker will
resolve any expressions contained in the segment data. Circular references are
also detected in this step (a symbol may have a circular reference that goes
unnoticed if the symbol is not used).
Step five is to output a map file with a detailed list of all modules,
segments and symbols encountered.
And, last step, if you give the <tt><ref id="option-v" name="-v"></tt> switch
twice, you get a dump of the segment data. However, this may be quite
unreadable if you're not a developer:-)
<sect>Configuration files<label id="config-files"><p>
Configuration files are used to describe the layout of the output file(s). Two
major topics are covered in a config file: The memory layout of the target
architecture, and the assignment of segments to memory areas. In addition,
several other attributes may be specified.
Case is ignored for keywords, that is, section or attribute names, but it is
<em/not/ ignored for names and strings.
<sect1>Introduction<p>
Memory areas are specified in a <tt/MEMORY/ section. Lets have a look at an
example (this one describes the usable memory layout of the C64):
<tscreen><verb>
MEMORY {
RAM1: start = $0800, size = $9800;
ROM1: start = $A000, size = $2000;
RAM2: start = $C000, size = $1000;
ROM2: start = $E000, size = $2000;
}
</verb></tscreen>
As you can see, there are two ram areas and two rom areas. The names
(before the colon) are arbitrary names that must start with a letter, with
the remaining characters being letters or digits. The names of the memory
areas are used when assigning segments. As mentioned above, case is
significant for these names.
The syntax above is used in all sections of the config file. The name
(<tt/ROM1/ etc.) is said to be an identifier, the remaining tokens up to the
semicolon specify attributes for this identifier. You may use the equal sign
to assign values to attributes, and you may use a comma to separate
attributes, you may also leave both out. But you <em/must/ use a semicolon to
mark the end of the attributes for one identifier. The section above may also
have looked like this:
<tscreen><verb>
# Start of memory section
MEMORY
{
RAM1:
start $0800
size $9800;
ROM1:
start $A000
size $2000;
RAM2:
start $C000
size $1000;
ROM2:
start $E000
size $2000;
}
</verb></tscreen>
There are of course more attributes for a memory section than just start and
size. Start and size are mandatory attributes, that means, each memory area
defined <em/must/ have these attributes given (the linker will check that). I
will cover other attributes later. As you may have noticed, I've used a
comment in the example above. Comments start with a hash mark (`#'), the
remainder of the line is ignored if this character is found.
Let's assume you have written a program for your trusty old C64, and you would
like to run it. For testing purposes, it should run in the <tt/RAM/ area. So
we will start to assign segments to memory sections in the <tt/SEGMENTS/
section:
<tscreen><verb>
SEGMENTS {
CODE: load = RAM1, type = ro;
RODATA: load = RAM1, type = ro;
DATA: load = RAM1, type = rw;
BSS: load = RAM1, type = bss, define = yes;
}
</verb></tscreen>
What we are doing here is telling the linker, that all segments go into the
<tt/RAM1/ memory area in the order specified in the <tt/SEGMENTS/ section. So
the linker will first write the <tt/CODE/ segment, then the <tt/RODATA/
segment, then the <tt/DATA/ segment - but it will not write the <tt/BSS/
segment. Why? Enter the segment type: For each segment specified, you may also
specify a segment attribute. There are five possible segment attributes:
<tscreen><verb>
ro means readonly
wprot same as ro but will be marked as write protected in
the VICE label file if -Lp is given
rw means read/write
bss means that this is an uninitialized segment
empty will not go in any output file
</verb></tscreen>
So, because we specified that the segment with the name BSS is of type bss,
the linker knows that this is uninitialized data, and will not write it to an
output file. This is an important point: For the assembler, the <tt/BSS/
segment has no special meaning. You specify, which segments have the bss
attribute when linking. This approach is much more flexible than having one
fixed bss segment, and is a result of the design decision to supporting an
arbitrary segment count.
If you specify "<tt/type = bss/" for a segment, the linker will make sure that
this segment does only contain uninitialized data (that is, zeroes), and issue
a warning if this is not the case.
For a <tt/bss/ type segment to be useful, it must be cleared somehow by your
program (this happens usually in the startup code - for example the startup
code for cc65 generated programs takes care about that). But how does your
code know, where the segment starts, and how big it is? The linker is able to
give that information, but you must request it. This is, what we're doing with
the "<tt/define = yes/" attribute in the <tt/BSS/ definitions. For each
segment, where this attribute is true, the linker will export three symbols.
<tscreen><verb>
__NAME_LOAD__ This is set to the address where the
segment is loaded.
__NAME_RUN__ This is set to the run address of the
segment. We will cover run addresses
later.
__NAME_SIZE__ This is set to the segment size.
</verb></tscreen>
Replace <tt/NAME/ by the name of the segment, in the example above, this would
be <tt/BSS/. These symbols may be accessed by your code.
Now, as we've configured the linker to write the first three segments and
create symbols for the last one, there's only one question left: Where does
the linker put the data? It would be very convenient to have the data in a
file, wouldn't it?
We don't have any files specified above, and indeed, this is not needed in a
simple configuration like the one above. There is an additional attribute
"file" that may be specified for a memory area, that gives a file name to
write the area data into. If there is no file name given, the linker will
assign the default file name. This is "a.out" or the one given with the
<tt><ref id="option-o" name="-o"></tt> option on the command line. Since the
default behaviour is ok for our purposes, I did not use the attribute in the
example above. Let's have a look at it now.
The "file" attribute (the keyword may also be written as "FILE" if you like
that better) takes a string enclosed in double quotes (`"') that specifies the
file, where the data is written. You may specifiy the same file several times,
in that case the data for all memory areas having this file name is written
into this file, in the order of the memory areas defined in the <tt/MEMORY/
section. Let's specify some file names in the <tt/MEMORY/ section used above:
<tscreen><verb>
MEMORY {
RAM1: start = $0800, size = $9800, file = %O;
ROM1: start = $A000, size = $2000, file = "rom1.bin";
RAM2: start = $C000, size = $1000, file = %O;
ROM2: start = $E000, size = $2000, file = "rom2.bin";
}
</verb></tscreen>
The <tt/%O/ used here is a way to specify the default behaviour explicitly:
<tt/%O/ is replaced by a string (including the quotes) that contains the
default output name, that is, "a.out" or the name specified with the <tt><ref
id="option-o" name="-o"></tt> option on the command line. Into this file, the
linker will first write any segments that go into <tt/RAM1/, and will append
then the segments for <tt/RAM2/, because the memory areas are given in this
order. So, for the RAM areas, nothing has really changed.
We've not used the ROM areas, but we will do that below, so we give the file
names here. Segments that go into <tt/ROM1/ will be written to a file named
"rom1.bin", and segments that go into <tt/ROM2/ will be written to a file
named "rom2.bin". The name given on the command line is ignored in both cases.
Let us look now at a more complex example. Say, you've successfully tested
your new "Super Operating System" (SOS for short) for the C64, and you
will now go and replace the ROMs by your own code. When doing that, you
face a new problem: If the code runs in RAM, we need not to care about
read/write data. But now, if the code is in ROM, we must care about it.
Remember the default segments (you may of course specify your own):
<tscreen><verb>
CODE read only code
RODATA read only data
DATA read/write data
BSS uninitialized data, read/write
</verb></tscreen>
Since <tt/BSS/ is not initialized, we must not care about it now, but what
about <tt/DATA/? <tt/DATA/ contains initialized data, that is, data that was
explicitly assigned a value. And your program will rely on these values on
startup. Since there's no other way to remember the contents of the data
segment, than storing it into one of the ROMs, we have to put it there. But
unfortunately, ROM is not writeable, so we have to copy it into RAM before
running the actual code.
The linker cannot help you copying the data from ROM into RAM (this must be
done by the startup code of your program), but it has some features that will
help you in this process.
First, you may not only specify a "<tt/load/" attribute for a segment, but
also a "<tt/run/" attribute. The "<tt/load/" attribute is mandatory, and, if
you don't specify a "<tt/run/" attribute, the linker assumes that load area
and run area are the same. We will use this feature for our data area:
<tscreen><verb>
SEGMENTS {
CODE: load = ROM1, type = ro;
RODATA: load = ROM2, type = ro;
DATA: load = ROM2, run = RAM2, type = rw, define = yes;
BSS: load = RAM2, type = bss, define = yes;
}
</verb></tscreen>
Let's have a closer look at this <tt/SEGMENTS/ section. We specify that the
<tt/CODE/ segment goes into <tt/ROM1/ (the one at $A000). The readonly data
goes into <tt/ROM2/. Read/write data will be loaded into <tt/ROM2/ but is run
in <tt/RAM2/. That means that all references to labels in the <tt/DATA/
segment are relocated to be in <tt/RAM2/, but the segment is written to
<tt/ROM2/. All your startup code has to do is, to copy the data from it's
location in <tt/ROM2/ to the final location in <tt/RAM2/.
So, how do you know, where the data is located? This is the second point,
where you get help from the linker. Remember the "<tt/define/" attribute?
Since we have set this attribute to true, the linker will define three
external symbols for the data segment that may be accessed from your code:
<tscreen><verb>
__DATA_LOAD__ This is set to the address where the segment
is loaded, in this case, it is an address in
ROM2.
__DATA_RUN__ This is set to the run address of the segment,
in this case, it is an address in RAM2.
__DATA_SIZE__ This is set to the segment size.
</verb></tscreen>
So, what your startup code must do, is to copy <tt/__DATA_SIZE__/ bytes from
<tt/__DATA_LOAD__/ to <tt/__DATA_RUN__/ before any other routines are called.
All references to labels in the <tt/DATA/ segment are relocated to <tt/RAM2/
by the linker, so things will work properly.
There are some other attributes not covered above. Before starting the
reference section, I will discuss the remaining things here.
You may request symbols definitions also for memory areas. This may be
useful for things like a software stack, or an i/o area.
<tscreen><verb>
MEMORY {
STACK: start = $C000, size = $1000, define = yes;
}
</verb></tscreen>
This will define three external symbols that may be used in your code:
<tscreen><verb>
__STACK_START__ This is set to the start of the memory
area, $C000 in this example.
__STACK_SIZE__ The size of the area, here $1000.
__STACK_LAST__ This is NOT the same as START+SIZE.
Instead, it it defined as the first
address that is not used by data. If we
don't define any segments for this area,
the value will be the same as START.
</verb></tscreen>
A memory section may also have a type. Valid types are
<tscreen><verb>
ro for readonly memory
rw for read/write memory.
</verb></tscreen>
The linker will assure, that no segment marked as read/write or bss is put
into a memory area that is marked as readonly.
Unused memory in a memory area may be filled. Use the "<tt/fill = yes/"
attribute to request this. The default value to fill unused space is zero. If
you don't like this, you may specify a byte value that is used to fill these
areas with the "<tt/fillval/" attribute. This value is also used to fill unfilled
areas generated by the assemblers <tt/.ALIGN/ and <tt/.RES/ directives.
Segments may be aligned to some memory boundary. Specify "<tt/align = num/" to
request this feature. Num must be a power of two. To align all segments on a
page boundary, use
<tscreen><verb>
SEGMENTS {
CODE: load = ROM1, type = ro, align = $100;
RODATA: load = ROM2, type = ro, align = $100;
DATA: load = ROM2, run = RAM2, type = rw, define = yes,
align = $100;
BSS: load = RAM2, type = bss, define = yes, align = $100;
}
</verb></tscreen>
If an alignment is requested, the linker will add enough space to the output
file, so that the new segment starts at an address that is divideable by the
given number without a remainder. All addresses are adjusted accordingly. To
fill the unused space, bytes of zero are used, or, if the memory area has a
"<tt/fillval/" attribute, that value. Alignment is always needed, if you have
the used the <tt/.ALIGN/ command in the assembler. The alignment of a segment
must be equal or greater than the alignment used in the <tt/.ALIGN/ command.
The linker will check that, and issue a warning, if the alignment of a segment
is lower than the alignment requested in a <tt/.ALIGN/ command of one of the
modules making up this segment.
For a given segment you may also specify a fixed offset into a memory area or
a fixed start address. Use this if you want the code to run at a specific
address (a prominent case is the interrupt vector table which must go at
address $FFFA). Only one of <tt/ALIGN/ or <tt/OFFSET/ or <tt/START/ may be
specified. If the directive creates empty space, it will be filled with zero,
of with the value specified with the "<tt/fillval/" attribute if one is given.
The linker will warn you if it is not possible to put the code at the
specified offset (this may happen if other segments in this area are too
large). Here's an example:
<tscreen><verb>
SEGMENTS {
VECTORS: load = ROM2, type = ro, start = $FFFA;
}
</verb></tscreen>
or (for the segment definitions from above)
<tscreen><verb>
SEGMENTS {
VECTORS: load = ROM2, type = ro, offset = $1FFA;
}
</verb></tscreen>
File names may be empty, data from segments assigned to a memory area with
an empty file name is discarded. This is useful, if the a memory area has
segments assigned that are empty (for example because they are of type
bss). In that case, the linker will create an empty output file. This may
be suppressed by assigning an empty file name to that memory area.
The symbol <tt/%S/ may be used to access the default start address (that is,
$200 or the value given on the command line with the <tt><ref id="option-S"
name="-S"></tt> option).
<sect1>Reference<p>
<sect1>Builtin configurations<label id="builtin-configs"><p>
Here is a list of the builin configurations for the different target
types:
<descrip>
<tag><tt>none</tt></tag>
<tscreen><verb>
MEMORY {
RAM: start = %S, size = $10000, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = rw;
RODATA: load = RAM, type = rw;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
}
FEATURES {
CONDES: segment = RODATA,
type = constructor,
label = __CONSTRUCTOR_TABLE__,
count = __CONSTRUCTOR_COUNT__;
CONDES: segment = RODATA,
type = destructor,
label = __DESTRUCTOR_TABLE__,
count = __DESTRUCTOR_COUNT__;
}
</verb></tscreen>
<tag><tt>atari</tt></tag>
<tscreen><verb>
MEMORY {
ZP: start = $82, size = $7E, type = rw;
HEADER: start = $0000, size = $6, file = %O;
RAM: start = $1F00, size = $9D1F, file = %O;
}
SEGMENTS {
EXEHDR: load = HEADER, type = wprot;
CODE: load = RAM, type = wprot, define = yes;
RODATA: load = RAM, type = wprot;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
ZEROPAGE: load = ZP, type = zp;
AUTOSTRT: load = RAM, type = wprot;
}
FEATURES {
CONDES: segment = RODATA,
type = constructor,
label = __CONSTRUCTOR_TABLE__,
count = __CONSTRUCTOR_COUNT__;
CONDES: segment = RODATA,
type = destructor,
label = __DESTRUCTOR_TABLE__,
count = __DESTRUCTOR_COUNT__;
}
</verb></tscreen>
<tag><tt>c64</tt></tag>
<tscreen><verb>
MEMORY {
ZP: start = $02, size = $1A, type = rw;
RAM: start = $7FF, size = $c801, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = wprot;
RODATA: load = RAM, type = wprot;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
ZEROPAGE: load = ZP, type = zp;
}
FEATURES {
CONDES: segment = RODATA,
type = constructor,
label = __CONSTRUCTOR_TABLE__,
count = __CONSTRUCTOR_COUNT__;
CONDES: segment = RODATA,
type = destructor,
label = __DESTRUCTOR_TABLE__,
count = __DESTRUCTOR_COUNT__;
}
</verb></tscreen>
<tag><tt>c128</tt></tag>
<tscreen><verb>
MEMORY {
ZP: start = $02, size = $1A, type = rw;
RAM: start = $1bff, size = $a401, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = wprot;
RODATA: load = RAM, type = wprot;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
ZEROPAGE: load = ZP, type = zp;
}
FEATURES {
CONDES: segment = RODATA,
type = constructor,
label = __CONSTRUCTOR_TABLE__,
count = __CONSTRUCTOR_COUNT__;
CONDES: segment = RODATA,
type = destructor,
label = __DESTRUCTOR_TABLE__,
count = __DESTRUCTOR_COUNT__;
}
</verb></tscreen>
<tag><tt>plus4</tt></tag>
<tscreen><verb>
MEMORY {
ZP: start = $02, size = $1A, type = rw;
RAM: start = $0fff, size = $7001, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = wprot;
RODATA: load = RAM, type = wprot;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
ZEROPAGE: load = ZP, type = zp;
}
FEATURES {
CONDES: segment = RODATA,
type = constructor,
label = __CONSTRUCTOR_TABLE__,
count = __CONSTRUCTOR_COUNT__;
CONDES: segment = RODATA,
type = destructor,
label = __DESTRUCTOR_TABLE__,
count = __DESTRUCTOR_COUNT__;
}
</verb></tscreen>
<tag><tt>cbm610</tt></tag>
<tscreen><verb>
MEMORY {
ZP: start = $02, size = $1A, type = rw;
RAM: start = $0001, size = $FFF0, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = wprot;
RODATA: load = RAM, type = wprot;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
ZEROPAGE: load = ZP, type = zp;
}
FEATURES {
CONDES: segment = RODATA,
type = constructor,
label = __CONSTRUCTOR_TABLE__,
count = __CONSTRUCTOR_COUNT__;
CONDES: segment = RODATA,
type = destructor,
label = __DESTRUCTOR_TABLE__,
count = __DESTRUCTOR_COUNT__;
}
</verb></tscreen>
<tag><tt>pet</tt></tag>
<tscreen><verb>
MEMORY {
ZP: start = $02, size = $1A, type = rw;
RAM: start = $03FF, size = $7BFF, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = wprot;
RODATA: load = RAM, type = wprot;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
ZEROPAGE: load = ZP, type = zp;
}
FEATURES {
CONDES: segment = RODATA,
type = constructor,
label = __CONSTRUCTOR_TABLE__,
count = __CONSTRUCTOR_COUNT__;
CONDES: segment = RODATA,
type = destructor,
label = __DESTRUCTOR_TABLE__,
count = __DESTRUCTOR_COUNT__;
}
</verb></tscreen>
<tag><tt>apple2</tt></tag>
<tscreen><verb>
MEMORY {
ZP: start = $00, size = $1A, type = rw;
RAM: start = $800, size = $8E00, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = ro;
RODATA: load = RAM, type = ro;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
ZEROPAGE: load = ZP, type = zp;
}
FEATURES {
CONDES: segment = RODATA,
type = constructor,
label = __CONSTRUCTOR_TABLE__,
count = __CONSTRUCTOR_COUNT__;
CONDES: segment = RODATA,
type = destructor,
label = __DESTRUCTOR_TABLE__,
count = __DESTRUCTOR_COUNT__;
}
</verb></tscreen>
<tag><tt>geos</tt></tag>
<tscreen><verb>
MEMORY {
HEADER: start = $204, size = 508, file = %O;
RAM: start = $400, size = $7C00, file = %O;
}
SEGMENTS {
HEADER: load = HEADER, type = ro;
CODE: load = RAM, type = ro;
RODATA: load = RAM, type = ro;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
}
FEATURES {
CONDES: segment = RODATA,
type = constructor,
label = __CONSTRUCTOR_TABLE__,
count = __CONSTRUCTOR_COUNT__;
CONDES: segment = RODATA,
type = destructor,
label = __DESTRUCTOR_TABLE__,
count = __DESTRUCTOR_COUNT__;
}
</verb></tscreen>
</descrip>
The "<tt/start/" attribute for the <tt/RAM/ memory area of the CBM systems is
two less than the actual start of the basic RAM to account for the two bytes
load address that is needed on disk and supplied by the startup code.
<sect>Bugs/Feedback<p>
If you have problems using the linker, if you find any bugs, or if you're
doing something interesting with it, I would be glad to hear from you. Feel
free to contact me by email (<htmlurl url="mailto:uz@cc65.org"
name="uz@cc65.org">).
<sect>Copyright<p>
ld65 (and all cc65 binutils) are (C) Copyright 1998-2000 Ullrich von
Bassewitz. For usage of the binaries and/or sources the following
conditions do apply:
This software is provided 'as-is', without any expressed or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
<enum>
<item> The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
<item> Altered source versions must be plainly marked as such, and must not
be misrepresented as being the original software.
<item> This notice may not be removed or altered from any source
distribution.
</enum>
</article>

View File

@ -1,718 +0,0 @@
ld65
A Linker for ca65 Object modules
(C) Copyright 1998-2000 Ullrich von Bassewitz
(uz@musoftware.de)
Contents
--------
1. Overview
2. Usage
3. Detailed workings
4. Output configuration files
4.1 Introduction
4.2 Reference
4.3 Builtin configurations
5. Bugs/Feedback
6. Copyright
1. Overview
-----------
The ld65 linker combines several object modules created by the ca65
assembler, producing an executable file. The object modules may be read
from a library created by the ar65 archiver (this is somewhat faster and
more convenient). The linker was designed to be as flexible as possible.
It complements the features that are built into the ca65 macroassembler:
* Accept any number of segments to form an executable module.
* Resolve arbitrary expressions stored in the object files.
* In case of errors, use the meta information stored in the object files
to produce helpful error messages. In case of undefined symbols,
expression range errors, or symbol type mismatches, ld65 is able to
tell you the exact location in the original assembler source, where
the symbol was referenced.
* Flexible output. The output of ld65 is highly configurable by a
config file. More common platforms are supported by builtin
configurations that may be activated by naming the target system.
The output generation was designed with different output formats in
mind, so adding other formats shouldn't be a great problem.
2. Usage
--------
The linker is called as follows:
---------------------------------------------------------------------------
Usage: ld65 [options] module ...
Short options:
-h Help (this text)
-m name Create a map file
-o name Name the default output file
-t sys Set the target system
-v Verbose mode
-vm Verbose map file
-C name Use linker config file
-Ln name Create a VICE label file
-Lp Mark write protected segments as such (VICE)
-S addr Set the default start address
-V Print the linker version
Long options:
--help Help (this text)
--mapfile name Create a map file
--target sys Set the target system
--version Print the linker version
---------------------------------------------------------------------------
-h
--help
Print the short option summary shown above.
-m name
--mapfile name
This option (which needs an argument that will used as a filename for
the generated map file) will cause the linker to generate a map file.
The map file does contain a detailed overview over the modules used, the
sizes for the different segments, and a table containing exported
symbols.
-o name
The -o switch is used to give the name of the default output file.
Depending on your output configuration, this name may NOT be used as
name for the output file. However, for the builtin configurations, this
name is used for the output file name.
-t sys
--target sys
The argument for the -t switch is the name of the target system. Since
this switch will activate a builtin configuration, it may not be used
together with the -C option. The following target systems are currently
supported:
none
atari
c64
c128
plus4
cbm610
pet
apple2
geos
There are a few more targets defined but neither of them is actually
supported. See section 4.3 for more information about the builtin
configurations.
-v
--verbose
Using the -v option, you may enable more output that may help you to
locate problems. If an undefined symbol is encountered, -v causes the
linker to print a detailed list of the references (that is, source file
and line) for this symbol.
-vm
Must be used in conjunction with -m (generate map file). Normally the
map file will not include empty segments and sections, or unreferenced
symbols. Using this option, you can force the linker to include all
this information into the map file.
-C
This gives the name of an output config file to use. See section 4 for
more information about config files. -C may not be used together with
-t.
-Ln
This option allows you to create a file that contains all global labels
and may be loaded into VICE emulator using the pb (playback) command.
You may use this to debug your code with VICE. Note: The label feature
is very new in VICE and has some bugs. If you have problems, please get
the latest VICE version.
-Lp
Deprecated option.
-S addr
--start-addr addr
Using -S you may define the default starting address. If and how this
address is used depends on the config file in use. For the builtin
configurations, only the "none" system honors an explicit start address,
all other builtin config provide their own.
-V
--version
This option print the version number of the linker. If you send any
suggestions or bugfixes, please include this number.
If one of the modules is not found in the current directory, and the
module name does not have a path component, the value of the environment
variable CC65_LIB is prepended to the name, and the linker tries to open
the module with this new name.
3. Detailed workings
--------------------
The linker does several things when combining object modules:
First, the command line is parsed from left to right. For each object file
encountered (object files are recognized by a magic word in the header, so
the linker does not care about the name), imported and exported
identifiers are read from the file and inserted in a table. If a library
name is given (libraries are also recognized by a magic word, there are no
special naming conventions), all modules in the library are checked if an
export from this module would satisfy an import from other modules. All
modules where this is the case are marked. If duplicate identifiers are
found, the linker issues a warning.
This procedure (parsing and reading from left to right) does mean, that a
library may only satisfy references for object modules (given directly or
from a library) named BEFORE that library. With the command line
ld65 crt0.o clib.lib test.o
the module test.o may not contain references to modules in the library
clib.lib. If this is the case, you have to change the order of the modules
on the command line:
ld65 crt0.o test.o clib.lib
Step two is, to read the configuration file, and assign start addresses
for the segments and define any linker symbols (see section 4).
After that, the linker is ready to produce an output file. Before doing
that, it checks it's data for consistency. That is, it checks for
unresolved externals (if the output format is not relocatable) and for
symbol type mismatches (for example a zero page symbol is imported by a
module as absolute symbol).
Step four is, to write the actual target files. In this step, the linker
will resolve any expressions contained in the segment data. Circular
references are also detected in this step (a symbol may have a circular
reference that goes unnoticed if the symbol is not used).
Step five is to output a map file with a detailed list of all modules,
segments and symbols encountered.
And, last step, if you give the -v switch twice, you get a dump of the
segment data. However, this may be quite unreadable if you're not a
developer:-)
4. Output configuration files
-----------------------------
Configuration files are used to describe the layout of the output file(s).
Two major topics are covered in a config file: The memory layout of the
target architecture, and the assignment of segments to memory areas. In
addition, several other attributes may be specified.
Case is ignored for keywords, that is, section or attribute names, but it
is NOT ignored for names and strings.
4.1 Introduction
----------------
Memory areas are specified in a "MEMORY" section. Lets have a look at an
example (this one describes the usable memory layout of the C64):
MEMORY {
RAM1: start = $0800, size = $9800;
ROM1: start = $A000, size = $2000;
RAM2: start = $C000, size = $1000;
ROM2: start = $E000, size = $2000;
}
As you can see, there are two ram areas and two rom areas. The names
(before the colon) are arbitrary names that must start with a letter, with
the remaining characters being letters or digits. The names of the memory
areas are used when assigning segments. As mentioned above, case is
significant for these names.
The syntax above is used in all sections of the config file. The name
("ROM1" etc.) is said to be an identifier, the remaining tokens up to the
semicolon specify attributes for this identifier. You may use the equal
sign to assign values to attributes, and you may use a comma to separate
attributes, you may also leave both out. But you MUST use a semicolon to
mark the end of the attributes for one identifier. The section above may
also have looked like this:
# Start of memory section
MEMORY
{
RAM1:
start $0800
size $9800;
ROM1:
start $A000
size $2000;
RAM2:
start $C000
size $1000;
ROM2:
start $E000
size $2000;
}
There are of course more attributes for a memory section than just start
and size. Start and size are mandatory attributes, that means, each memory
area defined MUST have these attributes given (the linker will check
that). I will cover other attributes later. As you may have noticed, I've
used a comment in the example above. Comments start with a hash mark
(`#'), the remainder of the line is ignored if this character is found.
Let's assume you have written a program for your trusty old C64, and you
would like to run it. For testing purposes, it should run in the RAM area.
So we will start to assign segments to memory sections in the SEGMENTS
section:
SEGMENTS {
CODE: load = RAM1, type = ro;
RODATA: load = RAM1, type = ro;
DATA: load = RAM1, type = rw;
BSS: load = RAM1, type = bss, define = yes;
}
What we are doing here is telling the linker, that all segments go into
the RAM1 memory area in the order specified in the SEGMENTS section. So
the linker will first write the CODE segment, then the RODATA segment,
then the DATA segment - but it will not write the BSS segment. Why? Enter
the segment type: For each segment specified, you may also specify a
segment attribute. There are five possible segment attributes:
ro means readonly
wprot same as ro but will be marked as write protected in
the VICE label file if -Lp is given
rw means read/write
bss means that this is an uninitialized segment
empty will not go in any output file
So, because we specified that the segment with the name BSS is of type
bss, the linker knows that this is uninitialized data, and will not write
it to an output file. This is an important point: For the assembler, the
BSS segment has no special meaning. You specify, which segments have the
bss attribute when linking. This approach is much more flexible than
having one fixed bss segment, and is a result of the design decision to
supporting an arbitrary segment count.
If you specify "type = bss" for a segment, the linker will make sure that
this segment does only contain uninitialized data (that is, zeroes), and
issue a warning if this is not the case.
For a bss type segment to be useful, it must be cleared somehow by your
program (this happens usually in the startup code - for example the
startup code for cc65 generated programs takes care about that). But how
does your code know, where the segment starts, and how big it is? The
linker is able to give that information, but you must request it. This is,
what we're doing with the "define = yes" attribute in the BSS definitions.
For each segment, where this attribute is true, the linker will export
three symbols.
__NAME_LOAD__ This is set to the address where the segment is
loaded.
__NAME_RUN__ This is set to the run address of the segment.
We will cover run addresses later.
__NAME_SIZE__ This is set to the segment size.
Replace "NAME" by the name of the segment, in the example above, this
would be "BSS". These symbols may be accessed by your code.
Now, as we've configured the linker to write the first three segments and
create symbols for the last one, there's only one question left: Where
does the linker put the data? It would be very convenient to have the data
in a file, wouldn't it?
We don't have any files specified above, and indeed, this is not needed in
a simple configuration like the one above. There is an additional
attribute "file" that may be specified for a memory area, that gives a
file name to write the area data into. If there is no file name given, the
linker will assign the default file name. This is "a.out" or the one given
with the -o option on the command line. Since the default behaviour is ok
for our purposes, I did not use the attribute in the example above. Let's
have a look at it now.
The "file" attribute (the keyword may also be written as "FILE" if you
like that better) takes a string enclosed in double quotes (`"') that
specifies the file, where the data is written. You may specifiy the same
file several times, in that case the data for all memory areas having this
file name is written into this file, in the order of the memory areas
defined in the MEMORY section. Let's specify some file names in the MEMORY
section used above:
MEMORY {
RAM1: start = $0800, size = $9800, file = %O;
ROM1: start = $A000, size = $2000, file = "rom1.bin";
RAM2: start = $C000, size = $1000, file = %O;
ROM2: start = $E000, size = $2000, file = "rom2.bin";
}
The %O used here is a way to specify the default behaviour explicitly: %O
is replaced by a string (including the quotes) that contains the default
output name, that is, "a.out" or the name specified with the -o option on
the command line. Into this file, the linker will first write any segments
that go into RAM1, and will append then the segments for RAM2, because the
memory areas are given in this order. So, for the RAM areas, nothing has
really changed.
We've not used the ROM areas, but we will do that below, so we give the
file names here. Segments that go into ROM1 will be written to a file
named "rom1.bin", and segments that go into ROM2 will be written to a file
named "rom2.bin". The name given on the command line is ignored in both
cases.
Let us look now at a more complex example. Say, you've successfully tested
your new "Super Operating System" (SOS for short) for the C64, and you
will now go and replace the ROMs by your own code. When doing that, you
face a new problem: If the code runs in RAM, we need not to care about
read/write data. But now, if the code is in ROM, we must care about it.
Remember the default segments (you may of course specify your own):
CODE read only code
RODATA read only data
DATA read/write data
BSS uninitialized data, read/write
Since the BSS is not initialized, we must not care about it now, but what
about DATA? DATA contains initialized data, that is, data that was
explicitly assigned a value. And your program will rely on these values on
startup. Since there's no other way to remember the contents of the data
segment, than storing it into one of the ROMs, we have to put it there.
But unfortunately, ROM is not writeable, so we have to copy it into RAM
before running the actual code.
The linker cannot help you copying the data from ROM into RAM (this must
be done by the startup code of your program), but it has some features
that will help you in this process.
First, you may not only specify a "load" attribute for a segment, but also
a "run" attribute. The "load" attribute is mandatory, and, if you don't
specify a "run" attribute, the linker assumes that load area and run area
are the same. We will use this feature for our data area:
SEGMENTS {
CODE: load = ROM1, type = ro;
RODATA: load = ROM2, type = ro;
DATA: load = ROM2, run = RAM2, type = rw, define = yes;
BSS: load = RAM2, type = bss, define = yes;
}
Let's have a closer look at this SEGMENTS section. We specify that the
CODE segment goes into ROM1 (the one at $A000). The readonly data goes
into ROM2. Read/write data will be loaded into ROM2 but is run in RAM2.
That means that all references to labels in the DATA segment are relocated
to be in RAM2, but the segment is written to ROM2. All your startup code
has to do is, to copy the data from it's location in ROM2 to the final
location in RAM2.
So, how do you know, where the data is located? This is the second point,
where you get help from the linker. Remember the "define" attribute? Since
we have set this attribute to true, the linker will define three external
symbols for the data segment that may be accessed from your code:
__DATA_LOAD__ This is set to the address where the segment is
loaded, in this case, it is an address in ROM2.
__DATA_RUN__ This is set to the run address of the segment, in
this case, it is an address in RAM2.
__DATA_SIZE__ This is set to the segment size.
So, what your startup code must do, is to copy __DATA_SIZE__ bytes from
__DATA_LOAD__ to __DATA_RUN__ before any other routines are called. All
references to labels in the DATA segment are relocated to RAM2 by the
linker, so things will work properly.
There are some other attributes not covered above. Before starting the
reference section, I will discuss the remaining things here.
You may request symbols definitions also for memory areas. This may be
useful for things like a software stack, or an i/o area.
MEMORY {
STACK: start = $C000, size = $1000, define = yes;
}
This will define three external symbols that may be used in your code:
__STACK_START__ This is set to the start of the memory
area, $C000 in this example.
__STACK_SIZE__ The size of the area, here $1000.
__STACK_LAST__ This is NOT the same as START+SIZE.
Instead, it it defined as the first
address that is not used by data. If we
don't define any segments for this area,
the value will be the same as START.
A memory section may also have a type. Valid types are
ro for readonly memory
and rw for read/write memory.
The linker will assure, that no segment marked as read/write or bss is put
into a memory area that is marked as readonly.
Unused memory in a memory area may be filled. Use the "fill = yes"
attribute to request this. The default value to fill unused space is zero.
If you don't like this, you may specify a byte value that is used to fill
these areas with the "fillval" attribute. This value is also used to fill
unfilled areas generated by the assemblers .ALIGN and .RES directives.
Segments may be aligned to some memory boundary. Specify "align = num" to
request this feature. Num must be a power of two. To align all segments on
a page boundary, use
SEGMENTS {
CODE: load = ROM1, type = ro, align = $100;
RODATA: load = ROM2, type = ro, align = $100;
DATA: load = ROM2, run = RAM2, type = rw, define = yes,
align = $100;
BSS: load = RAM2, type = bss, define = yes, align = $100;
}
If an alignment is requested, the linker will add enough space to the
output file, so that the new segment starts at an address that is
divideable by the given number without a remainder. All addresses are
adjusted accordingly. To fill the unused space, bytes of zero are used,
or, if the memory area has a "fillval" attribute, that value. Alignment is
always needed, if you have the used the .ALIGN command in the assembler.
The alignment of a segment must be equal or greater than the alignment
used in the .ALIGN command. The linker will check that, and issue a
warning, if the alignment of a segment is lower than the alignment
requested in a .ALIGN command of one of the modules making up this
segment.
For a given segment you may also specify a fixed offset into a memory area or
a fixed start address. Use this if you want the code to run at a specific
address (a prominent case is the interrupt vector table which must go at
address $FFFA). Only one of ALIGN or OFFSET or START may be specified. If the
directive creates empty space, it will be filled with zero, of with the value
specified with the "fillval" attribute if one is given. The linker will warn
you if it is not possible to put the code at the specified offset (this may
happen if other segments in this area are too large). Here's an example:
SEGMENTS {
VECTORS: load = ROM2, type = ro, start = $FFFA;
}
or (for the segment definitions from above)
SEGMENTS {
VECTORS: load = ROM2, type = ro, offset = $1FFA;
}
File names may be empty, data from segments assigned to a memory area with
an empty file name is discarded. This is useful, if the a memory area has
segments assigned that are empty (for example because they are of type
bss). In that case, the linker will create an empty output file. This may
be suppressed by assigning an empty file name to that memory area.
The symbol %S may be used to access the default start address (that is,
$200 or the value given on the command line with the -S option).
4.2 Reference
-------------
4.3 Builtin configurations
--------------------------
Here is a list of the builin configurations for the different target
types:
none:
MEMORY {
RAM: start = %S, size = $10000, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = rw;
RODATA: load = RAM, type = rw;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
}
atari:
MEMORY {
HEADER: start = $0000, size = $6, file = %O;
RAM: start = $1F00, size = $6100, file = %O;
}
SEGMENTS {
EXEHDR: load = HEADER, type = wprot;
CODE: load = RAM, type = wprot, define = yes;
RODATA: load = RAM, type = wprot;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
AUTOSTRT: load = RAM, type = wprot;
}
c64:
MEMORY {
RAM: start = $7FF, size = $c801, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = ro;
RODATA: load = RAM, type = ro;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
}
c128:
MEMORY {
RAM: start = $1bff, size = $a401, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = ro;
RODATA: load = RAM, type = ro;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
}
ace:
(non-existent)
plus4:
MEMORY {
RAM: start = $0fff, size = $7001, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = ro;
RODATA: load = RAM, type = ro;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
}
cbm610:
MEMORY {
RAM: start = $0001, size = $FFF0, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = ro;
RODATA: load = RAM, type = ro;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
}
pet:
MEMORY {
RAM: start = $03FF, size = $7BFF, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = ro;
RODATA: load = RAM, type = ro;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
}
apple2:
MEMORY {
RAM: start = $800, size = $8E00, file = %O;
}
SEGMENTS {
CODE: load = RAM, type = ro;
RODATA: load = RAM, type = ro;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
}
geos:
MEMORY {
HEADER: start = $204, size = 508, file = %O;
RAM: start = $400, size = $7C00, file = %O;
}
SEGMENTS {
HEADER: load = HEADER, type = ro;
CODE: load = RAM, type = ro;
RODATA: load = RAM, type = ro;
DATA: load = RAM, type = rw;
BSS: load = RAM, type = bss, define = yes;
}
The "start" attribute for the RAM memory area of the CBM systems is two
less than the actual start of the basic RAM to account for the two bytes
load address that is needed on disk and supplied by the startup code.
5. Bugs/Feedback
----------------
If you have problems using the linker, if you find any bugs, or if you're
doing something interesting with it, I would be glad to hear from you.
Feel free to contact me by email (uz@musoftware.de).
6. Copyright
------------
ld65 (and all cc65 binutils) are (C) Copyright 1998-2000 Ullrich von
Bassewitz. For usage of the binaries and/or sources the following
conditions do apply:
This software is provided 'as-is', without any expressed or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not
be misrepresented as being the original software.
3. This notice may not be removed or altered from any source
distribution.