cc65/doc/da65.sgml

<!doctype linuxdoc system>

<article>
<title>da65 Users Guide
<author>
<url url="mailto:uz@cc65.org" name="Ullrich von Bassewitz">,<newline>
<url url="mailto:greg.king5@verizon.net" name="Greg King">
<date>2014-11-23

<abstract>
da65 is a 6502/65C02 disassembler that is able to read user-supplied
information about its input data, for better results. The output is ready for
feeding into ca65, the macro assembler supplied with the cc65 C compiler.
</abstract>

<!-- Table of contents -->
<toc>

<!-- Begin the document -->

<sect>Overview<p>

da65 is a disassembler for 6502/65C02 code. It is supplied as a utility with
the cc65 C compiler and generates output that is suitable for the ca65
macro assembler.

Besides generating output for ca65, one of the design goals was that the user
is able to feed additional information about the code into the disassembler,
for improved results. This information may include the location and size of
tables, and their format.

One nice advantage of this concept is that disassembly of copyrighted binaries
may be handled without problems: One can just pass the information file for
disassembling the binary, so everyone with a legal copy of the binary can
generate a nicely formatted disassembly with readable labels and other
information.


<sect>Usage<p>


<sect1>Command line option overview<p>

The assembler accepts the following options:

<tscreen><verb>
---------------------------------------------------------------------------
Usage: da65 [options] [inputfile]
Short options:
  -g                    Add debug info to object file
  -h                    Help (this text)
  -i name               Specify an info file
  -o name               Name the output file
  -v                    Increase verbosity
  -F                    Add formfeeds to the output
  -S addr               Set the start/load address
  -V                    Print the disassembler version

Long options:
  --argument-column n   Specify argument start column
  --comment-column n    Specify comment start column
  --comments n          Set the comment level for the output
  --cpu type            Set cpu type
  --debug-info          Add debug info to object file
  --formfeeds           Add formfeeds to the output
  --help                Help (this text)
  --hexoffs             Use hexadecimal label offsets
  --info name           Specify an info file
  --label-break n       Add newline if label exceeds length n
  --mnemonic-column n   Specify mnemonic start column
  --pagelength n        Set the page length for the listing
  --start-addr addr     Set the start/load address
  --text-column n       Specify text start column
  --verbose             Increase verbosity
  --version             Print the disassembler version
---------------------------------------------------------------------------
</verb></tscreen>


<sect1>Command line options in detail<p>

Here is a description of all the command line options:

<descrip>

  <label id="option--argument-column">
  <tag><tt>--argument-column n</tt></tag>

  Specifies the column where the argument for a mnemonic or pseudo instruction
  starts.


  <label id="option--comment-column">
  <tag><tt>--comment-column n</tt></tag>

  Specifies the column where the comment for an instruction starts.


  <label id="option--comments">
  <tag><tt>--comments n</tt></tag>

  Set the comment level for the output. Valid arguments are 0..4. Greater
  values will increase the level of additional information written to the
  output file in form of comments.


  <label id="option--cpu">
  <tag><tt>--cpu type</tt></tag>

  Set the CPU type. The option takes a parameter, which may be one of
  <itemize>
  <item>6502
  <item>6502x
  <item>65sc02
  <item>65c02
  <item>huc6280
  <item>4510
  </itemize>

  6502x is for the NMOS 6502 with unofficial opcodes. huc6280 is the CPU of
  the PC engine. 4510 is the CPU of the Commodore C65. Support for the 65816
  currently is not available.


  <label id="option--formfeeds">
  <tag><tt>-F, --formfeeds</tt></tag>

  Add formfeeds to the generated output. This feature is useful together
  with the <tt><ref id="option--pagelength" name="--pagelength"></tt> option.
  If <tt/--formfeeds/ is given, a formfeed is added to the output after each
  page.


  <tag><tt>-g, --debug-info</tt></tag>

  This option adds the <tt/.DEBUGINFO/ command to the output file, so the
  assembler will generate debug information when re-assembling the generated
  output.


  <tag><tt>-h, --help</tt></tag>

  Print the short option summary shown above.


  <label id="option--hexoffs">
  <tag><tt>--hexoffs</tt></tag>

  Output label offsets in hexadecimal instead of decimal notation.


  <label id="option--info">
  <tag><tt>-i name, --info name</tt></tag>

  Specify an info file. The info file contains global options that may
  override or replace command line options plus informations about the code
  that has to be disassembled. See the separate section <ref id="infofile"
  name="Info File Format">.


  <label id="option-o">
  <tag><tt>-o name</tt></tag>

  Specify a name for an output file. The default is to use <tt/stdout/, so
  without this switch or the corresponding <ref id="global-options"
  name="global option"> <tt><ref id="OUTPUTNAME" name="OUTPUTNAME"></tt>,
  the output will go to the terminal.


  <label id="option--label-break">
  <tag><tt>--label-break n</tt></tag>

  Adds a newline if the length of a label exceeds the given length.
  Note: If the label would run into the code in the mid column, a
  linefeed is always inserted regardless of this setting.

  This option overrides the <ref id="global-options" name="global option">
  <tt><ref id="LABELBREAK" name="LABELBREAK"></tt>.


  <label id="option--mnemonic-column">
  <tag><tt>--mnemonic-column n</tt></tag>

  Specifies the column where a mnemonic or pseudo instrcuction is output.


  <label id="option--pagelength">
  <tag><tt>--pagelength n</tt></tag>

  Sets the length of a listing page in lines. After this number of lines, a
  new page header is generated. If the <tt><ref id="option--formfeeds"
  name="--formfeeds"></tt> is also given, a formfeed is inserted before
  generating the page header.

  A value of zero for the page length will disable paging of the output.


  <label id="option--start-addr">
  <tag><tt>-S addr, --start-addr addr</tt></tag>

  Specify the start/load address of the binary code that is going to be
  disassembled. The given address is interpreted as an octal value if
  preceded with a '0' digit, as a hexadecimal value if preceded
  with '0x', '0X', or '$', and as a decimal value in all other cases. If no
  start address is specified, $10000 minus the size of the input file is used.


  <label id="option--text-column">
  <tag><tt>--text-column n</tt></tag>

  Specifies the column where additional text is output. This additional text
  consists of the bytes encoded in this line in text representation.


  <tag><tt>-v, --verbose</tt></tag>

  Increase the disassembler verbosity. Usually only needed for debugging
  purposes. You may use this option more than one time for even more
  verbose output.


  <tag><tt>-V, --version</tt></tag>

  Print the version number of the assembler. If you send any suggestions
  or bugfixes, please include the version number.

</descrip>
<p>


<sect>Detailed workings<p>

<sect1>Supported CPUs<p>

The default (no CPU given on the command line or in the <tt/GLOBAL/ section of
the info file) is the 6502 CPU. The disassembler knows all "official" opcodes
for this CPU. Invalid opcodes are translated into <tt/.byte/ commands.

With the command line option <tt><ref id="option--cpu" name="--cpu"></tt>, the
disassembler may be told to recognize either the 65SC02 or 65C02 CPUs. The
latter understands the same opcodes as the former, plus 16 additional bit
manipulation and bit test-and-branch commands.

When disassembling 4510 code, due to handling of 16-bit wide branches, da65
can produce output that can not be re-assembled, when one or more of those
branches point outside of the disassembled memory. This can happen when text
or binary data is processed.

While there is some code for the 65816 in the sources, it is currently
unsupported.


<sect1>Attribute map<p>

The disassembler works by creating an attribute map for the whole address
space ($0000 - $FFFF). Initially, all attributes are cleared. Then, an
external info file (if given) is read. Disassembly is done in several passes.
In all passes, with the exception of the last one, information about the
disassembled code is gathered and added to the symbol and attribute maps. The
last pass generates output using the information from the maps.

<sect1>Labels<p>

Some instructions may generate labels in the first pass, while most other
instructions do not generate labels, but use them if they are available. Among
others, the branch and jump instructions will generate labels for the target
of the branch in the first pass. External labels (taken from the info file)
have precedence over internally generated ones, They must be valid identifiers
as specified for the ca65 assembler. Internal labels (generated by the
disassembler) have the form <tt/Labcd/, where <tt/abcd/ is the hexadecimal
address of the label in upper case letters. You should probably avoid using
such label names for external labels.


<sect1>Info File<p>

The info file is used to pass additional information about the input code to
the disassembler. This includes label names, data areas or tables, and global
options like input and output file names. See the <ref id="infofile"
name="next section"> for more information.


<sect>Info File Format<label id="infofile"><p>

The info file contains lists of specifications grouped together. Each group
directive has an identifying token and an attribute list enclosed in curly
braces. Attributes have a name followed by a value. The syntax of the value
depends on the type of the attribute. String attributes are places in double
quotes, numeric attributes may be specified as decimal numbers or hexadecimal
with a leading dollar sign. There are also attributes where the attribute
value is a keyword; in this case, the keyword is given as-is (without quotes or
anything). Each attribute is terminated by a semicolon.

<tscreen><verb>
        group-name { attribute1 attribute-value; attribute2 attribute-value; }
</verb></tscreen>


<sect1>Comments<p>

Comments start with a hash mark (<tt/#/); and, extend from the position of
the mark to the end of the current line. Hash marks inside of strings will
<em/not/ start a comment, of course.


<sect1>Specifying global options<label id="global-options"><p>

Global options may be specified in a group with the name <tt/GLOBAL/. The
following attributes are recognized:

<descrip>

  <tag><tt/ARGUMENTCOLUMN/</tag>
  This attribute specifies the column in the output, where the argument for
  an opcode or pseudo instruction starts. The corresponding command line
  option is
  <tt><ref id="option--argument-column" name="--argument-column"></tt>.


  <tag><tt/COMMENTCOLUMN/</tag>
  This attribute specifies the column in the output, where the comment starts
  in a line. It is only used for in-line comments. The corresponding command
  line option is
  <tt><ref id="option--comment-column" name="--comment-column"></tt>.


  <label id="COMMENTS">
  <tag><tt/COMMENTS/</tag>
  This attribute may be used instead of the <tt><ref id="option--comments"
  name="--comments"></tt> option on the command line. It takes a numerical
  parameter between 0 and 4. Higher values increase the amount of information
  written to the output file in form of comments.


  <tag><tt/CPU/</tag>
  This attribute may be used instead of the <tt><ref id="option--cpu"
  name="--cpu"></tt> option on the command line. For possible values see
  there. The value is a string and must be enclosed in quotes.


  <tag><tt/HEXOFFS/</tag>
  The attribute is followed by a boolean value. If true, offsets to labels are
  output in hex, otherwise they're output in decimal notation. The default is
  false. The attribute may be changed on the command line using the <tt><ref
  id="option--hexoffs" name="--hexoffs"></tt> option.


  <tag><tt/INPUTNAME/</tag>
  The attribute is followed by a string value, which gives the name of the
  input file to read. If it is present, the disassembler does not accept an
  input file name on the command line.


  <tag><tt/INPUTOFFS/</tag>
  The attribute is followed by a numerical value that gives an offset into
  the input file which is skipped before reading data. The attribute may be
  used to skip headers or unwanted code sections in the input file.


  <tag><tt/INPUTSIZE/</tag>
  <tt/INPUTSIZE/ is followed by a numerical value that gives the amount of
  data to read from the input file. Data beyond <tt/INPUTOFFS + INPUTSIZE/
  is ignored.


  <label id="LABELBREAK">
  <tag><tt/LABELBREAK/</tag>
  <tt/LABELBREAK/ is followed by a numerical value that specifies the label
  length that will force a newline. To have all labels on their own lines,
  you may set this value to zero.

  See also the <tt><ref id="option--label-break" name="--label-break"></tt>
  command line option. A <tt/LABELBREAK/ statement in the info file will
  override any value given on the command line.


  <tag><tt/MNEMONICCOLUMN/</tag>
  This attribute specifies the column in the output, where the mnemonic or
  pseudo instruction is placed. The corresponding command line option is
  <tt><ref id="option--mnemonic-column" name="--mnemonic-column"></tt>.


  <tag><tt/NEWLINEAFTERJMP/</tag>
  This attribute is followed by a boolean value. When true, a newline is
  inserted after each <tt/JMP/ instruction. The default is false.


  <tag><tt/NEWLINEAFTERRTS/</tag>
  This attribute is followed by a boolean value. When true, a newline is
  inserted after each <tt/RTS/ instruction. The default is false.


  <label id="OUTPUTNAME">
  <tag><tt/OUTPUTNAME/</tag>
  The attribute is followed by string value, which gives the name of the
  output file to write. If it is present, specification of an output file on
  the command line using the <tt><ref id="option-o" name="-o"></tt> option is
  not allowed.

  The default is to use <tt/stdout/ for output, so without this attribute or
  the corresponding command line option <tt/<ref id="option-o" name="-o">/
  the output will go to the terminal.


  <tag><tt/PAGELENGTH/</tag>
  This attribute may be used instead of the <tt><ref id="option--pagelength"
  name="--pagelength"></tt> option on the command line. It takes a numerical
  parameter. Using zero as page length (which is the default) means that no
  pages are generated.


  <tag><tt/STARTADDR/</tag>
  This attribute may be used instead of the <tt><ref id="option--start-addr"
  name="--start-addr"></tt> option on the command line. It takes a numerical
  parameter. The default for the start address is $10000 minus the size of
  the input file (this assumes that the input file is a ROM that contains the
  reset and irq vectors).


  <tag><tt/TEXTCOLUMN/</tag>
  This attribute specifies the column, where the data bytes are output
  translated into ASCII text. It is only used if
  <tt><ref id="COMMENTS" name="COMMENTS"></tt> is set to at least 4. The
  corresponding command line option is
  <tt><ref id="option--text-column" name="--text-column"></tt>.

</descrip>


<sect1>Specifying Ranges<p>

The <tt/RANGE/ directive is used to give information about address ranges. The
following attributes are recognized:

<descrip>

  <tag><tt>COMMENT</tt></tag>
  This attribute is only allowed if a label is also given. It takes a string
  as argument. See the description of the <tt><ref id="infofile-label"
  name="LABEL"></tt> directive for an explanation.

  <tag><tt>END</tt></tag>
  This gives the end address of the range. The end address is inclusive, that
  means, it is part of the range. Of course, it may not be smaller than the
  start address.

  <tag><tt>NAME</tt></tag>
  This is a convenience attribute. It takes a string argument and will cause
  the disassembler to define a label for the start of the range with the
  given name. So a separate <tt><ref id="infofile-label" name="LABEL"></tt>
  directive is not needed.

  <tag><tt>START</tt></tag>
  This gives the start address of the range.

  <tag><tt>TYPE</tt></tag>
  This attribute specifies the type of data within the range. The attribute
  value is one of the following keywords:

  <descrip>
    <tag><tt>ADDRTABLE</tt></tag>
    The range consists of data and is disassembled as a table of words
    (16 bit values). The difference to the <tt/WORDTABLE/ type is that
    a label is defined for each entry in the table.

    <tag><tt>BYTETABLE</tt></tag>
    The range consists of data and is disassembled as a byte table.

    <tag><tt>CODE</tt></tag>
    The range consists of code.

    <tag><tt>DBYTETABLE</tt></tag>
    The range consists of data and is disassembled as a table of dbytes
    (double byte values, 16 bit values with the low byte containing the
    most significant byte of the 16 bit value).

    <tag><tt>DWORDTABLE</tt></tag>
    The range consists of data and is disassembled as a table of double
    words (32 bit values).

    <tag><tt>RTSTABLE</tt></tag>
    The range consists of data and is disassembled as a table of words (16 bit
    values). The values are interpreted as words that are pushed onto the
    stack and jump to it via <tt/RTS/. This means that they contain
    <tt/address-1/ of a function, for which a label will get defined by the
    disassembler.

    <tag><tt>SKIP</tt></tag>
    The range is simply ignored when generating the output file. Please note
    that this means that reassembling the output file will <em/not/ generate
    the original file, not only because the missing piece in between, but also
    because the following code will be located on wrong addresses. Output
    generated with <tt/SKIP/ ranges will need manual rework.

    <tag><tt>TEXTTABLE</tt></tag>
    The range consists of readable text.

    <tag><tt>WORDTABLE</tt></tag>
    The range consists of data and is disassembled as a table of words
    (16 bit values).

  </descrip>

</descrip>


<sect1>Specifying Labels<label id="infofile-label"><p>

The <tt/LABEL/ directive is used to give names for labels in the disassembled
code. The following attributes are recognized:

<descrip>

  <tag><tt>ADDR</tt></tag>
  Followed by a numerical value. Specifies the value of the label.

  <tag><tt>COMMENT</tt></tag>
  Attribute argument is a string. The comment will show up in a separate line
  before the label, if the label is within code or data range, or after the
  label if it is outside.

  Example output:

<tscreen><verb>
        foo     := $0001        ; Comment for label named "foo"

        ; Comment for label named "bar"
        bar:
</verb></tscreen>

  <tag><tt>NAME</tt></tag>
  The attribute is followed by a string value which gives the name of the
  label. Empty names are allowed, in this case the disassembler will create
  an unnamed label (see the assembler docs for more information about unnamed
  labels).

  <tag><tt>SIZE</tt></tag>
  This attribute is optional and may be used to specify the size of the data
  that follows. If a size greater than 1 is specified, the disassembler will
  create labels in the form <tt/label+offs/ for all bytes within the given
  range, where <tt/label/ is the label name given with the <tt/NAME/
  attribute, and <tt/offs/ is the offset within the data.

</descrip>


<sect1>Specifying Segments<label id="infofile-segment"><p>

The <tt/SEGMENT/ directive is used to specify a segment within the
disassembled code. The following attributes are recognized:

<descrip>

  <tag><tt>START</tt></tag>
  Followed by a numerical value. Specifies the start address of the segment.

  <tag><tt>END</tt></tag>
  Followed by a numerical value. Specifies the end address of the segment. The
  end address is the last address that is a part of the segment.

  <tag><tt>NAME</tt></tag>
  The attribute is followed by a string value which gives the name of the
  segment.
</descrip>

All attributes are mandatory. Segments must not overlap. The disassembler will
change back to the (default) <tt/.code/ segment after the end of each defined
segment. That might not be what you want. As a rule of thumb, if you're using
segments, you should define segments for all disassembled code.


<sect1>Specifying Assembler Includes<label id="infofile-asminc"><p>

The <tt/ASMINC/ directive is used to give the names of input files containing
symbol assignments in assembler syntax:

<tscreen><verb>
        Name = value
        Name := value
</verb></tscreen>

The usual conventions apply for symbol names. Values may be specified as hex
(leading &dollar;), binary (leading %) or decimal. The values may optionally
be signed.

NOTE: The include file parser is very simple. Expressions are not allowed, and
anything but symbol assignments is flagged as an error (but see the
<tt/IGNOREUNKNOWN/ directive below).

The following attributes are recognized:

<descrip>

  <tag><tt>FILE</tt></tag>
  Followed by a string value. Specifies the name of the file to read.

  <tag><tt>COMMENTSTART</tt></tag>
  The optional attribute is followed by a character constant. It specifies the
  character that starts a comment. The default value is a semicolon. This
  value is ignored if <tt/IGNOREUNKNOWN/ is true.

  <tag><tt>IGNOREUNKNOWN</tt></tag>
  This attribute is optional and is followed by a boolean value. It allows to
  ignore input lines that don't have a valid syntax. This allows to read in
  assembler include files that contain more than just symbol assignments.
  Note: When this attribute is used, the disassembler will ignore any errors
  in the given include file. This may have undesired side effects.

</descrip>


<sect1>An Info File Example<p>

The following is a short example for an info file that contains most of the
directives explained above:

<tscreen><verb>
        # This is a comment. It extends to the end of the line
        GLOBAL {
            OUTPUTNAME      "kernal.s";
            INPUTNAME       "kernal.bin";
            STARTADDR       $E000;
            PAGELENGTH      0;                  # No paging
            CPU             "6502";
        };

        # One segment for the whole stuff
        SEGMENT { START $E000;  END   $FFFF; NAME "kernal"; };

        RANGE { START $E612;    END   $E631; TYPE Code;      };
        RANGE { START $E632;    END   $E640; TYPE ByteTable; };
        RANGE { START $EA51;    END   $EA84; TYPE RtsTable;  };
        RANGE { START $EC6C;    END   $ECAB; TYPE RtsTable;  };
        RANGE { START $ED08;    END   $ED11; TYPE AddrTable; };

        # Zero-page variables
        LABEL { NAME "fnadr";   ADDR  $90;   SIZE 3;    };
        LABEL { NAME "sal";     ADDR  $93;   };
        LABEL { NAME "sah";     ADDR  $94;   };
        LABEL { NAME "sas";     ADDR  $95;   };

        # Stack
        LABEL { NAME "stack";   ADDR  $100;  SIZE 255;  };

        # Indirect vectors
        LABEL { NAME "cinv";    ADDR  $300;  SIZE 2;    };      # IRQ
        LABEL { NAME "cbinv";   ADDR  $302;  SIZE 2;    };      # BRK
        LABEL { NAME "nminv";   ADDR  $304;  SIZE 2;    };      # NMI

        # Jump table at end of kernal ROM
        LABEL { NAME "kscrorg"; ADDR  $FFED; };
        LABEL { NAME "kplot";   ADDR  $FFF0; };
        LABEL { NAME "kiobase"; ADDR  $FFF3; };
        LABEL { NAME "kgbye";   ADDR  $FFF6; };

        # Hardware vectors
        LABEL { NAME "hanmi";   ADDR  $FFFA; };
        LABEL { NAME "hares";   ADDR  $FFFC; };
        LABEL { NAME "hairq";   ADDR  $FFFE; };
</verb></tscreen>


<sect>Copyright<p>

da65 (and all cc65 binutils) is (C) Copyright 1998-2011, Ullrich von
Bassewitz. For usage of the binaries and/or sources, the following
conditions do apply:

This software is provided 'as-is', without any expressed or implied
warranty.  In no event will the authors be held liable for any damages
arising from the use of this software.

Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:

<enum>
<item>  The origin of this software must not be misrepresented; you must not
        claim that you wrote the original software. If you use this software
        in a product, an acknowledgment in the product documentation would be
        appreciated but is not required.
<item>  Altered source versions must be plainly marked as such, and must not
        be misrepresented as being the original software.
<item>  This notice may not be removed or altered from any source
        distribution.
</enum>


</article>