1
0
mirror of https://github.com/ksherlock/x65.git synced 2025-01-22 17:31:48 +00:00
x65/x65.txt
Carl-Henrik Skårstedt 824d02095a Disassembler improvements
- Added a way to instrument labels for the disassembler
- Added pointers section to disassembler, needs to be instrumented
- Fixes
2015-11-28 14:21:54 -08:00

1158 lines
36 KiB
Plaintext

-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
x65 Assembler
-------------
x65 is an open source 6502 series assembler that supports object files,
linking, fixed address assembling and a relocatable executable.
Assemblers have existed for a long time and what they do is well documented,
x65 tries to accomodate most expectations of syntax from Kick Assembler (a
Java 6502 assembler) to Merlin (an Apple II assembler).
For debugging, dump_x65 is a tool that will show all content of x65 object
files, and x65dsasm is a disassembler intended to review the assembled
result.
Noteworthy features:
* Code with sections, object files and linking or single file fixed
address, or mix it up with fixed address sections in object files.
* Assembler listing with cycle counting for code review.
* Export multiple binaries with a single link operation.
* C style scoping within '{' and '}' with local and pool labels
respecting scopes.
* Conditional assembly with if/ifdef/else etc.
* Assembler directives representing a variety of features.
* Local labels can be defined in a number of ways, such as leading
period (.label) or leading at-sign (@label) or terminating
dollar sign (label$).
* String Symbols system allows building user expressions and macros
during assembly.
* Reassignment of symbols and labels by default.
* No indentation required for instructions, meaning that labels can't
be mnemonics, macros or directives.
* Supporting the syntax of other 6502 assemblers (Merlin syntax
requires command line argument, -endm adds support for sources
using macro/endmacro and repeat/endrepeat combos rather
than scoeps).
* Apple II GS executable output.
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Contents
--------
License
Command line arguments
CPU options
Syntax
Targets
Listing Output
Expressions
Math expression symbols supported
PC expression symbols supported
Conditional operators
Conditional assembly
65816
Data
Macros
Strings
Structs and Enums
Symbols
Label Pool
Sections
Relocatable code and linking
Merlin
All Directives
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
License
-------
Created by Carl-Henrik Skårstedt on 9/23/15.
The MIT License (MIT)
Copyright (c) 2015 Carl-Henrik Skårstedt
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the "Software"),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
Details, source and documentation at https://github.com/Sakrac/x65.
"struse.h" can be found at https://github.com/Sakrac/struse,
only the header file is required.
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Document Updates
----------------
Nov 23 2015 - Initial pass of x65 documentation
Nov 24 2015 - More text
Nov 26 2015 - String directive and more text
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Command line arguments
----------------------
Input, output and target options are set on the command line, many of
these options can be controlled with assembler directives in code as
well as the command line.
x65 source target [options]
Options include:
* -i(path) : Add include path
* -D(label)[=value] : Define a label with an optional value
(otherwise defined as 1)
* -cpu=6502/65c02/65c02wdc/65816: assemble with opcodes for a different cpu
* -acc=8/16: set the accumulator mode for 65816 at start, default is 8 bits
* -xy=8/16: set the index register mode for 65816 at start, default is 8 bits
* -org = $2000 or - org = 4096: set the default start address of
fixed address code
* -obj (file.x65) : generate object file for later linking
* -bin : Raw binary
* -c64 : Include load address (default)
* -a2b : Apple II Dos 3.3 Binary
* -a2p : Apple II ProDos Binary
* -a2o : Apple II GS OS executable (relocatable)
* -mrg : Force merge all sections (use with -a2o)
* -sym (file.sym) : symbol file
* -lst / -lst = (file.lst) : generate disassembly text from
result (file or stdout)
* -opcodes / -opcodes = (file.s) : dump all available opcodes(file or stdout)
* -sect: display sections loaded and built
* -vice (file.vs) : export a vice symbol file
* -merlin: use Merlin syntax
* -endm : macros end with endm or endmacro instead of scoped('{' - '}')
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
CPU options
-----------
The CPU can be defined on the command line with the -cpu=<name> option, or as an
assembler directive with the CPU directive. The supported CPU names are:
* 6502 - basic 6502 instruction set
* 6502ill - 6502 instruction set with illegal opcodes
* 65C02 - basic 65C02 instruction set
* 65c02WDC - 65C02 instruction set with added WDC instructions
* 65816 - basic 65816 instruction set
The CPU can be changed within a source file, the highest instruction count
CPU will be used for -lst disassembly output.
65816 has additional states that the assembler needs to be aware of such as the
accumulator and index register sizes (8 or 16 bit). These can be specified
on the command line and using assembler directives like A16, A8, I16, I8 etc.
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Syntax
------
The syntax of x65 source is the result of trying to build code originally
created for a variety of assemblers, including a number of open source
games and old personal code. The primary syntax inspiration is from
Kick Assembler, but also DASM, TASM and XASM. Most of the downloaded
sample code was written for Apple II where Merlin, Orca and Lisa were
referenced.
Note that Merlin syntax requires the -merlin command line option.
In normal mode x65 does not care about indentation, labels can be indented
and instructions can be in column 1. In this mode labels can not use
the same name as any directive or instruction and the same goes for macros,
etc. Colons are optional for labels.
Comments are line based and either semicolon or double forward slashes:
; comment
// also a comment
Local labels are any labels starting with ., !, @ or : or ending with $.
A local label will be discarded after a scope ends ( '}' ) or after a
global label is declared.
{ ; open scope
ldx #2
dex
beq .zero ; .zero is a local label within the current scope
bne ! ; address of open scope ({)
.zero
} ; close scope
Symbols are assigned with an equal sign or the EQU keyword and can be
preceeded by 'CONST' to prevent changes:
BitmapStart = $2000
CONST ColorMap EQU $400
Symbols can be removed using the UNDEF directive
UNDEF BitmapStart ; BitmapStart is no longer defined
By using the -merlin command line argument x65 is in Merlin syntax mode
which restrics labels to be in column 1 and everything else in column 2
or higher. Merlin syntax also enables a number of Merlin specific assembler
directives. See the Merlin section for more information.
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Targets
-------
Most target file formats are just a binary executable code with a few bytes
for load address and code size, with the exception of the Apple II GS
relocatable executable.
If building a fixed address target the initial address can be specified
with the command line option "-org" or by using an ORG directive in the source.
Multiple ORG statements is allowed in the source and inbetween space will
be filled with zeroes.
In order to support larger projects an intermediate (fully assembled)
relocatable target format is available using the -obj command line option to
generate a .x65 object file. More information about object files in Sections.
Command line options for target output:
* -org = $2000: set the default start address of fixed address code,
default is $1000
* -obj (file.x65): generate object file for later linking
* -bin : Raw binary
* -c64 : Include load address (default)
* -a2b : Apple II Dos 3.3 Binary (load address + file size)
* -a2p : Apple II ProDos Binary (set org to $2000 otherwise binary)
* -a2o : Apple II GS OS executable (relocatable)
* -mrg : Force merge all sections (use with -a2o)
The -mrg option will combine all segments into one to allow for 16 bit
addressing to reach data in other segments, but will limit the size to fit
into a 64 k bank.
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Listing Output
-----------
The command line -lst option will enable list output which is a traditional
way to review 6502 code. -lst=(filename) will write the list output to a file
whereas -lst by itself will send the list output to stdout.
The list output will be generated after the source has been assembled. The
output will use spaces instead of tabs to keep the columns consistant in
different editors.
The order of lines in the list output will correspond to memory and not to the
order of lines in the original code, and lines that doesn't generate data may
be omitted.
By using scoping '{' and '}' the listing starts and stops cycle counters, each
cycle counter starting is marked by c>number and stopping by c<number = time for
a single pass through all the instructions within the scope.
Columns left to right
* Address
* Bytes (up to 4) or Cycle Counter start (c>1) / end (c<1 = ...)
* Instruction (disassembled)
* Cycle Count for Instruction
* Source line that generated the data
section Code
c>1 Sin {
$0000 a2 03 ldx #$03 2 ldx #3
c>2 {
$0002 b5 e8 lda $e8,x 4 lda SinP.Ang,x
$0004 95 ec sta $ec,x 4 sta SinP.R,x
$0006 95 e4 sta $e4,x 4 sta SinP.W0,x
$0008 95 f4 sta $f4,x 4 sta Mul824.A,x
$000a 95 f0 sta $f0,x 4 sta Mul824.B,x
$000c ca dex 2 dex
$000d 10 f3 bpl $0002 2+ bpl !
c<2 = 24 + 1 }
; x^2, copy to W1
$000f a9 e0 lda #$e0 2 lda #SinP.W1
$0011 20 00 00 jsr $0000 6 jsr Multiply824S_Copy
; iterate value
$0014 a0 00 ldy #$00 2 ldy #0
.SinIterate
c>2 {
; W0 *= W1
$0016 a2 03 ldx #$03 2 ldx #3
c>3 {
$0018 b5 e4 lda $e4,x 4 lda SinP.W0,x ; x^(1+2n)
$001a 95 f4 sta $f4,x 4 sta Mul824.A,x
$001c b5 e0 lda $e0,x 4 lda SinP.W1,x ; x^2
$001e 95 f0 sta $f0,x 4 sta Mul824.B,x
$0020 ca dex 2 dex
$0021 10 f5 bpl $0018 2+ bpl !
c<3 = 20 + 1 }
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Expressions
-----------
Expressions contain values, such as labels or raw numbers and operators,
the order of operations is based on C like precedence. Internally the
expression is converted to reverse polish notation to make it easier to
keep track of complex expressions.
Values in expressions can be labels, symbols, strings (added as an
expression within parenthesis) or raw decimal, binary or hexadecimal numbers.
Math expression symbols supported:
+ Add two numbers (a+b)
- Subtract one number from another (a-b)
* Multiply two numbers (a*b)
/ Divide one number by another (a/b)
& Logical and two numbers (a&b)
| Logical or two numbers (a|b)
^ Logical exclusive or two numbers (a^b)
<< Shift value left (multiply a by 2^b)
>> Shift value right (divide a by 2^b)
( Open parenthesis, override operator precedence
) Close parenthesis, end a parenthesis block
PC expression symbols supported:
* Current address (PC). This conflicts with the use of * as multiply
so multiply will be interpreted only after a value or right parenthesis
< If less than is not followed by another '<' in an expression this
evaluates to the low byte of a value (and $ff)
> If greater than is not followed by another '>' in an expression
this evaluates to the high byte of a value (>>8)
^ Inbetween two values '^' is an eor operation, as a prefix to
values it extracts the bank byte (v>>24).
! Start of scope (use like an address label in expression)
% First address after scope (use like an address label in expression)
$ Precedes hexadecimal value
% If immediately followed by '0' or '1' this is a binary value and not
scope closure address
Conditional operators
== Double equal signs yields 1 if left value is the same as the right value
< If inbetween two values, less than will yield 1 if left value is less
than right value
> If inbetween two values, greater than will yield 1 if left value is
greater than right value
<= If inbetween two values, less than or equal will yield 1 if left value
is less than or equal to right value
>= If inbetween two values, greater than or equal will yield 1 if left
value is greater than or equal to right value
Example:
lda #(((>SCREEN_MATRIX)&$3c)*4)+8
sta $d018
Avoid using parenthesis as the first character of the parameter of an opcode
that can be relative addressed instead of an absolute address. This can be
avoided by
jmp (a+b) ; generates a relative jump
jmp.a (a+b) ; generates an absolute jump
jmp +(a+b) ; generates an absolute jump
c = (a+b)
jmp c ; generates an absolute jump
jmp a+b ; generates an absolute jump
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Conditional assembly
--------------------
IF / ELSE / ENDIF etc. works in a similar way to C, IF exp / ELIF exp assembles if
the expression is non-zero, IFDEF symbol assembles if the symbol has been
assigned.
There isn't any particular restriction to what can be excluded in a
non-assembling block of source.
* ELIF - conditionals, "else if" following an IF or IFDEF condtion
* ELSE - conditionals, following an IF or IFDEF or ELIF condition
* ENDIF - conditionals, terminates a condition
* IF - conditionals, start a block of conditional assembly if an expression
evaluates to non-zero
* IFDEF - conditionals, start a block of conditional assembly if a symbol or
label exists at this point
Example:
if 0
this part of the source will not assemble,
however a line can not start with a conditional
assembler directive such as if, ifdef, else, elseif
or endif within a block that does not assemble
unless followed by a valid expression
else
; this part of the source will assemble
lda #0
rts
endif
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
65816
-----
65816 is major expansion of 6502 and requires the assembler to be aware of
what processor flags the user has set to select instructions.
use -cpu=65816 on command line or CPU 65816 in source to set.
* A16 - 65816, set accumulator immediate operators to 16 bit mode
* A8 - 65816, set accumulator immediate operators to 8 bit mode
* I16 - 65816, set index register immediate operators to 16 bit mode,
same as XY16
* I8 - 65816, set index register immediate operators to 8 bit mode,
same as XY8
* XY16 - 65816, set index register immediate operators to 16 bit mode,
same as I16
* XY8 - 65816, set index register immediate operators to 8 bit mode,
same as I8
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Data
----
Data is any part of the binary that is not generate by assembler
mnemonics, most of the directives declare specific data except for DS that
declares a repeating value.
* BYTE - data, define comma separated bytes
* BYTES - data, same as byte
* DC - data, define comma separated bytes (default), words, triples or
longs (DC.B, DC.W, DC.T, DC.L)
* DS - data, define repeated value, first value is count, optional is fill
value, default is in bytes (DS.B, DS.W, DS.T, DS.L)
* DV - data, same as DC but differentiated in DASM as allowing expressions
* IMPORT - data and sections, load a file and include it in the assembly based
on the argument
* INCBIN - data, load a file and insert it at the current address
* INCDIR - data and control, add a directory to search for INCLUDE, INCBIN,
INCOBJ or IMPORT files in
* LONG - data, define comma separated 32 bit values
* TEXT - data, insert text at the current address optionally with a filter
* WORD - data, insert comma separated 16 bit values, same as WORDS
* WORDS - data, insert comma seperated 16 bit values, same as WORD
Example:
ONE_824 = 1<<24 ; 1 as a 8.24 number
CosInvPermute: ; 1 +
long -(ONE_824 + 1)/(2) ; x^2 * this
long (ONE_824 + 3*4)/(2*3*4) ; x^4 * this
long -(ONE_824 + 3*4*5*6)/(2*3*4*5*6) ; x^6 * this
long -(ONE_824 + 3*4*5*6*7*8)/(2*3*4*5*6*7*8) ; x^8 * this
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Macros
------
The default macro syntax is similar to a C inline function, using the
directive MACRO.
MACRO [name](parameter1, parameter2, etc.) {
lda #parameter1
sta parameter2
}
To use the macro use the name and specify parameters:
[name](1,dest)
The parenthesis are optional both for the macro declaration and for the
macro instantiation so macros can be used as if they were instructions
MACRO neg address {
sec
lda #0
sbc source
sta source
}
MACRO nega {
eor #$ff
sec
adc #0
}
Now 'neg' and 'nega' can be used as if it was an instruction:
neg $7f80 ; negate byte at this hard coded address for some reason
lda #$6c
nega ; negate accumulator
In order to support code written for other assemblers the -endm command line
option changes the syntax for macro declarations to start on the line after
MACRO and end before the line starting with ENDM or ENDMACRO:
MACRO inca
sec
adc #0
ENDMACRO
Directives for macros:
* MACRO - macros, start a macro declaration
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Strings
-------
Strings are special symbols that contain text and was included in an
effort to support ORCA macros. The difference with ORCA and other
assemblers is that the macros can build up string symbols (along with
value symbols) and combine results into a more powerful macro system.
x65 now supports the same mechanism but not the same exact keywords.
Strings can be created and passed in as a value symbol in expressions
or used directly as a macro (without parameters).
Strings are defined using the STRING directive followed by the string
name and an equal sign followed by a string expression.
Strings can include value symbols which will be evaluated and represented
by $ + the hexadecimal representation of the value.
The UNDEF directive can be used to remove String Symbols.
Example:
STRING exp = "1 + 2 + 3"
EVAL exp
result (output):
EVAL(2): "exp" = "1 + 2 + 3" = $6
Example:
STRING code_str = "lda #0\nsta $fe"
code_str
result (code):
lda #0
sta $fe
Example:
STRING concat_example = "ldx #0"
concat_example +=
Directives for String Symbols
* STRING - declare a string symbol
* UNDEF - remove a string
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Structs and Enums
-----------------
* ENUM - structs and enums, declare enumerations like C
* STRUCT - structs and enums, declare a C-like structure of symbols
separated by dots
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Symbols
-------
Symbols are assigned with an equal sign or the keyword EQU or defined
as labels within code.
Structs and Enums are structured symbols.
INCSYM can be used to reference symbols from previous assembled
binary executables:
INCSYM EntryPoint "Binary.sym"
EntryPoint is defined from the previously assembled code using an
optional symbol file.
* INCSYM - symbols, include all or specific symbols from a .sym file
* LABEL - symbols, optional prefix to symbol assignments
* LABPOOL - symbols, a stack-like pool of addresses, same as POOL
* STRUCT - structs and enums, declare a C-like structure of symbols
separated by dots
* POOL - symbols, a stack-like pool of addresses, same as LABPOOL
* CONST - symbols, declare assigned symbol as constant and if changed
cause an error
* XDEF - sections, declare a label as external which can be referenced in
other source files by using XREF
* XREF - sections, reference a label that has been declared as global in
another file by using XDEF
* UNDEF - symbols, erase a symbol or string
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Label Pool
----------
Add a label pool for temporary address labels. This is similar to how
stack frame variables are assigned in C.
A label pool is a mini stack of addresses that can be assigned as
temporary labels with a scope ('{' and '}'). This can be handy for large
functions trying to minimize use of zero page addresses, the function can
declare a range (or set of ranges) of available zero page addresses and
labels can be assigned within a scope and be deleted on scope closure.
The format of a label pool is: "pool [pool name] start-end, start-end"
and labels can then be allocated from that range by
[pool name] [label name][.b][.w]
where .b means allocate one byte and .w means allocate two bytes. The
label pools themselves are local to the scope they are defined in so
you can have label pools that are only valid for a section of your code.
Label pools works with any addresses, not just zero page addresses.
Example:
```
Function_Name: {
pool zpWork $f6-$100 ; zero page addresses for temporary labels
zpWork zpTrg.w ; zpTrg will be $fe
zpWork zpSrc.w ; zpSrc will be $fc
lda #>Src
sta zpSrc
lda #<Src
sta zpSrc+1
lda #>Dest
sta zpDst
lda #<Dest
sta zpDst+1
{
zpWork zpLen ; zpLen will be $fb
lda #Length
sta zpLen
}
nop
{
zpWork zpOff ; zpOff will be $fb (shared with zpLen)
}
rts
```
The following extensions are recognized:
* [pool name] var (no extension is one byte)
* [pool name] var.w (2 bytes)
* [pool name] var.d (2 bytes)
* [pool name] var.t (3 bytes)
* [pool name] var.l (4 bytes)
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Sections
--------
x65 supports linking of fully assembled object files into a single
larger project. This is a fairly standard feature of compilers but
supporting both common 68000 linking style and Apple II Merlin style
means that x65 is not quite as straightforward.
The purpose of a linked project is to work in multiple source files
without worrying about where in memory each file gets compiled to.
In addition sections of code and data in a single file can be linked
to different target locations. Each source file gets assembled to an
object file (.x65) and all the internal and external references are
stored separately from the binary code to be fixed up later.
The last step of a linked project is to load all object files and
generate one or more exported programs. A special source file uses
the INCOBJ directive to bring in object files one by one and piled up
by using the LINK [segment name] at a fixed address.
The SECTION directive starts a block of code or data to be linked
later. By default x65 creates a section named "default" which can
be used for linking as is but is intended to be replaced.
In order to export labels from a source file it should be declared
with XDEF prior to being defined:
XDEF Function
SECTION Code
Function:
lda #1
rts
To reference an exported label from a different file use XREF
XREF Function
SECTION Code
Code:
jsr Function
rts
To link object files (.x65) into an executable the assembled
objects need to be combined into a single source using INCOBJ
INCOBJ "Code.x65"
INCOBJ "Routines.x65"
The result will put the first included code section OR the first code
section declared in the link file.
The link file can export multiple binary executable files by using
the EXPORT directive
SECTION CodeOther, Code
EXPORT other
Code in the CodeOther section will be built as (binary)_other.(ext)
By linking multiple targets at once files can reference labels
between eachother.
Sections can be named anything and still be assigned a section type:
section Gameplay, Code ; code section named Gameplay, unaligned
...
section GameBinary, Data, $100 ; data section named GameBinary, aligned
...
section Work, Zeropage ; Zeropage or Direct page section
...
section FixedZP, Zeropage
org $a0 ; Make zero page section as a fixed address
Section types include:
* Code: binary code
* Data: binary data
* BSS: uninitialized memory (for certain targets filled with zeroes)
* Zeropage: uninitialized memory restricted to the range $00 - $ff
Additional section directive styles include:
SEG segname
SEG.U segname
SEGMENT "segname": segtype
.SEGMENT "segname"
For creating relocatable files (OMF) certain sections can not be fixed address.
Special sections for Apple II GS executables:
Sections named DirectPage_Stack and of a BSS type (default) determine the size of the direct page + stack for the executable. If multiple sections match this rule the size will be the sum of all the sections with this name.
Zeropage sections will be linked to a fixed address (default at the highest direct page addresses) prior to exporting the relocatable code. Zeropage sections in x65 is intended to allocate ranges of the zero page / direct page which is a bit confusing with OMF that has the concept of the direct page + stack segment.
Directives related to sections:
* DUMMY - sections, start a dummy section (defines addresses but does not
generate data, same as Merlin DUM)
* DUMMY_END - sections, end a dummy section (same as Merlin DEND)
* EXPORT - sections, this section will link or save to a separate binary file
with the argument appended to the link or binary filename.
* IMPORT - data and sections, load a file and include it in the assembly based
on the argument
* INCOBJ - sections, load an object file (.x65) of previously assembled source
* LINK - sections, links a section to the current section
* SECTION - section, declare a section; Comma separated arguments are name,
type, align where type is Code, Data, BSS or Zeropage
* SEG - section, same as SECTION
* SEGMENT - section, same as SECTION
* XDEF - sections, declare a label as external which can be referenced in
other source files by using XREF
* XREF - sections, reference a label that has been declared as global in
another file by using XDEF
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Relocatable code and linking
----------------------------
A lot of 6502 code has been built with fixed address assemblers. While
supporting fixed address assembling, x65 is built around generating relocatable
code that can be linked as as final build step.
Code and data is broken into sections, where data sections can be
uninitialized (BSS and Zeropage) or initialized. Sections with the same
type and the same name are combined before linking.
Apple II GS uses a relocatable binary format that can be exported, other
targets link to a fixed address during the linking stage.
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
Merlin
------
x65 can compile most Merlin syntax code with the -merlin command line
option.
A variety of directives and label rules to support Merlin assembler
sources. Merlin syntax is supported in x65 since there is historic
relevance and readily available publicly release source.
Merlin Label Syntax
]label means mutable address label, also does not seem to invalidate
local labels.
:label is perfectly valid, currently treating as a local variable
labels can include '?'
Merlin labels are not allowed to include '.' as period means logical
or in merlin, which also means that enums and structs are not
supported when assembling with merlin syntax.
Merlin expressions
Merlin may not process expressions (probably left to right, parenthesis
not allowed) the same as x65 but given that it wouldn't be intuitive
to read the code that way, there are probably very few cases where this
would be an issue.
Merlin additional directives
XC
Change processor. The first instance of XC will switch from 6502 to
65C02, the second switches from 65C02 to 65816. To return to 6502 use
XC OFF. To go directly to 65816 XC XC is supported.
MX
MX sets the immediate mode accumulator instruction size, it takes a
number and uses the lowest two bits. Bit 0 applies to index registers
(x, y) where 0 means 16 bits and 1 means 8 bits, bit 1 applies to the
accumulator. Normally it is specified in binary using the '%' prefix.
MX %11
LUP
LUP is Merlingo for loop. The lines following the LUP directive to
the keyword --^ are repeated the number of times that follows LUP.
MAC
MAC is short for Macro. Merlin macros are defined on line inbetween
MAC and <<< or EOM. Macro arguments are listed on the same line as
MAC and the macro identifier is the label preceeding the MAC directive
on the same line.
EJECT
An old assembler directive that does not affect the assembler but if
printed would insert a page break at that point.
DS
Define section, followed by a number of bytes. If number is positive
insert this amount of 0 bytes, if negative, reduce the current PC.
DUM, DEND
Dummy section, this will not write any opcodes or data to the binary
output but all code and data will increment the PC addres up to the
point of DEND.
PUT
A variation of INCLUDE that applies an oddball set of filename
rules. These rules apply to INCLUDE as well just in case they
make sense.
USR
In Merlin USR calls a function at a fixed address in memory, x65
safely avoids this. If there is a requirement for a user defined
macro you've got the source code to do it in.
SAV
SAV causes Merlin to save the result it has generated so far,
which is somewhat similar to the [EXPORT](#export) directive.
If the SAV name is different than the source name the section
will have a different EXPORT name appended and exported to a
separate binary file.
DSK
DSK is similar to SAV
ENT
ENT defines the label that preceeds it as external, same as XDEF.
EXT
EXT imports an external label, same as XREF.
LNK, STR
LNK links the contents of an object file, to fit with the named section
method of linking in x65 this keyword has been reworked to have a
similar result, the actual linking doesn't begin until the current
section is complete.
CYC
CYC starts and stops a cycle counter, x65 scoping allows for hierarchical
cycle listings but the first merlin directive CYC starts the counter and
the next CYC stops the counter and shows the result. This is 6502 only
until data is entered for other CPUs.
ADR
Define byte triplets (like DA but three bytes instead of 2)
ADRL
Define values of four bytes.
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-
All Directives
---------------
* A16 - 65816, set accumulator immediate operators to 16 bit mode
* A8 - 65816, set accumulator immediate operators to 8 bit mode
* ABORT - exit assembler after printing the argument to stdout and error,
same as ERR
* ALIGN - fixed address assembly align next to argumnet, reloc assembly set
aligment of section if immediately after section declaration
* BYTE - data, define comma separated bytes
* BYTES - data, same as byte
* CONST - symbols, declare assigned symbol as constant and if changed cause
an error
* UNDEF - symbols, erase a symbol or string
* CPU - instructions, change target processor, valid arguments are: 6502,
6502ill, 65C02, 65C02WDC, 65816; Same as PROCESSOR
* DC - data, define comma separated bytes (default), words, triples or
longs (DC.B, DC.W, DC.T, DC.L)
* DS - data, define repeated value, first value is count, optional is fill
value, default is in bytes (DS.B, DS.W, DS.T, DS.L)
* DUMMY - sections, start a dummy section (defines addresses but does not
generate data, same as Merlin DUM)
* DUMMY_END - sections, end a dummy section (same as Merlin DEND)
* DV - data, same as DC but differentiated in DASM as allowing expressions
* ECHO - status, output an expression to stdout, same as PRINT and EVAL
* ELIF - conditionals, "else if" following an IF or IFDEF condtion
* ELSE - conditionals, following an IF or IFDEF or ELIF condition
* ENDIF - conditionals, terminates a condition
* ENUM - structs and enums, declare enumerations like C
* ERR - exit assembler with a message and error, same as ABORT
* EVAL - status, output an expression to stdout, same as PRINT and ECHO
* EXPORT - sections, this section will link or save to a separate binary
file with the argument appended to the link or binary filename.
* I16 - 65816, set index register immediate operators to 16 bit mode,
same as XY16
* I8 - 65816, set index register immediate operators to 8 bit mode,
same as XY8
* IF - conditionals, start a block of conditional assembly if an expression
evaluates to non-zero
* IFDEF - conditionals, start a block of conditional assembly if a symbol or
label exists at this point
* IMPORT - data and sections, load a file and include it in the assembly
based on the argument
* INCBIN - data, load a file and insert it at the current address
* INCDIR - data and control, add a directory to search for INCLUDE, INCBIN,
INCOBJ or IMPORT files in
* INCLUDE - control, load a source file and assemble it at the current address
* INCOBJ - sections, load an object file (.x65) of previously assembled source
* INCSYM - symbols, include all or specific symbols from a .sym file
* LABEL - symbols, optional prefix to symbol assignments
* LABPOOL - symbols, a stack-like pool of addresses, same as POOL
* LINK - sections, links a section to the current section
* LOAD - set the load address for fixed address binary if different than the
initial fixed address (c64 prg and Apple II Dos 3)
* LONG - data, define comma separated 32 bit values
* MACRO - macros, start a macro declaration
* ORG - set fixed address, same as PC
* PC - set fixed address, same as ORG
* POOL - symbols, a stack-like pool of addresses, same as LABPOOL
* PRINT - status, output an expression to stdout, same as PRINT and EVAL
* PROCESSOR - instructions, change target processor, valid arguments are: 6502,
6502ill, 65C02, 65C02WDC, 65816; Same as CPU
* REPEAT - repeat a block of code a number of times, same as REPT
* REPT - repeat a block of code a number of times, same as REPEAT
* STRING - strings, declare a string that can be used in expressions or
assembled as if it was a macro.
* SECTION - section, declare a section; Comma separated arguments are name,
type, align where type is Code, Data, BSS or Zeropage
* SEG - section, same as SECTION
* SEGMENT - section, same as SECTION
* STRUCT - structs and enums, declare a C-like structure of symbols separated
by dots
* TEXT - data, insert text at the current address optionally with a filter
* WORD - data, insert comma separated 16 bit values, same as WORDS
* WORDS - data, insert comma seperated 16 bit values, same as WORD
* XDEF - sections, declare a label as external which can be referenced in
other source files by using XREF
* XREF - sections, reference a label that has been declared as global in
another file by using XDEF
* XY16 - 65816, set index register immediate operators to 16 bit mode,
same as I16
* XY8 - 65816, set index register immediate operators to 8 bit mode,
same as I8
-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-