# x65
6502 Macro Assembler in a single c++ file using the struse single file text parsing library. Supports most syntaxes. x65 was recently named Asm6502 but was renamed because Asm6502 is too generic, x65 has no particular meaning.
Every assembler seems to add or change its own quirks to the 6502 syntax. This implementation aims to support all of them at once as long as there is no contradiction.
To keep up with this trend x65 is adding the following features to the mix:
* Full expression evaluation everywhere values are used: [Expressions](#expressions)
* Basic relative sections and linking.
* Apple II GS executable output
* C style scoping within '{' and '}': [Scopes](#scopes)
* Reassignment of labels. This means there is no error if you declare the same label twice, but on the other hand you can do things like label = label + 2.
* [Local labels](#labels) can be defined in a number of ways, such as leading period (.label) or leading at-sign (@label) or terminating dollar sign (label$).
* [Directives](#directives) support both with and without leading period.
* Labels don't need to end with colon, but they can.
* No indentation required for instructions, meaning that labels can't be mnemonics, macros or directives.
* Conditional assembly with if/ifdef/else etc.
* As far as achievable, support the syntax of other 6502 assemblers (Merlin syntax now requires command line argument, -endm adds support for sources using macro/endmacro and repeat/endrepeat combos rather than scoeps).
In summary, if you are familiar with any 6502 assembler syntax you should feel at home with x65. If you're familiar with C programming expressions you should be familiar with '{', '}' scoping and complex expressions.
There are no hard limits on binary size so if the address exceeds $ffff it will just wrap around to $0000. I'm not sure about the best way to handle that or if it really is a problem.
There is a sublime package for coding/building in Sublime Text 3 in the *sublime* subfolder.
## Features
* **Code**
* **Linking**
* **Comments**
* **Labels**
* **Directives**
* **Macros**
* **Expressions**
* **List File with Cycle Count**
## Prerequisite
x65.cpp requires struse.h which is a single file text parsing library that can be retrieved from https://github.com/Sakrac/struse.
### References
* [6502 opcodes](http://www.6502.org/tutorials/6502opcodes.html)
* [6502 opcode grid](http://www.llx.com/~nparker/a2/opcodes.html)
* [Codebase64 CPU section](http://codebase64.org/doku.php?id=base:6502_6510_coding)
* [6502 illegal opcodes](http://www.oxyron.de/html/opcodes02.html)
* [65816 opcodes](http://wiki.superfamicom.org/snes/show/65816+Reference#fn:14)
## Command Line Options
The command line options specifies the source file, the destination file and what type of file to generate, such as c64 or apple II dos 3.3 or binary or an x65 specific object file. You can also generate a disassembly listing with inline source code or dump the available set of opcodes as a source file. The command line can also set labels for conditional assembly to allow for distinguishing debug builds from shippable builds.
Typical command line ([*] = optional):
```
x65 [-DLabel] [-iIncDir] (source.s) (dest.prg) [-lst[=file.lst]] [-opcodes[=file.s]]
[-sym dest.sym] [-vice dest.vs] [-obj]/[-c64]/[-bin]/[-a2b] [-merlin] [-endm]
```
**Usage**
x65 filename.s code.prg [options]
* -i(path) : Add include path
* -D(label)[=value] : Define a label with an optional value (otherwise defined as 1)
* -cpu=6502/65c02/65c02wdc/65816: assemble with opcodes for a different cpu
* -acc=8/16: set the accumulator mode for 65816 at start, default is 8 bits
* -xy=8/16: set the index register mode for 65816 at start, default is 8 bits
* -org = $2000 or - org = 4096: set the default start address of fixed address code
* -obj : generate object file for later linking instead of executable binary (.x65)
* -bin : Raw binary (no load address or size included before code)
* -c64 : Include load address (default, default org is $1000)
* -a2b : Apple II Dos 3.3 Binary (changes default org to $803, adds load addr+size)
* -a2p : Apple II ProDos Binary (changed default org to $2000, sets to binary)
* -a2o : Apple II GS OS executable (writes relocatable executable binary)
* -mrg : Force merge all sections (use with -a2o)
* -sym (file.sym) : symbol file
* -lst / -lst = (file.lst) : generate disassembly text from result(file or stdout)
* -opcodes / -opcodes = (file.s) : dump all available opcodes(file or stdout)
* -sect: display sections loaded and built
* -vice (file.vs) : export a vice symbol file
* -merlin: use Merlin syntax
* -endm : macros end with endm or endmacro instead of scoped ('{' - '}')
### Code
Code is any valid mnemonic/opcode and addressing mode. At the moment only one opcode per line is assembled.
### Linking
In order to manage more complex projects linking multiple assembled object files is desirable and x65 builds object files that can be included in a final linking step.
Simply build code with or without a fixed address and the -obj filename.x65 command line argument, then use INCOBJ filename.x65 in a final linking source. The linking source can be assigned a fixed address for most targets or exported as a relocatable executable for Apple II GS.
### Relocatable executable
For Apple II GS OS executable. This output requires 65816 instructions to handle the larger memory and the entry point for code needs to be implemented correctly. Using the -mrg option merges all sections together so that 16 bit addressing is safe, otherwise different code or data segments could be loaded in different banks and 3 byte referencing is required. An important note is that I have not been significantly exposed to Apple II GS or 65816 so this feature is only guaranteed as far as being able to ensure the correctness without actually building a running piece of code.
### Comments
Comments are currently line based and both ';' and '//' are accepted as delimiters.
### Expressions
Anywhere a number can be entered it can also be interpreted as a full expression, for example:
```
Get123:
bytes Get1-*, Get2-*, Get3-*
Get1:
lda #1
rts
Get2:
lda #2
rts
Get3:
lda #3
rts
```
Would yield 3 bytes where the address of a label can be calculated by taking the address of the byte plus the value of the byte.
### Labels
Labels come in two flavors: **Addresses** (PC based) or **Values** (Evaluated from an expression). An address label is simply placed somewhere in code and a value label is followed by '**=**' and an expression. All labels are rewritable so it is fine to do things like NumInstance = NumInstance+1. Value assignments can be prefixed with '.const' or '.label' but is not required to be prefixed by anything, the CONST keyword should cause an error if the label is modified in the same source file.
*Local labels* exist inbetween *global labels* and gets discarded whenever a new global label is added. The syntax for local labels are one of: prefix with period, at-sign, exclamation mark or suffix with $, as in: **.local** or **!local** or **@local** or **local$**. Both value labels and address labels can be local labels.
```
Function: ; global label
ldx #32
.local_label ; local label
dex
bpl .local_label
rts
Next_Function: ; next global label, the local label above is now erased.
rts
```
### Directives
Directives are assembler commands that control the code generation but that does not generate code by itself. Some assemblers prefix directives with a period (.org instead of org) so a leading period is accepted but not required for directives.
* [**CPU**](#cpu) Set the CPU to assemble for.
* [**ORG**](#org) (same as **PC**): Set the current compiling address.
* [**LOAD**](#load) Set the load address for binary formats that support it.
* [**SECTION**](#section) Start a relative section
* [**LINK**](#link) Link a relative section at this address
* [**XDEF**](#xdef) Make a label available globally
* [**XREF**](#xref) Reference a label declared globally in a different object file (.x65)
* [**INCOBJ**](#incobj) Include an object file (.x65) to this file
* [**EXPORT**](#export) Save out additional binary files with argument appended to filename
* [**ALIGN**](#align) Align the address to a multiple by filling with 0s
* [**MACRO**](#macro) Declare a macro
* [**EVAL**](#eval) Log an expression during assembly.
* [**BYTES**](#bytes) Insert comma separated bytes at this address (same as **BYTE** or **DC.B**)
* [**WORDS**](#words) Insert comma separated 16 bit values at this address (same as **WORD** or **DC.W**)
* [**LONG**](#long) Insert comma separated 32 bit values at this address
* [**TEXT**](#text) Insert text at this address
* [**INCLUDE**](#include) Include another source file and assemble at this address
* [**INCBIN**](#incbin) Include a binary file at this address
* [**IMPORT**](#import) Catch-all file inclusion (source, bin, text, object, symbols)
* [**CONST**](#const) Assign a value to a label and make it constant (error if reassigned with other value)
* [**LABEL**](#label) Decorative directive to assign an expression to a label
* [**INCSYM**](#incsym) Include a symbol file with an optional set of wanted symbols.
* [**POOL**](#pool) Add a label pool for temporary address labels
* [**IF / ELSE / IFDEF / ELIF / ENDIF**](#conditional) Conditional assembly
* [**STRUCT**](#struct) Hierarchical data structures (dot separated sub structures)
* [**REPT**](#rept) Repeat a scoped block of code a number of times.
* [**INCDIR**](#incdir) Add a directory to look for binary and text include files in.
* [**65816**](#65816) A16/A8/I16/I8 Directives to control the immediate mode size
* [**MERLIN**](#merlin) A variety of directives and label rules to support Merlin assembler sources
**CPU**
Set the CPU to assemble for. This can be updated throughout the source file as needed. **PROCESSOR** is also accepted as an alias.
```
CPU 65816
```
**ORG**
```
org $2000
(or pc $2000)
```
Start a section with a fixed addresss. Note that source files with fixed address sections can be exported to object files and will be placed at their location in the final binary output when loaded with **INCOBJ**.
**SECTION**
```
section Code
Start:
lda #Data
sta $ff
rts
section BSS
Data:
byte 1,2,3,4
```
Starts a relative code section. Relative sections require a name and sections that share the same name will be linked sequentially. The labels will be evaluated at link time.
Sections can be aligned by adding a comma separated argument:
```
section Data,$100
```
Sections can be names and assigned a fixed address by immediately following with an ORG directive
```
section Code
org $4000
```
If there is any code or data between the SECTION and ORG directives the ORG directive will begin a new section.
The primary purpose of relative sections (sections that are not assembled at a fixed address) is to generate object files (.x65) that can be referenced from a linking source file by using **INCOBJ** and assigned an address at that point using the **LINK** directive. Object files can mix and match relative and fixed address sections and only the relative sections need to be linked using the **LINK** directive.
Sections can be named anything and still be assigned a section type:
```
section Gameplay, Code ; code section named Gameplay, unaligned
...
section GameBinary, Data, $100 ; data section named GameBinary, aligned
...
section Work, Zeropage ; Zeropage or Direct page section
...
section FixedZP, Zeropage
org $a0 ; Make zero page section as a fixed address
```
Section types include:
* Code: binary code
* Data: binary data
* BSS: uninitialized memory, for fixed address projects the
* Zeropage: uninitialized memory restricted to the range $00 - $ff
Additional section directive styles include:
```
SEG segname
SEG.U segname
SEGMENT "segname": segtype
.SEGMENT "segname"
```
For creating relocatable files (OMF) certain sections can not be fixed address.
**XDEF**
Used in files assembled to object files to share a label globally. All labels that are not xdef'd are still processed but protected so that other objects can use the same label name without colliding. **XDEF ** must be specified before the label is defined, such as at the top of the file.
Non-xdef'd labels are kept private to the object file for the purpose of late evaluations that may refer to them, and those labels should also show up in .sym and vice files.
```
XDEF InitBobs
InitBobs:
rts
```
**XREF**
In order to reference a label that was globally declared in another object file using XDEF the label must be declared by using XREF.
**INCOBJ**
Include an object file for linking into this file. Object files are generated by the *-obj* command line option followed by a filename ("file.x65"). Any linked segments will be linked, and multiple linked files can be generated by using the [**EXPORT**](#export) directive.
**LINK**
Link a set of relative sections (sharing the same name) at this address
The following lines will place all sections named Code sequentially at location $1000, followed by all sections named BSS:
```
org $1000
link Code
link BSS
```
There is currently object file support (use -obj argument to generate), the recommended file extension for object files is .x65. In order to access symbols from object file code use **XDEF** prior to declaring a label within the object.
To inspect the contents of x65 objects files there is a 'dump_x65' tool included in this archive.
Note that the assembler will link all segments in a reasonable order (first code segments from current file, then code from other files, then data, then BSS segments), so using the **LINK** directive is intended to give more control but is not necessary for the linking process. **INCOBJ** is necessary for bringing in external objects though otherwise the linker won't know how to find the segments to link.
**LOAD**
```
load $2000
```
For c64 .prg files this prefixes the binary file with this address.
**EXPORT**
Allows saving multiple binary files (prg, a2b, bin, etc.) from a single source file build
```
section gamecode_level1
export _level1
```
will export the section "gamecode_level1" to (output_file)_level1.prg while other sections would be grouped together into (output_file).prg. This allows a single linking source to combine multiple loads overlapping the same memory area ending up in separate files.
**ALIGN**
```
align $100
```
Add bytes of 0 up to the next address divisible by the alignment. If the section is a fixed address (using an ORG directive) align will be applied at the location it was specified, but if the section is relative (using the SECTION directive) the alignment will apply to the start of the section.
**MACRO**
See the '[Macro](#macro)' section below
**EVAL**
Example:
```
eval Current PC: *
```
Might yield the following in stdout:
```
Eval (15): Current PC : "*" = $2010
```
When eval is encountered on a line print out "EVAL (\) \: \ = \" to stdout. This can be useful to see the size of things or debugging expressions.
**BYTES**
Adds the comma separated values on the current line to the assembled output, for example
```
RandomBytes:
bytes NumRandomBytes
{
bytes 13,1,7,19,32
NumRandomBytes = * - !
}
```
**byte** or **dc.b** are also recognized.
**WORDS**
Adds comma separated 16 bit values similar to how **BYTES** work. **word** or **dc.w** are also recognized.
**LONGS**
Adds comma separated 32 bit values similar to how **WORDS** work.
**TEXT**
Copies the string in quotes on the same line. The plan is to do a petscii conversion step. Use the modifier 'petscii' or 'petscii_shifted' to convert alphabetic characters to range.
Example:
```
text petscii_shifted "This might work"
```
**INCLUDE**
Include another source file. This should also work with .sym files to import labels from another build. The plan is for x65 to export .sym files as well.
Example:
```
include "wizfx.s"
```
**INCBIN**
Include binary data from a file, this inserts the binary data at the current address.
Example:
```
incbin "wizfx.gfx"
```
**IMPORT**
Insert multiple types of data or code at the current address. Import takes an additional parameter to determine what to do with the file data, and can accept reading in a portion of binary data.
The options for import are:
* source: same as **INCLUDE**
* binary: same as **INCBIN**
* c64: same as **INCBIN** but skip first two bytes of file as if this was a c64 prg file
* text: include text data from another file, default is petscii otherwise add another directive from the **TEXT** directive
* object: same as **INCOBJ**
* symbols: same as **INCSYM**, specify list of desired symbols prior to filename.
After the filename for binary and c64 files follow comma separated values for skip data size and max load size. c64 mode will add the two extra bytes to the skip size.
```
import source "EQ.S"
import binary "GFX.BIN",0,256
import c64 "FONT.BIN",8,8*26
import text petscii_shifted "LICENSE.TXT"
import object "engine.x65"
import symbols InitEffect, UpdateEffect "effect.sym"
```
**CONST**
Prefix a label assignment with 'const' or '.const' to cause an error if the label gets reassigned.
```
const zpData = $fe
```
**LABEL**
Decorative directive to assign an expression to a label, label assignments are followed by '=' and an expression.
These two assignments do the same thing (with different values):
```
label zpDest = $fc
zpDest = $fa
```
**INCSYM**
Include a symbol file with an optional set of wanted symbols.
Open a symbol file and extract a set of symbols, or all symbols if no set was specified. Local labels will be discarded if possible.
```
incsym Part1_Init, Part1_Update, Part1_Exit "part1.sym"
```
**POOL**
Add a label pool for temporary address labels. This is similar to how stack frame variables are assigned in C.
A label pool is a mini stack of addresses that can be assigned as temporary labels with a scope ('{' and '}'). This can be handy for large functions trying to minimize use of zero page addresses, the function can declare a range (or set of ranges) of available zero page addresses and labels can be assigned within a scope and be deleted on scope closure. The format of a label pool is: "pool start-end, start-end" and labels can then be allocated from that range by '