From 9f8ad61fe205fe282807663bbdacf5639bbd8479 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Carl-Henrik=20Sk=C3=A5rstedt?= Date: Thu, 26 Nov 2015 13:10:58 -0800 Subject: [PATCH] Adding string symbols - String Symbols can be evaluated as expressions or assembled as code - String Symbols can be generated by macros - Cleaning up first page - Adding more to the x65.txt documentation --- README.md | 961 +++--------------------------------------------------- x65.cpp | 355 ++++++++++++++++++-- x65.txt | 647 ++++++++++++++++++++++++++---------- 3 files changed, 845 insertions(+), 1118 deletions(-) diff --git a/README.md b/README.md index 0c29cf2..de8b03d 100644 --- a/README.md +++ b/README.md @@ -2,27 +2,35 @@ 6502 Macro Assembler in a single c++ file using the struse single file text parsing library. Supports most syntaxes. x65 was recently named Asm6502 but was renamed because Asm6502 is too generic, x65 has no particular meaning. -Every assembler seems to add or change its own quirks to the 6502 syntax. This implementation aims to support all of them at once as long as there is no contradiction. +In order to minimize the documentation and make this page shorter I've moved the [old documentation](../../wiki/Previous-first-page) here. -To keep up with this trend x65 is adding the following features to the mix: +The [up to date documentation is here](x65.txt). -* Full expression evaluation everywhere values are used: [Expressions](#expressions) -* Basic relative sections and linking. -* Apple II GS executable output -* C style scoping within '{' and '}': [Scopes](#scopes) -* Reassignment of labels. This means there is no error if you declare the same label twice, but on the other hand you can do things like label = label + 2. -* [Local labels](#labels) can be defined in a number of ways, such as leading period (.label) or leading at-sign (@label) or terminating dollar sign (label$). -* [Directives](#directives) support both with and without leading period. -* Labels don't need to end with colon, but they can. -* No indentation required for instructions, meaning that labels can't be mnemonics, macros or directives. +x65 can assemble 6502, 65C02 and 65816 source and build executables for c64, Apple II or just raw binary. + +Noteworthy features: + +* Code with sections, object files and linking or single file fixed + address, or mix it up with fixed address sections in object files. +* Assembler listing with cycle counting for code review. +* Export multiple binaries with a single link operation. +* C style scoping within '{' and '}' with local and pool labels + respecting scopes. * Conditional assembly with if/ifdef/else etc. -* As far as achievable, support the syntax of other 6502 assemblers (Merlin syntax now requires command line argument, -endm adds support for sources using macro/endmacro and repeat/endrepeat combos rather than scoeps). - -In summary, if you are familiar with any 6502 assembler syntax you should feel at home with x65. If you're familiar with C programming expressions you should be familiar with '{', '}' scoping and complex expressions. - -There are no hard limits on binary size so if the address exceeds $ffff it will just wrap around to $0000. I'm not sure about the best way to handle that or if it really is a problem. - -There is a sublime package for coding/building in Sublime Text 3 in the *sublime* subfolder. +* Assembler directives representing a variety of features. +* Local labels can be defined in a number of ways, such as leading + period (.label) or leading at-sign (@label) or terminating + dollar sign (label$). +* String Symbols system allows building user expressions and macros + during assembly. +* Reassignment of symbols and labels by default. +* No indentation required for instructions, meaning that labels can't + be mnemonics, macros or directives. +* Supporting the syntax of other 6502 assemblers (Merlin syntax + requires command line argument, -endm adds support for sources + using macro/endmacro and repeat/endrepeat combos rather + than scoeps). +* Apple II GS executable output. ## Features @@ -30,6 +38,7 @@ There is a sublime package for coding/building in Sublime Text 3 in the *sublime * **Linking** * **Comments** * **Labels** +* **String Symbols** * **Directives** * **Macros** * **Expressions** @@ -39,6 +48,11 @@ There is a sublime package for coding/building in Sublime Text 3 in the *sublime x65.cpp requires struse.h which is a single file text parsing library that can be retrieved from https://github.com/Sakrac/struse. +## Additional x65 Tools + +* **dump_x65**: Inspect the contents of .x65 object files generated by x65 to track down linking issues +* **x65dsasm*: Disassemble assembled binary code for review + ### References * [6502 opcodes](http://www.6502.org/tutorials/6502opcodes.html) @@ -47,889 +61,11 @@ x65.cpp requires struse.h which is a single file text parsing library that can b * [6502 illegal opcodes](http://www.oxyron.de/html/opcodes02.html) * [65816 opcodes](http://wiki.superfamicom.org/snes/show/65816+Reference#fn:14) -## Command Line Options +### Download Binaries -The command line options specifies the source file, the destination file and what type of file to generate, such as c64 or apple II dos 3.3 or binary or an x65 specific object file. You can also generate a disassembly listing with inline source code or dump the available set of opcodes as a source file. The command line can also set labels for conditional assembly to allow for distinguishing debug builds from shippable builds. +[Windows x64 binaries](../..//raw/master/bin/x65_x64.zip) +[Windows x86 binaries](../..//raw/master/bin/x65_win32.zip) -Typical command line ([*] = optional): - -``` -x65 [-DLabel] [-iIncDir] (source.s) (dest.prg) [-lst[=file.lst]] [-opcodes[=file.s]] - [-sym dest.sym] [-vice dest.vs] [-obj]/[-c64]/[-bin]/[-a2b] [-merlin] [-endm] -``` - -**Usage** -x65 filename.s code.prg [options] -* -i(path) : Add include path -* -D(label)[=value] : Define a label with an optional value (otherwise defined as 1) -* -cpu=6502/65c02/65c02wdc/65816: assemble with opcodes for a different cpu -* -acc=8/16: set the accumulator mode for 65816 at start, default is 8 bits -* -xy=8/16: set the index register mode for 65816 at start, default is 8 bits -* -org = $2000 or - org = 4096: set the default start address of fixed address code -* -obj : generate object file for later linking instead of executable binary (.x65) -* -bin : Raw binary (no load address or size included before code) -* -c64 : Include load address (default, default org is $1000) -* -a2b : Apple II Dos 3.3 Binary (changes default org to $803, adds load addr+size) -* -a2p : Apple II ProDos Binary (changed default org to $2000, sets to binary) -* -a2o : Apple II GS OS executable (writes relocatable executable binary) -* -mrg : Force merge all sections (use with -a2o) -* -sym (file.sym) : symbol file -* -lst / -lst = (file.lst) : generate disassembly text from result(file or stdout) -* -opcodes / -opcodes = (file.s) : dump all available opcodes(file or stdout) -* -sect: display sections loaded and built -* -vice (file.vs) : export a vice symbol file -* -merlin: use Merlin syntax -* -endm : macros end with endm or endmacro instead of scoped ('{' - '}') - -### Code - -Code is any valid mnemonic/opcode and addressing mode. At the moment only one opcode per line is assembled. - -### Linking - -In order to manage more complex projects linking multiple assembled object files is desirable and x65 builds object files that can be included in a final linking step. -Simply build code with or without a fixed address and the -obj filename.x65 command line argument, then use INCOBJ filename.x65 in a final linking source. The linking source can be assigned a fixed address for most targets or exported as a relocatable executable for Apple II GS. - -### Relocatable executable - -For Apple II GS OS executable. This output requires 65816 instructions to handle the larger memory and the entry point for code needs to be implemented correctly. Using the -mrg option merges all sections together so that 16 bit addressing is safe, otherwise different code or data segments could be loaded in different banks and 3 byte referencing is required. An important note is that I have not been significantly exposed to Apple II GS or 65816 so this feature is only guaranteed as far as being able to ensure the correctness without actually building a running piece of code. - -### Comments - -Comments are currently line based and both ';' and '//' are accepted as delimiters. - -### Expressions - -Anywhere a number can be entered it can also be interpreted as a full expression, for example: - -``` -Get123: - bytes Get1-*, Get2-*, Get3-* -Get1: - lda #1 - rts -Get2: - lda #2 - rts -Get3: - lda #3 - rts -``` - -Would yield 3 bytes where the address of a label can be calculated by taking the address of the byte plus the value of the byte. - -### Labels - -Labels come in two flavors: **Addresses** (PC based) or **Values** (Evaluated from an expression). An address label is simply placed somewhere in code and a value label is followed by '**=**' and an expression. All labels are rewritable so it is fine to do things like NumInstance = NumInstance+1. Value assignments can be prefixed with '.const' or '.label' but is not required to be prefixed by anything, the CONST keyword should cause an error if the label is modified in the same source file. - -*Local labels* exist inbetween *global labels* and gets discarded whenever a new global label is added. The syntax for local labels are one of: prefix with period, at-sign, exclamation mark or suffix with $, as in: **.local** or **!local** or **@local** or **local$**. Both value labels and address labels can be local labels. - -``` -Function: ; global label - ldx #32 -.local_label ; local label - dex - bpl .local_label - rts - -Next_Function: ; next global label, the local label above is now erased. - rts -``` - -### Directives - -Directives are assembler commands that control the code generation but that does not generate code by itself. Some assemblers prefix directives with a period (.org instead of org) so a leading period is accepted but not required for directives. - -* [**CPU**](#cpu) Set the CPU to assemble for. -* [**ORG**](#org) (same as **PC**): Set the current compiling address. -* [**LOAD**](#load) Set the load address for binary formats that support it. -* [**SECTION**](#section) Start a relative section -* [**LINK**](#link) Link a relative section at this address -* [**XDEF**](#xdef) Make a label available globally -* [**XREF**](#xref) Reference a label declared globally in a different object file (.x65) -* [**INCOBJ**](#incobj) Include an object file (.x65) to this file -* [**EXPORT**](#export) Save out additional binary files with argument appended to filename -* [**ALIGN**](#align) Align the address to a multiple by filling with 0s -* [**MACRO**](#macro) Declare a macro -* [**EVAL**](#eval) Log an expression during assembly. -* [**BYTES**](#bytes) Insert comma separated bytes at this address (same as **BYTE** or **DC.B**) -* [**WORDS**](#words) Insert comma separated 16 bit values at this address (same as **WORD** or **DC.W**) -* [**LONG**](#long) Insert comma separated 32 bit values at this address -* [**TEXT**](#text) Insert text at this address -* [**INCLUDE**](#include) Include another source file and assemble at this address -* [**INCBIN**](#incbin) Include a binary file at this address -* [**IMPORT**](#import) Catch-all file inclusion (source, bin, text, object, symbols) -* [**CONST**](#const) Assign a value to a label and make it constant (error if reassigned with other value) -* [**LABEL**](#label) Decorative directive to assign an expression to a label -* [**INCSYM**](#incsym) Include a symbol file with an optional set of wanted symbols. -* [**POOL**](#pool) Add a label pool for temporary address labels -* [**IF / ELSE / IFDEF / ELIF / ENDIF**](#conditional) Conditional assembly -* [**STRUCT**](#struct) Hierarchical data structures (dot separated sub structures) -* [**REPT**](#rept) Repeat a scoped block of code a number of times. -* [**INCDIR**](#incdir) Add a directory to look for binary and text include files in. -* [**65816**](#65816) A16/A8/I16/I8 Directives to control the immediate mode size -* [**MERLIN**](#merlin) A variety of directives and label rules to support Merlin assembler sources - -**CPU** - -Set the CPU to assemble for. This can be updated throughout the source file as needed. **PROCESSOR** is also accepted as an alias. - -``` - CPU 65816 -``` - -**ORG** - -``` - -org $2000 -(or pc $2000) - -``` -Start a section with a fixed addresss. Note that source files with fixed address sections can be exported to object files and will be placed at their location in the final binary output when loaded with **INCOBJ**. - -**SECTION** - -``` - section Code -Start: - lda #Data - sta $ff - rts - - section BSS -Data: - byte 1,2,3,4 -``` - -Starts a relative code section. Relative sections require a name and sections that share the same name will be linked sequentially. The labels will be evaluated at link time. - -Sections can be aligned by adding a comma separated argument: - -``` - section Data,$100 -``` - -Sections can be names and assigned a fixed address by immediately following with an ORG directive - -``` - section Code - org $4000 -``` - -If there is any code or data between the SECTION and ORG directives the ORG directive will begin a new section. - -The primary purpose of relative sections (sections that are not assembled at a fixed address) is to generate object files (.x65) that can be referenced from a linking source file by using **INCOBJ** and assigned an address at that point using the **LINK** directive. Object files can mix and match relative and fixed address sections and only the relative sections need to be linked using the **LINK** directive. - -Sections can be named anything and still be assigned a section type: - -``` - section Gameplay, Code ; code section named Gameplay, unaligned - ... - section GameBinary, Data, $100 ; data section named GameBinary, aligned - ... - section Work, Zeropage ; Zeropage or Direct page section - ... - section FixedZP, Zeropage - org $a0 ; Make zero page section as a fixed address -``` - -Section types include: - -* Code: binary code -* Data: binary data -* BSS: uninitialized memory, for fixed address projects the -* Zeropage: uninitialized memory restricted to the range $00 - $ff - -Additional section directive styles include: - -``` - SEG segname - SEG.U segname - SEGMENT "segname": segtype - .SEGMENT "segname" -``` - -For creating relocatable files (OMF) certain sections can not be fixed address. - -Special sections for Apple II GS executables: - -Sections named DirectPage_Stack and of a BSS type (default) determine the size of the direct page + stack for the executable. If multiple sections match this rule the size will be the sum of all the sections with this name. - -Zeropage sections will be linked to a fixed address (default at the highest direct page addresses) prior to exporting the relocatable code. Zeropage sections in x65 is intended to allocate ranges of the zero page / direct page which is a bit confusing with OMF that has the concept of the direct page + stack segment. - -**XDEF** - -Used in files assembled to object files to share a label globally. All labels that are not xdef'd are still processed but protected so that other objects can use the same label name without colliding. **XDEF ** must be specified before the label is defined, such as at the top of the file. - -Non-xdef'd labels are kept private to the object file for the purpose of late evaluations that may refer to them, and those labels should also show up in .sym and vice files. - -``` - XDEF InitBobs - -InitBobs: - rts -``` - -**XREF** - -In order to reference a label that was globally declared in another object file using XDEF the label must be declared by using XREF. - -**INCOBJ** - -Include an object file for linking into this file. Object files are generated by the *-obj* command line option followed by a filename ("file.x65"). Any linked segments will be linked, and multiple linked files can be generated by using the [**EXPORT**](#export) directive. - -**LINK** - -Link a set of relative sections (sharing the same name) at this address - -The following lines will place all sections named Code sequentially at location $1000, followed by all sections named BSS: - -``` - org $1000 - link Code - link BSS -``` - -There is currently object file support (use -obj argument to generate), the recommended file extension for object files is .x65. In order to access symbols from object file code use **XDEF** prior to declaring a label within the object. - -To inspect the contents of x65 objects files there is a 'dump_x65' tool included in this archive. - -Note that the assembler will link all segments in a reasonable order (first code segments from current file, then code from other files, then data, then BSS segments), so using the **LINK** directive is intended to give more control but is not necessary for the linking process. **INCOBJ** is necessary for bringing in external objects though otherwise the linker won't know how to find the segments to link. - -**LOAD** - -``` -load $2000 -``` - -For c64 .prg files this prefixes the binary file with this address. - -**EXPORT** - -Allows saving multiple binary files (prg, a2b, bin, etc.) from a single source file build - -``` - section gamecode_level1 - export _level1 -``` - -will export the section "gamecode_level1" to (output_file)_level1.prg while other sections would be grouped together into (output_file).prg. This allows a single linking source to combine multiple loads overlapping the same memory area ending up in separate files. - -**ALIGN** - -``` -align $100 -``` - -Add bytes of 0 up to the next address divisible by the alignment. If the section is a fixed address (using an ORG directive) align will be applied at the location it was specified, but if the section is relative (using the SECTION directive) the alignment will apply to the start of the section. - -**MACRO** - -See the '[Macro](#macro)' section below - -**EVAL** - -Example: -``` -eval Current PC: * -``` -Might yield the following in stdout: -``` -Eval (15): Current PC : "*" = $2010 -``` - -When eval is encountered on a line print out "EVAL (\) \: \ = \" to stdout. This can be useful to see the size of things or debugging expressions. - -**BYTES** - -Adds the comma separated values on the current line to the assembled output, for example - -``` -RandomBytes: - bytes NumRandomBytes - { - bytes 13,1,7,19,32 - NumRandomBytes = * - ! - } -``` - -**byte** or **dc.b** are also recognized. - -**WORDS** - -Adds comma separated 16 bit values similar to how **BYTES** work. **word** or **dc.w** are also recognized. - -**LONGS** - -Adds comma separated 32 bit values similar to how **WORDS** work. - -**TEXT** - -Copies the string in quotes on the same line. The plan is to do a petscii conversion step. Use the modifier 'petscii' or 'petscii_shifted' to convert alphabetic characters to range. - -Example: - -``` -text petscii_shifted "This might work" -``` - -**INCLUDE** - -Include another source file. This should also work with .sym files to import labels from another build. The plan is for x65 to export .sym files as well. - -Example: - -``` -include "wizfx.s" -``` - - -**INCBIN** - -Include binary data from a file, this inserts the binary data at the current address. - -Example: - -``` -incbin "wizfx.gfx" -``` - -**IMPORT** - -Insert multiple types of data or code at the current address. Import takes an additional parameter to determine what to do with the file data, and can accept reading in a portion of binary data. - -The options for import are: -* source: same as **INCLUDE** -* binary: same as **INCBIN** -* c64: same as **INCBIN** but skip first two bytes of file as if this was a c64 prg file -* text: include text data from another file, default is petscii otherwise add another directive from the **TEXT** directive -* object: same as **INCOBJ** -* symbols: same as **INCSYM**, specify list of desired symbols prior to filename. - -After the filename for binary and c64 files follow comma separated values for skip data size and max load size. c64 mode will add the two extra bytes to the skip size. - -``` - import source "EQ.S" - import binary "GFX.BIN",0,256 - import c64 "FONT.BIN",8,8*26 - import text petscii_shifted "LICENSE.TXT" - import object "engine.x65" - import symbols InitEffect, UpdateEffect "effect.sym" -``` - -**CONST** - -Prefix a label assignment with 'const' or '.const' to cause an error if the label gets reassigned. - -``` -const zpData = $fe -``` - -**LABEL** - -Decorative directive to assign an expression to a label, label assignments are followed by '=' and an expression. - -These two assignments do the same thing (with different values): -``` -label zpDest = $fc -zpDest = $fa -``` - -**INCSYM** - -Include a symbol file with an optional set of wanted symbols. - -Open a symbol file and extract a set of symbols, or all symbols if no set was specified. Local labels will be discarded if possible. - -``` -incsym Part1_Init, Part1_Update, Part1_Exit "part1.sym" -``` - -**POOL** - -Add a label pool for temporary address labels. This is similar to how stack frame variables are assigned in C. - -A label pool is a mini stack of addresses that can be assigned as temporary labels with a scope ('{' and '}'). This can be handy for large functions trying to minimize use of zero page addresses, the function can declare a range (or set of ranges) of available zero page addresses and labels can be assigned within a scope and be deleted on scope closure. The format of a label pool is: "pool start-end, start-end" and labels can then be allocated from that range by ' **STRUCT** - -Hierarchical data structures (dot separated sub structures) - -Structs helps define complex data types, there are two basic types to define struct members, and as long as a struct is declared it can be used as a member type of another struct. - -The result of a struct is that each member is an offset from the start of the data block in memory. Each substruct is referenced by separating the struct names with dots. - -Example: - -``` -struct MyStruct { - byte count - word pointer -} - -struct TwoThings { - MyStruct thing_one - MyStruct thing_two -} - -struct Mixed { - word banana - TwoThings things -} - -Eval Mixed.things -Eval Mixed.things.thing_two -Eval Mixed.things.thing_two.pointer -Eval Mixed.things.thing_one.count -``` - -results in the output: - -``` -EVAL(16): "Mixed.things" = $2 -EVAL(27): "Mixed.things.thing_two" = $5 -EVAL(28): "Mixed.things.thing_two.pointer" = $6 -EVAL(29): "Mixed.things.thing_one.count" = $2 -``` - -**REPT** - -Repeat a scoped block of code a number of times. The syntax is rept \ { \ }. The full word **REPEAT** is also recognized. - -Example: - -``` -columns = 40 -rows = 25 -rept columns { - screen_addr = $400 + rept ; rept is the repeat counter - ldx $1000+rept - dest = screen_addr - remainder = 3 - rept (rows+remainder)/4 { - stx dest - dest = dest + 4*40 - } - rept 3 { - inx - remainder = remainder-1 - screen_addr = screen_addr + 40 - dest = screen_addr - rept (rows+remainder)/4 { - stx dest - dest = dest + 4*40 - } - } -} -``` - -Note that if the -endm command line option is used (macros are not defined with curly brackets but inbetween macro and endm*) this also affects rept so the syntax for a repeat block changes to - -``` -.REPEAT 4 - lsr -.ENDREPEAT -``` - -The symbol 'REPT' is the current repeat count within a REPT (0 outside of repeat blocks). - - -**INCDIR** - -Adds a folder to search for INCLUDE, INCBIN, etc. files in - -###**65816** - -* **A16** Set immediate mode for accumulator to be 16 bits -* **A8** Set immediate mode for accumulator to be 8 bits -* **I16** Set immediate mode for index registers to be 16 bits. **XY16** is also accepted. -* **I8** Set immediate mode for index registers to be 8 bits. **XY8** is also accepted. - -The accumulator and index register mode will be reset to 8 bits if the CPU is switched to something other than 65816. - -An alternative method is to add .b/.w to immediate mode instructions that support 16 bit modes such as: - -``` - ora.b #$21 - ora.w #$2322 -``` -This alleviates some confusion about which mode a certain line might be assembled for when looking at source code. - -Note that in case a 4 digit hex value is used in 8 bit mode and an immediate mode is allowed but is not currently enable a two byte value will be emitted - -``` - lda #$0043 ; will be 16 bit regardless of accumulator mode if in 65816 mode - lda #$43 ; will be 8 bit or 16 bit depending on accumulator mode -``` - -Similarly for instructions that accept 3 byte addresses (bank + address) adding .l instructs the assembler to choose a bank address: - -``` - and.l $222120 - and.l $222120,x - jsr.l $203212 -``` -Although if six hexadecimal digits are specified the bank + address instruction will be assembled without decoration. - -Alternatively Merlin simply adds an 'l' to the instruction: - -``` - andl $222120 - andl $222120,x - jsrl $203212 -``` - -x65 Labels are not restricted to 16 bits, the bank byte can be extracted from a label with the '^' operator: - -``` - lda #^label -``` - - -###**MERLIN** - -A variety of directives and label rules to support Merlin assembler sources. Merlin syntax is supported in x65 since there is historic relevance and readily available publicly release source. - -* [Pinball Construction Set source](https://github.com/billbudge/PCS_AppleII) (Bill Budge) -* [Prince of Persia source](https://github.com/jmechner/Prince-of-Persia-Apple-II) (Jordan Mechner) - -To enable Merlin 8.16 syntax use the '-merlin' command line argument. Where it causes no harm, Merlin directives are supported for non-merlin mode. - -*LABELS* - -]label means mutable address label, also does not seem to invalidate local labels. - -:label is perfectly valid, currently treating as a local variable - -labels can include '?' - -Merlin labels are not allowed to include '.' as period means logical or in merlin, which also means that enums and structs are not supported when assembling with merlin syntax. - -*Expressions* - -Merlin may not process expressions (probably left to right, parenthesis not allowed) the same as x65 but given that it wouldn't be intuitive to read the code that way, there are probably very few cases where this would be an issue. - -**XC** - -Change processor. The first instance of XC will switch from 6502 to 65C02, the second switches from 65C02 to 65816. To return to 6502 use **XC OFF**. To go directly to 65816 **XC XC** is supported. - -**MX** - -MX sets the immediate mode accumulator instruction size, it takes a number and uses the lowest two bits. Bit 0 applies to index registers (x, y) where 0 means 16 bits and 1 means 8 bits, bit 1 applies to the accumulator. Normally it is specified in binary using the '%' prefix. - -``` - MX %11 -``` - -**LUP** - -LUP is Merlingo for loop. The lines following the LUP directive to the keyword --^ are repeated the number of times that follows LUP. - -**MAC** - -MAC is short for Macro. Merlin macros are defined on line inbetween MAC and <<< or EOM. Macro arguments are listed on the same line as MAC and the macro identifier is the label preceeding the MAC directive on the same line. - -**EJECT** - -An old assembler directive that does not affect the assembler but if printed would insert a page break at that point. - -**DS** - -Define section, followed by a number of bytes. If number is positive insert this amount of 0 bytes, if negative, reduce the current PC. - -**DUM**, **DEND** - -Dummy section, this will not write any opcodes or data to the binary output but all code and data will increment the PC addres up to the point of DEND. - -**PUT** - -A variation of **INCLUDE** that applies an oddball set of filename rules. These rules apply to **INCLUDE** as well just in case they make sense. - -**USR** - -In Merlin USR calls a function at a fixed address in memory, x65 safely avoids this. If there is a requirement for a user defined macro you've got the source code to do it in. - -**SAV** - -SAV causes Merlin to save the result it has generated so far, which is somewhat similar to the [EXPORT](#export) directive. If the SAV name is different than the source name the section will have a different EXPORT name appended and exported to a separate binary file. - -**DSK** - -DSK is similar to SAV - -**ENT** - -ENT defines the label that preceeds it as external, same as [**XDEF**](#xdef). - -**EXT** - -EXT imports an external label, same as [**XREF**](#xref). - -**LNK** / **STR** - -LNK links the contents of an object file, to fit with the named section method of linking in x65 this keyword has been reworked to have a similar result, the actual linking doesn't begin until the current section is complete. - -**CYC** - -CYC starts and stops a cycle counter, x65 scoping allows for hierarchical cycle listings but the first merlin directive CYC starts the counter and the next CYC stops the counter and shows the result. This is 6502 only until data is entered for other CPUs. - -**ADR** - -Define byte triplets (like **DA** but three bytes instead of 2) - -**ADRL** - -Define values of four bytes. - -## List File - -This is a typical list file. Columns from left to right are: - -* Address -* Bytes (up to 4) generated by this line -* Instruction (disassembled) -* Cycles for this instruction, + indicates this instruction can be an extra cycle due to condition or for 65816 multiple extra cycles. -* Source code that generated this line - -For scope lines ('{' - '}') the sum of the cycles within the scope is added up as are the additional cycles. - -``` - c>1 Sin { -$0000 a2 03 ldx #$03 2 ldx #3 - c>2 { -$0002 b5 e8 lda $e8,x 4 lda SinP.Ang,x -$0004 95 ec sta $ec,x 4 sta SinP.R,x ; result starts with x -$0006 95 e4 sta $e4,x 4 sta SinP.W0,x -$0008 95 f4 sta $f4,x 4 sta Mul824.A,x -$000a 95 f0 sta $f0,x 4 sta Mul824.B,x -$000c ca dex 2 dex -$000d 10 f3 bpl $0002 2+ bpl ! - c<2 = 24 + 1 } - ; x^2, copy to W1 -$000f a9 e0 lda #$e0 2 lda #SinP.W1 -$0011 20 00 00 jsr $0000 6 jsr Multiply824S_Copy - ; iterate value -$0014 a0 00 ldy #$00 2 ldy #0 - .SinIterate - c>2 { - ; W0 *= W1 -$0016 a2 03 ldx #$03 2 ldx #3 - c>3 { -$0018 b5 e4 lda $e4,x 4 lda SinP.W0,x ; x^(1+2n) -$001a 95 f4 sta $f4,x 4 sta Mul824.A,x -$001c b5 e0 lda $e0,x 4 lda SinP.W1,x ; x^2 -$001e 95 f0 sta $f0,x 4 sta Mul824.B,x -$0020 ca dex 2 dex -$0021 10 f5 bpl $0018 2+ bpl ! - c<3 = 20 + 1 } -$0023 a9 e4 lda #$e4 2 lda #SinP.W0 ; Copy to W0 -$0025 20 00 00 jsr $0000 6 jsr Multiply824S_Copy -$0028 a2 e4 ldx #$e4 2 ldx #SinP.W0 ; Copy W0 to A -$002a a9 f4 lda #$f4 2 lda #Mul824.A -$002c 20 00 00 jsr $0000 6 jsr Cpy824Z -$002f a2 00 ldx #$00 2 ldx #0 - c>3 { -$0031 b9 00 00 lda $0000,y 4+ lda SinInvPermute,y -$0034 95 f0 sta $f0,x 4 sta Mul824.B,x -$0036 c8 iny 2 iny -$0037 e8 inx 2 inx -$0038 e0 04 cpx #$04 2 cpx #4 -$003a d0 f5 bne $0031 2+ bne ! - c<3 = 16 + 2 } -``` - -## Expression syntax - -Expressions contain values, such as labels or raw numbers and operators including +, -, \*, /, & (and), | (or), ^ (eor), << (shift left), >> (shift right) similar to how expressions work in C. Parenthesis are supported for managing order of operations where C style precedence needs to be overridden. In addition there are some special characters supported: - -* \*: Current address (PC). This conflicts with the use of \* as multiply so multiply will be interpreted only after a value or right parenthesis -* <: If less than is not followed by another '<' in an expression this evaluates to the low byte of a value (and $ff) -* >: If greater than is not followed by another '>' in an expression this evaluates to the high byte of a value (>>8) -* ^: Inbetween two values '^' is an eor operation, as a prefix to values it extracts the bank byte (v>>24). -* !: Start of scope (use like an address label in expression) -* %: First address after scope (use like an address label in expression) -* $: Precedes hexadecimal value -* %: If immediately followed by '0' or '1' this is a binary value and not scope closure address - -**Conditional operators** - -* ==: Double equal signs yields 1 if left value is the same as the right value -* <: If inbetween two values, less than will yield 1 if left value is less than right value -* >: If inbetween two values, greater than will yield 1 if left value is greater than right value -* <=: If inbetween two values, less than or equal will yield 1 if left value is less than or equal to right value -* >=: If inbetween two values, greater than or equal will yield 1 if left value is greater than or equal to right value - -Example: - -``` -lda #(((>SCREEN_MATRIX)&$3c)*4)+8 -sta $d018 -``` - -Avoid using parenthesis as the first character of the parameter of an opcode that can be relative addressed instead of an absolute address. This can be avoided by - -``` - jmp (a+b) ; generates a relative jump - jmp.a (a+b) ; generates an absolute jump - jmp +(a+b) ; generates an absolute jump - -c = (a+b) - jmp c ; generates an absolute jump - jmp a+b ; generates an absolute jump -``` - -## Macros - -A macro can be defined by the using the directive macro and includes the line within the following scope: - -Example: -``` -macro ShiftLeftA(Source) { - rol Source - rol A -} -``` - -The macro will be instantiated anytime the macro name is encountered: -``` -lda #0 -ShiftLeftA($a0) -``` - -The parameter field is optional for both the macro declaration and instantiation, if there is a parameter in the declaration but not in the instantiation the parameter will be removed from the macro. If there are no parameters in the declaration the parenthesis can be omitted and will be slightly more efficient to assemble, as in: - -``` -.macro GetBit { - asl - bne % - jsr GetByte -} -``` - -Alternative syntax for macros: - -To support the syntax of other assemblers macro parameters can also be defined through space separated arguments: - -``` - macro loop_end op lbl { - op - bne lbl - } - - ldx #4 - { - sta buf,x - loop_end dex ! - } -``` - -Other assemblers use a directive to end macros rather than a scope (inbetween { and }). This is supported by adding '-endm' to the command line: - -``` -macro ShiftLeftA source - rol source - asl -endm -``` - -As long as the macro end directive starts with endm it will be accepted, so endmacro will work as well. - -Currently macros with parameters use search and replace without checking if the parameter is a whole word, the plan is to fix this. - -## Scopes - -Scopes are lines inbetween '{' and '}' including macros. The purpose of scopes is to reduce the need for local labels and the scopes nest just like C code to support function level and loops and inner loop scoping. '!' is a label that is the first address of the scope and '%' the first address after the scope. - -Additionally scopes have a meaning for counting cycles when exporting a .lst file, each open scope '{' will add a new counter of CPU cycles that will accumulate until the corresponding '}' which will be shown on that line in the listing file. Use -lst as a command line option to generate a listing file. - -This means you can write -``` -{ - lda #0 - ldx #8 - { - sta Label,x - dex - bpl ! - } -} -``` -to construct a loop without adding a label. - -##Examples - -Using scoping to avoid local labels - -``` -; set zpTextPtr to a memory location with text -; return: y is the offset to the first space. -; (y==0 means either first is space or not found.) -FindFirstSpace - ldy #0 - { - lda (zpTextPtr),y - cmp #$20 - beq % ; found, exit - iny - bne ! ; not found, keep searching - } - rts -``` ### Acknowledgments This project would not be completed without the direct or indirect support of great people, some which I can currently remember: @@ -951,38 +87,17 @@ Primarily tested with personal archive of sources written for Kick assmebler, DA * irp (indefinite repeat) **FIXED** +* Added string symbols * Resolved the DirectPage_Stack section vs. Zeropage section for Apple II GS/OS executables. * OMF export for Apple II GS/OS executables -* More DASM directives supported (ERR, DV, DS.B, DS.W, DS.L) -* Removed the concept of linking by merging sections and instead keeping the sections separate and individually assigned memory addresses so they won't overlap. -* Fixed up Merlin LNK directive to work with new linker -* Fixed linker merged section reloc confusion. -* -org command line argument to override the built-in assumption of org $1000, to avoid ever having to use the ORG directive inlined in code. -* dump_x65 now shows the code offset of each section into the .x65 file which can be copied and pasted into the disassembler in case the object file assembler output needs to be inspected. -* A linker export summary is shown when building binary fixed address, this shows how the linker re-arranged the sections in memory. The section addresses are also included in the .lst file even if the section didn't generate any listing information, such as included object files. -* BSS sections are handled similar to CODE and DATA sections but will not write out BSS bytes at end of binary data. This should complete the section handling necessary to build a relocatable executable. -* Replaced the fixed address linker so it doesn't merge sections but just assigns addresses. This is more similar to how a relocatable code loader would handle it. I may need to merge sections for OMF to reduce the number of code sections. -* dc.t (3 bytes) dc.l (4 bytes) for data declaration -* Linking of zero page / direct page sections -* Added section types, should cover most intuitive formats (seg.type; segment name; segment "name": type; etc. etc.) -* Changed the data for relocs to better match Apple II GS OMF format which also changes the object file format. -* Added a disassembler (disassembler/x65dsasm.cpp) -* % evaluates to the current end of scope instead of whatever scope ends first -* rept crash fix if not resolved until assembly completed -* rept and symbol reference with forward reference label was not taking section into account -* Link append sections target confusion cleared up (caused crash/link errors/freeze) -* XREF prevented linking with same name symbol included from .x65 object causing a linker failure -* << was mistakenly interpreted as shift right -* REPT is also a value that can be used in expressions as a repeat counter -* LONG allows for adding 4 byte values like BYTE/WORD -* Cycle listing for 65816 - required more complex visualization due to more than 1 cycle extra [(older fixes)](../../wiki/fixes) Revisions: +* 10 - String Symbols * 9 - Apple II GS OS executable * 8 - Fish food / Linking tested and passed with external project (Apple II gs Rastan) -* 7 - 65816 support +* 7 - 65816 Support * 6 - 65C02 support * 5 - Merlin syntax * 4 - Object files, relative sections and linking diff --git a/x65.cpp b/x65.cpp index ba2b52f..0e148c0 100644 --- a/x65.cpp +++ b/x65.cpp @@ -234,9 +234,11 @@ enum AssemblerDirective { AD_TEXT, // TEXT: Add text to output AD_INCLUDE, // INCLUDE: Load and assemble another file at this address AD_INCBIN, // INCBIN: Load and directly insert another file at this address - AD_CONST, // CONST: Prevent a label from mutating during assemble AD_IMPORT, // IMPORT: Include or Incbin or Incobj or Incsym + AD_CONST, // CONST: Prevent a label from mutating during assemble AD_LABEL, // LABEL: Create a mutable label (optional) + AD_STRING, // STRING: Declare a string symbol + AD_UNDEF, // UNDEF: remove a string or a label AD_INCSYM, // INCSYM: Reference labels from another assemble AD_LABPOOL, // POOL: Create a pool of addresses to assign as labels dynamically AD_IF, // #IF: Conditional assembly follows based on expression @@ -304,7 +306,8 @@ enum EvalOperator { EVOP_STP, // u, Unexpected input, should stop and evaluate what we have EVOP_NRY, // v, Not ready yet EVOP_XRF, // w, value from XREF label - EVOP_ERR, // x, Error + EVOP_EXP, // x, sub expression + EVOP_ERR, // y, Error }; // Opcode encoding @@ -940,6 +943,8 @@ DirectiveName aDirectiveNames[] { { "IMPORT", AD_IMPORT }, { "CONST", AD_CONST }, { "LABEL", AD_LABEL }, + { "STRING", AD_STRING }, + { "UNDEF", AD_UNDEF }, { "INCSYM", AD_INCSYM }, { "LABPOOL", AD_LABPOOL }, { "POOL", AD_LABPOOL }, @@ -1274,6 +1279,24 @@ public: bool reference; // this label is accessed from external and can't be used for evaluation locally } Label; + +// String data +typedef struct { +public: + strref string_name; // name of the string + strref string_const; // string contents if source reference + strovl string_value; // string contents if modified, initialized to null string + + StatusCode Append(strref append); + StatusCode ParseLine(strref line); + + strref get() { return string_value.valid() ? string_value.get_strref() : string_const; } + void clear() { if (string_value.cap()) { free(string_value.charstr()); + string_value.invalidate(); string_value.clear(); } + string_const.clear(); + } +} StringSymbol; + // If an expression can't be evaluated immediately, this is required // to reconstruct the result when it can be. typedef struct { @@ -1408,6 +1431,7 @@ public: class Asm { public: pairArray labels; + pairArray strings; pairArray macros; pairArray labelPools; pairArray labelStructs; @@ -1528,7 +1552,7 @@ public: EvalOperator RPNToken_Merlin(strref &expression, const struct EvalContext &etx, EvalOperator prev_op, short §ion, int &value); EvalOperator RPNToken(strref &expression, const struct EvalContext &etx, - EvalOperator prev_op, short §ion, int &value); + EvalOperator prev_op, short §ion, int &value, strref &subexp); StatusCode EvalExpression(strref expression, const struct EvalContext &etx, int &result); void SetEvalCtxDefaults(struct EvalContext &etx); int ReptCnt() const; @@ -1541,7 +1565,13 @@ public: StatusCode AssignLabel(strref label, strref line, bool make_constant = false); StatusCode AddressLabel(strref label); void LabelAdded(Label *pLabel, bool local = false); - void IncludeSymbols(strref line); + StatusCode IncludeSymbols(strref line); + + // Strings + StringSymbol *GetString(strref string_name); + StringSymbol *AddString(strref string_name, strref string_value); + StatusCode StringAction(StringSymbol *pStr, strref line); + StatusCode ParseStringOp(StringSymbol *pStr, strref line); // Manage locals void MarkLabelLocal(strref label, bool scope_label = false); @@ -1564,6 +1594,8 @@ public: StatusCode ApplyDirective(AssemblerDirective dir, strref line, strref source_file); StatusCode Directive_Rept(strref line, strref source_file); StatusCode Directive_Macro(strref line, strref source_file); + StatusCode Directive_String(strref line); + StatusCode Directive_Undef(strref line); StatusCode Directive_Include(strref line); StatusCode Directive_Incbin(strref line, int skip=0, int len=0); StatusCode Directive_Import(strref line); @@ -1630,6 +1662,12 @@ void Asm::Cleanup() { labels.clear(); macros.clear(); allSections.clear(); + for (unsigned int i = 0; i < strings.count(); ++i) { + StringSymbol &str = strings.getValue(i); + if (str.string_value.cap()) + free(str.string_value.charstr()); + } + strings.clear(); for (std::vector::iterator exti = externals.begin(); exti !=externals.end(); ++exti) exti->labels.clear(); externals.clear(); @@ -3111,7 +3149,7 @@ EvalOperator Asm::RPNToken_Merlin(strref &expression, const struct EvalContext & } // Get a single token from most non-apple II assemblers -EvalOperator Asm::RPNToken(strref &exp, const struct EvalContext &etx, EvalOperator prev_op, short §ion, int &value) +EvalOperator Asm::RPNToken(strref &exp, const struct EvalContext &etx, EvalOperator prev_op, short §ion, int &value, strref &subexp) { char c = exp.get_first(); switch (c) { @@ -3165,6 +3203,7 @@ EvalOperator Asm::RPNToken(strref &exp, const struct EvalContext &etx, EvalOpera if (ret != STATUS_NOT_STRUCT) return EVOP_ERR; // partial struct } if (!pLabel && label.same_str("rept")) { value = etx.rept_cnt; return EVOP_VAL; } + if (!pLabel) { if (StringSymbol *pStr = GetString(label)) subexp = pStr->get(); return EVOP_EXP; } if (!pLabel || !pLabel->evaluated) return EVOP_NRY; // this label could not be found (yet) value = pLabel->value; section = pLabel->section; return pLabel->reference ? EVOP_XRF : EVOP_VAL; } @@ -3200,11 +3239,16 @@ static int mul_as_shift(int scalar) return scalar == 1 ? shift : 0; } +#define MAX_EXPR_STACK 2 + StatusCode Asm::EvalExpression(strref expression, const struct EvalContext &etx, int &result) { int numValues = 0; int numOps = 0; + strref expression_stack[MAX_EXPR_STACK]; + int exp_sp = 0; + char ops[MAX_EVAL_OPER]; // RPN expression int values[MAX_EVAL_VALUES]; // RPN values (in order of RPN EVOP_VAL operations) short section_ids[MAX_EVAL_SECTIONS]; // local index of each referenced section @@ -3217,19 +3261,29 @@ StatusCode Asm::EvalExpression(strref expression, const struct EvalContext &etx, char op_stack[MAX_EVAL_OPER]; EvalOperator prev_op = EVOP_NONE; expression.trim_whitespace(); - while (expression) { + while (expression || exp_sp) { int value = 0; short section = -1, index_section = -1; EvalOperator op = EVOP_NONE; - if (syntax == SYNTAX_MERLIN) + strref subexp; + if (!expression && exp_sp) { + expression = expression_stack[--exp_sp]; + op = EVOP_RPR; + } else if (syntax == SYNTAX_MERLIN) op = RPNToken_Merlin(expression, etx, prev_op, section, value); else - op = RPNToken(expression, etx, prev_op, section, value); + op = RPNToken(expression, etx, prev_op, section, value, subexp); if (op == EVOP_ERR) return ERROR_UNEXPECTED_CHARACTER_IN_EXPRESSION; else if (op == EVOP_NRY) return STATUS_NOT_READY; - else if (op == EVOP_XRF) { + else if (op == EVOP_EXP) { + if (exp_sp >= MAX_EXPR_STACK) + return ERROR_TOO_MANY_VALUES_IN_EXPRESSION; + expression_stack[exp_sp++] = expression; + expression = subexp; + op = EVOP_LPR; + } else if (op == EVOP_XRF) { xrefd = true; op = EVOP_VAL; } @@ -4016,7 +4070,7 @@ StatusCode Asm::AddressLabel(strref label) } // include symbols listed from a .sym file or all if no listing -void Asm::IncludeSymbols(strref line) +StatusCode Asm::IncludeSymbols(strref line) { strref symlist = line.before('"').get_trimmed_ws(); line = line.between('"', '"'); @@ -4044,10 +4098,143 @@ void Asm::IncludeSymbols(strref line) } } loadedData.push_back(buffer); - } + } else + return ERROR_COULD_NOT_INCLUDE_FILE; + return STATUS_OK; } +// Get a string record if it exists +StringSymbol *Asm::GetString(strref string_name) +{ + unsigned int string_hash = string_name.fnv1a(); + unsigned int index = FindLabelIndex(string_hash, strings.getKeys(), strings.count()); + while (index < strings.count() && string_hash == strings.getKey(index)) { + if (string_name.same_str(strings.getValue(index).string_name)) + return strings.getValues() + index; + index++; + } + return nullptr; +} +// Add or modify a string record +StringSymbol *Asm::AddString(strref string_name, strref string_value) +{ + StringSymbol *pStr = GetString(string_name); + if (pStr==nullptr) { + unsigned int string_hash = string_name.fnv1a(); + unsigned int index = FindLabelIndex(string_hash, strings.getKeys(), strings.count()); + strings.insert(index, string_hash); + pStr = strings.getValues() + index; + pStr->string_name = string_name; + pStr->string_value.invalidate(); + pStr->string_value.clear(); + } + if (pStr->string_value.cap()) { + free(pStr->string_value.charstr()); + pStr->string_value.invalidate(); + pStr->string_value.clear(); + } + pStr->string_const = string_value; + return pStr; +} + +// append a string to another string +StatusCode StringSymbol::Append(strref append) +{ + if (!append) + return STATUS_OK; + + strl_t add_len = append.get_len(); + + if (!string_value.cap()) { + strl_t new_len = (add_len + 0xff)&(~(strl_t)0xff); + char *buf = (char*)malloc(new_len); + if (!buf) + return ERROR_OUT_OF_MEMORY; + string_value.set_overlay(buf, new_len); + string_value.copy(string_const); + } else if (string_value.cap() < (string_value.get_len() + add_len)) { + strl_t new_len = (string_value.get_len() + add_len + 0xff)&(~(strl_t)0xff); + char *buf = (char*)malloc(new_len); + if (!buf) + return ERROR_OUT_OF_MEMORY; + strovl ovl(buf, new_len); + ovl.copy(string_value.get_strref()); + free(string_value.charstr()); + string_value.set_overlay(buf, new_len); + } + string_const.clear(); + string_value.append(append); + return STATUS_OK; +} + +StatusCode Asm::ParseStringOp(StringSymbol *pStr, strref line) +{ + line.skip_whitespace(); + if (line[0] == '+') + ++line; + for (;;) { + line.skip_whitespace(); + if (line[0] == '"') { + strref substr = line.between('"', '"'); + line += substr.get_len() + 2; + pStr->Append(substr); + } else { + strref label = line.split_range(syntax == SYNTAX_MERLIN ? + label_end_char_range_merlin : label_end_char_range); + if (StringSymbol *pStr2 = GetString(label)) + pStr->Append(pStr2->get()); + else if (Label *pLabel = GetLabel(label)) { + if (!pLabel->evaluated) + return ERROR_TARGET_ADDRESS_MUST_EVALUATE_IMMEDIATELY; + strown<32> lblstr; + lblstr.sprintf("$%x", pLabel->value); + pStr->Append(lblstr.get_strref()); + } else + break; + } + line.skip_whitespace(); + if (!line || line[0] != '+') + break; + ++line; + line.skip_whitespace(); + } + return STATUS_OK; +} + +StatusCode Asm::StringAction(StringSymbol *pStr, strref line) +{ + line.skip_whitespace(); + if (line[0] == '+' && line[1] == '=') { // append strings + line += 2; + line.skip_whitespace(); + return ParseStringOp(pStr, line); + } else if (line[0] == '=') { + ++line; + line.skip_whitespace(); + pStr->clear(); + return ParseStringOp(pStr, line); + } else { + strref str = pStr->string_value.valid() ? + pStr->string_value.get_strref() : pStr->string_const; + if (!str) + return STATUS_OK; + char *macro = (char*)malloc(str.get_len()); + strovl mac(macro, str.get_len()); + mac.copy(str); + mac.replace("\\n", "\n"); + loadedData.push_back(macro); + contextStack.push(contextStack.curr().source_name, mac.get_strref(), mac.get_strref()); + if (scope_depth >= (MAX_SCOPE_DEPTH - 1)) + return ERROR_TOO_DEEP_SCOPE; + else + scope_address[++scope_depth] = CurrSection().GetPC(); + contextStack.curr().scoped_context = true; + return STATUS_OK; + + } + return STATUS_OK; +} // // @@ -4251,6 +4438,62 @@ StatusCode Asm::Directive_Macro(strref line, strref source_file) return STATUS_OK; } +// string: create a symbolic string +StatusCode Asm::Directive_String(strref line) +{ + line.skip_whitespace(); + strref string_name = line.split_range_trim(word_char_range, line[0]=='.' ? 1 : 0); + if (line[0]=='=' || keyword_equ.is_prefix_word(line)) { + line.next_word_ws(); + strref substr = line; + if (line[0] == '"') { + substr = line.between('"', '"'); + line += substr.get_len() + 2; + StringSymbol *pStr = AddString(string_name, substr); + if (pStr == nullptr) + return ERROR_OUT_OF_MEMORY; + line.skip_whitespace(); + if (line[0] == '+') + return ParseStringOp(pStr, line); + } else { + StringSymbol *pStr = AddString(string_name, strref()); + return ParseStringOp(pStr, line); + } + } else { + if (!AddString(string_name, strref())) + return ERROR_OUT_OF_MEMORY; + } + return STATUS_OK; +} + +StatusCode Asm::Directive_Undef(strref line) +{ + strref name = line.split_range_trim(syntax == SYNTAX_MERLIN ? label_end_char_range_merlin : label_end_char_range); + unsigned int name_hash = name.fnv1a(); + unsigned int index = FindLabelIndex(name_hash, labels.getKeys(), labels.count()); + while (index < labels.count() && name_hash == labels.getKey(index)) { + if (name.same_str(labels.getValue(index).label_name)) { + labels.remove(index); + return STATUS_OK; + } + index++; + } + index = FindLabelIndex(name_hash, strings.getKeys(), strings.count()); + while (index < strings.count() && name_hash == strings.getKey(index)) { + if (name.same_str(strings.getValue(index).string_name)) { + StringSymbol str = strings.getValue(index); + if (str.string_value.cap()) { + free(str.string_value.charstr()); + str.string_value.invalidate(); + } + strings.remove(index); + return STATUS_OK; + } + index++; + } + return STATUS_OK; +} + // include: read in a source file and assemble at this point StatusCode Asm::Directive_Include(strref line) { @@ -4355,10 +4598,16 @@ StatusCode Asm::Directive_Import(strref line) line += import_text.get_len(); line.skip_whitespace(); strref text_type = "petscii"; - if (line[0]!='"') { - text_type = line.get_word_ws(); - line += text_type.get_len(); - line.skip_whitespace(); + while (line[0]!='"') { + strref word = line.get_word_ws(); + if (word.same_str("petscii") || word.same_str("petscii_shifted")) { + text_type = line.get_word_ws(); + line += text_type.get_len(); + line.skip_whitespace(); + } else if (StringSymbol *pStr = GetString(line.get_word_ws())) { + line = pStr->get(); + break; + } } CurrSection().AddText(line, text_type); return STATUS_OK; @@ -4369,8 +4618,7 @@ StatusCode Asm::Directive_Import(strref line) } else if (import_symbols.is_prefix_word(line)) { line += import_symbols.get_len(); line.skip_whitespace(); - IncludeSymbols(line); - return STATUS_OK; + return IncludeSymbols(line); } return STATUS_OK; @@ -4570,20 +4818,44 @@ StatusCode Asm::Directive_EVAL(strref line) line.trim_whitespace(); struct EvalContext etx; SetEvalCtxDefaults(etx); + strref lab1 = line; + lab1 = lab1.split_token_any_trim(syntax == SYNTAX_MERLIN ? label_end_char_range_merlin : label_end_char_range); + StringSymbol *pStr = line.same_str_case(lab1) ? GetString(lab1) : nullptr; + if (line && EvalExpression(line, etx, value) == STATUS_OK) { if (description) { - printf("EVAL(%d): " STRREF_FMT ": \"" STRREF_FMT "\" = $%x\n", - contextStack.curr().source_file.count_lines(description) + 1, STRREF_ARG(description), STRREF_ARG(line), value); + if (pStr != nullptr) { + printf("EVAL(%d): " STRREF_FMT ": \"" STRREF_FMT "\" = \"" STRREF_FMT "\" = $%x\n", + contextStack.curr().source_file.count_lines(description) + 1, STRREF_ARG(description), STRREF_ARG(line), STRREF_ARG(pStr->get()), value); + } else { + printf("EVAL(%d): " STRREF_FMT ": \"" STRREF_FMT "\" = $%x\n", + contextStack.curr().source_file.count_lines(description) + 1, STRREF_ARG(description), STRREF_ARG(line), value); + } } else { - printf("EVAL(%d): \"" STRREF_FMT "\" = $%x\n", - contextStack.curr().source_file.count_lines(line) + 1, STRREF_ARG(line), value); + if (pStr != nullptr) { + printf("EVAL(%d): \"" STRREF_FMT "\" = \"" STRREF_FMT "\" = $%x\n", + contextStack.curr().source_file.count_lines(line) + 1, STRREF_ARG(line), STRREF_ARG(pStr->get()), value); + } else { + printf("EVAL(%d): \"" STRREF_FMT "\" = $%x\n", + contextStack.curr().source_file.count_lines(line) + 1, STRREF_ARG(line), value); + } } } else if (description) { - printf("EVAL(%d): \"" STRREF_FMT ": " STRREF_FMT"\"\n", - contextStack.curr().source_file.count_lines(description) + 1, STRREF_ARG(description), STRREF_ARG(line)); + if (pStr != nullptr) { + printf("EVAL(%d): " STRREF_FMT ": \"" STRREF_FMT "\" = \"" STRREF_FMT "\"\n", + contextStack.curr().source_file.count_lines(description) + 1, STRREF_ARG(description), STRREF_ARG(line), STRREF_ARG(pStr->get())); + } else { + printf("EVAL(%d): \"" STRREF_FMT ": " STRREF_FMT"\"\n", + contextStack.curr().source_file.count_lines(description) + 1, STRREF_ARG(description), STRREF_ARG(line)); + } } else { - printf("EVAL(%d): \"" STRREF_FMT "\"\n", - contextStack.curr().source_file.count_lines(line) + 1, STRREF_ARG(line)); + if (pStr != nullptr) { + printf("EVAL(%d): \"" STRREF_FMT "\" = \"" STRREF_FMT "\"\n", + contextStack.curr().source_file.count_lines(line) + 1, STRREF_ARG(line), STRREF_ARG(pStr->get())); + } else { + printf("EVAL(%d): \"" STRREF_FMT "\"\n", + contextStack.curr().source_file.count_lines(line) + 1, STRREF_ARG(line)); + } } return STATUS_OK; } @@ -4774,8 +5046,20 @@ StatusCode Asm::ApplyDirective(AssemblerDirective dir, strref line, strref sourc break; case AD_TEXT: { // text: add text within quotes - strref text_prefix = line.before('"').get_trimmed_ws(); - line = line.between('"', '"'); + strref text_prefix; + while (line[0] != '"') { + strref word = line.get_word_ws(); + if (word.same_str("petscii") || word.same_str("petscii_shifted")) { + text_prefix = line.get_word_ws(); + line += text_prefix.get_len(); + line.skip_whitespace(); + } else if (StringSymbol *pStr = GetString(line.get_word_ws())) { + line = pStr->get(); + break; + } + } + if (line[0] == '"') + line = line.between('"', '"'); CurrSection().AddText(line, text_prefix); break; } @@ -4803,10 +5087,15 @@ StatusCode Asm::ApplyDirective(AssemblerDirective dir, strref line, strref sourc error = ERROR_UNEXPECTED_LABEL_ASSIGMENT_FORMAT; break; } + + case AD_STRING: + return Directive_String(line); + case AD_UNDEF: + return Directive_Undef(line); + case AD_INCSYM: - IncludeSymbols(line); - break; + return IncludeSymbols(line); case AD_LABPOOL: { strref name = line.split_range_trim(word_char_range, line[0]=='.' ? 1 : 0); @@ -4831,7 +5120,8 @@ StatusCode Asm::ApplyDirective(AssemblerDirective dir, strref line, strref sourc CheckConditionalDepth(); // Check if nesting bool conditional_result; error = EvalStatement(line, conditional_result); - if (GetLabel(line.get_trimmed_ws()) != nullptr) + strref name = line.get_trimmed_ws(); + if (GetLabel(name) != nullptr || GetString(name) != nullptr) ConsumeConditional(); else SetConditional(); @@ -5432,7 +5722,10 @@ StatusCode Asm::BuildLine(strref line) labPool++; } if (!gotConstruct) { - if (syntax==SYNTAX_MERLIN && strref::is_ws(line_start[0])) { + if (StringSymbol *pStr = GetString(label)) { + StringAction(pStr, line); + line.clear(); + } else if (syntax==SYNTAX_MERLIN && strref::is_ws(line_start[0])) { error = ERROR_UNDEFINED_CODE; } else if (label[0]=='$' || strref::is_number(label[0])) line.clear(); diff --git a/x65.txt b/x65.txt index 592a9bf..df065a6 100644 --- a/x65.txt +++ b/x65.txt @@ -20,22 +20,25 @@ result. Noteworthy features: -* Full expression evaluation everywhere values are used. -* Basic relative sections and linking in addition to fixed address. -* C style scoping within '{' and '}' +* Code with sections, object files and linking or single file fixed + address, or mix it up with fixed address sections in object files. +* Assembler listing with cycle counting for code review. +* Export multiple binaries with a single link operation. +* C style scoping within '{' and '}' with local and pool labels + respecting scopes. * Conditional assembly with if/ifdef/else etc. -* Directives support both with and without leading period. +* Assembler directives representing a variety of features. * Local labels can be defined in a number of ways, such as leading period (.label) or leading at-sign (@label) or terminating dollar sign (label$). -* Reassignment of symbols. This means there is no error if you declare - the same label twice, but on the other hand you can do things like - label = label + 2. +* String Symbols system allows building user expressions and macros + during assembly. +* Reassignment of symbols and labels by default. * No indentation required for instructions, meaning that labels can't be mnemonics, macros or directives. -* As far as achievable, support the syntax of other 6502 assemblers - (Merlin syntax now requires command line argument, -endm adds support - for sources using macro/endmacro and repeat/endrepeat combos rather +* Supporting the syntax of other 6502 assemblers (Merlin syntax + requires command line argument, -endm adds support for sources + using macro/endmacro and repeat/endrepeat combos rather than scoeps). * Apple II GS executable output. @@ -43,6 +46,37 @@ Noteworthy features: -0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- +Contents +-------- + + +License +Command line arguments +CPU options +Syntax +Targets +Listing Output +Expressions + Math expression symbols supported + PC expression symbols supported + Conditional operators +Conditional assembly +65816 +Data +Macros +Strings +Structs and Enums +Symbols +Label Pool +Sections +Relocatable code and linking +Merlin +All Directives + + +-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- + + License ------- @@ -53,20 +87,23 @@ The MIT License (MIT) Copyright (c) 2015 Carl-Henrik Skårstedt -Permission is hereby granted, free of charge, to any person obtaining a copy of this software -and associated documentation files (the "Software"), to deal in the Software without restriction, -including without limitation the rights to use, copy, modify, merge, publish, distribute, -sublicense, and/or sell copies of the Software, and to permit persons to whom the Software -is furnished to do so, subject to the following conditions: +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the "Software"), +to deal in the Software without restriction, including without limitation +the rights to use, copy, modify, merge, publish, distribute, sublicense, +and/or sell copies of the Software, and to permit persons to whom the +Software is furnished to do so, subject to the following conditions: -The above copyright notice and this permission notice shall be included in all copies or -substantial portions of the Software. +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, -INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR -PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE -FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, -ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, +ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR +OTHER DEALINGS IN THE SOFTWARE. Details, source and documentation at https://github.com/Sakrac/x65. @@ -82,6 +119,7 @@ Document Updates Nov 23 2015 - Initial pass of x65 documentation Nov 24 2015 - More text +Nov 26 2015 - String directive and more text -0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- @@ -90,34 +128,37 @@ Command line arguments ---------------------- +Input, output and target options are set on the command line, many of +these options can be controlled with assembler directives in code as +well as the command line. + x65 source target [options] -Where "options" include - - * -i(path) : Add include path - * -D(label)[=value] : Define a label with an optional value - (otherwise defined as 1) - * -cpu=6502/65c02/65c02wdc/65816: assemble with opcodes for a different cpu - * -acc=8/16: set the accumulator mode for 65816 at start, default is 8 bits - * -xy=8/16: set the index register mode for 65816 at start, default is 8 bits - * -org = $2000 or - org = 4096: set the default start address of - fixed address code - * -obj (file.x65) : generate object file for later linking - * -bin : Raw binary - * -c64 : Include load address (default) - * -a2b : Apple II Dos 3.3 Binary - * -a2p : Apple II ProDos Binary - * -a2o : Apple II GS OS executable (relocatable) - * -mrg : Force merge all sections (use with -a2o) - * -sym (file.sym) : symbol file - * -lst / -lst = (file.lst) : generate disassembly text from - result (file or stdout) - * -opcodes / -opcodes = (file.s) : dump all available opcodes(file or stdout) - * -sect: display sections loaded and built - * -vice (file.vs) : export a vice symbol file - * -merlin: use Merlin syntax - * -endm : macros end with endm or endmacro instead of scoped('{' - '}') +Options include: +* -i(path) : Add include path +* -D(label)[=value] : Define a label with an optional value + (otherwise defined as 1) +* -cpu=6502/65c02/65c02wdc/65816: assemble with opcodes for a different cpu +* -acc=8/16: set the accumulator mode for 65816 at start, default is 8 bits +* -xy=8/16: set the index register mode for 65816 at start, default is 8 bits +* -org = $2000 or - org = 4096: set the default start address of + fixed address code +* -obj (file.x65) : generate object file for later linking +* -bin : Raw binary +* -c64 : Include load address (default) +* -a2b : Apple II Dos 3.3 Binary +* -a2p : Apple II ProDos Binary +* -a2o : Apple II GS OS executable (relocatable) +* -mrg : Force merge all sections (use with -a2o) +* -sym (file.sym) : symbol file +* -lst / -lst = (file.lst) : generate disassembly text from + result (file or stdout) +* -opcodes / -opcodes = (file.s) : dump all available opcodes(file or stdout) +* -sect: display sections loaded and built +* -vice (file.vs) : export a vice symbol file +* -merlin: use Merlin syntax +* -endm : macros end with endm or endmacro instead of scoped('{' - '}') -0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- @@ -152,11 +193,12 @@ Syntax ------ -The syntax of x65 source is the result of trying to build code from a -variety of assemblers, including a number of open source games and old -personal code. The primary syntax inspiration is from Kick Assembler, -but also DASM, TASM and XASM. Most of the downloaded sample code was -written for Apple II where Merlin, Orca and Lisa were referenced. +The syntax of x65 source is the result of trying to build code originally +created for a variety of assemblers, including a number of open source +games and old personal code. The primary syntax inspiration is from +Kick Assembler, but also DASM, TASM and XASM. Most of the downloaded +sample code was written for Apple II where Merlin, Orca and Lisa were +referenced. Note that Merlin syntax requires the -merlin command line option. @@ -218,26 +260,25 @@ generate a .x65 object file. More information about object files in Sections. Command line options for target output: - * -org = $2000: set the default start address of fixed address code, - default is $1000 - * -obj (file.x65): generate object file for later linking - * -bin : Raw binary - * -c64 : Include load address (default) - * -a2b : Apple II Dos 3.3 Binary (load address + file size) - * -a2p : Apple II ProDos Binary (set org to $2000 otherwise binary) - * -a2o : Apple II GS OS executable (relocatable) - * -mrg : Force merge all sections (use with -a2o) +* -org = $2000: set the default start address of fixed address code, + default is $1000 +* -obj (file.x65): generate object file for later linking +* -bin : Raw binary +* -c64 : Include load address (default) +* -a2b : Apple II Dos 3.3 Binary (load address + file size) +* -a2p : Apple II ProDos Binary (set org to $2000 otherwise binary) +* -a2o : Apple II GS OS executable (relocatable) +* -mrg : Force merge all sections (use with -a2o) The -mrg option will combine all segments into one to allow for 16 bit addressing to reach data in other segments, but will limit the size to fit into a 64 k bank. - -0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- -List Output +Listing Output ----------- @@ -309,6 +350,8 @@ the order of operations is based on C like precedence. Internally the expression is converted to reverse polish notation to make it easier to keep track of complex expressions. +Values in expressions can be labels, symbols, strings (added as an +expression within parenthesis) or raw decimal, binary or hexadecimal numbers. Math expression symbols supported: @@ -420,6 +463,20 @@ non-assembling block of source. * IFDEF - conditionals, start a block of conditional assembly if a symbol or label exists at this point +Example: + +if 0 +this part of the source will not assemble, +however a line can not start with a conditional +assembler directive such as if, ifdef, else, elseif +or endif within a block that does not assemble +unless followed by a valid expression +else + ; this part of the source will assemble + lda #0 + rts +endif + -0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- @@ -428,9 +485,11 @@ non-assembling block of source. ----- -65816 is large expansion of 6502 and requires the assembler to be aware of +65816 is major expansion of 6502 and requires the assembler to be aware of what processor flags the user has set to select instructions. +use -cpu=65816 on command line or CPU 65816 in source to set. + * A16 - 65816, set accumulator immediate operators to 16 bit mode * A8 - 65816, set accumulator immediate operators to 8 bit mode * I16 - 65816, set index register immediate operators to 16 bit mode, @@ -442,6 +501,8 @@ what processor flags the user has set to select instructions. * XY8 - 65816, set index register immediate operators to 8 bit mode, same as I8 + + -0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- @@ -470,6 +531,17 @@ declares a repeating value. * WORD - data, insert comma separated 16 bit values, same as WORDS * WORDS - data, insert comma seperated 16 bit values, same as WORD + +Example: + +ONE_824 = 1<<24 ; 1 as a 8.24 number +CosInvPermute: ; 1 + + long -(ONE_824 + 1)/(2) ; x^2 * this + long (ONE_824 + 3*4)/(2*3*4) ; x^4 * this + long -(ONE_824 + 3*4*5*6)/(2*3*4*5*6) ; x^6 * this + long -(ONE_824 + 3*4*5*6*7*8)/(2*3*4*5*6*7*8) ; x^8 * this + + -0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- @@ -493,129 +565,101 @@ The parenthesis are optional both for the macro declaration and for the macro instantiation so macros can be used as if they were instructions MACRO neg address { - sec - lda #0 - sbc source - sta source + sec + lda #0 + sbc source + sta source } - MACRO nega { - eor #$ff - sec - adc #0 - } + MACRO nega { + eor #$ff + sec + adc #0 + } Now 'neg' and 'nega' can be used as if it was an instruction: - neg $7f80 ; negate byte at this hard coded address for some reason - lda #$6c - nega ; negate accumulator + neg $7f80 ; negate byte at this hard coded address for some reason + lda #$6c + nega ; negate accumulator In order to support code written for other assemblers the -endm command line option changes the syntax for macro declarations to start on the line after MACRO and end before the line starting with ENDM or ENDMACRO: - MACRO inca - sec - adc #0 - ENDMACRO + MACRO inca + sec + adc #0 + ENDMACRO Directives for macros: * MACRO - macros, start a macro declaration + -0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- + +Strings +------- + + +Strings are special symbols that contain text and was included in an +effort to support ORCA macros. The difference with ORCA and other +assemblers is that the macros can build up string symbols (along with +value symbols) and combine results into a more powerful macro system. + +x65 now supports the same mechanism but not the same exact keywords. + +Strings can be created and passed in as a value symbol in expressions +or used directly as a macro (without parameters). + +Strings are defined using the STRING directive followed by the string +name and an equal sign followed by a string expression. + +Strings can include value symbols which will be evaluated and represented +by $ + the hexadecimal representation of the value. + +Example: + + STRING exp = "1 + 2 + 3" + EVAL exp + +result (output): + +EVAL(2): "exp" = "1 + 2 + 3" = $6 + +Example: + + STRING code_str = "lda #0\nsta $fe" + code_str + +result (code): + + lda #0 + sta $fe + +Example: + + STRING concat_example = "ldx #0" + concat_example += + +Directives for String Symbols + +* STRING - declare a string symbol + + +-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- + + Structs and Enums +----------------- + * ENUM - structs and enums, declare enumerations like C * STRUCT - structs and enums, declare a C-like structure of symbols separated by dots --0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- - -#Sections - -x65 supports linking of fully assembled object files into a single -larger project. This is a fairly standard feature of compilers but -supporting both common 68000 linking style and Apple II Merlin style -means that x65 is not quite as straightforward. - -The purpose of a linked project is to work in multiple source files -without worrying about where in memory each file gets compiled to. -In addition sections of code and data in a single file can be linked -to different target locations. Each source file gets assembled to an -object file (.x65) and all the internal and external references are -stored separately from the binary code to be fixed up later. - -The last step of a linked project is to load all object files and -generate one or more exported programs. A special source file uses -the INCOBJ directive to bring in object files one by one and piled up -by using the LINK [segment name] at a fixed address. - -The SECTION directive starts a block of code or data to be linked -later. By default x65 creates a section named "default" which can -be used for linking as is but is intended to be replaced. - -In order to export labels from a source file it should be declared -with XDEF prior to being defined: - - XDEF Function - - SECTION Code - -Function: - lda #1 - rts - -To reference an exported label from a different file use XREF - - XREF Function - - SECTION Code -Code: - jsr Function - rts - -To link object files (.x65) into an executable the assembled -objects need to be combined into a single source using INCOBJ - - INCOBJ "Code.x65" - INCOBJ "Routines.x65" - -The result will put the first included code section OR the first code -section declared in the link file. - -The link file can export multiple binary executable files by using -the EXPORT directive - - SECTION CodeOther, Code - EXPORT other - -Code in the CodeOther section will be built as (binary)_other.(ext) - -By linking multiple targets at once files can reference labels -between eachother. - - -* DUMMY - sections, start a dummy section (defines addresses but does not - generate data, same as Merlin DUM) -* DUMMY_END - sections, end a dummy section (same as Merlin DEND) -* EXPORT - sections, this section will link or save to a separate binary file - with the argument appended to the link or binary filename. -* IMPORT - data and sections, load a file and include it in the assembly based - on the argument -* INCOBJ - sections, load an object file (.x65) of previously assembled source -* LINK - sections, links a section to the current section -* SECTION - section, declare a section; Comma separated arguments are name, - type, align where type is Code, Data, BSS or Zeropage -* SEG - section, same as SECTION -* SEGMENT - section, same as SECTION -* XDEF - sections, declare a label as external which can be referenced in - other source files by using XREF -* XREF - sections, reference a label that has been declared as global in - another file by using XDEF - -0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- @@ -716,11 +760,137 @@ The following extensions are recognized: * [pool name] var.l (4 bytes) +-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- + + +Sections +-------- + + +x65 supports linking of fully assembled object files into a single +larger project. This is a fairly standard feature of compilers but +supporting both common 68000 linking style and Apple II Merlin style +means that x65 is not quite as straightforward. + +The purpose of a linked project is to work in multiple source files +without worrying about where in memory each file gets compiled to. +In addition sections of code and data in a single file can be linked +to different target locations. Each source file gets assembled to an +object file (.x65) and all the internal and external references are +stored separately from the binary code to be fixed up later. + +The last step of a linked project is to load all object files and +generate one or more exported programs. A special source file uses +the INCOBJ directive to bring in object files one by one and piled up +by using the LINK [segment name] at a fixed address. + +The SECTION directive starts a block of code or data to be linked +later. By default x65 creates a section named "default" which can +be used for linking as is but is intended to be replaced. + +In order to export labels from a source file it should be declared +with XDEF prior to being defined: + + XDEF Function + + SECTION Code + +Function: + lda #1 + rts + +To reference an exported label from a different file use XREF + + XREF Function + + SECTION Code +Code: + jsr Function + rts + +To link object files (.x65) into an executable the assembled +objects need to be combined into a single source using INCOBJ + + INCOBJ "Code.x65" + INCOBJ "Routines.x65" + +The result will put the first included code section OR the first code +section declared in the link file. + +The link file can export multiple binary executable files by using +the EXPORT directive + + SECTION CodeOther, Code + EXPORT other + +Code in the CodeOther section will be built as (binary)_other.(ext) + +By linking multiple targets at once files can reference labels +between eachother. + +Sections can be named anything and still be assigned a section type: + + section Gameplay, Code ; code section named Gameplay, unaligned + ... + section GameBinary, Data, $100 ; data section named GameBinary, aligned + ... + section Work, Zeropage ; Zeropage or Direct page section + ... + section FixedZP, Zeropage + org $a0 ; Make zero page section as a fixed address + +Section types include: + +* Code: binary code +* Data: binary data +* BSS: uninitialized memory (for certain targets filled with zeroes) +* Zeropage: uninitialized memory restricted to the range $00 - $ff + +Additional section directive styles include: + + SEG segname + SEG.U segname + SEGMENT "segname": segtype + .SEGMENT "segname" + +For creating relocatable files (OMF) certain sections can not be fixed address. + +Special sections for Apple II GS executables: + +Sections named DirectPage_Stack and of a BSS type (default) determine the size of the direct page + stack for the executable. If multiple sections match this rule the size will be the sum of all the sections with this name. + +Zeropage sections will be linked to a fixed address (default at the highest direct page addresses) prior to exporting the relocatable code. Zeropage sections in x65 is intended to allocate ranges of the zero page / direct page which is a bit confusing with OMF that has the concept of the direct page + stack segment. + + +Directives related to sections: + + +* DUMMY - sections, start a dummy section (defines addresses but does not + generate data, same as Merlin DUM) +* DUMMY_END - sections, end a dummy section (same as Merlin DEND) +* EXPORT - sections, this section will link or save to a separate binary file + with the argument appended to the link or binary filename. +* IMPORT - data and sections, load a file and include it in the assembly based + on the argument +* INCOBJ - sections, load an object file (.x65) of previously assembled source +* LINK - sections, links a section to the current section +* SECTION - section, declare a section; Comma separated arguments are name, + type, align where type is Code, Data, BSS or Zeropage +* SEG - section, same as SECTION +* SEGMENT - section, same as SECTION +* XDEF - sections, declare a label as external which can be referenced in + other source files by using XREF +* XREF - sections, reference a label that has been declared as global in + another file by using XDEF + + -0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- Relocatable code and linking +---------------------------- + A lot of 6502 code has been built with fixed address assemblers. While supporting fixed address assembling, x65 is built around generating relocatable @@ -734,6 +904,155 @@ Apple II GS uses a relocatable binary format that can be exported, other targets link to a fixed address during the linking stage. +-0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0- + + +Merlin +------ + +x65 can compile most Merlin syntax code with the -merlin command line +option. + +A variety of directives and label rules to support Merlin assembler +sources. Merlin syntax is supported in x65 since there is historic +relevance and readily available publicly release source. + +Merlin Label Syntax + +]label means mutable address label, also does not seem to invalidate + local labels. + +:label is perfectly valid, currently treating as a local variable + +labels can include '?' + +Merlin labels are not allowed to include '.' as period means logical +or in merlin, which also means that enums and structs are not +supported when assembling with merlin syntax. + + +Merlin expressions + +Merlin may not process expressions (probably left to right, parenthesis +not allowed) the same as x65 but given that it wouldn't be intuitive +to read the code that way, there are probably very few cases where this +would be an issue. + + +Merlin additional directives + +XC + +Change processor. The first instance of XC will switch from 6502 to +65C02, the second switches from 65C02 to 65816. To return to 6502 use +XC OFF. To go directly to 65816 XC XC is supported. + + +MX + +MX sets the immediate mode accumulator instruction size, it takes a +number and uses the lowest two bits. Bit 0 applies to index registers +(x, y) where 0 means 16 bits and 1 means 8 bits, bit 1 applies to the +accumulator. Normally it is specified in binary using the '%' prefix. + + MX %11 + + +LUP + +LUP is Merlingo for loop. The lines following the LUP directive to +the keyword --^ are repeated the number of times that follows LUP. + + +MAC + +MAC is short for Macro. Merlin macros are defined on line inbetween +MAC and <<< or EOM. Macro arguments are listed on the same line as +MAC and the macro identifier is the label preceeding the MAC directive +on the same line. + + +EJECT + +An old assembler directive that does not affect the assembler but if +printed would insert a page break at that point. + + +DS + +Define section, followed by a number of bytes. If number is positive +insert this amount of 0 bytes, if negative, reduce the current PC. + + +DUM, DEND + +Dummy section, this will not write any opcodes or data to the binary +output but all code and data will increment the PC addres up to the +point of DEND. + + +PUT + +A variation of INCLUDE that applies an oddball set of filename +rules. These rules apply to INCLUDE as well just in case they +make sense. + + +USR + +In Merlin USR calls a function at a fixed address in memory, x65 +safely avoids this. If there is a requirement for a user defined +macro you've got the source code to do it in. + + +SAV + +SAV causes Merlin to save the result it has generated so far, +which is somewhat similar to the [EXPORT](#export) directive. +If the SAV name is different than the source name the section +will have a different EXPORT name appended and exported to a +separate binary file. + + +DSK + +DSK is similar to SAV + + +ENT + +ENT defines the label that preceeds it as external, same as XDEF. + +EXT + +EXT imports an external label, same as XREF. + + +LNK, STR + +LNK links the contents of an object file, to fit with the named section +method of linking in x65 this keyword has been reworked to have a +similar result, the actual linking doesn't begin until the current +section is complete. + + +CYC + +CYC starts and stops a cycle counter, x65 scoping allows for hierarchical +cycle listings but the first merlin directive CYC starts the counter and +the next CYC stops the counter and shows the result. This is 6502 only +until data is entered for other CPUs. + + +ADR + +Define byte triplets (like DA but three bytes instead of 2) + + +ADRL + +Define values of four bytes. + -0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0-