2018-09-28 10:05:11 -07:00
|
|
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
|
|
<html xmlns="http://www.w3.org/1999/xhtml">
|
|
|
|
|
|
|
|
<head>
|
|
|
|
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
|
|
|
|
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
|
|
|
<link href="main.css" rel="stylesheet" type="text/css" />
|
|
|
|
<title>Code Generation & Assembly - 6502bench SourceGen</title>
|
|
|
|
</head>
|
|
|
|
|
|
|
|
<body>
|
2018-10-09 10:04:10 -07:00
|
|
|
<div id="content">
|
2018-09-28 10:05:11 -07:00
|
|
|
<h1>6502bench SourceGen: Code Generation & Assembly</h1>
|
|
|
|
<p><a href="index.html">Back to index</a></p>
|
|
|
|
|
|
|
|
<p>SourceGen can generate an assembly source file that, when fed into
|
|
|
|
the target assembler, will recreate the original data file exactly.
|
2018-10-03 18:03:04 -07:00
|
|
|
Every assembler is different, so support must be added to SourceGen
|
|
|
|
for each.</p>
|
2018-09-28 10:05:11 -07:00
|
|
|
<p>The generation / assembly dialog can be opened with File > Assemble.</p>
|
|
|
|
|
|
|
|
|
|
|
|
<h2><a name="supported">Supported Assemblers</a></h2>
|
|
|
|
|
|
|
|
<p>SourceGen currently supports the following cross-assemblers:</p>
|
|
|
|
<ul>
|
2018-10-23 16:06:29 -07:00
|
|
|
<li><a href="https://sourceforge.net/projects/tass64/">64tass</a> v1.53.1515 or later</li>
|
2018-09-28 10:05:11 -07:00
|
|
|
<li><a href="https://cc65.github.io/">cc65</a> v2.17 or later</li>
|
|
|
|
<li><a href="https://www.brutaldeluxe.fr/products/crossdevtools/merlin/">Merlin 32</a> v1.0.0 or later</li>
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
|
|
|
|
<h3><a name="version">Version-Specific Code Generation</a></h3>
|
|
|
|
|
|
|
|
<p>Code generation must be tailored to the specific version of the
|
|
|
|
assembler. This is most easily understood with an example.</p>
|
|
|
|
<p>If you write <code>MVN $01,$02</code>, the assembler is expected to output
|
|
|
|
<code>54 02 01</code>, with the arguments reversed. cc65 v2.17 doesn't
|
|
|
|
do that; this is a bug that was fixed in a later version. So if you're
|
|
|
|
generating code for v2.17, you want to create source code with the
|
2018-10-03 18:03:04 -07:00
|
|
|
arguments the wrong way around.</p>
|
2018-09-28 10:05:11 -07:00
|
|
|
<p>Having version-dependent source code is a bad idea, so SourceGen
|
|
|
|
just outputs raw hex bytes for MVN/MVP instructions. This yields the
|
|
|
|
correct code for all versions of the assembler, but is ugly and
|
|
|
|
annoying. So we want to output actual MVN/MVP instructions when producing
|
|
|
|
code for newer versions of the assembler.</p>
|
2019-04-18 10:41:01 -07:00
|
|
|
<p>When you configure a cross-assembler, SourceGen runs the executable with
|
|
|
|
version query args, and extracts the version information from the output
|
|
|
|
stream. This is used by the generator to ensure that the output will compile.
|
2018-09-28 10:05:11 -07:00
|
|
|
If no assembler is configured, SourceGen will produce code optimized
|
|
|
|
for the latest version of the assembler.</p>
|
|
|
|
|
|
|
|
<h2><a name="generate">Generating Source Code</a></h2>
|
|
|
|
|
|
|
|
<p>Cross assemblers tend to generate additional files, either compiler
|
|
|
|
intermediaries ("file.o") or metadata ("_FileInformation.txt"). Some
|
|
|
|
generators may produce multiple source files, perhaps a link script or
|
|
|
|
symbol definition header to go with the assembly source. To avoid
|
|
|
|
spreading files across the filesystem, SourceGen does all of its work
|
2018-10-03 18:03:04 -07:00
|
|
|
in the same directory where the project lives. Before you can generate
|
2019-04-18 10:41:01 -07:00
|
|
|
code, you have to have assigned your project a directory. This is why
|
|
|
|
you can't assemble a project until you've saved it for the first time.</p>
|
2018-09-28 10:05:11 -07:00
|
|
|
|
|
|
|
<p>The Generate and Assemble dialog has a drop-down list near the top
|
|
|
|
that lets you pick which assembler to target. The name of the assembler
|
|
|
|
will be shown with the detected version number. If the assembler
|
|
|
|
executable isn't configured, "[latest version]" will be shown instead
|
|
|
|
of a version number.</p>
|
|
|
|
<p>The Settings button will take you directly to the assembler configuration
|
2018-10-09 10:04:10 -07:00
|
|
|
tab in the application settings dialog.</p>
|
2018-09-28 10:05:11 -07:00
|
|
|
<p>Hit the Generate button to generate the source code into a file on disk.
|
|
|
|
The file will use the project name, with the ".dis65" replaced by
|
|
|
|
"_<assembler>.S".</p>
|
|
|
|
<p>The first 64KiB of each generated file will be shown in the preview
|
|
|
|
window. If multiple files were generated, you can use the "preview file"
|
|
|
|
drop-down to select between them. Line numbers are
|
|
|
|
prepended to each line to make it easier to track down errors.</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<h3><a name="localizer">Label Localizer</a></h3>
|
|
|
|
<p>The label localizer is an optional feature that automatically converts
|
|
|
|
some labels to an assembler-specific less-than-global label format. Local
|
|
|
|
labels may be reusable (e.g. using "]LOOP" for multiple consecutive
|
|
|
|
loops is easier to understand than giving each one a unique label) or
|
|
|
|
reduce the size of a generated link table. There are usually restrictions
|
|
|
|
on local labels, e.g. references to them may not be allowed to cross a
|
|
|
|
global label definition, which the localizer factors in automatically.</p>
|
|
|
|
<p>The localizer is somewhat experimental at this time, and can be
|
|
|
|
disabled from the
|
|
|
|
<a href="settings.html#app-settings">application settings</a>.</p>
|
|
|
|
|
|
|
|
|
|
|
|
<h2><a name="assemble">Cross-Assembling Generated Code</a></h2>
|
|
|
|
|
|
|
|
<p>After generating sources, if you have a cross-assembler executable
|
|
|
|
configured, you can run it by clicking the "Run Assembler" button. The
|
|
|
|
command-line output will be displayed, with stdout and stderr separated.
|
|
|
|
(I'd prefer them to be interleaved, but that's not what the system
|
|
|
|
provides.)</p>
|
|
|
|
|
|
|
|
<p>The output will show the assembler's exit code, which will be zero
|
2018-10-03 18:03:04 -07:00
|
|
|
on success (note: sometimes they lie.) If it appeared to succeed,
|
|
|
|
SourceGen will then compare the assembler's output to the original file,
|
|
|
|
and report any differences.</p>
|
2018-09-28 10:05:11 -07:00
|
|
|
<p>Failures here may be due to bugs in the cross-assembler or in
|
|
|
|
SourceGen. However, SourceGen can generally work around assembler bugs,
|
2018-10-03 18:03:04 -07:00
|
|
|
so any failure is an opportunity for improvement.</p>
|
2018-09-28 10:05:11 -07:00
|
|
|
|
2018-10-23 16:06:29 -07:00
|
|
|
|
|
|
|
<h2><a name="quirks">Assembler-Specific Bugs & Quirks</a></h2>
|
|
|
|
|
|
|
|
<p>This is a list of bugs and quirky behavior in cross-assemblers that
|
|
|
|
SourceGen works around when generating code.</p>
|
|
|
|
<p>Every assembler seems to have a different way of dealing with expressions.
|
|
|
|
Most of them will let you group expressions with parenthesis, but that
|
|
|
|
doesn't always help. For example, <code>PEA label >> 8 + 1</code> is
|
|
|
|
perfectly valid, but writing <code>PEA (label >> 8) + 1</code> will cause
|
|
|
|
most assemblers to assume you're trying to use an alterate form of PEA
|
|
|
|
with indirect addressing (which doesn't exist). The code generator needs
|
|
|
|
to understand expression syntax and operator precedence to generate correct
|
|
|
|
code, but also needs to know how to handle the corner cases.</p>
|
|
|
|
|
|
|
|
|
|
|
|
<h3><a name="64tass">64tass</a></h3>
|
|
|
|
|
|
|
|
<p>Code is generated for 64tass v1.53.1515.</p>
|
|
|
|
|
|
|
|
<p>Bugs:</p>
|
|
|
|
<ul>
|
2019-04-11 16:23:02 -07:00
|
|
|
<li>Undocumented opcode <code>SHA (ZP),Y</code> ($93) is not supported;
|
2018-10-23 16:06:29 -07:00
|
|
|
the assembler appears to be expecting <code>SHA ABS,X</code> instead.</li>
|
|
|
|
<li>BRK, COP, and WDM are not allowed to have operands.</li>
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
<p>Quirks:</p>
|
|
|
|
<ul>
|
|
|
|
<li>The underscore character ('_') is allowed as a character in labels,
|
|
|
|
but when used as the first character in a label it indicates the
|
|
|
|
label is local. If you create labels with leading underscores that
|
|
|
|
are not local, the labels must be altered to start with some other
|
|
|
|
character, and made unique.</li>
|
2018-10-23 20:29:24 -07:00
|
|
|
<li>Labels starting with two underscores are "reserved". Trying to
|
|
|
|
use them causes an error.</li>
|
2018-10-23 16:06:29 -07:00
|
|
|
<li>By default, 64tass sets the first two bytes of the output file to
|
|
|
|
the load address. The <code>--nostart</code> flag is used to
|
|
|
|
suppress this.</li>
|
|
|
|
<li>By default, 64tass is case-insensitive, but SourceGen treats labels
|
|
|
|
as case-sensitive. The <code>--case-sensitive</code> must be passed to
|
|
|
|
the assembler.</li>
|
|
|
|
<li>If you set the <code>--case-sensitive</code> flag, <b>all</b> opcodes
|
2018-11-02 13:49:27 -07:00
|
|
|
and operands must be lower-case. Most of the SourceGen options used to
|
|
|
|
show things in upper case must be disabled.</li>
|
2018-10-23 16:06:29 -07:00
|
|
|
<li>For 65816, selecting the bank byte is done with the back-quote ('`')
|
|
|
|
rather than the caret ('^'). (There's a note in the docs to the effect
|
|
|
|
that they plan to move to carets.)</li>
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
|
|
|
|
<h3><a name="cc65">cc65</a></h3>
|
|
|
|
|
|
|
|
<p>Code is generated for cc65 v2.27.</p>
|
|
|
|
|
|
|
|
<p>Bugs:</p>
|
|
|
|
<ul>
|
|
|
|
<li>The arguments to MVN/MVP are reversed.</li>
|
|
|
|
<li>PC relative branches don't wrap around at bank boundaries.</li>
|
|
|
|
<li>BRK <arg> is assembled to opcode $05 rather than $00.</li>
|
|
|
|
<li>WDM is not supported.</li>
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
<p>Quirks:</p>
|
|
|
|
<ul>
|
|
|
|
<li>Operator precedence is unusual. Consider <code>label >> 8 - 16</code>.
|
|
|
|
cc65 puts shift higher than subtraction, whereas languages like C
|
|
|
|
and assemblers like 64tass do it the other way around. So cc65
|
|
|
|
regards the expression as <code>(label >> 8) - 16</code>, while the
|
|
|
|
more common interpretation would be <code>label >> (8 - 16)</code>.
|
2018-10-26 15:59:00 -07:00
|
|
|
(This is actually somewhat convenient, since none of the expressions
|
|
|
|
SourceGen currently generates require parenthesis.)</li>
|
2019-04-11 16:23:02 -07:00
|
|
|
<li>Undocumented opcode <code>SBX</code> ($cb) uses the mnemonic AXS. All
|
|
|
|
other opcodes match up with the "unintended opcodes" document.</li>
|
2018-10-30 15:42:13 -07:00
|
|
|
<li>ca65 is implemented as a single-pass assembler, so label widths
|
2018-11-02 13:49:27 -07:00
|
|
|
can't always be known in time. For example, if you use some zero-page
|
|
|
|
labels, but they're defined via .ORG $0000 after the point where the
|
|
|
|
labels are used, the assembler will already have generated them as
|
|
|
|
absolute values. Width disambiguation must be applied to operands
|
|
|
|
that wouldn't be ambiguous to a multi-pass assembler.</li>
|
2018-10-31 15:28:01 -07:00
|
|
|
<li>The assembler is geared toward generating relocatable code with
|
|
|
|
multiple segments (it is, after all, an assembler for a C compiler).
|
2018-11-18 15:20:12 -08:00
|
|
|
A linker configuration script is expected to be provided for anything
|
|
|
|
complex. SourceGen generates a custom config file for each project.</li>
|
2018-10-23 16:06:29 -07:00
|
|
|
</ul>
|
|
|
|
|
|
|
|
|
|
|
|
<h3><a name="merlin32">Merlin 32</a></h3>
|
|
|
|
|
|
|
|
<p>Code is generated for Merlin 32 v1.0.</p>
|
|
|
|
|
|
|
|
<p>Bugs:</p>
|
|
|
|
<ul>
|
|
|
|
<li>PC relative branches don't wrap around at bank boundaries.</li>
|
|
|
|
<li>For some failures, an exit code of zero is returned.</li>
|
2018-11-02 13:49:27 -07:00
|
|
|
<li>Some DP indexed store instructions cause errors if the label isn't
|
|
|
|
unambiguously DP (e.g. <code>STX $00,X</code> vs.
|
|
|
|
<code>STX $0000,X</code>). This isn't a problem with project/platform
|
|
|
|
symbols, which are output as two-digit hex values when possible, but
|
|
|
|
causes failures when direct page locations are included in the project
|
|
|
|
and given labels.</li>
|
2018-11-02 17:09:02 -07:00
|
|
|
<li>The check for 64KiB overflow appears to happen before instructions
|
|
|
|
that might be absolute or direct page are resolved and reduced in size.
|
|
|
|
This makes it unlikely that a full 64KiB bank of code can be
|
|
|
|
assembled.</li>
|
2018-10-23 16:06:29 -07:00
|
|
|
</ul>
|
|
|
|
|
|
|
|
<p>Quirks:</p>
|
|
|
|
<ul>
|
2018-10-26 15:59:00 -07:00
|
|
|
<li>Operator precedence is unusual. Expressions are generally processed
|
|
|
|
from left to right. The byte-selection operators have a lower
|
|
|
|
precedence than all of the others, and so are always processed last.</li>
|
2018-10-23 16:06:29 -07:00
|
|
|
<li>The byte selection operators ('<', '>', '^') are actually
|
|
|
|
word-selection operators, yielding 16-bit values when wide registers
|
2018-10-24 13:17:03 -07:00
|
|
|
are enabled on the 65816.</li>
|
2018-10-26 15:59:00 -07:00
|
|
|
<li>Values loaded into registers are implicitly mod 256 or 65536. There
|
|
|
|
is no need to explicitly mask an expression.</li>
|
2018-10-23 16:06:29 -07:00
|
|
|
<li>The assembler tracks register widths when it sees SEP/REP instructions,
|
2019-04-11 16:23:02 -07:00
|
|
|
but doesn't attempt to track the emulation flag. So if you issue a
|
|
|
|
<code>REP #$20</code>
|
2018-10-26 15:59:00 -07:00
|
|
|
while in emulation mode, the assembler will incorrectly assume long
|
|
|
|
registers. (Really I just want to be able to turn the width-tracking
|
|
|
|
off, but there's no way to do that.)</li>
|
|
|
|
<li>Non-unique local labels should cause an error, but don't.</li>
|
2018-10-23 16:06:29 -07:00
|
|
|
</ul>
|
|
|
|
|
2018-09-28 10:05:11 -07:00
|
|
|
</div>
|
|
|
|
|
2018-10-09 10:04:10 -07:00
|
|
|
<div id="footer">
|
2018-09-28 10:05:11 -07:00
|
|
|
<p><a href="index.html">Back to index</a></p>
|
|
|
|
</div>
|
|
|
|
</body>
|
|
|
|
<!-- Copyright 2018 faddenSoft -->
|
|
|
|
</html>
|