mirror of
https://github.com/RevCurtisP/C02.git
synced 2024-10-31 10:14:09 +00:00
531 lines
14 KiB
Plaintext
531 lines
14 KiB
Plaintext
C02 Design Considerations
|
|
|
|
Variable Storage
|
|
================
|
|
|
|
Zero Page
|
|
---------
|
|
|
|
The 6502 has a Zero Page addressing mode. This differs from the absolute
|
|
addressing mode in that the address operand is only one byte instead of
|
|
two and the instructions take less cycles to execute.
|
|
|
|
Also, some systems have very limited RAM, requiring all variables to be
|
|
stored in zero page. One example is the Atari 2600, which maps the 128
|
|
bytes of RAM in the RIOT chip to the range $80-$8F and mirrors it at
|
|
$180-$18F (for the machine stack).
|
|
|
|
To allow the use of zero page variables, the compiler recognizes the
|
|
"zeropage" declaration modifier and the "#pragma zeropage" directive,
|
|
the latter of which specifies a base address for allocating zero page
|
|
variables.
|
|
|
|
I may add a -Z command line option to allow the specification of the
|
|
zero page variable base address.
|
|
|
|
ROM Based Code
|
|
--------------
|
|
|
|
For compiled programs that will reside in ROM, such as an EPROM or a
|
|
cartridge, variables will need to reside in a separate memory area than
|
|
the program.
|
|
|
|
The compiler normally allocates all variables directly after the generated
|
|
code, const variables first and regular variables afterward. In addition,
|
|
the regular variables are all all allocated as zero bytes in the assembled
|
|
object code, which can unnecesarilly inflate the size of the generated
|
|
binary file.
|
|
|
|
The compiler allows the location of non-const variables to be specified
|
|
using the "#pragma rambase" directive.
|
|
|
|
I may add a -R command line option to allow the specification of the RAM
|
|
variable base address.
|
|
|
|
Read/Write Variables
|
|
--------------------
|
|
|
|
Some systems used different memory addresses for reading and writing to
|
|
RAM. One example is Atari 2600 catridges with RAM. The SARA Superchip
|
|
contained 128 bytes of RAM, which were written at addresses $F000 through
|
|
$F01F and read from addresses $F080 through $F0FF, while CBS RAM+ had 256
|
|
bytes which were written from $F000 though $F0FF and read from $F100
|
|
through $F1FF.
|
|
|
|
To allow for this, the compiler recognizes an additional directive,
|
|
"#pragma writebase" directive. In this case the "#pragma rambase"
|
|
will specify the base address for reads. When a write base address is
|
|
specified, the compiler will allocate all variables using the read base
|
|
address and generate an offset, which will be included in all variable
|
|
assignments. Thus the code
|
|
|
|
#pragma rambase $F100
|
|
#pragma writebase $F000
|
|
char b,c;
|
|
c = b;
|
|
|
|
would generate the assembly code
|
|
|
|
LDA B
|
|
STA C-256
|
|
B EQU $F100
|
|
C EQU $F101
|
|
|
|
I may add a -W command line option to allow the specification of the write
|
|
base address.
|
|
|
|
Variable Types
|
|
==============
|
|
|
|
Pointers
|
|
--------
|
|
|
|
The 6502 differs from nearly all other microprocessors by not having
|
|
a 16 bit index register, instead using indirect indexed mode in
|
|
conjunction with zero page.
|
|
|
|
On other processors, it would be trivial to dereference a pointer
|
|
stored anywhere in memory. For example, on the 6800:
|
|
|
|
LDX P ;N = *P
|
|
LDAA IX
|
|
|
|
There is no direct equivalent on the 6502, but by using indirect
|
|
indexed mode, any zero page variable pair can be used as a pointer:
|
|
|
|
LDY #0 ;N = *P
|
|
LDA (P),Y
|
|
|
|
Additionally, accessing elements of a dereferenced zero page pointer
|
|
is just as trivial:
|
|
|
|
LDY I ;N = *P[I]
|
|
LDA (P),Y
|
|
|
|
expr ;N - *P[expr]
|
|
TAY
|
|
LDA (P),Y
|
|
|
|
However, trying to use a 16-bit value stored outside of zero-page
|
|
would take extra code and require using a a zero page byte pair:
|
|
|
|
LDY P ;N = *P
|
|
STY ZP
|
|
LDY P+1
|
|
STY ZP+1
|
|
LDY #0
|
|
LDA (ZP),Y
|
|
|
|
This violates two principles of the C02 design philosophy: that
|
|
each token correspond to one or two machine instructions and
|
|
that the compiler is completely agnostic of the system configuration.
|
|
|
|
Therefore, if pointers are implemented, they will have to be in
|
|
zero page.
|
|
|
|
Implementation:
|
|
|
|
Declaration of a pointer should use two bytes of zero page storage,
|
|
the address of which would be taken from the free zero page space
|
|
specified by the #pragma zeropage directive.
|
|
|
|
P EQU $80 ;char *p
|
|
Q EQU $82 ;char *q
|
|
|
|
Any pointers declared in a header file would added to the variable
|
|
table but not allocated, so would be defined in the accompanying
|
|
assembly file:
|
|
|
|
char *dst; //stddef.h02
|
|
char dstlo,dsthi;
|
|
|
|
DST EQU $30 ;stddef.a02
|
|
DSTLO EQU $30
|
|
DSTHI EQU $31
|
|
|
|
Since they are 16-bit values, raw pointers cannot be passed into
|
|
used directly in expressions. But could be used in standalone
|
|
assignments:
|
|
|
|
LDX Q ;p = q
|
|
STX P
|
|
LDY Q+1 ;X is used for LSB and Y as MSB to match
|
|
STY P+1 ;the address passing convention for functions
|
|
|
|
Likewise if a 16-bit integer variable type was added to C02,
|
|
then one could be assigned to the other using the exact same
|
|
assembly code.
|
|
|
|
A derefenced pointer can be used anywhere an array references
|
|
is allowed and would be subject to the same restrictions.
|
|
|
|
A raw pointer used as an argument to a function will be passed
|
|
using the same convention as an address:
|
|
|
|
LDY P+1 ;func(p)
|
|
LDX P
|
|
JSR FUNC
|
|
|
|
JSR FUNC ;p = func()
|
|
STY P+1
|
|
STX P
|
|
|
|
Since pointers are zero page the address of operator on a pointer
|
|
will generate a char value:
|
|
|
|
LDA #P ;&p
|
|
|
|
This will allow passing of pointers for use inside of functions
|
|
using indexed mode:
|
|
|
|
LDA #P ;inc(&p)
|
|
JSR FUNC
|
|
|
|
TAX ;void inc()
|
|
INC ($00,X)
|
|
RTS
|
|
|
|
The memio module makes extensive use of pointer addresses as
|
|
function arguments.
|
|
|
|
Structs
|
|
-------
|
|
|
|
For the declarations
|
|
|
|
STRUCT RECORD { CHAR NAME[8]; CHAR INDEX; CHAR DATA[128]; };
|
|
STRUCT RECORD REC;
|
|
|
|
references to the members of the struct REC will generate the code:
|
|
|
|
XXA REC+$09 ;REC.INDEX
|
|
|
|
LDX I ;REC.DATA[I]
|
|
XXA REC+$0A,X
|
|
|
|
Using the address of operator on a struct member generates assembly
|
|
code with parentherical expressions, which are not recognized by all
|
|
assemblers:
|
|
|
|
LDY #>(REC+$0A) ;FUNC(&REC.DATA)
|
|
LDX #<(REC+$0A)
|
|
JSR FUNC
|
|
|
|
The compiler could optimize the generation of code for references
|
|
to the first member of a struct, producing
|
|
|
|
XXA REC ;REC.NAME
|
|
|
|
instead of
|
|
|
|
XXA REC+$00 ;REC.NAME
|
|
|
|
but the machine code produced by the assembler should be indentical
|
|
in either case.
|
|
|
|
|
|
Expression Evaluation
|
|
=====================
|
|
|
|
Array Indexes
|
|
-------------
|
|
|
|
Array indexing normally uses the X register and indexed addressing mode:
|
|
|
|
LDX I ;R[I]
|
|
XXA R,X
|
|
|
|
If the index is a constant or literal, absolute addressing is used:
|
|
|
|
LDA R+1 ;R[1]
|
|
XXA S+0 ;S[0]
|
|
|
|
Specifying a register as the index also uses indexed addressing mode:
|
|
|
|
TAX ;R[A]
|
|
XXA R,X
|
|
XXA R,Y ;R[Y]
|
|
XXA R,X ;R[X]
|
|
|
|
Allowing for an expression as the index in the first term of an
|
|
expression uses only one extra byte of code and two machine cycles:
|
|
|
|
expression ;R[expr]
|
|
TAX
|
|
LDAA R,X
|
|
|
|
while in any other termm is uses an extra three extra bytes of code
|
|
and ten machine cycles:
|
|
|
|
PHA ;R[expr]
|
|
expr code ;code to evaluate the expression
|
|
TAX
|
|
PLA
|
|
XXA R,X
|
|
|
|
compared to the extra four to six bytes and six to eight machine cycles
|
|
used by the equivalent C02 code required to achieve the same result:
|
|
|
|
expr code ;Z = expr
|
|
STA Z
|
|
LDX Z ;R[Z]
|
|
XXA R,X
|
|
|
|
Function Calls
|
|
--------------
|
|
|
|
A function call in the first term of an expression requires additional
|
|
processing by the compiler, since the accumulator holds the return
|
|
value upon completion of the function call:
|
|
|
|
JSR FUNC ;R = func()
|
|
STA R
|
|
|
|
Allowing a function call in susbsequent terms, however, requires
|
|
extensive stack manipulation:
|
|
|
|
|
|
|
|
whereas the equivalent C02 code generates much simpler machine code:
|
|
|
|
|
|
|
|
|
|
Shift Operators
|
|
----------------
|
|
|
|
In standard C, the shift operators use the number of bits to shift as
|
|
the operand. Since the 6502 shift and rotate instructions only shift
|
|
one bit at a time, this would require the use of an index register and
|
|
six to seven bytes of code
|
|
|
|
expr code ;expr
|
|
.LOOP LDY B ;>> B
|
|
LSR
|
|
DEY
|
|
BNE .LOOP
|
|
|
|
whereas a library function would require five bytes of code:
|
|
|
|
SHIFTR: LSR ;A=Value to Shift
|
|
DEY ;Y=Bits to Shift
|
|
BNE SHIFTR
|
|
RTS
|
|
|
|
and each function call would use five bytes of code
|
|
|
|
expr code ;shiftr(expr, B)
|
|
LDY B
|
|
JSR SHIFTR
|
|
|
|
Following the philosophy that a operator should correspond to a single
|
|
machine instruction, in C02 shifting is implemented as post-operators
|
|
using the various available addressing modes:
|
|
|
|
ASL S ;S<<
|
|
|
|
LDX I ;T[I]>>
|
|
LSR T,X
|
|
|
|
ASL ;A<<
|
|
|
|
|
|
Post-Operators and Pre-Operators
|
|
--------------------------------
|
|
|
|
Parsing for post-operators in a standalone expression is trivial, since
|
|
that is the only time the relevant characters will only follow the operand.
|
|
|
|
Implementing pre-operators on standalone expressions would be redundant
|
|
since their would be no difference in the generated code.
|
|
|
|
Parsing for post-operators and/or preoperators within an expression or
|
|
evaluation, however, would complicate the detection of operators and
|
|
comparators.
|
|
|
|
In addition, the code generated from a post-operator or pre-operator
|
|
withing an expression:
|
|
|
|
DEC I ;R[--I]
|
|
LDX I
|
|
XXA R,X
|
|
|
|
LDX I ;R[I++]
|
|
XXA R,X
|
|
INC I
|
|
|
|
Is indentical to the code generated when using a standalone post-operator;
|
|
|
|
DEC I ;I--
|
|
LDX I ;R[I]
|
|
XXA R,X
|
|
|
|
LDX I ;R[I]
|
|
XXA R,X
|
|
INC I ;I++
|
|
|
|
Assignments
|
|
===========
|
|
|
|
Assignment to a variable generates an STA instruction, using either absolute
|
|
or indexed addressing mode:
|
|
|
|
expr code ;R = expr
|
|
STA R
|
|
|
|
expr code ;R[2] = expr
|
|
STA R+2
|
|
|
|
expr code ;R[I] = expr
|
|
LDX I
|
|
STA R,I
|
|
|
|
while assignment to an index register generates a transfer instruction:
|
|
|
|
expr code ;Y = expr
|
|
TAY
|
|
|
|
expr code ;X = expr
|
|
TAX
|
|
|
|
and assignment to the Accumulator is simply a syntactical convenience:
|
|
|
|
expr code ;A = expr
|
|
|
|
Specific to C02 is the implied assignment, which also generates an STA:
|
|
|
|
STA S ;S
|
|
|
|
Allowing an expression as an array index in an assignment is problematic
|
|
on an NMOS 6502 since the index expression will be evaluated prior to the
|
|
expression being assigned to the array element and neither index register
|
|
may be directly pushed or pulled from the stack.
|
|
|
|
The addition of the PLX instruction to the 65C02 would allows this to be
|
|
done using only two extra bytes of code for each variable assignment:
|
|
|
|
indexp oode ;R[indexp] = expr
|
|
PHA
|
|
expr code
|
|
PLX
|
|
STA R,X
|
|
|
|
expr oode ;R[expr], S[exps] = func()
|
|
PHA
|
|
exps code
|
|
PHA
|
|
JSR FUNC
|
|
PLX
|
|
STY S,X
|
|
PLX
|
|
STA R,X
|
|
|
|
however, this would only work with the A and Y variables of a plural
|
|
assignment.
|
|
|
|
A workaround for the NMOS 6502 would require an extra five bytes of code
|
|
|
|
indexp oode ;R[indexp] = expr
|
|
PHA
|
|
expr code
|
|
TAY
|
|
PLA
|
|
TAX
|
|
TYA
|
|
STA R,X
|
|
|
|
and the use of the Y register would limit it to the only the A variable
|
|
of a plural assigment, whereas the equivalent C02 code would use an
|
|
extra four to six bytes of code
|
|
|
|
indexp code ;I = indexp;
|
|
STA I
|
|
expr code ;R[I] = expr
|
|
LDX I
|
|
STA R,X
|
|
|
|
and works with all three variables of a plural assignment.
|
|
|
|
Conditionals
|
|
============
|
|
|
|
Conditionals are separate and distinct from expessions, due to the fact
|
|
that the comparison operators all set status flags which affect the various
|
|
branch instructions:
|
|
|
|
CMP DATA Normal Inverted
|
|
Reg < Data BCC BCS
|
|
Reg = Data BEQ BNE
|
|
Reg ≥ Data BCS BCC
|
|
Reg ≠ Data BNE BEQ
|
|
|
|
CLC:SBC DATA Normal Inverted
|
|
Reg ≤ Data BCC BCS
|
|
Reg > Data BCS BCC
|
|
|
|
When compiling a comparison, the generated code will usually, but not
|
|
always, branch when the condition is false, skipping to the end of the
|
|
block of code following the comparison. In addition, the logical not
|
|
operator will invert the comparison.
|
|
|
|
By arranging the eight standard comparisons, along with evaluation of
|
|
a term as true when non-zero, in this order:
|
|
|
|
0 1 2 3 4 5 6 7
|
|
!0 = < ≤ ≥ > ≠ 0
|
|
|
|
the comparison can be inverted with a simple exclusive-or. For this
|
|
reason, The ! operator is logical rather than bitwise and affects
|
|
the result of the comparison rather than an individual expression
|
|
or term within the expression.
|
|
|
|
Standalone Expressions
|
|
----------------------
|
|
|
|
Flags Operators
|
|
---------------
|
|
|
|
Logical Operators
|
|
-----------------
|
|
|
|
Parsing the logical operators && and || is trivial if the preceding
|
|
condition is a comparison or flag operation, since any expression
|
|
evaluation is complete before the parse encounters the initial & or |
|
|
character. For a standalone evaluation of an expression as true or\
|
|
false, however, the expression evaluator will mistake the initial
|
|
character of the && or || as a bitwise operator. Differentiating
|
|
the two would require changing to a look-ahead parser.
|
|
|
|
One solution is to enclose require parenthises around each comparison
|
|
when using logical operators, which is allowable in C syntax, but
|
|
it's just as easy, and arguably cleaner looking, to use the words
|
|
"and" and "or" instead. This is be allowble in standard C by using
|
|
the @define directive to alias "and" to "&&" and "or" to "||".
|
|
|
|
The most efficient way to implement logical operators is to use shortcut
|
|
evaluations.
|
|
|
|
Under the normal circumstances, where the generated code branches when
|
|
the condition is false, && can be implemented by evaluation the next
|
|
comparison only in the event of the first condition being true.
|
|
|
|
For ||, however, the following comparison will be evaluated only if the
|
|
first comparison was false.
|
|
|
|
LDA I ;IF (I<J
|
|
CMP J
|
|
BCS ENDIF
|
|
LDA M ;&& M<>N)
|
|
CMP N
|
|
BEQ ENDIF
|
|
|
|
LDA I ;IF (I<J
|
|
CMP J
|
|
BCC block
|
|
LDA M ;|| M<>N)
|
|
CMP N
|
|
BEQ ENDIF
|
|
block
|
|
|
|
When chaining multiple && and/or || operators, the shortcut evaluation
|
|
effectively make them right-associative. |