INTRODUCTION C02 is a simple C-syntax language designed to generate highly optimized code for the 6502 microprocessor. The C02 specification is a highly specific subset of the C standard with some modifications and extensions PURPOSE Why create a whole new language, particularly one with severe restrictions, when there are already full-featured C compilers available? It can be argued that standard C is a poor fit for processors like the 6502. The C was language designed to translate directly to machine language instructions whenever possible. This works well on 32-bit processors, but requires either a byte-code interpreter or the generation of complex code on a typical 8-bit processor. C02, on the other hand, has been designed to translate directly to 6502 machine language instructions. The C02 language and compiler were designed with two goals in mind. The first goal is the ability to target machines with low memory: a few kilobytes of RAM (assuming the generated object code is to be loaded into and ran from RAM), or as little as 128 bytes of RAM and 2 kilobytes of ROM (assuming the object code is to be run from a ROM or PROM). The compiler is agnostic with regard to system calls and library functions. Calculations and comparisons are done with 8 bit precision. Intermediate results, array indexing, and function calls use the 6502 internal registers. While this results in compiled code with virtually no overhead, it severely restricts the syntax of the language. The second goal is to port the compiler to C02 code so that it may be compiled by itself and run on any 6502 based machine with sufficient memory and appropriate peripherals. This slightly restricts the implementation of code structures. SOURCE AND OUTPUT FILES C02 source code files are denoted with the .c02 extension. The compiler reads the source code file, processes it, and generates an assembly language file with the same name as the source code file, but with the .asm extension instead of the .c02 extension. This assembly language file is then assembled to create the final object code file. Note: The default implementation of the compiler creates assembly language code formatted for the DASM assembler. The generation of the assembly language is parameterized, so it may be easily changed to work with other assemblers. COMMENTS The parser recognizes both C style and C++ style comments. C style comments begin with /* and end at next */. Nested C style comments are not supported. C++ style comments begin with // and end at the next newline. C++ style comments my be nested inside C style comments. DIRECTIVES Directives are special instructions to the compiler. Depending on the directive, it may or may not generate compiled code. A directive is denoted by a leading # character. Unlike a statements, a directives is not followed by a semicolon. Note: Unlike standard C and C++, which use a preprocessor to process directives, the C02 compiler processes directives directly. DEFINE DIRECTIVE The #define directive is used to define constants (see below). INCLUDE DIRECTIVE The #include directive causes the compiler to read and process and external file. In most cases, #include directives will be used with libraries of function calls, but they can also be used to modularize the code that makes up a program. An #include directive is followed by the file name to be included. This file name may be surrounded with either a < and > character, or by two " characters. In the former case, the compiler looks for the file in an implementation specific library directory (the default being ./include), while in the latter case, the compiler looks for the file in the current working directory. Two file types are currently supported. Header files are denoted by the .h02 extension. A header file is used to provide the compiler with the information necessary to use machine language system and/or library routines written in assembly language, and consists of comments and declarations. The declarations in a header file added to the symbol table, but do not directly generate code. After a header file has been processed, the compiler reads and process a assembly language file with the same name as the header file, but with the .a02 extension instead of the .h02 extension. The compiler does not currently generate any assembler required pseudo-operators, such as the specification of the target processor, or the starting address of the assembled object code. Therefore, at least one header file, with an accompanying assembly language file is needed in order to successfully assemble the compiler generated code. Details on the structure and implementation of a typical header file can be found in the file header.txt. Assembly language files are denoted by the .a02 extension. When the compiler processes an assembly language file, it simply inserts the contents of the file into the generated code. PRAGMA DIRECTIVE The #pragma directive is used to set various compiler options. When using a #pragma directive it is followed by the pragma name and possibly an option, each separated by whitespace. Note: The various pragma directives are specific to the cross-compiler and may be changed or omitted in future versions of the compiler. PRAGMA ASCII The #pragma ascii directive instructs the compiler to convert the characters in literal strings to a form expected by the target machine. Options: #pragma ascii high //Sets the high bit to 1 (e.g. Apple II) #pragma ascii invert //Swaps upper and lower case (e.g. PETSCII) PRAGMA ORIGIN The #pragma origin directive sets the target address of compiled code. Examples: #pragma origin $0400 //Compiled code starts at address 1024 #pragma origin 8192 //Compiled code starts at address 8192 PRAGMA PADDING The #pragma padding directive adds empty bytes to the end of the compiled program. It should be used with target systems that require the object code to be padded with a specific number of bytes. Examples: #pragma padding 1 //Add one empty byte to end of code #pragma padding $FF //Add 255 empty bytes to end of code PRAGMA RAMBASE The #pragma rambase directive sets the base address for variables in RAM (not declared const). This is normally used when the compiled code will be stored in ROM (such as in an EPROM or Cartridge), but can be used any time variables should be in a specific area of RAM. Examples: #pragma rambase $0200 //Define Variable RAM base address for NES #pragma rambase 828 //Define Variable RAM in C64 Tape Buffer #pragma rambase 4096 //Define RAM base for Vic 20 ROM cartridge PRAGMA VARTABLE The #pragma vartable directive forces the variable table to be written. It should be used before any #include directives that need to generate code following the table. PRAGMA WRITEBASE The #pragma writebase directive sets the base address for writing to variables in RAM. This is used when target system uses different addresses for reading and writing the same memory locations. This directive must be preceded by #pragma rambase directive with a non-zero argument. Examples: #pragma rambase $F080 //Define Superchip RAM Read Base for Atari 2600 #pragma writebase $F000 //Define Superchip RAM Write Base for Atari 2600 Note: Setting a RAM write base causes the compiler to generate a write offset which is concatenated to the variable name on all assignments. PRAGMA ZEROPAGE The #pragma zeropage directive sets the base address for variables declared as zeropage. Example: #Pragma zeropage $80 //Start zeropage variables at address 128 LITERALS A literal represents a value between 0 and 255. Values may be written as a number (binary, decimal, osir hexadecimal) or a character literal. A binary number consists of a % followed by eight binary digits (0 or 1). A decimal number consists of one to three decimal digits (0 through 9). A hexadecimal number consists of a $ followed by two hexadecimal digits (0 through 9 or A through F). A character literal consists of a single character surrounded by ' symbols. A ' character may be specified by escaping it with a \. Examples: &0101010 Binary Number 123 Decimal Number $FF Hexadecimal Number 'A' Character Literal '\'' Escaped Character Literal STRINGS A string is a consecutive series of characters terminated by an ASCII null character (a byte with the value 0). A string literal is written as up to 255 printable characters. prefixed and suffixed with " characters. The " character and a subset of ASCII control characters can be specified in a string literal by using escape sequences prefixed with the \ symbol: \b $08 Backspace \e $1B Escape \f $0C Form Feed \n $0A Line Feed \r $0D Carriage Return \t $09 Tab \v $0B Vertical Tab \" $22 Double Quotation Mark \\ $5C Backslash SYMBOLS A symbol consists of an alphabetic character followed by zero to five alphanumeric characters. Four types of symbols are supported: labels, simple variables, variable arrays, and functions. A label specifies a target point for a goto statement. A label is written as a symbol suffixed by a : character. A constant represents a literal value. A constant is written as a symbol prefixed by the # character. A simple variable represents a single byte of memory. A variable is written as a symbol without a suffix. A variable array represents a block of up to 256 continuous bytes in memory. An Array reference are written as a symbol suffixed a [ character, index, and ] character. The lowest index of an array is 0, and the highest index is one less than the number of bytes in the array. There is no bounds checking on arrays: referencing an element beyond the end of the array will access indeterminate memory locations. A function is a subroutine that receives multiple values as arguments and optionally returns a value. A function is written as a symbol suffixed with a ( character, up to three arguments separated by commas, and a ) character, The special symbols A, X, and Y represent the 6502 registers with the same names. Registers may only be used in specific circumstances (which are detailed in the following text). Various C02 statements modify registers as they are processed, care should be taken when using them. However, when used properly, register references can increase the efficiency of compiled code. STATEMENTS Statements include declarations, assignments, stand-alone function calls, and control structures. Most statements are suffixed with ; characters, but some may be followed with program blocks. BLOCKS A program block is a series of statements surrounded by the { and } characters. They may only be used with function definitions and control structures. CONSTANTS A constant is defined by using the #define directive followed the constant name (without the # prefix) and the literal value to be assigned to it. Examples: #define TRUE $FF #define FALSE 0 #define BITS %01010101 #define ZED 'Z' ENUMERATIONS An enumeration is a sequential list of constants. Enumerations are used to generate sets of related but distinct values. An enumeration is defined using an enum statement. When using the enum keyword, it is followed by a { character, one or more constant names separated by commas, and a } character. A period may be used in place of a constant name, in which case the sequence will be skipped. The enum statement is terminated with a semicolon. Examples: enum {BLACK, WHITE, RED, CYAN, PURPLE, GREEN, BLUE, YELLOW}; enum {., FIRST, SECOND, THIRD}; enum {ZERO, ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE, TEN}; Note: Values are automatically assigned to the constants in an enumeration. The first constant in the enumeration is assigned the value 0, the second is assigned the value 1, and so on. BITMASKS Bitmasks are a list of constants with values that correspond to each bit in a byte. Bitmasks are used to allow multiple true/false flags to be combined into a single char variable. An enumeration is defined using a bitmask statement. When using the bitmask keyword, it is followed by a { character, one to eight constant names separated by commas, and a } character. A period may be used in place of a constant name, in which case the bit value will be skipped. The bitmask statement is terminated with a semicolon. Examples: bitmask {BLUE, GREEN, RED, BRIGHT, INVERT, BLINK, FLIP, BKGRND}; bitmask {RD, RTS, DTR, RI, CD, ., CTS, DSR}; Note: Values are automatically assigned to the constants in a bitmask, each of which is a sequential power of two. The first constant in the bitmask is assigned the value 1, the second is assigned the value 2, the third is assigned the value 4, and so on. DECLARATIONS A declaration statement consists of type a keyword (char, int, or void) followed one or more variable names (and optional definitions) or a single function name and optional function block, or the struct keyword followed by a structure name and either a definition or a variable name. Variables may only be of type char or int, and all variable declaration statements are suffixed with a ; character. Variables of type char may be delared as arrays, by appending the variable name with the [ character, the upper bound of the array (0 to 255), and the ] character. Examples: char c; //Defines 8-bit variable c char hi,lo; //Defines 8-bit variables hi and lo char r[7]; //Defines 8 byte array r int addr; //Defines 16-bit variable addr int i, j; //Defines 16-bit variables i and j A function declaration consists of the function name suffixed with a ( character, followed by an optional parameter set and a ) character. The parameter set, if specified, may be either one to three simple char variables, a single int variable, or a char variable followed by an int variable. A function declaration terminated with a ; character is called a forward declaration and does not generate any code, while one followed by a program block creates the specified function. Functions of type char and int explicitly return one or more values (using a return statement), while functions of type void return no explicit value. The return statement causes the function to exit, after which control returns to the statement immediately following the function call. If the last statement before the closing } of the function body is not a return, then an implicit return is assumed. A return statement may be followed by an list of one to three variables following the same rules as function arguments (see FUNCTION CALLS, below), in which case those values are returned by the function call, otherwise the function call will not return any explicit values. Examples: void myfunc(); //Forward declaration of function myfunc char not(tmp) {tmp = tmp ^ $FF;} char max(tmp1, tmp2) {if (tmp1 > tmp2) return tmp1; else return tmp2;} char min(tmp1, tmp2) {if (tmp1 < tmp2) return tmp1; else return tmp2;} char test(b,c,d) {if (b:-) return min(c,d); else return max(c,d);} int swap(*,msb,lsb) {return *,lsb,msb;} //Swap bytes in integer int incdec(m,i) {if (m:-) i--; else i++; return i}; //inc/dec integer Note: Like all variables, function parameters are global. They must be declared prior to the function declaration, and retain there values after the function call. Although functions may be called recursively, they are not re-entrant. Allocation of variables and functions is implementation dependent, they could be placed in any part of memory and in any order. The default behavior is to place variables directly after the program code, including them as part of the generated object file. Function arguments and return values are passed through the 6502 registers. Char type values are passed by loading A, Y, and X respectively, while int type values are passed by loading Y with the most significant byte and X with the least significant bit. A return statement without explicit return values will return whatever happens to be in the registers at that time. STRUCTS A struct is a special type of variable which is composed of one or more unique variables called members. Each member may be either a simple variable or an array. However, the total size of the struct may not exceed 256 characters. Member names are local to a struct: each member within a struct must have a unique name, but the same member name can be used in different structs and may also have the same name as a global variable. The struct keyword is used for both defining the members of a struct type as well as declaring struct type variables. When defining a struct type, the struct keyword is followed by the name of the struct type, the { character, the member definitions separated by commas, and the } character. The struct definition is terminated with a semicolon. Each member definition is composed of a type keyword (char, int, or struct) and one or more member names, separated with commas. If a member is an array, the member name is suffixed the [ character, the upper bound of the array, and the ] character. Each member definition is terminated with a semicolon. Any number of comments may appear before the first member, between members, and after the last member. Member names are limited to six alphanumeric characters, the first of which must be alphabetic. Any names are allowed including reserved words, as well as A, X, and Y (which in this case, do not refer to registers). When declaring a struct variable, the struct keyword is followed by the struct type name, and one or more variable names, separated with commas. The struct declaration is terminated with a semicolon. Examples: struct record {char name[8]; char index; int addr, char data[128];}; struct record rec; struct record srcrec, dstrec; struct point {char x, y;} struct point pnt; struct line {struct pnt bgnpnt, endpnt;} Note: Unlike simple and array variable, the members of a struct variable may not be initialized during declaration. MODIFIERS A modifier is used with a declaration to override the default properties of an object. Modifiers may currently only be used with simple variable and array declarations, although this may be expanded in the future. The alias modifier specifies that a variable is to be located at a specific address. The specified address may either be a literal in the range 0 to 65534 ($0 to $FFFF) or a previously defined variable name. When using the alias modifier, the declared variable must be followed by the = character and the literal or variable name to be aliased to. The aligned modifier specifies that the the variable or array will start on a page variable. This is used to ensure that accessing an array element will not cross a page boundary, which requires extra CPU cycles to execute. The const modifier specifies that a variable or array should not be changed by program code. The const modifier may be preceded by an` aligned or zeropage modifier. A const variable declaration may include an initial value definition in the form of an = character and literal after the variable name. A const array is declared in one of two ways: the variable name suffixed with a [ character, a literal specifying the upper bound of the array, and a ] character; or a variable name followed by an = character and string literal or series of atring and/or numeric literals separated by commas and surrounded by the { or } characters. The zeropage modifier specifies that the variable will be defined in page zero (addresses 0 through 255). It should be used in conjunction with the pragma zeropage directive. Examples: alias char putcon = $F001; //Defines variable putcon with address $F001 alias char alpha = omega; //Defines variable alpha aliased to omega aligned char table[240]; //Defines 241 byte array aligned to page boundary const char debug = #TRUE; //Defines variable debug initialized to constant const char flag = 1; //Defines variable flag initialized to 1 const char s = "string"; //Defines 7 byte string s initialized to "string" const char n = {1,2,3}; //Defines 3 byte array m containing 1, 2, and 3 const char m = {"abc", 123); //Defines 5 byte array containing string and byte const char t = {"ab", "cd"); //Defines 6 byte array of two strings aligned const char fbncci = {0, 1, 1, 2, 3, 5, 8, 13, 21, 34}; zeropage char ptr, tmp; //Defines zero page variables EXPRESSIONS An expression is a series of one or more terms separated by operators. Each term in an expression may be any of the following: function call (first term only) subscripted array element char type variable, struct member, constant, or literal byte operation register (A, X, or Y). An expression may be preceded with a - character, in which case the first term is assumed to be a literal 0. Operators: + — Add the following value. - — Subtract the following value. & — Bitwise AND with the following value. | — Bitwise OR with the following value. ^ — Bitwise Exclusive OR with the following value. Arithmetic operators have no precedence. All operations are performed in left to right order. Expressions may not contain parenthesis. Note: the character ! may be substituted for | on systems that do not support the latter character. No escaping is necessary because a ! may not appear anywere a | would. After an expression has been evaluated, the A register will contain the result. Note: Function calls are allowed in the first term of an expression because upon return from the function the return value will be in the Accumulator. However, due to the 6502 having only one Accumulator, which is used for all operations between two bytes, there is no simple system agnostic method for allowing function calls in subsequent terms. BYTE OPERATIONS Byte operation allows the the bytes in an integer value to be accessed as individual character values. A byte operation consists of a byte operator prefixed to an integer value. Byte Operators: < - Get Least Significant Byte > - Get Most Significant Byte The integer value may be an integer literal, an address, or an int type variable or struct member. Examples: hi = >&r; lo = <&r; //Set hi, lo to MSB, LSB of address of array R page = >53281; //Set page to MSB of the integer literal 53281 lsb = — Evaluates to TRUE if expression is greater than term >= — Evaluates to TRUE if expression is greater than or equal to term <> — Evaluates to TRUE if expression is not equal to term The parser considers == equivalent to a single =. The operator <> was chosen instead of the usual != because it simplified the parser design. A test consists of an expression followed by a test-op. Test-Ops: :+ — Evaluates to TRUE if the result of the expression is positive :- — Evaluates to TRUE if the result of the expression is negative A negative value is one in which the high bit is a 1 (128 — 255), while a positive value is one in which the high bit is a 0 (0 — 127). The primary purpose of test operators is to check the results of functions that return a positive value upon succesful completion and a negative value if an error was encounters. They compile into smaller code than would be generated using the equivalent comparison operators. An contention may be preceded by negation operator (the ! character), which reverses the result of the entire contention. For example: ! expr evaluates to TRUE if expr is zero, or FALSE if it is non-zero; while ! expr = term evaluates to TRUE if expr and term are not equal, or FALSE if they are; and ! expr :+ evaluates to TRUE if expr is negative, or FALSE if it is positive Note: Contentions are compiled directly into 6502 conditional branch instructions, which precludes their use inside expressions. Standalone expressions and test-ops generate a single branch instruction, and therefore result in the most efficient code. Comparisons generate a compare instruction and one or two branch instructions (=. <. >=, and <> generate one, while <= and > generate two). A preceding negation operator will switch the number of branch instructions used in a comparison, but otherwise does not change the size of the generated code. CONDITIONALS A conditional consists of one or more contentions joined with the conjunctors "and" and "or". If only one contention is present, the result of the conditional is the same as the result of the contention. If two contentions are joined with "and", then the conditional is true only if both of the contentions are true. If either or both of the contentions are false, then the conditional is false. If two contentions are joined with "or", then the conditional is true if either or both of the contentions are true. If both of the contentions are false, then the conditional is false. When more three or more contentions are chained together, the conjunctors are evaluated in left to right order, using short-circuit evaluation. If the contention to the left of an "and" is false, then the entire conditional evaluates to false, and if the contention to the left of an "or" is true, then the entire conditional evaluates to true. In either case, no further contentions in the conditional are evaluated. ARRAY SUBSCRIPTS Individual elements of an array are accessed using subscript notation. Subscripted array elements may be used as a terms in an expression, as well as the target variable in an assignments. They are written as the variable name suffixed with a [ character, followed by an index, and the ] character. When assigning to an array element, the index may be a literal, constant, or simple variable. When using an array element in an expression or pop statement, the index may be any expression. Examples: z = r[i]; //Store the value from element i of array r into variable z r[0] = z; //Store the value of variable z into the first element of r z = d[15-i]; //Store the value element 15-i of array d into variable z c = t[getc()]; //Get a character, translate using array t and store in c Note: Register references may be used as array indexes within expressions, but the contents of each registers may change with each term evaluation. Using a constant, literal, or the X or Y registers as an array index will generates the same amount of code as a simple variable reference and leave both the X and Y registers unchanged. Using the A register as an index will generate one extra byte of code, while using a simple variable as index will generate 1 to 2 extra bytes of code. In either case, the index value will be left in the X register. When an expression is used as an index, one extra byte of stack space is used, and an additional three bytes of code is generated. The X register will contain the result of the expression and the Y register will be left in an unknown state. STRUCTS Individual members of a struct variable are specified using the struct variable name, a period, and the member name. If a member is an array, it's elements are accessed using the same syntax as an array variable. A struct variable can also be treated like an array variable, with each byte of the variable accessed as an array index. Examples: i = rec.index; //Get Struct Member rec.data[i] = i; //Set Struct Member Element arr[i] = rec[i]; //Copy Struct Byte into Array Note: Unlike standard C, structs may not be assigned using an equals sign. One struct variable may be copied to another byte by byte or through a function call. SIZE-OF OPERATOR The size-of operator @ generates a literal value equal to the size in bytes of a specified variable. It is allowed anywhere a literal would be and should be used anytime the size of an array, struct, or member is required. When using the size-of operator, it is prefixed to the variable name or member specification. Examples: for (i=0; i<=@z; i++) z[i] = r[i]; //Copy elements from r[] to z[] blkput(@rec ,&rec); //Copy struct rec to next block segment memcpy(@rec.data, &rec.data); //Copy member data to destination array Note: The size-of operator is evaluated at compile time and generates two bytes of machine language code. It is the most efficient method of specifying a variable length. INDEX-OF OPERATOR The index-of operator ? generates a literal value equal to the offset in bytes of a specified structure member. It is allowed anywhere a literal would be and should be used anytime the offset of the member of a struct is required. When using the size-of operator, it is prefixed to the member specification. Examples: blkmem(?rec.data, &s); //Search block for segment where data contains s memcpy(?rec.data, &t); //Copy bytes of rec up to member data into t Note: The index-of operator is evaluated at compile time and generates two bytes of machine language code. It is the most efficient method of specifying a the offset of a struct member. FUNCTION CALLS A function call may be used as a stand-alone statement, or as the first term in an expression. A function call consists of the function name appended with a ( character, followed by zero to three arguments separated with commas, and a closing ) character. The first argument of a function call may be an expression, integer, address, or string (see below). The second argument may be a term (subscripted array element, simple variable, or constant), integer, address, or string, The third argument may only be a simple variable or constant. If the first or second argument is an integer address or string, then no more arguments may be passed. When passing the address of a variable, array, struct, or struct member into a function, the variable specification is prefixed with the address-of operator &. When passing a literal string, it is simply specified as is. Examples: c = getc(); //Get character from keyboard n = abs(b+c-d); //Return the absolute value of result of expression m = min(r[i], r[j]); //Return lesser of to array elements l = strlen(&s); //Return the length of string s p = strchr(c, &s); //Return position of character c in string s putc(getc()); //Echo typed characters to screen puts("Hello World"); //Write "Hello World" to screen memdst(&dstrec); //Set struct variable as destination memcpy(140, &srcrec); //Copy struct variable to destination struct puts(&rec.name); //Write struct member to screen Note: This particular argument passing convention has been chosen because of the 6502's limited number of registers and stack processing instructions. When an address is passed, the high byte is stored in the Y register and the low byte in the X register. If a string is passed, it is turned into anonymous array, and it's address is passed in the Y and X registers. Otherwise, the first argument is passed in the A register, the second in the Y register, and the third in the X register. EXTENDED PARAMETER PASSING To enable direct calling of machine language routines that that do not match the built-in parameter passing convention, C02 supports the non-standard statements push, pop, and inline. The push statement is used to push arguments onto the machine stack prior to a function call. When using a push statement, it is followed by one or more arguments, separated by commas, and terminated with a semicolon. An argument may be an expression, in which case the single byte result is pushed onto the stack, or it may be an address or string, in which case the address is pushed onto the stack, high byte first and low byte second. The pop statement is likewise used to pop arguments off of the machine stack after a function call. When using a pop statement, it is followed with one or more simple variables or subscripted array elements , separated by commas, and terminated with a semicolon. If any of the arguments are to be discarded, a period may be specified instead of a variable name. The number of arguments pushed and popped may or may not be the same, depending on how the machine language routine manipulates the stack pointer. Examples: push d,r; mult(); pop p; //multiply d times r and store in p push x1,y1,x2,y2; rect(); pop .,.,.,.; //draw rectangle from x1,y1 to x2,y2 push &s, "tail"; strcat(); //concatenate "tail" onto string s push x[i],y[i]; rotate(d); pop x[i],y[i]; //rotate point x[1],y[i] by d Note: The push and pop statements could also be used to manipulate the stack inside or separate from a function, but this should be done with care. The inline statement is used when calling machine language routines that expect constant byte or word values immediately following the 6502 JSR instruction. A routine of this type will adjust the return address to the point directly after the last instruction. When using the inline statement, it is followed by one or more arguments, separated by commas, and terminated with a semicolon. The arguments may be constants, addresses, or strings. Examples; iprint(); inline "Hello World"; //Print "Hello World" irect(); inline 10,10,100,100; //Draw rectangle from (10,10) to (100,100) Note: If a string is specified in an inline statement, rather than creating an anonymous string and compiling the address inline, the entire string will be compiled directly inline. ASSIGNMENTS An assignment is a statement in which the result of an expression is stored in a variable. An assignment usually consists of a simple variable or subscripted array element, an = character, and an expression, terminated with a ; character. Examples: i = i + 1; //Add 1 to contents variable i c = getchr(); //Call function and store result in variable c s[i] = 0; //Terminate string at position i SHORTCUT-IFS A shortcut-if is a special form of assignment consisting of an contention and two expressions, of which one will be assigned based on the result of the contention. A shortcut-if is written as a condition surrounded by ( and ) characters, followed by a ? character, the expression to be evaluated if the condition was true, a : character, and the expression to be evaluated if the condition was false. Example: result = (value1 < value) ? value1 : value2; Note: Shortcut-ifs may only be used with assignments. This may change in the future. POST-OPERATORS A post-operator is a special form of assignment which modifies the value of a variable. The post-operator is suffixed to the variable it modifies. Post-Operators: ++ Increment variable (increase it's value by 1) -- Decrement variable (decrease it's value by 1) << Left shift variable >> Right shift variable Post-operators may be used with either simple variables or subscripted array elements. Examples: i++; //Increment the contents variable i b[i]<<; //Left shift the contents of element i of array b Note: Post-operators may only be used in stand-alone statements, although this may change in the future. ASSIGNMENTS TO REGISTERS Registers A, X, and Y may assigned to using the = character. Register A (but not X or Y) may be used with the << and >> post-operators, while registers X and Y (but not A) may be used with the ++ and -- post-operators. IMPLICIT ASSIGNMENTS A statement consisting of only a simple variable is treated as an implicit assignment of the A register to the variable in question. This is useful on systems that use memory locations as strobe registers. Examples: HMOVE; //Move Objects (Atari VCS) S80VID; //Enable 80-Column Video (Apple II) Note: An implicit assignment generates an STA opcode with the variable as the operand. PLURAL ASSIGNMENTS C02 allows a function to return up to three values by specifying multiple variables, separated by commas, to the left of the assignment operator (=). All three variables to be assigned may be either simple variables or subscripted array elements. Registers are not allowed in plural assignments. Examples: row, col = scnpos(); //Get current screen position cr, mn, mx = cpmnmx(a, b); //Compare two values, return min and max x[i], y[i] = rotate(x[i],y[i],d); //Rotate x[i] and y[i] by d degrees x[i], y[i], z[i] = get3d(i); //Generate 3d coordinate for index i Note: When compiled, a plural assignment generates an STX for the third assignment (if specified), an STY for the second assignment and an STA for the first assignment. Using a subscripted array element for the third assignment generates an overhead of three bytes of machine code. GOTO STATEMENT A goto statement unconditionally transfers program execution to the specified label. When using a goto statement, it is followed by the label name and a terminating semicolon. Example: goto end; Note: A goto statement may be executed from within a loop structure (although a break or continue statement is preferred), but should not normally be used to jump from inside a function to outside of it, as this would leave the return address on the machine stack. IF AND ELSE STATEMENTS The if then and else statements are used to conditionally execute blocks of code. When using the if keyword, it is followed by a conditional (surrounded by parenthesis) and the block of code to be executed if the conditional was true. An else statement may directly follow an if statement (with no other executable code intervening). The else keyword is followed by the block of code to be executed if the conditional was false. Examples: if (c = 27) goto end; if (n) q = div(n,d) else putln("Division by 0!"); if (r[q]i | i > >j || >i = >j && 13) //Echo line to screen i=0; do {i++;} while (>i <= >j && 0;i++) {c=getc();s[i]=c} //Read characters into string s Note: For loops are compiled using the 6502 JMP statements, so the code blocks may be arbitrarily large. A for loop generates less efficient code more than a simple while loop, but will always execute the increment assignment on a continue. BREAK AND CONTINUE A break statement is used to exit out of a do, for, or while loop or a case block. The continue statement is used to jump to the beginning of a do, for, or while loop. Neither may be used outside it's corresponding control structures. When a break statement is encountered, program execution is transferred to the statement immediately following the end of the block associated with the innermost do, for, while, or case statement. When using the break keyword, it is followed with a trailing semicolon. When a continue statement is encountered, program execution is transferred to the beginning of the block associated with the innermost do, for, or while statement. In the case of a for statement, the increment assignment is executed, followed by the conditional, and in the case of a while statement, the conditional is executed. When using the continue keyword, it is followed with a trailing semicolon. Examples: do {c=rdkey(); if (c=0) continue; if (c=27) break;} while (c<>13);` for (i=0;i