In combination with earlier patches, this fixes#53.
Also, if the lint flag requiring explicit function types is set, then also require that K&R-style parameters be explicitly declared with types, rather than not being declared and defaulting to int. (This is a requirement in C99 and later.)
Previously, these would report "identifier expected"; now they correctly say "')' expected".
This introduces a new UnexpectedTokenError procedure that can be used more generally for cases where the expected token may differ based on context.
_Thread_local is recognized but gives a "not supported" error. It could arguably be 'supported' trivially by saying the execution of an ORCA/C program is just one thread and so no special handling is needed, but that likely isn't what someone using it would expect.
There would be a possible issue if a "static" or "typedef" storage class specifier occurred after a type specifier that required memory to be allocated for it, because that memory conceptually might be in the local pool, but static objects are processed at the end of the translation unit, so their types need to stick around. In practice, this should not occur, because the local pool isn't currently used for much (in particular, not for statements or declarations in the body of a function). We give an error in case this somehow might occur.
In combination with preceding commits, this fixes#14. Declaration specifiers can now appear in any order, as required by the C standards.
This includes both the standard ones (inline and _Noreturn) and the ORCA/C-specific ones (asm and pascal). They can now be freely mixed with other declaration specifiers.
Some errors related to function specifiers are not yet detected.
This could happen in a declaration like "char _Alignas(long) c;", where typeSpec wound up specifying long rather than char.
Also, tweak error checks for _Alignas and _Atomic.
_Bool, _Complex, _Imaginary, _Atomic, restrict, and _Alignas are now recognized in types, but all except restrict and _Alignas will give an error saying they are not supported.
This also introduces uniform definitions of the syntactic classes of tokens that can be used in declaration specifiers and related constructs (currently used in some places but not yet in others).
These qualifiers were previously sometimes accepted between the name and left brace of struct and enum type specifiers. This was non-standard and is no longer allowed.
Type specifiers and type qualifiers can now appear in any order, as specified by the C standards. However, storage class specifiers and function specifiers still cannot be freely mixed with them.
As of C11, type names are now used as part of the declaration syntax (in _Alignas and _Atomic specifiers), in addition to their uses in expressions. Moving the TypeName method will allow it to be called when processing declarations.
Specifically, the following six punctuator tokens are now supported:
<: :> <% %> %: %:%:
These behave the same as the existing tokens [, ], {, }, #, and ## (respectively), apart from their spelling.
This can be useful when the full ASCII character set cannot easily be displayed or input (e.g. on the IIgs text screen with certain language settings).
Specifically, the following will now be tokenized as keywords:
_Alignas
_Alignof
_Atomic
_Bool
_Complex
_Generic
_Imaginary
_Noreturn
_Static_assert
_Thread_local
restrict
('inline' was also added as a standard keyword in C99, but ORCA/C already treated it as such.)
The parser currently has no support for any of these keywords, so for now errors will still be generated if they are used, but this is a first step toward adding support for them.
This could happen in some very obscure cases like using these macros for the names of segments or include files. The fix is to just terminate precompiled header generation if they are encountered.
The issue was that invalid sym files could be generated if an #include is encountered within an #if or #ifdef block in the main source file. The fix (for now) is to simply terminate precompiled header generation if such an #include is encountered.
Fixes#2.
This can occur in cases such as trying to assign to a non-l-value.
This patch ensures consistent handling of errors and prevents null pointer dereferences.
Previously, the logic for this was incorrect and would lead to a null pointer dereference in the compiler. In most cases the generated code would not actually change the pointer.
The following program demonstrates the issue:
#include <stdio.h>
#pragma memorymodel 1
typedef char bigarray[0x20000];
bigarray big[5];
int main(void) {
bigarray *p = big;
p++;
printf("%p %p\n", (void*)big, (void*)p);
}
These are initially entered into the symbol table with no known type (itype = nil), so this case should be accounted for in NewSymbol.
This typically would not cause a problem, but might if the zero page contained certain values
In certain rare cases, constant subexpression elimination could set the left subtree of a pc_bno operation in the intermediate code to nil. This could lead to null pointer dereferences, sometimes resulting in a crash or error during native code generation.
The below program sometimes demonstrates the problem (dependent on zero page contents):
#pragma optimize 16
struct F {int *p;};
void foo(struct F* f)
{
struct {int c;} s = {0};
++f->p;
s.c |= *--f->p;
}
This could cause spurious errors, or in some cases bad code generation.
The following example illustrates the problem:
#include <stdio.h>
enum {A,B,C};
/* arr was treated as having a size of 1, rather than 3 */
char arr[(int)C+1] = {1,2,3}; /* incorrectly gave an error for initializer */
int main(void) {
static int i = (int)C+1; /* incorrectly gave an error */
printf("%zu\n", sizeof(arr));
printf("%i\n", (int)C+1); /* OK */
printf("%i\n", i);
}
const structs are wrapped in definedType. The debugger symbol table code is unaware of this, which results in missing or incomplete entries.
example:
const struct { int a; int b; } cs;
cs: isForwardDeclared = false; class = ident
4 byte constant defined type of
4 byte struct: 223978
const struct { int a; int b; } *pcs;
pcs: isForwardDeclared = false; class = ident
4 byte pointer to
4 byte constant defined type of
4 byte struct: 224145
const struct { const struct { const int a; } a[2]; } csa[5];
csa: isForwardDeclared = false; class = ident
20 byte 5 element array of
4 byte constant defined type of
4 byte struct: 225155
const struct { const struct { const int a; } a[2]; } *cspa[5];
cspa: isForwardDeclared = false; class = ident
20 byte 5 element array of
4 byte pointer to
4 byte constant defined type of
4 byte struct: 224850
This change unwraps the definedType so the underlying type info can be placed in the debugger symbol table.