In combination with earlier patches, this fixes#53.
Also, if the lint flag requiring explicit function types is set, then also require that K&R-style parameters be explicitly declared with types, rather than not being declared and defaulting to int. (This is a requirement in C99 and later.)
Previously, these would report "identifier expected"; now they correctly say "')' expected".
This introduces a new UnexpectedTokenError procedure that can be used more generally for cases where the expected token may differ based on context.
_Thread_local is recognized but gives a "not supported" error. It could arguably be 'supported' trivially by saying the execution of an ORCA/C program is just one thread and so no special handling is needed, but that likely isn't what someone using it would expect.
There would be a possible issue if a "static" or "typedef" storage class specifier occurred after a type specifier that required memory to be allocated for it, because that memory conceptually might be in the local pool, but static objects are processed at the end of the translation unit, so their types need to stick around. In practice, this should not occur, because the local pool isn't currently used for much (in particular, not for statements or declarations in the body of a function). We give an error in case this somehow might occur.
In combination with preceding commits, this fixes#14. Declaration specifiers can now appear in any order, as required by the C standards.
This includes both the standard ones (inline and _Noreturn) and the ORCA/C-specific ones (asm and pascal). They can now be freely mixed with other declaration specifiers.
Some errors related to function specifiers are not yet detected.
This could happen in a declaration like "char _Alignas(long) c;", where typeSpec wound up specifying long rather than char.
Also, tweak error checks for _Alignas and _Atomic.
_Bool, _Complex, _Imaginary, _Atomic, restrict, and _Alignas are now recognized in types, but all except restrict and _Alignas will give an error saying they are not supported.
This also introduces uniform definitions of the syntactic classes of tokens that can be used in declaration specifiers and related constructs (currently used in some places but not yet in others).
These qualifiers were previously sometimes accepted between the name and left brace of struct and enum type specifiers. This was non-standard and is no longer allowed.
Type specifiers and type qualifiers can now appear in any order, as specified by the C standards. However, storage class specifiers and function specifiers still cannot be freely mixed with them.
As of C11, type names are now used as part of the declaration syntax (in _Alignas and _Atomic specifiers), in addition to their uses in expressions. Moving the TypeName method will allow it to be called when processing declarations.
When these invalid sym files were used during subsequent compiles, certain type pointers (for what should be const-qualified struct or union types) could be left uninitialized, or possibly initialized pointing to different types. This could result in spurious errors or potentially in other problems.
This relates to unions or structs that are "filled" with zeros because the initializer does not include explicit terms for them, and that contain bit-fields or (for unions) do not start with the longest member.
The following program is an example that was miscompiled:
#include <stdio.h>
struct BF {
int i:3;
int j:4;
};
union U {
int i;
long l;
};
struct Outer1 {
int n;
struct BF bf[7];
union U u[5];
};
struct Outer2 {
long p;
struct Outer1 o1;
long q;
};
int main(void) {
static struct Outer2 s = {1,{0},212};
printf("%li %li\n", s.p, s.q);
}
This fixes case 1 (dealing with run-time initialization of structures containing bit-fields) and case 2 (dealing with initialization of structs where initializer values are not provided for all elements) from issue #59. It also fixes cases that could result in invalid initialization of unions if their first element was not the longest, as in the following example:
#include <stdio.h>
union U {
int i;
long l;
};
int main(void) {
union U a[5] = {1,2,3,4};
printf("a[0].i=%i, a[1].i=%i, a[2].i=%i, a[3].i=%i, a[4].i=%i\n",
a[0].i, a[1].i, a[2].i, a[3].i, a[4].i);
}
This was an alias for double, but it's non-standard and undocumented. Apparently it existed in some other pre-standard compilers, but it's not in any version of standard C, and I can't find any evidence of it being used. Considering the possibility for confusion, I think it's best to remove it.
This allows debuggers to stop on the declaration lines, and also provides trace-back information for them if that feature is enabled.
Currently, this applies only to declarations that occur after the first statement in the block. As such, it doesn't change the handling of traditional pre-C99-style declarations at the beginning of a block.
An extra, fourth byte was being generated for the bitfield(s). This would cause all subsequent members of the struct and any enclosing object not to be initialized at the proper locations, which would generally corrupt their values.
The following program illustrates the issue:
#include <stdio.h>
struct X {
int a:9;
int b:9;
int c;
} x = {123,234,12345};
int main(void) {
printf("x.a = %i, x.b = %i, x.b = %i\n", x.a, x.b, x.c);
}
The initialized bytes for the bitfield(s) could wind up improperly being placed after those for the non-bitfield, generally corrupting both values.
The following program illustrates the problem:
#include <stdio.h>
struct X {
int a:9;
int b;
} x = {42,123};
int main(void) {
printf("x.a = %i, x.b = %i\n", x.a, x.b);
}
Note that this code currently permits discarding the const qualifier via such an initialization. That should give a diagnostic, but currently it doesn't in this or various other cases.
The following code (derived from a csmith-generated test case) illustrates the problem:
struct S0 {
const long f4;
};
const struct S0 g_149;
const long *g_311 = &g_149.f4;
This would lead to errors in programs like the following:
int main(void) {
typedef int x;
x: ;
}
Even before support for mixed statements and declarations was introduced, this error could happen if the labeled statement was the first statement after the declarations in a block (as in the above example). Adding that support also allowed this error to happen with later statements in a block. The C4.2.4.1.CC test case was affected by this.
Under these rules, if, switch, for, while, and do statements each have their own block scopes separate from the enclosing scope, and their substatements also have their own block scopes.
This patch always applies the C99 scope rules, but a flag can be changed to disable them or make them conditional on a configuration setting.
In the case of structs or unions, an error is now produced. This addresses one of the problems mentioned in issue #53.
In the case of arrays, tentative definitions like "int i[];" are now permitted at file scope. If not completed by a subsequent definition, this winds up producing an array with one element, initialized to 0. See the discussion and example in C99/C11 section 6.9.2 (or C90 section 6.7.2 and example in TC1).
If there are no varargs calls (and nothing else that saves stack positions), then space doesn't need to be allocated for the saved stack position. This can also lead to more efficient prolog/epilog code for small functions.
For example, the following is now allowed:
typedef void v;
void foo(v) {}
This appears to be permitted under at least C99 and C11 (the C89 wording is less clear), and is accepted by other modern compilers.
These are enabled when bit 15 is set in the #pragma debug directive.
Support is still needed to ensure these work properly with pre-compiled headers.
This patch is from Kelvin Sherlock.
Previously, several optimizations would be disabled for the rest of the translation unit whenever the keyword 'volatile' appeared. Now, if 'volatile' is used within a function, it only reduces optimization for that function, since whatever was declared as 'volatile' will be out of scope after the function is over. Uses of 'volatile' outside functions still behave as before.
This problem could cause "duplicate symbol" and "undeclared identifier" errors, for example in the following program:
typedef int f1( void );
void bar( void ) {
int i;
f1 *foo;
int baz;
i = 10;
}
int foo;
long baz;
This should give C99-compatible behavior, as far as it goes. The functions aren't actually inlined, but that's just a quality-of-implementation issue. No C standard requires actual inlining.
Non-static inline functions are still not supported. The C99 semantics for them are more complicated, and they're less widely used, so they're a lower priority for now.
The "inline" function specifier can currently only come after the "static" storage class specifier. This relates to a broader issue where not all legal orderings of declaration specifiers are supported.
Since "inline" was already treated as a keyword in ORCA/C, this shouldn't create any extra compatibility issues for C89 code.