This prohibits empty initializers ({}) for arrays of unknown size, consistent with C23 requirements. Previous versions of C did not allow empty initializers at all, but ORCA/C historically did in some cases, so this patch still allows them for structs/unions/arrays of known size.
When the expression is initially parsed, we do not necessarily know whether it is the initializer for the struct/union or for its first member. That needs to be determined based on the type. To support that, a new function is added to evaluate the expression separately from using it to initialize an object.
This is currently used in a couple places in the designated initializer code (solving the problem with #pragma expand in the last commit). It could probably be used elsewhere too, but for now it is not.
As noted previously, there is some ambiguity in the standards about how anonymous structs/unions participate in initialization. ORCA/C follows the model that they do participate as structs or unions, and designated initialization of them is implemented accordingly.
This currently has a slight issue in that extra copies of the anonymous member field name will be printed in #pragma expand output.
This generally simplifies things, and always generates individual initializer records for each explicit initialization of a bit-field (which was previously done for automatic initialization, but not static).
This should work correctly for automatic initialization, but needs corresponding code changes in GenSymbols for static initialization.
This could maybe be simplified to just fill on levels with braces, but I want to consider that after implementing designated initializers for structs and unions.
The idea (not yet implemented) is to use this to support out-of-order initialization. For automatic variables, we can just initialize the subobjects in the order that initializers appear. For static variables, we will eventually need to reorder the initializers in order, but this can be done based on their recorded displacements.
The part of the declaration within the header could be ignored on subsequent compilations using the .sym file, which could lead to errors or misbehavior.
(This also applies to headers that end in the middle of a _Static_assert(...) or segment directive.)
This is preparatory to supporting designated initializers.
Any struct/union type with an anonymous member now forces .sym file generation to end, since we do not have a scheme for serializing this information in a .sym file. It would be possible to do so, but for now we just avoid this situation for simplicity.
This is a minimal implementation that does not actually inline anything, but it is intended to implement the semantics defined by the C99 and later standards.
One complication is that a declaration that appears somewhere after the function body may create an external definition for a function that appeared to be an inline definition when it was defined. To support this while preserving ORCA/C's general one-pass compilation strategy, we generate code even for inline definitions, but treat them as private and add the prefix "~inline~" to the name. If they are "un-inlined" based on a later declaration, we generate a stub with external linkage that just jumps to the apparently-inline function.
This detects errors in the following cases that were previously missed:
* A function declaration and definition being part of the same overall declaration, e.g.:
void f(void), g(void) {}
* A function declaration (not definition) with no declaration specifiers, e.g.:
f(void);
(Function definitions with no declaration specifiers continue to be accepted by default, consistent with C90 rules.)
A function declared "inline" with an explicit "extern" storage class has the same semantics as if "inline" was omitted. (It is not an inline definition as defined in the C standards.) The "inline" specifier suggests that the function should be inlined, but it is legal to just ignore it, as we already do for "static inline" functions.
Also add a test for the inline function specifier.
The scanner has been updated so that ... should always get recognized as a single token, so this is no longer necessary as a workaround. Any code that actually uses separate . . . is non-standard and will need to be changed.
This does not really do anything, because ORCA/C does not support multithreading, but the C11 and later standards indicate it should be allowed anyway.
If a struct contained a function pointer with a prototyped parameter list, processing the parameters could reset the declaredTagOrEnumConst flag, potentially leading to a spurious error, as in this example:
struct S {
int (*f)(int);
};
This also gives a better error for structs declared as containing functions.
Note that this implementation allows anonymous structures and unions to participate in initialization. That is, you can have a braced initializer list corresponding to an anonymous structure or union. Also, anonymous structures within unions follow the initialization rules for structures (and vice versa).
I think the better interpretation of the standard text is that anonymous structures and unions cannot participate in initialization as such, and instead their members are treated as members of the containing structure or union for purposes of initialization. However, all other compilers I am aware of allow anonymous structures and unions to participate in initialization, so I have implemented it that way too.
C90 had constraints requiring # and ## tokens to only appear in preprocessing directives, but C99 and later removed those constraints, so this code is no longer necessary when targeting current languages versions. (It would be necessary in a "strict C90" mode, if that was ever implemented.)
The main practical effect of this is that # and ## tokens can be passed as parameters to macros, provided the macro either ignores or stringizes that parameter. # and ## tokens still have no role in the grammar of the C language after preprocessing, so they will be an unexpected token and produce some kind of error if they appear anywhere.
This also contains a change to ensure that a line containing one or more illegal characters (e.g. $) and then a # is not treated as a preprocessing directive.
This accords with its definition in the C standards. For the time being, the old form of three separate tokens is still accepted too, because the ... token may not be scanned correctly in the obscure case where there is a line continuation between the second and third dots.
One observable effect of this is that there are no longer spaces between the dots in #pragma expand output.
This enforces the constraint from C17 section 6.7 p2 that declarations "shall declare at least a declarator (other than the parameters of a function or the members of a structure or union), a tag, or the members of an enumeration."
Somewhat relaxed rules are used for enums in the default loose type checking mode, similar to what GCC and Clang do.
A declaration of this exact form always declares the tag T within the current scope, and as such makes this "struct T" a distinct type from any other "struct T" type in an outer scope. (Similarly for unions.)
See C17 section 6.7.2.3 p7 (and corresponding places in all other C standards).
Here is an example of a program affected by this:
struct S {char a;};
int main(void) {
struct S;
struct S *sp;
struct S {long b;} s;
sp = &s;
sp->b = sizeof(*sp);
return s.b;
}