Commit Graph

160 Commits

Author SHA1 Message Date
Stephen Heumann
58d8edf1ee Handle filling of array elements without explicit initializers.
At this point, designated initializers for arrays are at least largely working.
2022-11-27 16:48:58 -06:00
Stephen Heumann
aa6b82a136 Ensure array designators are processed at the level with braces. 2022-11-26 23:03:20 -06:00
Stephen Heumann
5df94c953e Fix handling of initializer counts in AutoInit.
This was broken by the previous changes to it.
2022-11-26 21:09:53 -06:00
Stephen Heumann
335e8be75e Rename the procedure for initializing one element of an auto variable.
"InitializeOneElement" is more descriptive of what it does now. We also skip passing the variable, which is always the same.
2022-11-26 20:46:24 -06:00
Stephen Heumann
5f8a6baa94 Get rid of an unnecessary field in initializer records.
The "isStructOrUnion" information can now be determined simply by the type in the record.
2022-11-26 20:29:31 -06:00
Stephen Heumann
968844fb38 Make auto initialization use the type and disp in initializer record.
This simplifies the code a good bit, as well as enabling out-of-order initialization using designated initializers.
2022-11-26 20:24:33 -06:00
Stephen Heumann
d1edc8821d Record the type being initialized in auto initializer records. 2022-11-26 19:58:01 -06:00
Stephen Heumann
cd9931a60c Record displacement from start of object in initializer records.
The idea (not yet implemented) is to use this to support out-of-order initialization. For automatic variables, we can just initialize the subobjects in the order that initializers appear. For static variables, we will eventually need to reorder the initializers in order, but this can be done based on their recorded displacements.
2022-11-26 19:27:17 -06:00
Stephen Heumann
8cfc14b50a Rename itype field of initializerRecord to basetype. 2022-11-26 15:45:26 -06:00
Stephen Heumann
b6d3dfb075 Designated initializers for arrays, part 1.
This can parse designated initializers for arrays, but does not create proper initializer records for them.
2022-11-26 15:22:58 -06:00
Stephen Heumann
740468f75c Avoid generating invalid .sym files if header ends with a partial prototyped function decl.
This could happen because the nested calls to DoDeclaration for the parameters would set inhibitHeader to false.
2022-11-26 14:20:58 -06:00
Stephen Heumann
2bf3862e5d Avoid generating invalid .sym files if header ends with a partial declaration.
The part of the declaration within the header could be ignored on subsequent compilations using the .sym file, which could lead to errors or misbehavior.

(This also applies to headers that end in the middle of a _Static_assert(...) or segment directive.)
2022-11-26 00:18:57 -06:00
Stephen Heumann
5500833180 Record which anon struct/union an anonymous member field came from.
This is preparatory to supporting designated initializers.

Any struct/union type with an anonymous member now forces .sym file generation to end, since we do not have a scheme for serializing this information in a .sym file. It would be possible to do so, but for now we just avoid this situation for simplicity.
2022-11-25 22:32:59 -06:00
Stephen Heumann
3f450bdb80 Support "inline" function definitions without static or extern.
This is a minimal implementation that does not actually inline anything, but it is intended to implement the semantics defined by the C99 and later standards.

One complication is that a declaration that appears somewhere after the function body may create an external definition for a function that appeared to be an inline definition when it was defined. To support this while preserving ORCA/C's general one-pass compilation strategy, we generate code even for inline definitions, but treat them as private and add the prefix "~inline~" to the name. If they are "un-inlined" based on a later declaration, we generate a stub with external linkage that just jumps to the apparently-inline function.
2022-11-19 23:04:22 -06:00
Stephen Heumann
d96a5f86f9 Do not force function type info to be in the global pool.
This should no longer be necessary, because functions are not forced to be in the global symbol table.
2022-11-07 21:41:30 -06:00
Stephen Heumann
202ed3b514 Require a declarator after comma in declarations.
This gives an error for code like "int x,;".
2022-11-07 20:00:23 -06:00
Stephen Heumann
82b2944eb8 Give an error if a function is defined multiple times. 2022-11-06 20:54:53 -06:00
Stephen Heumann
d3ba8b5551 Rework handling of scopes created for function declarators.
This is preparatory to other changes.
2022-11-05 21:13:44 -05:00
Stephen Heumann
986a283540 Simplify some code in DoDeclaration and improve error detection.
This detects errors in the following cases that were previously missed:

* A function declaration and definition being part of the same overall declaration, e.g.:
void f(void), g(void) {}

* A function declaration (not definition) with no declaration specifiers, e.g.:
f(void);

(Function definitions with no declaration specifiers continue to be accepted by default, consistent with C90 rules.)
2022-11-05 20:20:04 -05:00
Stephen Heumann
7d6b732d23 Simplify some declaration-processing logic.
This should not cause any functional change.
2022-11-01 18:43:44 -05:00
Stephen Heumann
f31b5ea1e6 Allow "extern inline" functions.
A function declared "inline" with an explicit "extern" storage class has the same semantics as if "inline" was omitted. (It is not an inline definition as defined in the C standards.) The "inline" specifier suggests that the function should be inlined, but it is legal to just ignore it, as we already do for "static inline" functions.

Also add a test for the inline function specifier.
2022-10-29 19:43:57 -05:00
Stephen Heumann
f54d0e1854 Require that main have no function specifiers.
This enforces a constraint in the C standards (for a hosted environment).
2022-10-29 18:36:51 -05:00
Stephen Heumann
65ec29ee3e Use 32-bit representation for line numbers.
C99 and later specify that line numbers set via #line can be up to 2147483647, so they need to be represented as (at least) a 32-bit value.
2022-10-22 21:46:12 -05:00
Stephen Heumann
bdf212ec6b Remove support for separate . . . as equivalent to a ... token.
The scanner has been updated so that ... should always get recognized as a single token, so this is no longer necessary as a workaround. Any code that actually uses separate . . .  is non-standard and will need to be changed.
2022-10-19 18:14:14 -05:00
Stephen Heumann
6d8ca42734 Parse the _Thread_local storage-class specifier.
This does not really do anything, because ORCA/C does not support multithreading, but the C11 and later standards indicate it should be allowed anyway.
2022-10-18 21:01:26 -05:00
Stephen Heumann
afe40c0f67 Prevent spurious errors about structs containing function pointers.
If a struct contained a function pointer with a prototyped parameter list, processing the parameters could reset the declaredTagOrEnumConst flag, potentially leading to a spurious error, as in this example:

struct S {
	int (*f)(int);
};

This also gives a better error for structs declared as containing functions.
2022-10-16 19:57:14 -05:00
Stephen Heumann
a864954353 Use "declarator expected" error messages when appropriate.
Previously, some of these cases would report "identifier expected."
2022-10-16 18:45:06 -05:00
Stephen Heumann
99e268e3b9 Implement support for anonymous structures and unions (C11).
Note that this implementation allows anonymous structures and unions to participate in initialization. That is, you can have a braced initializer list corresponding to an anonymous structure or union. Also, anonymous structures within unions follow the initialization rules for structures (and vice versa).

I think the better interpretation of the standard text is that anonymous structures and unions cannot participate in initialization as such, and instead their members are treated as members of the containing structure or union for purposes of initialization. However, all other compilers I am aware of allow anonymous structures and unions to participate in initialization, so I have implemented it that way too.
2022-10-16 18:44:19 -05:00
Stephen Heumann
b8b7dc2c2b Remove code that treats # as an illegal character in most places.
C90 had constraints requiring # and ## tokens to only appear in preprocessing directives, but C99 and later removed those constraints, so this code is no longer necessary when targeting current languages versions. (It would be necessary in a "strict C90" mode, if that was ever implemented.)

The main practical effect of this is that # and ## tokens can be passed as parameters to macros, provided the macro either ignores or stringizes that parameter. # and ## tokens still have no role in the grammar of the C language after preprocessing, so they will be an unexpected token and produce some kind of error if they appear anywhere.

This also contains a change to ensure that a line containing one or more illegal characters (e.g. $) and then a # is not treated as a preprocessing directive.
2022-10-13 18:35:26 -05:00
Stephen Heumann
4fe9c90942 Parse ... as a single punctuator token.
This accords with its definition in the C standards. For the time being, the old form of three separate tokens is still accepted too, because the ... token may not be scanned correctly in the obscure case where there is a line continuation between the second and third dots.

One observable effect of this is that there are no longer spaces between the dots in #pragma expand output.
2022-10-10 18:06:01 -05:00
Stephen Heumann
f263066f61 Give an error for declarations that do not declare anything.
This enforces the constraint from C17 section 6.7 p2 that declarations "shall declare at least a declarator (other than the parameters of a function or the members of a structure or union), a tag, or the members of an enumeration."

Somewhat relaxed rules are used for enums in the default loose type checking mode, similar to what GCC and Clang do.
2022-10-09 22:03:06 -05:00
Stephen Heumann
995ded07a5 Always treat "struct T;" as declaring the tag within the current scope.
A declaration of this exact form always declares the tag T within the current scope, and as such makes this "struct T" a distinct type from any other "struct T" type in an outer scope. (Similarly for unions.)

See C17 section 6.7.2.3 p7 (and corresponding places in all other C standards).

Here is an example of a program affected by this:

struct S {char a;};
int main(void) {
        struct S;
        struct S *sp;
        struct S {long b;} s;
        sp = &s;
        sp->b = sizeof(*sp);
        return s.b;
}
2022-10-04 18:45:11 -05:00
Stephen Heumann
3cea478e5e Clarify a comment. 2022-10-02 22:05:05 -05:00
Stephen Heumann
53baef0fb3 Make isPascal variable local to DoDeclaration.
This avoids the need to save/restore it elsewhere.
2022-10-02 22:04:46 -05:00
Stephen Heumann
1fa3ec8fdd Eliminate global variables for declaration specifiers.
They are now represented in local structures instead. This keeps the representation of declaration specifiers together and eliminates the need for awkward and error-prone code to save and restore the global variables.
2022-10-01 21:28:16 -05:00
Stephen Heumann
05ecf5eef3 Add option to use the declared type for float/double/comp params.
This differs from the usual ORCA/C behavior of treating all floating-point parameters as extended. With the option enabled, they will still be passed in the extended format, but will be converted to their declared type at the start of the function. This is needed for strict standards conformance, because you should be able to take the address of a parameter and get a usable pointer to its declared type. The difference in types can also affect the behavior of _Generic expressions.

The implementation of this is based on ORCA/Pascal, which already did the same thing (unconditionally) with real/double/comp parameters.
2022-09-18 21:16:46 -05:00
Stephen Heumann
6e3fca8b82 Implement strict type checking for enum types.
If strict type checking is enabled, this will prohibit redefinition of enums, like:

enum E {a,b,c};
enum E {x,y,z};

It also prohibits use of an "enum E" type specifier if the enum has not been previously declared (with its constants).

These things were historically supported by ORCA/C, but they are prohibited by constraints in section 6.7.2.3 of C99 and later. (The C90 wording was different and less clear, but I think they were not intended to be valid there either.)
2022-07-19 20:35:44 -05:00
Stephen Heumann
6d07043783 Do not treat uses of enum types from outer scopes as redeclarations.
This affects code like the following:

enum E {a,b,c};
int main(void) {
        enum E e;
        struct E {int x;}; /* or: enum E {x,y,z}; */
}

The line "enum E e;" should refer to the enum type declared in the outer scope, but not redeclare it in the inner scope. Therefore, a subsequent struct, union, or enum declaration using the same tag in the same scope is acceptable.
2022-07-18 21:34:29 -05:00
Stephen Heumann
fd54fd70d0 Remove some unnecessary/duplicate code.
This mainly comments out statements that zero out data that was already set to zero by a preceding Calloc call.
2022-07-18 21:19:44 -05:00
Stephen Heumann
c36bf9bf0a Ignore storage class when creating enum tag symbols.
This avoids strangeness where an enum tag declared within a typedef declaration would act like a typedef. For example, the following would compile without error:

typedef enum E {a,b,c} T;
E e;
2022-07-18 18:37:26 -05:00
Stephen Heumann
2cbcdc736c Allow the same identifier to be used as a typedef and an enum tag.
This should be allowed (because they are in separate name spaces), but was not.

This affected code like the following:

typedef int T;
enum T {a,b,c};
2022-07-18 18:33:54 -05:00
Stephen Heumann
5e20e02d06 Add a function to make a pointer type.
This allows us to refactor out code that was doing this in several places.
2022-06-19 17:55:08 -05:00
Stephen Heumann
58849607a1 Use cgPointerSize for size of pointers in various places.
This makes no practical difference when targeting the GS, but it better documents what the relevant size is.
2022-06-18 19:30:20 -05:00
Stephen Heumann
3c2b492618 Add support for compound literals within functions.
The basic approach is to generate a single expression tree containing the code for the initialization plus the reference to the compound literal (or its address). The various subexpressions are joined together with pc_bno pcodes, similar to the code generated for the comma operator. The initializer expressions are placed in a balanced binary tree, so that it is not excessively deep.

Note: Common subexpression elimination has poor performance for very large trees. This is not specific to compound literals, but compound literals for relatively large arrays can run into this issue. It will eventually complete and generate a correct program, but it may be quite slow. To avoid this, turn off CSE.
2022-06-08 21:34:12 -05:00
Stephen Heumann
2a9ec8fc43 Explicitly terminate PCH generation if there is an initialized variable.
Initialized variables have always been one of the things that stops PCH generation, but previously this was only detected when trying to write out the symbol records at the point of a later #include. On a subsequent compile using the sym file, nothing would recognize that PCH generation had stopped for this reason, so the PCH code would recognize the later #include as a potential opportunity to extend the sym file, and therefore would delete it to force regeneration next time. This led to the sym file being deleted and regenerated on alternate compiles, so its full benefit was not realized.

There is code in Header.pas to abort PCH generation if an initialized symbol is found. That is probably superfluous after this change, but it has been left in place for now.
2022-02-19 14:22:25 -06:00
Stephen Heumann
3893db1346 Make sure #pragma expand is properly applied in all cases.
There were various places where the flag for macro expansions was saved, set to false, and then later restored. If #pragma expand was used within those areas, it would not be properly applied. Here is an example showing that problem:

void f(void
#pragma expand 1
) {}

This could also affect some uses of #pragma expand within precompiled headers, e.g.:

#pragma expand 1
#include "a.h"
#undef foobar
#include "b.h"
...

Also, add a note saying that code in precompiled headers will not be expanded. (This has always been the case, but was not clearly documented.)
2022-02-15 20:50:02 -06:00
Stephen Heumann
785a6997de Record source file changes within a function as part of debug info.
This affects functions whose body spans multiple files due to includes, or is treated as doing so due to #line directives. ORCA/C will now generate a COP 6 instruction to record each source file change, allowing debuggers to properly track the flow of execution across files.
2022-02-05 18:32:11 -06:00
Stephen Heumann
e36503508a Allow more forms of address expressions in static initializers.
There were several forms that should be permitted but were not, such as &"str"[1], &*"str", &*a (where a is an array), and &*f (where f is a function).

This fixes #15 and also certain other cases illustrated in the following example:

char a[10];
int main(void);
static char *s1 = &"string"[1];
static char *s2 = &*"string";
static char *s3 = &*a;
static int (*f2)(void)=&*main;
2022-01-29 21:59:25 -06:00
Stephen Heumann
8eda03436a Preserve qualifiers when changing float/double/comp parameters to extended.
Changing the type is still non-standard, but at least this allows us to detect and report write-to-const errors.
2022-01-17 18:26:28 -06:00
Stephen Heumann
6f0b94bb7c Allow the pascal qualifier to appear anywhere types are used.
This is necessary to allow declarations of pascal-qualified function pointers as members of a structure, among other things.

Note that the behavior for "pascal" now differs from that for the standard function specifiers, which have more restrictive rules for where they can be used. This is justified by the fact that the "pascal" qualifier is allowed and meaningful for function pointer types, so it should be able to appear anywhere they can.

This fixes #28.
2022-01-13 20:11:43 -06:00