ORCA-C

mirror of https://github.com/byteworksinc/ORCA-C.git synced 2024-12-28 01:29:32 +00:00

Author	SHA1	Message	Date
Stephen Heumann	cd9931a60c	Record displacement from start of object in initializer records. The idea (not yet implemented) is to use this to support out-of-order initialization. For automatic variables, we can just initialize the subobjects in the order that initializers appear. For static variables, we will eventually need to reorder the initializers in order, but this can be done based on their recorded displacements.	2022-11-26 19:27:17 -06:00
Stephen Heumann	8cfc14b50a	Rename itype field of initializerRecord to basetype.	2022-11-26 15:45:26 -06:00
Stephen Heumann	b6d3dfb075	Designated initializers for arrays, part 1. This can parse designated initializers for arrays, but does not create proper initializer records for them.	2022-11-26 15:22:58 -06:00
Stephen Heumann	740468f75c	Avoid generating invalid .sym files if header ends with a partial prototyped function decl. This could happen because the nested calls to DoDeclaration for the parameters would set inhibitHeader to false.	2022-11-26 14:20:58 -06:00
Stephen Heumann	2bf3862e5d	Avoid generating invalid .sym files if header ends with a partial declaration. The part of the declaration within the header could be ignored on subsequent compilations using the .sym file, which could lead to errors or misbehavior. (This also applies to headers that end in the middle of a _Static_assert(...) or segment directive.)	2022-11-26 00:18:57 -06:00
Stephen Heumann	92a3af1d5f	Fix icp/isp tables to account for otherch. Commit `9cc72c8845` introduced otherch tokens but did not properly update these tables to account for them. This would cause * not to be accepted as the first character in an expression, and might also cause other problems.	2022-11-25 23:25:58 -06:00
Stephen Heumann	5500833180	Record which anon struct/union an anonymous member field came from. This is preparatory to supporting designated initializers. Any struct/union type with an anonymous member now forces .sym file generation to end, since we do not have a scheme for serializing this information in a .sym file. It would be possible to do so, but for now we just avoid this situation for simplicity.	2022-11-25 22:32:59 -06:00
Stephen Heumann	3f450bdb80	Support "inline" function definitions without static or extern. This is a minimal implementation that does not actually inline anything, but it is intended to implement the semantics defined by the C99 and later standards. One complication is that a declaration that appears somewhere after the function body may create an external definition for a function that appeared to be an inline definition when it was defined. To support this while preserving ORCA/C's general one-pass compilation strategy, we generate code even for inline definitions, but treat them as private and add the prefix "~inline~" to the name. If they are "un-inlined" based on a later declaration, we generate a stub with external linkage that just jumps to the apparently-inline function.	2022-11-19 23:04:22 -06:00
Stephen Heumann	ab368d442a	Allow \ as an "other character" preprocessing token. This still has a few issues. A \ token may not be followed by u or U (because this triggers UCN processing). We should scan through the whole possible UCN until we can confirm whether it is actually a UCN, but that would require more lookahead. Also, \ is not handled correctly in stringization (it should form escape sequences).	2022-11-08 20:46:48 -06:00
Stephen Heumann	9cc72c8845	Support "other character" preprocessing tokens. This implements the catch-all category for preprocessing tokens for "each non-white-space character that cannot be one of the above" (C17 section 6.4). These may appear in skipped code, or in macros or macro parameters if they are never expanded or are stringized during macro processing. The affected characters are $, @, `, and many extended characters. It is still an error if these tokens are used in contexts where they remain present after preprocessing. If #pragma ignore bit 0 is clear, these characters are also reported as errors in skipped code or preprocessor constructs.	2022-11-08 18:58:50 -06:00
Stephen Heumann	d96a5f86f9	Do not force function type info to be in the global pool. This should no longer be necessary, because functions are not forced to be in the global symbol table.	2022-11-07 21:41:30 -06:00
Stephen Heumann	202ed3b514	Require a declarator after comma in declarations. This gives an error for code like "int x,;".	2022-11-07 20:00:23 -06:00
Stephen Heumann	de57170ef8	Try to form composite types for extern declarations within blocks. If the extern declaration refers to a global variable/function for which a declaration is already visible, the inner declaration should have the composite type (and it is an error if the types are incompatible). This affects programs like the following: static char a[60] = {5}; int main(void) { extern char a[]; return sizeof(a)+a[0]; /* should return 65 */ }	2022-11-07 19:00:35 -06:00
Stephen Heumann	fa166030fe	Allow duplicate typedefs within block scopes (C11).	2022-11-06 21:39:58 -06:00
Stephen Heumann	e168a4d6cb	Treat static followed by extern declarations as specifying internal linkage. See C17 section 6.2.2 p4-5.	2022-11-06 21:19:47 -06:00
Stephen Heumann	82b2944eb8	Give an error if a function is defined multiple times.	2022-11-06 20:54:53 -06:00
Stephen Heumann	83147655d2	Revise NewSymbol to more closely align with standards. Function declarations within a block are now entered within its symbol table rather than moved to the global one. Several error checks are also added or tightened. This fixes at least one bug: if a function declared within a block had the same name as a variable in an outer scope, the symbol table entry for that variable could be corrupted, leading to spurious errors or incorrect code generation. This example program illustrates the problem: /* This should compile without errors and return 2 / int f(void) {return 1;} int g(void) {return 2;} int main(void) { int (f)(void) = g; { int f(void); } f = g; return f(); } Errors now detected include: Duplicate declarations of a static variable within a block (with the second one initialized) Duplicate declarations of the same variable as static and non-static *Declaration of the same identifier as a typedef and a variable (at file scope)	2022-11-06 20:50:25 -06:00
Stephen Heumann	d3ba8b5551	Rework handling of scopes created for function declarators. This is preparatory to other changes.	2022-11-05 21:13:44 -05:00
Stephen Heumann	986a283540	Simplify some code in DoDeclaration and improve error detection. This detects errors in the following cases that were previously missed: * A function declaration and definition being part of the same overall declaration, e.g.: void f(void), g(void) {} * A function declaration (not definition) with no declaration specifiers, e.g.: f(void); (Function definitions with no declaration specifiers continue to be accepted by default, consistent with C90 rules.)	2022-11-05 20:20:04 -05:00
Stephen Heumann	7d6b732d23	Simplify some declaration-processing logic. This should not cause any functional change.	2022-11-01 18:43:44 -05:00
Stephen Heumann	9a7dc23c5d	When a symbol is multiply declared, form the composite type. Previously, it generally just used the later type (except for function types where only the earlier one included a prototype). One effect of this is that if a global array is first declared with a size and then redeclared without one, the size information is lost, causing the proper space not to be allocated. See C17 section 6.2.7 p4. Here is an example affected by the array issue (dump the object file to see the size allocated): int foo[50]; int foo[];	2022-10-30 18:54:40 -05:00
Stephen Heumann	d4c4d18a55	Remove some unused code. This seemed to be aimed at supporting lazy allocation of symbol tables. That could be a useful optimization, but the code that existed was incomplete and did not do anything useful. That or similar code could be reintroduced as part of a full implementation of lazy allocation, if it is ever done.	2022-10-30 15:06:28 -05:00
Stephen Heumann	f31b5ea1e6	Allow "extern inline" functions. A function declared "inline" with an explicit "extern" storage class has the same semantics as if "inline" was omitted. (It is not an inline definition as defined in the C standards.) The "inline" specifier suggests that the function should be inlined, but it is legal to just ignore it, as we already do for "static inline" functions. Also add a test for the inline function specifier.	2022-10-29 19:43:57 -05:00
Stephen Heumann	f54d0e1854	Require that main have no function specifiers. This enforces a constraint in the C standards (for a hosted environment).	2022-10-29 18:36:51 -05:00
Stephen Heumann	913052fe7c	Add documentation and tests for _Pragma.	2022-10-29 16:02:38 -05:00
Stephen Heumann	e5428b21d2	Do not skip over the character after _Pragma(...).	2022-10-29 15:55:44 -05:00
Stephen Heumann	4702df9aac	Support Unicode strings and some escape sequences in _Pragma. This still works by "reconstructing" the string literal text, rather than just using what was in the source code. This is not what the standards specify and can result in slightly different behavior in some corner cases, but for realistic cases it is probably fine.	2022-10-25 22:47:22 -05:00
Stephen Heumann	e63d827049	Do not do macro expansion on preprocessor directive names. According to the C standards (C17 section 6.10.3 p8), they should not be subject to macro replacement. A similar change also applies to the "STDC" in #pragma STDC ... (but we still allow macros for other pragmas, which is allowed as part of the implementation-defined behavior of #pragma). Here is an example affected by this issue: #define ifdef ifndef #ifdef foobar #error "foobar defined?" #else int main(void) {} #endif	2022-10-25 22:40:20 -05:00
Stephen Heumann	e0b27db652	Do not try to interpret non-identifier tokens as pragma names. This could access arbitrary memory locations, and could theoretically cause misbehavior including falsely recognizing the token as a pragma or accessing a softswitch/IO location.	2022-10-25 22:26:30 -05:00
Stephen Heumann	81353a9f8a	Always interpret the digit sequence in #line as decimal. This is what the standards call for.	2022-10-23 13:47:59 -05:00
Stephen Heumann	e3a3548443	Fix line numbering via #line when using a .sym file. The line numbering would be off by one in this case.	2022-10-22 21:56:16 -05:00
Stephen Heumann	65ec29ee3e	Use 32-bit representation for line numbers. C99 and later specify that line numbers set via #line can be up to 2147483647, so they need to be represented as (at least) a 32-bit value.	2022-10-22 21:46:12 -05:00
Stephen Heumann	760c932fea	Initial implementation of _Pragma (C99). This works for typical cases, but does not yet handle Unicode strings, very long strings, or certain escape sequences.	2022-10-22 17:08:54 -05:00
Stephen Heumann	859aa4a20a	Do not enter the editor with a negative file displacement. This could happen due to errors on the command line, e.g.: cmpl +T +E file.c cc=(invalid)	2022-10-22 12:54:59 -05:00
Stephen Heumann	946c6c1d55	Always end preprocessor expression processing at end of line. In certain error cases, tokens from subsequent lines could get treated as part of a preprocessor expression, causing subsequent code to be essentially ignored and producing strange error messages. Here is an example (with an error) affected by this: #pragma optimize 0 0 int main(void) {}	2022-10-21 18:51:53 -05:00
Stephen Heumann	bdf212ec6b	Remove support for separate . . . as equivalent to a ... token. The scanner has been updated so that ... should always get recognized as a single token, so this is no longer necessary as a workaround. Any code that actually uses separate . . . is non-standard and will need to be changed.	2022-10-19 18:14:14 -05:00
Stephen Heumann	6d8ca42734	Parse the _Thread_local storage-class specifier. This does not really do anything, because ORCA/C does not support multithreading, but the C11 and later standards indicate it should be allowed anyway.	2022-10-18 21:01:26 -05:00
Stephen Heumann	cb5db95476	Do not print "\000" at end of printf/scanf format strings. This happened due to the change to include the null terminator in the internal representation of strings.	2022-10-18 18:40:14 -05:00
Stephen Heumann	91d33b586d	Fix various C99+ conformance issues and bugs in test cases. The main changes made to most tests are: Declarations always include explicit types, not relying on implicit int. The declaration of main in most test programs is changed to be "int main (void) {...}", adding an explicit return type and a prototype. (There are still some non-prototyped functions, though.) Functions are always declared before use, either by including a header or by providing a declaration for the specific function. The latter approach is usually used for printf, to avoid requiring ORCA/C to process stdio.h when compiling every test case (which might make test runs noticeably slower). Make all return statements in non-void functions (e.g. main) return a value. Avoid some instances of undefined behavior and type errors in printf and scanf calls. Several miscellaneous bugs are also fixed. There are still a couple test cases that intentionally rely on the C89 behavior, to ensure it still works.	2022-10-17 20:17:24 -05:00
Stephen Heumann	b3c30b05d8	Add a specific test for old C89 features that were removed from C99+.	2022-10-16 21:29:55 -05:00
Stephen Heumann	afe40c0f67	Prevent spurious errors about structs containing function pointers. If a struct contained a function pointer with a prototyped parameter list, processing the parameters could reset the declaredTagOrEnumConst flag, potentially leading to a spurious error, as in this example: struct S { int (*f)(int); }; This also gives a better error for structs declared as containing functions.	2022-10-16 19:57:14 -05:00
Stephen Heumann	a864954353	Use "declarator expected" error messages when appropriate. Previously, some of these cases would report "identifier expected."	2022-10-16 18:45:06 -05:00
Stephen Heumann	99e268e3b9	Implement support for anonymous structures and unions (C11). Note that this implementation allows anonymous structures and unions to participate in initialization. That is, you can have a braced initializer list corresponding to an anonymous structure or union. Also, anonymous structures within unions follow the initialization rules for structures (and vice versa). I think the better interpretation of the standard text is that anonymous structures and unions cannot participate in initialization as such, and instead their members are treated as members of the containing structure or union for purposes of initialization. However, all other compilers I am aware of allow anonymous structures and unions to participate in initialization, so I have implemented it that way too.	2022-10-16 18:44:19 -05:00
Stephen Heumann	44a1ba5205	Print floating constants with more precision in #pragma expand output. Finite numbers should now be printed with sufficient precision to produce the same value as the original constant in the relevant type.	2022-10-15 22:20:22 -05:00
Stephen Heumann	83ac0ecebf	Add a function to peek at the next character. This is necessary to correctly handle line continuations in a few places: * Between an initial . and the subsequent digit in a floating constant * Between the third and fourth characters of a %:%: digraph * Between the second and third dots of a ... token Previously, these would not be tokenized correctly, leading to spurious errors in the first and second cases above. Here is a sample program illustrating the problem: int printf(const char * restrict, ..\ \ ??/ .); int main(void) { double d = .??/ \ ??/ \ 1234; printf("%f\n", d); }	2022-10-15 21:42:02 -05:00
Stephen Heumann	6fadd52fc2	Update release notes to cover fixes to fgets() and gets().	2022-10-15 19:11:11 -05:00
Stephen Heumann	5be888a2bd	Make stdin/stdout/stderr into macros. They are supposed to be macros, according to the C standards. This ordinarily doesn't matter, but it can be detected by #ifdef, as in the following program: #include <stdio.h> #ifdef stdin int main(void) { puts("stdin is a macro"); } #endif	2022-10-15 17:10:59 -05:00
Stephen Heumann	072f8be6bc	Adjust test of missing declarators to cover only cases that are legal. Previously, it included some instances that violate the standard constraint that a declaration must declare a declarator, a tag, or an enum constant. As of commit `f263066f61`, this constraint is now enforced, so those cases would (properly) give errors.	2022-10-13 18:52:18 -05:00
Stephen Heumann	b8b7dc2c2b	Remove code that treats # as an illegal character in most places. C90 had constraints requiring # and ## tokens to only appear in preprocessing directives, but C99 and later removed those constraints, so this code is no longer necessary when targeting current languages versions. (It would be necessary in a "strict C90" mode, if that was ever implemented.) The main practical effect of this is that # and ## tokens can be passed as parameters to macros, provided the macro either ignores or stringizes that parameter. # and ## tokens still have no role in the grammar of the C language after preprocessing, so they will be an unexpected token and produce some kind of error if they appear anywhere. This also contains a change to ensure that a line containing one or more illegal characters (e.g. $) and then a # is not treated as a preprocessing directive.	2022-10-13 18:35:26 -05:00
Stephen Heumann	99a10590b1	Avoid out-of-range branches around asm code using dcl directives. The branch range calculation treated dcl directives as taking 2 bytes rather than 4, which could result in out-of-range branches. These could result in linker errors (for forward branches) or silently generating wrong code (for backward branches). This patch now treats dcb, dcw, and dcl as separate directives in the native-code layer, so the appropriate length can be calculated for each. Here is an example of code affected by this: int main(int argc, char *argv) { top: if (!argc) { / this caused a linker error / asm { dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 } goto top; / this generated bad code with no error */ } }	2022-10-13 18:00:16 -05:00

... 3 4 5 6 7 ...

912 Commits