ORCA-C

mirror of https://github.com/byteworksinc/ORCA-C.git synced 2024-12-30 14:31:04 +00:00

Author	SHA1	Message	Date
Stephen Heumann	e3a3548443	Fix line numbering via #line when using a .sym file. The line numbering would be off by one in this case.	2022-10-22 21:56:16 -05:00
Stephen Heumann	65ec29ee3e	Use 32-bit representation for line numbers. C99 and later specify that line numbers set via #line can be up to 2147483647, so they need to be represented as (at least) a 32-bit value.	2022-10-22 21:46:12 -05:00
Stephen Heumann	760c932fea	Initial implementation of _Pragma (C99). This works for typical cases, but does not yet handle Unicode strings, very long strings, or certain escape sequences.	2022-10-22 17:08:54 -05:00
Stephen Heumann	859aa4a20a	Do not enter the editor with a negative file displacement. This could happen due to errors on the command line, e.g.: cmpl +T +E file.c cc=(invalid)	2022-10-22 12:54:59 -05:00
Stephen Heumann	946c6c1d55	Always end preprocessor expression processing at end of line. In certain error cases, tokens from subsequent lines could get treated as part of a preprocessor expression, causing subsequent code to be essentially ignored and producing strange error messages. Here is an example (with an error) affected by this: #pragma optimize 0 0 int main(void) {}	2022-10-21 18:51:53 -05:00
Stephen Heumann	bdf212ec6b	Remove support for separate . . . as equivalent to a ... token. The scanner has been updated so that ... should always get recognized as a single token, so this is no longer necessary as a workaround. Any code that actually uses separate . . . is non-standard and will need to be changed.	2022-10-19 18:14:14 -05:00
Stephen Heumann	6d8ca42734	Parse the _Thread_local storage-class specifier. This does not really do anything, because ORCA/C does not support multithreading, but the C11 and later standards indicate it should be allowed anyway.	2022-10-18 21:01:26 -05:00
Stephen Heumann	cb5db95476	Do not print "\000" at end of printf/scanf format strings. This happened due to the change to include the null terminator in the internal representation of strings.	2022-10-18 18:40:14 -05:00
Stephen Heumann	91d33b586d	Fix various C99+ conformance issues and bugs in test cases. The main changes made to most tests are: Declarations always include explicit types, not relying on implicit int. The declaration of main in most test programs is changed to be "int main (void) {...}", adding an explicit return type and a prototype. (There are still some non-prototyped functions, though.) Functions are always declared before use, either by including a header or by providing a declaration for the specific function. The latter approach is usually used for printf, to avoid requiring ORCA/C to process stdio.h when compiling every test case (which might make test runs noticeably slower). Make all return statements in non-void functions (e.g. main) return a value. Avoid some instances of undefined behavior and type errors in printf and scanf calls. Several miscellaneous bugs are also fixed. There are still a couple test cases that intentionally rely on the C89 behavior, to ensure it still works.	2022-10-17 20:17:24 -05:00
Stephen Heumann	b3c30b05d8	Add a specific test for old C89 features that were removed from C99+.	2022-10-16 21:29:55 -05:00
Stephen Heumann	afe40c0f67	Prevent spurious errors about structs containing function pointers. If a struct contained a function pointer with a prototyped parameter list, processing the parameters could reset the declaredTagOrEnumConst flag, potentially leading to a spurious error, as in this example: struct S { int (*f)(int); }; This also gives a better error for structs declared as containing functions.	2022-10-16 19:57:14 -05:00
Stephen Heumann	a864954353	Use "declarator expected" error messages when appropriate. Previously, some of these cases would report "identifier expected."	2022-10-16 18:45:06 -05:00
Stephen Heumann	99e268e3b9	Implement support for anonymous structures and unions (C11). Note that this implementation allows anonymous structures and unions to participate in initialization. That is, you can have a braced initializer list corresponding to an anonymous structure or union. Also, anonymous structures within unions follow the initialization rules for structures (and vice versa). I think the better interpretation of the standard text is that anonymous structures and unions cannot participate in initialization as such, and instead their members are treated as members of the containing structure or union for purposes of initialization. However, all other compilers I am aware of allow anonymous structures and unions to participate in initialization, so I have implemented it that way too.	2022-10-16 18:44:19 -05:00
Stephen Heumann	44a1ba5205	Print floating constants with more precision in #pragma expand output. Finite numbers should now be printed with sufficient precision to produce the same value as the original constant in the relevant type.	2022-10-15 22:20:22 -05:00
Stephen Heumann	83ac0ecebf	Add a function to peek at the next character. This is necessary to correctly handle line continuations in a few places: * Between an initial . and the subsequent digit in a floating constant * Between the third and fourth characters of a %:%: digraph * Between the second and third dots of a ... token Previously, these would not be tokenized correctly, leading to spurious errors in the first and second cases above. Here is a sample program illustrating the problem: int printf(const char * restrict, ..\ \ ??/ .); int main(void) { double d = .??/ \ ??/ \ 1234; printf("%f\n", d); }	2022-10-15 21:42:02 -05:00
Stephen Heumann	6fadd52fc2	Update release notes to cover fixes to fgets() and gets().	2022-10-15 19:11:11 -05:00
Stephen Heumann	5be888a2bd	Make stdin/stdout/stderr into macros. They are supposed to be macros, according to the C standards. This ordinarily doesn't matter, but it can be detected by #ifdef, as in the following program: #include <stdio.h> #ifdef stdin int main(void) { puts("stdin is a macro"); } #endif	2022-10-15 17:10:59 -05:00
Stephen Heumann	072f8be6bc	Adjust test of missing declarators to cover only cases that are legal. Previously, it included some instances that violate the standard constraint that a declaration must declare a declarator, a tag, or an enum constant. As of commit `f263066f61`, this constraint is now enforced, so those cases would (properly) give errors.	2022-10-13 18:52:18 -05:00
Stephen Heumann	b8b7dc2c2b	Remove code that treats # as an illegal character in most places. C90 had constraints requiring # and ## tokens to only appear in preprocessing directives, but C99 and later removed those constraints, so this code is no longer necessary when targeting current languages versions. (It would be necessary in a "strict C90" mode, if that was ever implemented.) The main practical effect of this is that # and ## tokens can be passed as parameters to macros, provided the macro either ignores or stringizes that parameter. # and ## tokens still have no role in the grammar of the C language after preprocessing, so they will be an unexpected token and produce some kind of error if they appear anywhere. This also contains a change to ensure that a line containing one or more illegal characters (e.g. $) and then a # is not treated as a preprocessing directive.	2022-10-13 18:35:26 -05:00
Stephen Heumann	99a10590b1	Avoid out-of-range branches around asm code using dcl directives. The branch range calculation treated dcl directives as taking 2 bytes rather than 4, which could result in out-of-range branches. These could result in linker errors (for forward branches) or silently generating wrong code (for backward branches). This patch now treats dcb, dcw, and dcl as separate directives in the native-code layer, so the appropriate length can be calculated for each. Here is an example of code affected by this: int main(int argc, char *argv) { top: if (!argc) { / this caused a linker error / asm { dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 dcl 0 } goto top; / this generated bad code with no error */ } }	2022-10-13 18:00:16 -05:00
Stephen Heumann	19683706cc	Do not optimize code from asm statements. Previously, the assembly-level optimizations applied to code in asm statements. In many cases, this was fine (and could even do useful optimizations), but occasionally the optimizations could be invalid. This was especially the case if the assembly involved tricky things like self-modifying code. To avoid these problems, this patch makes the assembly optimizers ignore code from asm statements, so it is always emitted as-is, without any changes. This fixes #34.	2022-10-12 22:03:37 -05:00
Stephen Heumann	12a2e14b6d	Follow up peephole optimizations that may enable more optimizations. If one step of peephole optimization produced code that can be further optimized with more peephole optimizations, that additional optimization was not always done. This makes sure the additional optimization is done in several such cases. This was particularly likely to affect functions containing asm blocks (because CheckLabels would never trigger rescanning in them), but could also occur in other cases. Here is an example affected by this (generating inefficient code to load a[1]): #pragma optimize 1 int a[10]; void f(int x) {} int main(int argc, char **argv) { if (argc) return 0; f(a[1]); }	2022-10-12 19:14:13 -05:00
Stephen Heumann	ca21e33ba7	Generate more efficient code for indirect function calls.	2022-10-11 21:14:40 -05:00
Stephen Heumann	4fe9c90942	Parse ... as a single punctuator token. This accords with its definition in the C standards. For the time being, the old form of three separate tokens is still accepted too, because the ... token may not be scanned correctly in the obscure case where there is a line continuation between the second and third dots. One observable effect of this is that there are no longer spaces between the dots in #pragma expand output.	2022-10-10 18:06:01 -05:00
Stephen Heumann	f263066f61	Give an error for declarations that do not declare anything. This enforces the constraint from C17 section 6.7 p2 that declarations "shall declare at least a declarator (other than the parameters of a function or the members of a structure or union), a tag, or the members of an enumeration." Somewhat relaxed rules are used for enums in the default loose type checking mode, similar to what GCC and Clang do.	2022-10-09 22:03:06 -05:00
Stephen Heumann	995ded07a5	Always treat "struct T;" as declaring the tag within the current scope. A declaration of this exact form always declares the tag T within the current scope, and as such makes this "struct T" a distinct type from any other "struct T" type in an outer scope. (Similarly for unions.) See C17 section 6.7.2.3 p7 (and corresponding places in all other C standards). Here is an example of a program affected by this: struct S {char a;}; int main(void) { struct S; struct S sp; struct S {long b;} s; sp = &s; sp->b = sizeof(sp); return s.b; }	2022-10-04 18:45:11 -05:00
Stephen Heumann	3cea478e5e	Clarify a comment.	2022-10-02 22:05:05 -05:00
Stephen Heumann	53baef0fb3	Make isPascal variable local to DoDeclaration. This avoids the need to save/restore it elsewhere.	2022-10-02 22:04:46 -05:00
Stephen Heumann	1fa3ec8fdd	Eliminate global variables for declaration specifiers. They are now represented in local structures instead. This keeps the representation of declaration specifiers together and eliminates the need for awkward and error-prone code to save and restore the global variables.	2022-10-01 21:28:16 -05:00
Stephen Heumann	05ecf5eef3	Add option to use the declared type for float/double/comp params. This differs from the usual ORCA/C behavior of treating all floating-point parameters as extended. With the option enabled, they will still be passed in the extended format, but will be converted to their declared type at the start of the function. This is needed for strict standards conformance, because you should be able to take the address of a parameter and get a usable pointer to its declared type. The difference in types can also affect the behavior of _Generic expressions. The implementation of this is based on ORCA/Pascal, which already did the same thing (unconditionally) with real/double/comp parameters.	2022-09-18 21:16:46 -05:00
Stephen Heumann	4e76f62b0e	Allow additional letters in identifiers. The added characters are accented roman letters that were added to the Mac OS Roman character set at some time after it was first defined. Some IIGS fonts include them, although others do not.	2022-08-01 19:59:49 -05:00
Stephen Heumann	95ad02f0b9	Detect various errors in macro definitions. These changes detect violations of several constraints in C17 section 6.10.3 and subsections.	2022-07-28 20:49:22 -05:00
Stephen Heumann	711549392c	Update displayed version number to mark this as a development version.	2022-07-25 18:33:32 -05:00
Stephen Heumann	2f75f47140	Update ORCA/C version number to 2.2.0 B6.	2022-07-19 20:40:52 -05:00
Stephen Heumann	1177ddc172	Tweak release notes. The "known issue" about not issuing required diagnostics is removed because ORCA/C has gotten significantly better about that, particularly if strict type checking is enabled. There are still probably some diagnostics that are missed, but it is no longer a big enough issue to be called out more prominently than other bugs.	2022-07-19 20:38:13 -05:00
Stephen Heumann	6e3fca8b82	Implement strict type checking for enum types. If strict type checking is enabled, this will prohibit redefinition of enums, like: enum E {a,b,c}; enum E {x,y,z}; It also prohibits use of an "enum E" type specifier if the enum has not been previously declared (with its constants). These things were historically supported by ORCA/C, but they are prohibited by constraints in section 6.7.2.3 of C99 and later. (The C90 wording was different and less clear, but I think they were not intended to be valid there either.)	2022-07-19 20:35:44 -05:00
Stephen Heumann	d576f19ede	Remove trailing whitespace in release notes. (No substantive changes.)	2022-07-18 21:45:55 -05:00
Stephen Heumann	6d07043783	Do not treat uses of enum types from outer scopes as redeclarations. This affects code like the following: enum E {a,b,c}; int main(void) { enum E e; struct E {int x;}; /* or: enum E {x,y,z}; */ } The line "enum E e;" should refer to the enum type declared in the outer scope, but not redeclare it in the inner scope. Therefore, a subsequent struct, union, or enum declaration using the same tag in the same scope is acceptable.	2022-07-18 21:34:29 -05:00
Stephen Heumann	fd54fd70d0	Remove some unnecessary/duplicate code. This mainly comments out statements that zero out data that was already set to zero by a preceding Calloc call.	2022-07-18 21:19:44 -05:00
Stephen Heumann	60efb4d882	Generate better code for indexed jumps. They now use a jmp (addr,X) instruction, rather than a more complicated code sequence using rts. This is an improvement that was suggested in an old Genie message from Todd Whitesel.	2022-07-18 21:18:26 -05:00
Stephen Heumann	c36bf9bf0a	Ignore storage class when creating enum tag symbols. This avoids strangeness where an enum tag declared within a typedef declaration would act like a typedef. For example, the following would compile without error: typedef enum E {a,b,c} T; E e;	2022-07-18 18:37:26 -05:00
Stephen Heumann	2cbcdc736c	Allow the same identifier to be used as a typedef and an enum tag. This should be allowed (because they are in separate name spaces), but was not. This affected code like the following: typedef int T; enum T {a,b,c};	2022-07-18 18:33:54 -05:00
Stephen Heumann	bdf8ed4f29	Simplify some code.	2022-07-17 18:15:29 -05:00
Stephen Heumann	6bfd491f2a	Update release notes.	2022-07-14 18:40:59 -05:00
Stephen Heumann	6934c8890d	Detect several cases of inappropriate operand types being used with ++ or --.	2022-07-12 18:35:52 -05:00
Stephen Heumann	63d33b47bf	Generate valid code for "dereferencing" pointers to void. This covers code like the following, which is very dubious but does not seem to be clearly prohibited by the standards: int main(void) { void vp; vp; } Previously, this would do an indirect load of a four-byte value at the location, but then treat it as void. This could lead to the four-byte value being left on the stack, eventually causing a crash. Now we just evaluate the pointer expression (in case it has side effects), but effectively cast it to void without dereferencing it.	2022-07-12 18:34:58 -05:00
Stephen Heumann	417fd1ad9c	Generate better code for && and \|\|.	2022-07-11 21:16:18 -05:00
Stephen Heumann	312a3a09b9	Generate better code for long long >= comparisons.	2022-07-11 19:20:55 -05:00
Stephen Heumann	687a5eaa45	Generate better code for pc_not on boolean operands.	2022-07-11 18:54:39 -05:00
Stephen Heumann	b5b76b624c	Use pei rather than load+push in a few places.	2022-07-11 18:42:14 -05:00

1 2 3 4 5 ...

682 Commits