ORCA-C

mirror of https://github.com/byteworksinc/ORCA-C.git synced 2024-12-30 14:31:04 +00:00

Author	SHA1	Message	Date
Stephen Heumann	0aee669746	Update version number for ORCA/C 2.2.1 development.	2024-01-15 21:47:00 -06:00
Stephen Heumann	1aa654628a	Update ORCA/C version number to 2.2.0 final.	2023-07-16 22:51:16 -05:00
Stephen Heumann	9dad2b6186	Update displayed version number to mark this as a development version.	2023-04-16 14:29:02 -05:00
Stephen Heumann	5c96042423	Update ORCA/C version number to 2.2.0 B7. Also tweak documentation wording in a couple places.	2023-04-06 18:53:34 -05:00
Stephen Heumann	245dd0a3f4	Add lint check for implicit conversions that change a constant's value. This occurs when the constant value is out of range of the type being assigned to. This is likely indicative of an error, or of code that assumes types have larger ranges than they do in ORCA/C (e.g. 32-bit int). This intentionally does not report cases where a value is assigned to a signed type but is within the range of the corresponding unsigned type, or vice versa. These may be done intentionally, e.g. setting an unsigned value to "-1" or setting a signed value using a hex constant with the high bit set. Also, only conversions to 8-bit or 16-bit integer types are currently checked.	2023-01-03 18:57:32 -06:00
Stephen Heumann	09fbfb1905	Maintain a pool of empty symbol tables that can be reused. The motivation for this is that allocating and clearing symbol tables is a common operation, especially with C99+, where a construct like "if (...) { ... }" involves three levels of scope with their own symbol tables. In some tests, it could take an appreciable fraction of total execution time (sometimes ~10%). This patch allows symbol tables that have already been allocated and cleared to be reused for a subsequent scope, as long as they are still empty. It does this by maintaining a pool of empty symbol tables and taking one from there rather than allocating a new one when possible. We impose a somewhat arbitrary limit of MaxBlock/150000 on the number of symbol tables we keep, to avoid filling up memory with them. It would probably be better to use purgeable handles here, but that would be a little more work, and this should be good enough for now.	2022-12-13 21:14:23 -06:00
Stephen Heumann	fe62f70d51	Add lint option to check for unused variables.	2022-12-12 21:47:32 -06:00
Stephen Heumann	e71fe5d785	Treat unary + as an actual operator, not a no-op. This is necessary both to detect errors (using unary + on non-arithmetic types) and to correctly perform the integer promotions when unary + is used (which can be detected with sizeof or _Generic).	2022-12-09 19:03:38 -06:00
Stephen Heumann	bb1bd176f4	Add a command-line option to select the C standard to use. This provides a more straightforward way to place the compiler in a "strict conformance" mode. This could essentially be achieved by setting several pragma options, but having a single setting is simpler. "Compatibility modes" for older standards can also be selected, although these actually continue to enable most C17 features (since they are unlikely to cause compatibility problems for older code).	2022-12-07 21:35:15 -06:00
Stephen Heumann	6857913daa	Make the object buffer dynamically resizable. It will now grow as needed to accommodate large segments, subject to the constraints of available memory. In practice, this mostly affects the size of initialized static arrays that can be used. This also removes any limit apart from memory size on how large the object representation produced by a "compile to memory" can be, and cleans up error reporting regarding size limits.	2022-12-06 21:49:20 -06:00
Stephen Heumann	28e119afb1	Rework static initialization to support new-style initializer records. Static initialization of arrays/structs/unions now essentially "executes" the initializer records to fill in a buffer (and keep track of relocations), then emits pcode to represent that initialized state. This supports overlapping and out-of-order initializer records, as can be produced by designated initialization.	2022-12-02 21:55:57 -06:00
Stephen Heumann	5f8a6baa94	Get rid of an unnecessary field in initializer records. The "isStructOrUnion" information can now be determined simply by the type in the record.	2022-11-26 20:29:31 -06:00
Stephen Heumann	d1edc8821d	Record the type being initialized in auto initializer records.	2022-11-26 19:58:01 -06:00
Stephen Heumann	cd9931a60c	Record displacement from start of object in initializer records. The idea (not yet implemented) is to use this to support out-of-order initialization. For automatic variables, we can just initialize the subobjects in the order that initializers appear. For static variables, we will eventually need to reorder the initializers in order, but this can be done based on their recorded displacements.	2022-11-26 19:27:17 -06:00
Stephen Heumann	8cfc14b50a	Rename itype field of initializerRecord to basetype.	2022-11-26 15:45:26 -06:00
Stephen Heumann	5500833180	Record which anon struct/union an anonymous member field came from. This is preparatory to supporting designated initializers. Any struct/union type with an anonymous member now forces .sym file generation to end, since we do not have a scheme for serializing this information in a .sym file. It would be possible to do so, but for now we just avoid this situation for simplicity.	2022-11-25 22:32:59 -06:00
Stephen Heumann	3f450bdb80	Support "inline" function definitions without static or extern. This is a minimal implementation that does not actually inline anything, but it is intended to implement the semantics defined by the C99 and later standards. One complication is that a declaration that appears somewhere after the function body may create an external definition for a function that appeared to be an inline definition when it was defined. To support this while preserving ORCA/C's general one-pass compilation strategy, we generate code even for inline definitions, but treat them as private and add the prefix "~inline~" to the name. If they are "un-inlined" based on a later declaration, we generate a stub with external linkage that just jumps to the apparently-inline function.	2022-11-19 23:04:22 -06:00
Stephen Heumann	9cc72c8845	Support "other character" preprocessing tokens. This implements the catch-all category for preprocessing tokens for "each non-white-space character that cannot be one of the above" (C17 section 6.4). These may appear in skipped code, or in macros or macro parameters if they are never expanded or are stringized during macro processing. The affected characters are $, @, `, and many extended characters. It is still an error if these tokens are used in contexts where they remain present after preprocessing. If #pragma ignore bit 0 is clear, these characters are also reported as errors in skipped code or preprocessor constructs.	2022-11-08 18:58:50 -06:00
Stephen Heumann	65ec29ee3e	Use 32-bit representation for line numbers. C99 and later specify that line numbers set via #line can be up to 2147483647, so they need to be represented as (at least) a 32-bit value.	2022-10-22 21:46:12 -05:00
Stephen Heumann	859aa4a20a	Do not enter the editor with a negative file displacement. This could happen due to errors on the command line, e.g.: cmpl +T +E file.c cc=(invalid)	2022-10-22 12:54:59 -05:00
Stephen Heumann	99e268e3b9	Implement support for anonymous structures and unions (C11). Note that this implementation allows anonymous structures and unions to participate in initialization. That is, you can have a braced initializer list corresponding to an anonymous structure or union. Also, anonymous structures within unions follow the initialization rules for structures (and vice versa). I think the better interpretation of the standard text is that anonymous structures and unions cannot participate in initialization as such, and instead their members are treated as members of the containing structure or union for purposes of initialization. However, all other compilers I am aware of allow anonymous structures and unions to participate in initialization, so I have implemented it that way too.	2022-10-16 18:44:19 -05:00
Stephen Heumann	4fe9c90942	Parse ... as a single punctuator token. This accords with its definition in the C standards. For the time being, the old form of three separate tokens is still accepted too, because the ... token may not be scanned correctly in the obscure case where there is a line continuation between the second and third dots. One observable effect of this is that there are no longer spaces between the dots in #pragma expand output.	2022-10-10 18:06:01 -05:00
Stephen Heumann	1fa3ec8fdd	Eliminate global variables for declaration specifiers. They are now represented in local structures instead. This keeps the representation of declaration specifiers together and eliminates the need for awkward and error-prone code to save and restore the global variables.	2022-10-01 21:28:16 -05:00
Stephen Heumann	711549392c	Update displayed version number to mark this as a development version.	2022-07-25 18:33:32 -05:00
Stephen Heumann	2f75f47140	Update ORCA/C version number to 2.2.0 B6.	2022-07-19 20:40:52 -05:00
Stephen Heumann	00e7fe7125	Increase some size limits. The new value of maxLocalLabel is aligned with the C99+ requirement to support "511 identifiers with block scope declared in one block". The value of maxLabel is now the maximum it can be while keeping the size of the labelTab array under 32 KiB. (I'm not entirely sure the address calculations in the code generated by ORCA/Pascal would work correctly beyond that.)	2022-07-08 21:30:14 -05:00
Stephen Heumann	3c2b492618	Add support for compound literals within functions. The basic approach is to generate a single expression tree containing the code for the initialization plus the reference to the compound literal (or its address). The various subexpressions are joined together with pc_bno pcodes, similar to the code generated for the comma operator. The initializer expressions are placed in a balanced binary tree, so that it is not excessively deep. Note: Common subexpression elimination has poor performance for very large trees. This is not specific to compound literals, but compound literals for relatively large arrays can run into this issue. It will eventually complete and generate a correct program, but it may be quite slow. To avoid this, turn off CSE.	2022-06-08 21:34:12 -05:00
Stephen Heumann	06e17cd8f5	Give an error if file names or command-line parameters are too long. The source file name, keep name, NAMES= string, and cc= string are all restricted to 255 characters, but these limits were not previously enforced, and exceeding them could lead to strange behavior.	2022-02-12 15:42:15 -06:00
Stephen Heumann	bd811559d6	Fix issues with keep names in sym files. There were a couple issues that could occur with #pragma keep and sym files: If a source file used #pragma keep but it was overridden by KEEP= on the command line or {KeepName} in the shell, then the overriding keep name would be saved to the sym file. It would therefore be applied to subsequent compilations even if it was no longer specified in the command line or shell variable. If a source file used #pragma keep, that keep name would be recorded in the sym file. On subsequent compilations, it would always be used, overriding any keep name specified by the command line or shell, contrary to the usual rule that the name on the command line takes priority. With this patch, the keep name recorded in the sym file (if any) should always be the one specified by #pragma keep, but it can be overridden as usual.	2022-02-06 21:49:08 -06:00
Stephen Heumann	785a6997de	Record source file changes within a function as part of debug info. This affects functions whose body spans multiple files due to includes, or is treated as doing so due to #line directives. ORCA/C will now generate a COP 6 instruction to record each source file change, allowing debuggers to properly track the flow of execution across files.	2022-02-05 18:32:11 -06:00
Stephen Heumann	7322428e1d	Add an option to print file names in error messages. This can help identify if an error is in the main source file or an include file.	2022-02-04 22:10:50 -06:00
Stephen Heumann	73d194c12f	Allow string constants with up to 32760 bytes. This allows the length of the string plus a few extra bytes used internally to be represented by a 16-bit integer. Since the size limit for memory allocations has been raised, there is no good reason to impose a shorter limit on strings. Note that C99 and later specify a minimum translation limit for string constants of at least 4095 characters.	2021-10-24 21:43:43 -05:00
Stephen Heumann	692ebaba85	Structs or arrays may not contain structs with a flexible array member. We previously ignored this, but it is a constraint violation under the C standards, so it should be reported as an error. GCC and Clang allow this as an extension, as we were effectively doing previously. We will follow the standards for now, but if there was demand for such an extension in ORCA/C, it could be re-introduced subject to a #pragma ignore flag.	2021-10-17 22:22:42 -05:00
Stephen Heumann	5871820e0c	Support UTF-8/16/32 string literals and character constants (C11). These have u8, u, or U prefixes, respectively. The types char16_t and char32_t (defined in <uchar.h>) are used for UTF-16 and UTF-32 code points.	2021-10-11 20:54:37 -05:00
Stephen Heumann	7ae830ae7e	Initial support for compound literals. Compound literals outside of functions should work at this point. Compound literals inside of functions are not fully implemented, so they are disabled for now. (There is some code to support them, but the code to actually initialize them at the appropriate time is not written yet.)	2021-09-16 18:34:55 -05:00
Stephen Heumann	851d7d0787	Update displayed version number to mark this as a development version.	2021-09-15 18:34:27 -05:00
Stephen Heumann	617c46095d	Update ORCA/C version number to 2.2.0 B5.	2021-09-12 18:12:36 -05:00
Stephen Heumann	894baac94f	Give an error if assigning to a whole struct or union that has a const member. Such structs or unions are not modifiable lvalues, so they cannot be assigned to as a whole. Any non-const fields can be assigned to individually.	2021-09-11 18:12:58 -05:00
Stephen Heumann	b16210a50b	Record volatile and restrict qualifiers in types. These are needed to correctly distinguish pointer types in _Generic. They should also be used for type compatibility checks in other contexts, but currently are not. This also fixes a couple small problems related to type qualifiers: restrict was not allowed to appear after in type-names volatile status was not properly recorded in sym files Here is an example of using _Generic to distinguish pointer types based on the qualifiers of the pointed-to type: #include <stdio.h> #define f(e) _Generic((e),\ int restrict : 1,\ int volatile const : 2,\ int : 3,\ default: 0) #define g(e) _Generic((e),\ int : 1,\ const int : 2,\ volatile int : 3,\ default: 0) int main(void) { int * restrict * p1; int * volatile const * p2; int * const * p3; // should print "1 2 0 1" printf("%i %i %i %i\n", f(p1), f(p2), f(p3), f((int * restrict )0)); int q1; const int q2; volatile int q3; const volatile int q4; // should print "1 2 3 0" printf("%i %i %i %i\n", g(q1), g(q2), g(q3), g(q4)); } Here is an example of a problem resulting from volatile not being recorded in sym files (if a sym file was present, the read of x was lifted out of the loop): #pragma optimize -1 static volatile int x; #include <stdio.h> int main(void) { int y; for (unsigned i = 0; i < 100; i++) { y = x2 + 7; } }	2021-08-30 18:19:58 -05:00
Stephen Heumann	979852be3c	Use the right types for constants cast to character types. These were previously treated as having type int. This resulted in incorrect results from sizeof, and would also be a problem for _Generic if it was implemented. Note that this creates a token kind of "charconst", but this is not the kind for character constants in the source code. Those have type int, so their kind is intconst. The new kinds of "tokens" are created only through casts of constant expressions.	2021-03-07 13:38:21 -06:00
Stephen Heumann	8f8e7f12e2	Distinguish the different types of floating-point constants. As with expressions, the type does not actually limit the precision and range of values represented.	2021-03-07 00:48:51 -06:00
Stephen Heumann	f9f79983f8	Implement the standard pragmas, in particular FENV_ACCESS. The FENV_ACCESS pragma is now implemented. It causes floating-point operations to be evaluated at run time to the maximum extent possible, so that they can affect and be affected by the floating-point environment. It also disables optimizations that might evaluate floating-point operations at compile time or move them around calls to the <fenv.h> functions. The FP_CONTRACT and CX_LIMITED_RANGE pragmas are also recognized, but they have no effect. (FP_CONTRACT relates to "contracting" floating-point expressions in a way that ORCA/C does not do, and CX_LIMITED_RANGE relates to complex arithmetic, which ORCA/C does not support.)	2021-03-06 00:57:13 -06:00
Stephen Heumann	4ad7a65de6	Process floating-point values within the compiler using the extended type. This means that floating-point constants can now have the range and precision of the extended type (aka long double), and floating-point constant expressions evaluated within the compiler also have that same range and precision (matching expressions evaluated at run time). This new behavior is intended to match the behavior specified in the C99 and later standards for FLT_EVAL_METHOD 2. This fixes the previous problem where long double constants and constant expressions of type long double were not represented and evaluated with the full range and precision that they should be. It also gives extra range and precision to constants and constant expressions of type double or float. This may have pluses and minuses, but at any rate it is consistent with the existing behavior for expressions evaluated at run time, and with one of the possible models of floating point evaluation specified in the C standards.	2021-03-04 23:58:08 -06:00
Stephen Heumann	cf463ff155	Support switch statements using long long expressions.	2021-02-17 19:41:46 -06:00
Stephen Heumann	2408c9602c	Make expressionValue a saturating approximation of the true value for long long expressions. This gives sensible behavior for several things in the parser, e.g. where all negative values or all very large values should be disallowed.	2021-02-04 12:44:44 -06:00
Stephen Heumann	a59a2427fd	Add some support for ++/-- on long long values. Some more complex cases require pc_ind, which is not implemented yet.	2021-02-04 12:35:28 -06:00
Stephen Heumann	168a06b7bf	Add support for emitting 64-bit constants in statically-initialized data.	2021-02-04 02:17:10 -06:00
Stephen Heumann	793f0a57cc	Initial support for constants with long long types. Currently, the actual values they can have are still constrained to the 32-bit range. Also, there are some bits of functionality (e.g. for initializers) that are not implemented yet.	2021-02-03 23:11:23 -06:00
Stephen Heumann	2222e4a0b4	Restore old order of baseTypeEnum values. The ordinal values of these are hard-coded in code for handling pc_cnv/pc_cnn, so let's avoid changing them.	2021-01-29 23:23:03 -06:00
Stephen Heumann	085cd7eb1b	Initial code to recognize 'long long' as a type.	2021-01-29 22:27:11 -06:00

1 2

80 Commits