ORCA-C

mirror of https://github.com/byteworksinc/ORCA-C.git synced 2025-01-16 20:32:57 +00:00

Author	SHA1	Message	Date
Stephen Heumann	0fe9424373	Disable recognition of trigraphs in C23 mode.	2024-08-30 13:43:59 -05:00
Stephen Heumann	b27fc52dc4	Recognize :: as a punctuator token (C23). For the moment, it behaves as expected with regard to token merging and stringization, but otherwise it doesn't do anything. (It can be used in attributes, but those aren't implemented yet.)	2024-08-29 22:04:53 -05:00
Stephen Heumann	1e415aadd7	Parse _DecimalN keywords as (unsupported) type specifiers. This gives a clearer error message when they are used and helps to keep the subsequent parsing on track.	2024-08-29 13:24:08 -05:00
Stephen Heumann	9ecbd42c1a	Recognize new C23 keywords as keywords (in C23 mode). They are recognized as keywords, but they currently don't do anything.	2024-08-28 21:58:59 -05:00
Stephen Heumann	f59c2cf93d	Add a C23 standard option to the compiler. The default is still c17compat.	2024-08-27 22:06:55 -05:00
Stephen Heumann	a9e0b13e1c	Require at least one hex digit in \x escape sequences. This is required by the grammar given in the C standards.	2024-07-31 17:05:57 -05:00
Stephen Heumann	69320cd4d8	Detect some erroneous numeric constants that were being allowed. These include tokens like 0x, 0b, and 1.2LL.	2024-04-23 22:07:19 -05:00
Stephen Heumann	8278f7865a	Support unconvertible preprocessing numbers. These are tokens that follow the syntax for a preprocessing number, but not for an integer or floating constant after preprocessing. They are now allowed within the preprocessing phases of the compiler. They are not legal after preprocessing, but they may be used as operands of the # and ## preprocessor operators to produce legal tokens.	2024-04-23 21:39:14 -05:00
Stephen Heumann	a545685ab4	Use more correct logic for expanding macros in macro parameters. Specifically, this affects the case where a macro argument ends with the name of a function-like macro that takes 0 parameters. When that argument is initially expanded, the macro should not be expanded, even if there are parentheses within the macro that it is being passed to or the subsequent program code. This is the case because the C standards specify that "The argument’s preprocessing tokens are completely macro replaced before being substituted as if they formed the rest of the preprocessing file with no other preprocessing tokens being available." (The macro may still be expanded at a later stage, but that depends on other rules that determine whether the expansion is suppressed.) The logic for this was already present for the case of macros taking one or more argument; this extends it to apply to function-like macros taking zero arguments as well. I'm not sure that this makes any practical difference while cycles of mutually-referential macros still aren't handled correctly (issue #48), but if that were fixed then there would be some cases that depend on this behavior.	2024-02-26 22:31:46 -06:00
Stephen Heumann	84fdb5c975	Fix handling of empty macro arguments as ## operands. Previously, there were a couple problems: If the parameter that was passed an empty argument appeared directly after the ##, the ## would permanently be removed from the macro record, affecting subsequent uses of the macro even if the argument was not empty. If the parameter that was passed an empty argument appeared between two ## operators, both would effectively be skipped, so the tokens to the left of the first ## and to the right of the second would not be combined. This example illustrates both issues (not expected to compile; just check preprocessor output): #pragma expand 1 #define x(a,b,c) a##b##c x(1, ,3) x(a,b,c)	2024-02-22 21:55:46 -06:00
Stephen Heumann	d1847d40be	Set numString properly for numeric tokens generated by ##. Previously, it was not necessarily set correctly for the newly-generated token. This would result in incorrect behavior if that token was an operand to another ## operator, as in the following example: #define x(a,b,c) a##b##c x(1,2,3)	2024-02-22 21:42:05 -06:00
Kelvin Sherlock	586229e6eb	#define should always use the global pool.... if a #define is within a function, it could use the local memory pool for string allocation (via Malloc in NextToken, line 5785) which can lead to a dangling memory reference when the macro is expanded. void function(void) { #define TEXT "abc" static struct { char text[sizeof(TEXT)]; } template = { TEXT }; }	2024-01-15 22:05:32 -06:00
Stephen Heumann	9a56a50f5f	Support FPE card auto-detection. The second parameter of #pragma float is now optional, and if it missing or invalid then the FPE slot is auto-detected by the start-up code. This is done by calling the new ~InitFloat function in the FPE version of SysFloat.	2023-06-26 18:33:54 -05:00
Stephen Heumann	0021fd81bc	#pragma float: Generate code in the .root file to set the FPE slot. This allows valid FPE-using programs to be compiled using only #pragma float, with no changes needed to the code itself. The slot-setting code is only generated if the slot is 1..7, and even then it can be overridden by calling setfpeslot(), so this should not cause compatibility problems for existing code.	2023-06-17 18:13:31 -05:00
Stephen Heumann	c2262929e9	Fix handling of #pragma float. This was not getting recognized properly, because float is a keyword rather than an identifier.	2023-04-30 21:47:19 -05:00
Stephen Heumann	9d5360e844	Comment out some unused error messages.	2023-04-30 21:38:34 -05:00
Stephen Heumann	4c903a5331	Remove a few unused variables.	2023-04-04 18:11:41 -05:00
Stephen Heumann	cc36e9929f	Remove some unused variables.	2023-03-20 11:12:48 -05:00
Stephen Heumann	137188ff4f	Comment out an obsolete error message.	2023-03-17 19:47:30 -05:00
Stephen Heumann	ea056f1fbb	Avoid listing the first line twice when a pre-include file is used.	2023-03-15 20:43:43 -05:00
Stephen Heumann	30a04d42c5	Require preprocessor conditionals to be balanced in each include file. This is required by the standard syntax for a preprocessing file (C17 6.10), which must be a "group" (or empty).	2023-03-07 19:00:13 -06:00
Stephen Heumann	3406dbd3ae	Prevent a tag declared in an inner scope from shadowing a typedef. This could occur because when FindSymbol was called to look for symbols in all spaces, it would find a tag in an inner scope before a typedef in an outer scope. The processing order has been changed to look for regular symbols (including typedefs) in any scope, and only look for tags if no regular symbol is found. Here is an example illustrating the problem: typedef int T; int main(void) { struct T; T x; }	2023-03-06 21:38:05 -06:00
Stephen Heumann	85890e0b6b	Give an error if assembly code tries to use direct page addressing for a local variable that is out of range. This could previously cause bad code to be produced with no error reported.	2023-03-04 21:06:07 -06:00
Stephen Heumann	a6ef872513	Add debugging option to detect illegal use of null pointers. This adds debugging code to detect null pointer dereferences, as well as pointer arithmetic on null pointers (which is also undefined behavior, and can lead to later dereferences of the resulting pointers). Note that ORCA/Pascal can already detect null pointer dereferences as part of its more general range-checking code. This implementation for ORCA/C will report the same error as ORCA/Pascal ("Subrange exceeded"). However, it does not include any of the other forms of range checking that ORCA/Pascal does, and (unlike in ORCA/Pascal) it is controlled by a separate flag from stack overflow checking.	2023-02-12 18:56:02 -06:00
Stephen Heumann	03fc7a43b9	Give an error for expressions with incomplete struct/union types. These are erroneous, in situations where the expression is used for its value. For function return types, this violates a constraint (C17 6.5.2.2 p1), so a diagnostic is required. We also now diagnose this issue for identifier expressions or unary * (indirection) expressions. These cases cause undefined behavior per C17 6.3.2.1 p2, so a diagnostic is not required, but it is nice to give one.	2023-01-09 21:58:53 -06:00
Stephen Heumann	61a2cd1e5e	Consistently report "compiler error" for unrecognized error codes. The old approach of calling Error while in the middle of writing error messages did not work reliably.	2023-01-09 18:46:33 -06:00
Stephen Heumann	245dd0a3f4	Add lint check for implicit conversions that change a constant's value. This occurs when the constant value is out of range of the type being assigned to. This is likely indicative of an error, or of code that assumes types have larger ranges than they do in ORCA/C (e.g. 32-bit int). This intentionally does not report cases where a value is assigned to a signed type but is within the range of the corresponding unsigned type, or vice versa. These may be done intentionally, e.g. setting an unsigned value to "-1" or setting a signed value using a hex constant with the high bit set. Also, only conversions to 8-bit or 16-bit integer types are currently checked.	2023-01-03 18:57:32 -06:00
Stephen Heumann	fe62f70d51	Add lint option to check for unused variables.	2022-12-12 21:47:32 -06:00
Stephen Heumann	32975b720f	Allow native code peephole opt to be used when stack repair is enabled. I think the reason this was originally disallowed is that the old code sequence for stack repair code (in ORCA/C 2.1.0) ended with TYA. If this was followed by STA dp or STA abs, the native code peephole optimizer (prior to commit 7364e2d2d329d81) would have turned the combination into a STY instruction. That is invalid if the value in A is needed. This could come up, e.g., when assigning the return value from a function to two different variables. This is no longer an issue, because the current code sequence for stack repair code no longer ends in TYA and is not susceptible to the same kind of invalid optimization. So it is no longer necessary to disable the native code peephole optimizer when using stack repair code (either for all calls or just varargs calls).	2022-12-10 20:34:00 -06:00
Stephen Heumann	e71fe5d785	Treat unary + as an actual operator, not a no-op. This is necessary both to detect errors (using unary + on non-arithmetic types) and to correctly perform the integer promotions when unary + is used (which can be detected with sizeof or _Generic).	2022-12-09 19:03:38 -06:00
Stephen Heumann	bb1bd176f4	Add a command-line option to select the C standard to use. This provides a more straightforward way to place the compiler in a "strict conformance" mode. This could essentially be achieved by setting several pragma options, but having a single setting is simpler. "Compatibility modes" for older standards can also be selected, although these actually continue to enable most C17 features (since they are unlikely to cause compatibility problems for older code).	2022-12-07 21:35:15 -06:00
Stephen Heumann	6857913daa	Make the object buffer dynamically resizable. It will now grow as needed to accommodate large segments, subject to the constraints of available memory. In practice, this mostly affects the size of initialized static arrays that can be used. This also removes any limit apart from memory size on how large the object representation produced by a "compile to memory" can be, and cleans up error reporting regarding size limits.	2022-12-06 21:49:20 -06:00
Stephen Heumann	c06d78bb5e	Add __STDC_VERSION__ macro. With the addition of designated initializers, ORCA/C now supports all the major mandatory language features added between C90 and C17, apart from those made optional by C11. There are still various small areas of nonconformance and a number of missing library functions, but at this point it is reasonable for ORCA/C to report itself as being a C17 implementation.	2022-12-04 22:25:02 -06:00
Stephen Heumann	dc305a86b2	Add flag to suppress printing of put-back tokens with #pragma expand. This is currently used in a couple places in the designated initializer code (solving the problem with #pragma expand in the last commit). It could probably be used elsewhere too, but for now it is not.	2022-11-28 21:22:56 -06:00
Stephen Heumann	b6d3dfb075	Designated initializers for arrays, part 1. This can parse designated initializers for arrays, but does not create proper initializer records for them.	2022-11-26 15:22:58 -06:00
Stephen Heumann	3f450bdb80	Support "inline" function definitions without static or extern. This is a minimal implementation that does not actually inline anything, but it is intended to implement the semantics defined by the C99 and later standards. One complication is that a declaration that appears somewhere after the function body may create an external definition for a function that appeared to be an inline definition when it was defined. To support this while preserving ORCA/C's general one-pass compilation strategy, we generate code even for inline definitions, but treat them as private and add the prefix "~inline~" to the name. If they are "un-inlined" based on a later declaration, we generate a stub with external linkage that just jumps to the apparently-inline function.	2022-11-19 23:04:22 -06:00
Stephen Heumann	ab368d442a	Allow \ as an "other character" preprocessing token. This still has a few issues. A \ token may not be followed by u or U (because this triggers UCN processing). We should scan through the whole possible UCN until we can confirm whether it is actually a UCN, but that would require more lookahead. Also, \ is not handled correctly in stringization (it should form escape sequences).	2022-11-08 20:46:48 -06:00
Stephen Heumann	9cc72c8845	Support "other character" preprocessing tokens. This implements the catch-all category for preprocessing tokens for "each non-white-space character that cannot be one of the above" (C17 section 6.4). These may appear in skipped code, or in macros or macro parameters if they are never expanded or are stringized during macro processing. The affected characters are $, @, `, and many extended characters. It is still an error if these tokens are used in contexts where they remain present after preprocessing. If #pragma ignore bit 0 is clear, these characters are also reported as errors in skipped code or preprocessor constructs.	2022-11-08 18:58:50 -06:00
Stephen Heumann	f31b5ea1e6	Allow "extern inline" functions. A function declared "inline" with an explicit "extern" storage class has the same semantics as if "inline" was omitted. (It is not an inline definition as defined in the C standards.) The "inline" specifier suggests that the function should be inlined, but it is legal to just ignore it, as we already do for "static inline" functions. Also add a test for the inline function specifier.	2022-10-29 19:43:57 -05:00
Stephen Heumann	f54d0e1854	Require that main have no function specifiers. This enforces a constraint in the C standards (for a hosted environment).	2022-10-29 18:36:51 -05:00
Stephen Heumann	e5428b21d2	Do not skip over the character after _Pragma(...).	2022-10-29 15:55:44 -05:00
Stephen Heumann	4702df9aac	Support Unicode strings and some escape sequences in _Pragma. This still works by "reconstructing" the string literal text, rather than just using what was in the source code. This is not what the standards specify and can result in slightly different behavior in some corner cases, but for realistic cases it is probably fine.	2022-10-25 22:47:22 -05:00
Stephen Heumann	e63d827049	Do not do macro expansion on preprocessor directive names. According to the C standards (C17 section 6.10.3 p8), they should not be subject to macro replacement. A similar change also applies to the "STDC" in #pragma STDC ... (but we still allow macros for other pragmas, which is allowed as part of the implementation-defined behavior of #pragma). Here is an example affected by this issue: #define ifdef ifndef #ifdef foobar #error "foobar defined?" #else int main(void) {} #endif	2022-10-25 22:40:20 -05:00
Stephen Heumann	e0b27db652	Do not try to interpret non-identifier tokens as pragma names. This could access arbitrary memory locations, and could theoretically cause misbehavior including falsely recognizing the token as a pragma or accessing a softswitch/IO location.	2022-10-25 22:26:30 -05:00
Stephen Heumann	81353a9f8a	Always interpret the digit sequence in #line as decimal. This is what the standards call for.	2022-10-23 13:47:59 -05:00
Stephen Heumann	65ec29ee3e	Use 32-bit representation for line numbers. C99 and later specify that line numbers set via #line can be up to 2147483647, so they need to be represented as (at least) a 32-bit value.	2022-10-22 21:46:12 -05:00
Stephen Heumann	760c932fea	Initial implementation of _Pragma (C99). This works for typical cases, but does not yet handle Unicode strings, very long strings, or certain escape sequences.	2022-10-22 17:08:54 -05:00
Stephen Heumann	946c6c1d55	Always end preprocessor expression processing at end of line. In certain error cases, tokens from subsequent lines could get treated as part of a preprocessor expression, causing subsequent code to be essentially ignored and producing strange error messages. Here is an example (with an error) affected by this: #pragma optimize 0 0 int main(void) {}	2022-10-21 18:51:53 -05:00
Stephen Heumann	bdf212ec6b	Remove support for separate . . . as equivalent to a ... token. The scanner has been updated so that ... should always get recognized as a single token, so this is no longer necessary as a workaround. Any code that actually uses separate . . . is non-standard and will need to be changed.	2022-10-19 18:14:14 -05:00
Stephen Heumann	6d8ca42734	Parse the _Thread_local storage-class specifier. This does not really do anything, because ORCA/C does not support multithreading, but the C11 and later standards indicate it should be allowed anyway.	2022-10-18 21:01:26 -05:00

1 2 3 4 5

205 Commits