111 Commits

Author SHA1 Message Date
Stephen Heumann
73d194c12f Allow string constants with up to 32760 bytes.
This allows the length of the string plus a few extra bytes used internally to be represented by a 16-bit integer. Since the size limit for memory allocations has been raised, there is no good reason to impose a shorter limit on strings.

Note that C99 and later specify a minimum translation limit for string constants of at least 4095 characters.
2021-10-24 21:43:43 -05:00
Stephen Heumann
f567d60429 Allow bit-fields in unions.
All versions of standard C allow this, but ORCA/C previously did not.
2021-10-18 21:48:18 -05:00
Stephen Heumann
692ebaba85 Structs or arrays may not contain structs with a flexible array member.
We previously ignored this, but it is a constraint violation under the C standards, so it should be reported as an error.

GCC and Clang allow this as an extension, as we were effectively doing previously. We will follow the standards for now, but if there was demand for such an extension in ORCA/C, it could be re-introduced subject to a #pragma ignore flag.
2021-10-17 22:22:42 -05:00
Stephen Heumann
ad5063a9a3 Support hexadecimal floating-point constants. 2021-10-17 18:19:29 -05:00
Stephen Heumann
5871820e0c Support UTF-8/16/32 string literals and character constants (C11).
These have u8, u, or U prefixes, respectively. The types char16_t and char32_t (defined in <uchar.h>) are used for UTF-16 and UTF-32 code points.
2021-10-11 20:54:37 -05:00
Stephen Heumann
b076f85149 Avoid possible stack overflow when merging adjacent string literals.
The code for this was recursive and could overflow if there were several dozen consecutive string literals. It has been changed to only use one level of recursion, avoiding the problem.
2021-10-11 18:55:10 -05:00
Stephen Heumann
7ae830ae7e Initial support for compound literals.
Compound literals outside of functions should work at this point.

Compound literals inside of functions are not fully implemented, so they are disabled for now. (There is some code to support them, but the code to actually initialize them at the appropriate time is not written yet.)
2021-09-16 18:34:55 -05:00
Stephen Heumann
a8682e28d3 Give an error for pointer assignments that discard qualifiers.
This is controlled by #pragma ignore bit 5, which is now a more general "loose type checks" bit.
2021-09-10 17:58:20 -05:00
Stephen Heumann
9c04b94093 Allow invalid escape sequences and UCN-like sequences in skipped code.
The standard wording is not always clear on these cases, but I think at least some of them should be allowed and others may be undefined behavior (which we can choose to allow). At any rate, this allows non-standard escape sequences targeted at other compilers to appear in skipped-over code.

There probably ought to be similar handling for #defines that are never expanded, but that would require more code changes.
2021-09-06 20:37:17 -05:00
Stephen Heumann
ea461dba7b Give clearer error messages for errors in the command line. 2021-08-31 19:23:10 -05:00
Stephen Heumann
b8c332deeb Treat invalid escape sequences as errors.
This applies to octal and hexadecimal sequences with out-of-range values, and also to unrecognized escape characters. The C standards say both of these cases are syntax/constraint violations requiring a diagnostic.
2021-08-31 18:36:06 -05:00
Stephen Heumann
2b9d332580 Give an appropriate error for an illegal operator in a constant expression.
This was being reported as an "illegal type cast".
2021-08-22 20:33:34 -05:00
Stephen Heumann
5faf219eff Update comments about pragma flags. 2021-08-22 17:35:16 -05:00
Stephen Heumann
03f267ac02 Write out long long constants when using #pragma expand. 2021-03-11 23:20:14 -06:00
Stephen Heumann
9cd2807bc8 Do not leave behind detritus from the spinner when using #pragma expand.
This could happen with the following example (under ORCA/Shell with output to the screen only):

#include <stdio.h>
#pragma expand 1

int main(void) {
}
2021-03-11 19:01:38 -06:00
Stephen Heumann
2de8ac993e Fix to make _Generic handle struct types properly.
Also, use an existing error message instead of creating a new equivalent one.
2021-03-07 23:35:12 -06:00
Stephen Heumann
bccd86a627 Implement _Generic expressions (from C11).
Note that this code relies on CompTypes for type compatibility testing, and it has slightly non-standard behavior in some cases.
2021-03-07 21:59:37 -06:00
Stephen Heumann
979852be3c Use the right types for constants cast to character types.
These were previously treated as having type int. This resulted in incorrect results from sizeof, and would also be a problem for _Generic if it was implemented.

Note that this creates a token kind of "charconst", but this is not the kind for character constants in the source code. Those have type int, so their kind is intconst. The new kinds of "tokens" are created only through casts of constant expressions.
2021-03-07 13:38:21 -06:00
Stephen Heumann
8f8e7f12e2 Distinguish the different types of floating-point constants.
As with expressions, the type does not actually limit the precision and range of values represented.
2021-03-07 00:48:51 -06:00
Stephen Heumann
f9f79983f8 Implement the standard pragmas, in particular FENV_ACCESS.
The FENV_ACCESS pragma is now implemented. It causes floating-point operations to be evaluated at run time to the maximum extent possible, so that they can affect and be affected by the floating-point environment. It also disables optimizations that might evaluate floating-point operations at compile time or move them around calls to the <fenv.h> functions.

The FP_CONTRACT and CX_LIMITED_RANGE pragmas are also recognized, but they have no effect. (FP_CONTRACT relates to "contracting" floating-point expressions in a way that ORCA/C does not do, and CX_LIMITED_RANGE relates to complex arithmetic, which ORCA/C does not support.)
2021-03-06 00:57:13 -06:00
Stephen Heumann
4ad7a65de6 Process floating-point values within the compiler using the extended type.
This means that floating-point constants can now have the range and precision of the extended type (aka long double), and floating-point constant expressions evaluated within the compiler also have that same range and precision (matching expressions evaluated at run time). This new behavior is intended to match the behavior specified in the C99 and later standards for FLT_EVAL_METHOD 2.

This fixes the previous problem where long double constants and constant expressions of type long double were not represented and evaluated with the full range and precision that they should be. It also gives extra range and precision to constants and constant expressions of type double or float. This may have pluses and minuses, but at any rate it is consistent with the existing behavior for expressions evaluated at run time, and with one of the possible models of floating point evaluation specified in the C standards.
2021-03-04 23:58:08 -06:00
Stephen Heumann
4020098dd6 Evaluate constant expressions with long long and floating operands.
Note that we currently defer evaluation of such expressions to run time if the long long value cannot be represented exactly in a double, because statically-evaluated floating point expressions use the double format rather than the extended (long double) format used at run time.
2021-02-21 18:43:53 -06:00
Stephen Heumann
6bb91d20e5 Add the predefined macro __ORCAC_HAS_LONG_LONG__.
This allows headers or other code to test for the presence of this feature.
2021-02-17 14:41:09 -06:00
Stephen Heumann
b4604e079e Do preprocessor arithmetic in intmax_t/uintmax_t (aka long long types).
This is what C99 and later require.
2021-02-17 00:04:20 -06:00
Stephen Heumann
2e29390e8e Support 64-bit decimal constants in code. 2021-02-15 12:28:30 -06:00
Stephen Heumann
5e5434987b Give an error when trying to evaluate constant expressions with long long operands. 2021-02-04 14:56:15 -06:00
Stephen Heumann
c37fae0f3b Add most of the infrastructure to support 64-bit decimal constants.
Right now, decimal constants can have long long types based on their suffix, but they are still limited to a maximum value of 2^32-1.

This also implements the C99 change where decimal constants without a u suffix always have signed types. Thus, decimal constants of 2^31 and up now have type long long, even if their values could be represented in the type unsigned long.
2021-02-04 00:22:56 -06:00
Stephen Heumann
058c0565c6 Support 64-bit integer constants in hex/octal/binary formats.
64-bit decimal constants are not supported yet.
2021-02-04 00:02:44 -06:00
Stephen Heumann
793f0a57cc Initial support for constants with long long types.
Currently, the actual values they can have are still constrained to the 32-bit range. Also, there are some bits of functionality (e.g. for initializers) that are not implemented yet.
2021-02-03 23:11:23 -06:00
Stephen Heumann
714b417261 Merge branch 'master' into longlong 2021-02-03 21:20:37 -06:00
Stephen Heumann
4a95dbc597 Give an error if you try to define a macro to + or - on the command line.
This affects command lines like:
cmpl myprog.c cc=(-da=+) ...

Previously, this would be accepted, but a was actually defined to 0 rather than +.

Now, this gives an error, consistent with other tokens that are not supported in such definitions on the command line. (Perhaps we should support definitions using any tokens, but that would require bigger code changes.)

This also cleans up some related code to avoid possible null-pointer dereferences.
2021-02-03 21:06:58 -06:00
Stephen Heumann
1b9ee39de7 Disallow duplicate suffixes on numeric constants (e.g. "123ulu"). 2021-02-02 18:28:49 -06:00
Stephen Heumann
8ac887f4dc Hexadecimal/octal constants 0x80000000+ should have type unsigned long.
They previously had type signed long (with negative values).
2021-02-02 18:26:31 -06:00
Stephen Heumann
085cd7eb1b Initial code to recognize 'long long' as a type. 2021-01-29 22:27:11 -06:00
Stephen Heumann
f0a3808c18 Add a new #pragma ignore option to treat char and unsigned char as compatible.
This is contrary to the C standards, but ORCA/C historically permitted it (as do some other compilers), and I think there is a fair amount of existing code that relies on it.
2020-05-22 17:11:13 -05:00
Stephen Heumann
5d64436e6e Implement __STDC_HOSTED__ macro (from C99).
This is normally 1 (indicating a hosted implementation, where the full standard library is available and the program starts by executing main()), but it is 0 if one of the pragmas for special types of programs with different entry points has been used.
2020-03-07 15:51:29 -06:00
Stephen Heumann
a62cbe531a Implement __STDC_NO_...__ macros as specified by C11.
These indicate that various optional features of the C standard are not supported.
2020-03-06 23:29:54 -06:00
Stephen Heumann
32614abfca Allow '/*' or '//' in character constants.
These should not start a comment.
2020-02-04 18:42:55 -06:00
Stephen Heumann
c84c4d9c5c Check for non-void functions that execute to the end without returning a value.
This generalizes the heuristic approach for checking whether _Noreturn functions could execute to the end of the function, extending it to apply to any function with a non-void return type. These checks use the same #pragma lint bit but give different messages depending on the situation.
2020-02-02 13:50:15 -06:00
Stephen Heumann
77dcfdf3ee Implement support for macros with variable arguments (C99). 2020-01-31 20:07:10 -06:00
Stephen Heumann
bc951b6735 Make lint report some more cases where noreturn functions may return.
This uses a heuristic that may produce both false positives and false negatives, but any false positives should reflect extraneous code at the end of the function that is not actually reachable.
2020-01-30 17:35:15 -06:00
Stephen Heumann
76eb476809 Address some issues with stringization of macro arguments.
We now insert spaces corresponding to whitespace between tokens, and string tokens are enclosed in quotes.

There are still issues with (at least) escape sequences in strings and comments between tokens.
2020-01-30 12:48:16 -06:00
Stephen Heumann
80c513bbf2 Add a lint flag for checking if _Noreturn functions may return.
Currently, this only flags return statements, not cases where they may execute to the end of the function. (Whether the function will actually return is not decidable in general, although it may be in special cases).
2020-01-29 19:26:45 -06:00
Stephen Heumann
4fd642abb4 Add lint check for return with no value in a non-void function.
This is disallowed in C99 and later.
2020-01-29 18:50:45 -06:00
Stephen Heumann
a9f5fb13d8 Introduce a new #pragma lint bit for syntax that C99 disallows.
This currently checks for:
*Calls to undefined functions (same as bit 0)
*Parameters not declared in K&R-style function definitions
*Declarations or type names with no type specifiers (includes but is broader than the condition checked by bit 1)
2020-01-29 18:33:19 -06:00
Stephen Heumann
ffe6c4e924 Spellcheck comments throughout the code.
There are no non-comment changes.
2020-01-29 17:09:52 -06:00
Stephen Heumann
d60104cc47 Tweak handling of lint warnings.
If there were a warning and an error on the same line, and errors were treated as terminal, the warning could sometimes be reported as an error.
2020-01-29 12:16:17 -06:00
Stephen Heumann
f5cd1e3e3a Recognize designated initializers enough to give an error and skip them.
Previously, the designated initializer syntax could confuse the parser enough to cause null pointer dereferences. This avoids that, and also gives a more meaningful error message to the user.
2020-01-28 12:48:09 -06:00
Stephen Heumann
c514c109ab Allow for function-like macros taking no parameters.
This was broken by commit 06a3719304.
2020-01-25 19:44:29 -06:00
Stephen Heumann
fe6c410271 Allow #pragma lint messages to optionally be treated as warnings.
In the #pragma lint line, the integer indicating the checks to perform can now optionally be followed by a semicolon and another integer. If these are present and the second integer is 0, then the lint checks will be performed, but will be treated as warnings rather than errors, so that they do not cause compilation to fail.
2020-01-25 11:29:12 -06:00