Commit Graph

86 Commits

Author SHA1 Message Date
Stephen Heumann
5e5434987b Give an error when trying to evaluate constant expressions with long long operands. 2021-02-04 14:56:15 -06:00
Stephen Heumann
c37fae0f3b Add most of the infrastructure to support 64-bit decimal constants.
Right now, decimal constants can have long long types based on their suffix, but they are still limited to a maximum value of 2^32-1.

This also implements the C99 change where decimal constants without a u suffix always have signed types. Thus, decimal constants of 2^31 and up now have type long long, even if their values could be represented in the type unsigned long.
2021-02-04 00:22:56 -06:00
Stephen Heumann
058c0565c6 Support 64-bit integer constants in hex/octal/binary formats.
64-bit decimal constants are not supported yet.
2021-02-04 00:02:44 -06:00
Stephen Heumann
793f0a57cc Initial support for constants with long long types.
Currently, the actual values they can have are still constrained to the 32-bit range. Also, there are some bits of functionality (e.g. for initializers) that are not implemented yet.
2021-02-03 23:11:23 -06:00
Stephen Heumann
714b417261 Merge branch 'master' into longlong 2021-02-03 21:20:37 -06:00
Stephen Heumann
4a95dbc597 Give an error if you try to define a macro to + or - on the command line.
This affects command lines like:
cmpl myprog.c cc=(-da=+) ...

Previously, this would be accepted, but a was actually defined to 0 rather than +.

Now, this gives an error, consistent with other tokens that are not supported in such definitions on the command line. (Perhaps we should support definitions using any tokens, but that would require bigger code changes.)

This also cleans up some related code to avoid possible null-pointer dereferences.
2021-02-03 21:06:58 -06:00
Stephen Heumann
1b9ee39de7 Disallow duplicate suffixes on numeric constants (e.g. "123ulu"). 2021-02-02 18:28:49 -06:00
Stephen Heumann
8ac887f4dc Hexadecimal/octal constants 0x80000000+ should have type unsigned long.
They previously had type signed long (with negative values).
2021-02-02 18:26:31 -06:00
Stephen Heumann
085cd7eb1b Initial code to recognize 'long long' as a type. 2021-01-29 22:27:11 -06:00
Stephen Heumann
f0a3808c18 Add a new #pragma ignore option to treat char and unsigned char as compatible.
This is contrary to the C standards, but ORCA/C historically permitted it (as do some other compilers), and I think there is a fair amount of existing code that relies on it.
2020-05-22 17:11:13 -05:00
Stephen Heumann
5d64436e6e Implement __STDC_HOSTED__ macro (from C99).
This is normally 1 (indicating a hosted implementation, where the full standard library is available and the program starts by executing main()), but it is 0 if one of the pragmas for special types of programs with different entry points has been used.
2020-03-07 15:51:29 -06:00
Stephen Heumann
a62cbe531a Implement __STDC_NO_...__ macros as specified by C11.
These indicate that various optional features of the C standard are not supported.
2020-03-06 23:29:54 -06:00
Stephen Heumann
32614abfca Allow '/*' or '//' in character constants.
These should not start a comment.
2020-02-04 18:42:55 -06:00
Stephen Heumann
c84c4d9c5c Check for non-void functions that execute to the end without returning a value.
This generalizes the heuristic approach for checking whether _Noreturn functions could execute to the end of the function, extending it to apply to any function with a non-void return type. These checks use the same #pragma lint bit but give different messages depending on the situation.
2020-02-02 13:50:15 -06:00
Stephen Heumann
77dcfdf3ee Implement support for macros with variable arguments (C99). 2020-01-31 20:07:10 -06:00
Stephen Heumann
bc951b6735 Make lint report some more cases where noreturn functions may return.
This uses a heuristic that may produce both false positives and false negatives, but any false positives should reflect extraneous code at the end of the function that is not actually reachable.
2020-01-30 17:35:15 -06:00
Stephen Heumann
76eb476809 Address some issues with stringization of macro arguments.
We now insert spaces corresponding to whitespace between tokens, and string tokens are enclosed in quotes.

There are still issues with (at least) escape sequences in strings and comments between tokens.
2020-01-30 12:48:16 -06:00
Stephen Heumann
80c513bbf2 Add a lint flag for checking if _Noreturn functions may return.
Currently, this only flags return statements, not cases where they may execute to the end of the function. (Whether the function will actually return is not decidable in general, although it may be in special cases).
2020-01-29 19:26:45 -06:00
Stephen Heumann
4fd642abb4 Add lint check for return with no value in a non-void function.
This is disallowed in C99 and later.
2020-01-29 18:50:45 -06:00
Stephen Heumann
a9f5fb13d8 Introduce a new #pragma lint bit for syntax that C99 disallows.
This currently checks for:
*Calls to undefined functions (same as bit 0)
*Parameters not declared in K&R-style function definitions
*Declarations or type names with no type specifiers (includes but is broader than the condition checked by bit 1)
2020-01-29 18:33:19 -06:00
Stephen Heumann
ffe6c4e924 Spellcheck comments throughout the code.
There are no non-comment changes.
2020-01-29 17:09:52 -06:00
Stephen Heumann
d60104cc47 Tweak handling of lint warnings.
If there were a warning and an error on the same line, and errors were treated as terminal, the warning could sometimes be reported as an error.
2020-01-29 12:16:17 -06:00
Stephen Heumann
f5cd1e3e3a Recognize designated initializers enough to give an error and skip them.
Previously, the designated initializer syntax could confuse the parser enough to cause null pointer dereferences. This avoids that, and also gives a more meaningful error message to the user.
2020-01-28 12:48:09 -06:00
Stephen Heumann
c514c109ab Allow for function-like macros taking no parameters.
This was broken by commit 06a3719304.
2020-01-25 19:44:29 -06:00
Stephen Heumann
fe6c410271 Allow #pragma lint messages to optionally be treated as warnings.
In the #pragma lint line, the integer indicating the checks to perform can now optionally be followed by a semicolon and another integer. If these are present and the second integer is 0, then the lint checks will be performed, but will be treated as warnings rather than errors, so that they do not cause compilation to fail.
2020-01-25 11:29:12 -06:00
Stephen Heumann
d8097e6b31 Do not accept %:%: digraph in places where ## would not be accepted.
This could happen in obscure cases like the following (outside a macro):

for(int b;;-%:%:- b) ;
2020-01-21 07:21:58 -06:00
Stephen Heumann
06a3719304 Allow for empty macro arguments, as specified by C99 and later.
These were previously allowed in some cases, but not as the last argument to a macro. Also, stringization and concatenation of them did not behave according to the standards.
2020-01-20 19:49:22 -06:00
Stephen Heumann
656868a095 Implement support for universal character names in identifiers. 2020-01-20 17:22:06 -06:00
Stephen Heumann
9862500dee Give an error if a parameter in a function definition has an incomplete type.
In combination with earlier patches, this fixes #53.

Also, if the lint flag requiring explicit function types is set, then also require that K&R-style parameters be explicitly declared with types, rather than not being declared and defaulting to int. (This is a requirement in C99 and later.)
2020-01-20 12:43:01 -06:00
Stephen Heumann
d24dacf01a Add initial support for universal character names.
This currently only works in character constants or strings, not identifiers.
2020-01-19 23:59:54 -06:00
Stephen Heumann
6e89dc5883 Give a basic error message for use of _Generic. 2020-01-19 18:03:21 -06:00
Stephen Heumann
dd92585116 Give errors for most illegal uses of "restrict". 2020-01-19 17:31:20 -06:00
Stephen Heumann
49dea49cb8 Detect and give errors for various illegal uses of _Alignas. 2020-01-19 17:06:01 -06:00
Stephen Heumann
a130e79929 Prohibit _Noreturn specifier on non-functions. 2020-01-19 14:57:28 -06:00
Stephen Heumann
b4232fd4ea Flag more appropriate errors about unexpected tokens in type names.
Previously, these would report "identifier expected"; now they correctly say "')' expected".

This introduces a new UnexpectedTokenError procedure that can be used more generally for cases where the expected token may differ based on context.
2020-01-18 16:43:25 -06:00
Stephen Heumann
df029ce06f Handle storage class specifiers in DeclarationSpecifiers.
_Thread_local is recognized but gives a "not supported" error. It could arguably be 'supported' trivially by saying the execution of an ORCA/C program is just one thread and so no special handling is needed, but that likely isn't what someone using it would expect.

There would be a possible issue if a "static" or "typedef" storage class specifier occurred after a type specifier that required memory to be allocated for it, because that memory conceptually might be in the local pool, but static objects are processed at the end of the translation unit, so their types need to stick around. In practice, this should not occur, because the local pool isn't currently used for much (in particular, not for statements or declarations in the body of a function). We give an error in case this somehow might occur.

In combination with preceding commits, this fixes #14. Declaration specifiers can now appear in any order, as required by the C standards.
2020-01-18 14:52:27 -06:00
Stephen Heumann
8341f71ffc Initial phase of support for new C99/C11 type syntax.
_Bool, _Complex, _Imaginary, _Atomic, restrict, and _Alignas are now recognized in types, but all except restrict and _Alignas will give an error saying they are not supported.

This also introduces uniform definitions of the syntactic classes of tokens that can be used in declaration specifiers and related constructs (currently used in some places but not yet in others).
2020-01-12 15:43:30 -06:00
Stephen Heumann
428c991895 Rewrite type specifier parsing.
Type specifiers and type qualifiers can now appear in any order, as specified by the C standards. However, storage class specifiers and function specifiers still cannot be freely mixed with them.
2020-01-07 20:26:56 -06:00
Stephen Heumann
3121a465f1 Implement the _Alignof operator (from C11).
In ORCA/C, the alignment of all object types is 1.
2020-01-06 20:17:29 -06:00
Stephen Heumann
9036a98e1c Implement support for digraphs.
Specifically, the following six punctuator tokens are now supported:

<: :> <% %> %: %:%:

These behave the same as the existing tokens [, ], {, }, #, and ## (respectively), apart from their spelling.

This can be useful when the full ASCII character set cannot easily be displayed or input (e.g. on the IIgs text screen with certain language settings).
2020-01-04 21:49:50 -06:00
Stephen Heumann
6f2eb301e5 Implement C11 _Static_assert mechanism.
This allows code to contain static assertions (checked at compile time).
2020-01-04 18:16:29 -06:00
Stephen Heumann
0184e3db7b Recognize the new keywords from C99 and C11 as such.
Specifically, the following will now be tokenized as keywords:

_Alignas
_Alignof
_Atomic
_Bool
_Complex
_Generic
_Imaginary
_Noreturn
_Static_assert
_Thread_local
restrict

('inline' was also added as a standard keyword in C99, but ORCA/C already treated it as such.)

The parser currently has no support for any of these keywords, so for now errors will still be generated if they are used, but this is a first step toward adding support for them.
2020-01-03 22:48:53 -06:00
Stephen Heumann
ae6de310c7 Avoid storing stale values of __DATE__ or __TIME__ in sym files.
This could happen in some very obscure cases like using these macros for the names of segments or include files. The fix is to just terminate precompiled header generation if they are encountered.
2019-12-24 15:58:12 -06:00
Stephen Heumann
095807517b Fix bug leading to spurious errors in some cases when a sym file is present.
The issue was that invalid sym files could be generated if an #include is encountered within an #if or #ifdef block in the main source file. The fix (for now) is to simply terminate precompiled header generation if such an #include is encountered.

Fixes #2.
2019-12-24 15:45:32 -06:00
Stephen Heumann
60484d6f69 Fix for including system headers via macros.
This makes something like the following work:

#define STDIO_H <stdio.h>
#include STDIO_H

It didn't previously, because workString would be overwritten by NextToken. The effect in this case was that it would erroneously try to include the header <hh>, rather than <stdio.h>.

Detected based on a couple programs from FizzBuzz-C.
2018-09-13 21:59:46 -05:00
Stephen Heumann
95f5ec9c13 Don't print a whole bunch of spaces for an error message if the column number is 0.
This could happen, e.g., for a "'}' expected" error at end-of-file. It occurred because the 0..maxint type being used caused the Pascal compiler to use unsigned comparisons, which were inappropriate here.
2018-09-10 21:55:02 -05:00
Stephen Heumann
15b1c88d44 Give accurate error message if a numeric constant is too long.
Previously, "integer overflow" was reported in this case, even for floating constants.
2018-09-08 14:40:06 -05:00
Stephen Heumann
f6381b7523 Indicate errors at correct positions when the source line contains tabs.
Previously, the error markers would generally be misaligned in this case, because a tab would expand to no spaces (in ORCA/Shell) or multiple spaces (in most other environments), but the error-printing code would use a single space to try to line up with it.

The solution adopted is just to print tabs in the error lines at the positions where they occur in the source lines. The actual amount of space displayed will depend on the console being used, but in any case it should line up correctly with the source line.
2018-09-07 17:48:19 -05:00
Stephen Heumann
d33ac61af3 Fix bug where exit-to-editor may use wrong position in certain cases.
This could happen in certain situations where an error is detected at the end of a line (for example with "cannot redefine a macro" errors).

Fixes #40.
2018-09-06 23:55:25 -05:00
Stephen Heumann
dc1b0aa29f Add lint flag to check for several forms of undefined behavior in computations.
This adds lint bit 5 (a value of 32), which currently enables checking for the following conditions:

*Integer overflow from arithmetic in constant expressions (currently only of type int).
*Invalid constant shift counts (negative, or >= the width of the type)
*Division by (constant) zero.

These (mainly the first two) can be indicative of code that was designed for larger type sizes and needs changes to support 16-bit int.
2018-09-05 23:48:35 -05:00