This is contrary to the C standards, but ORCA/C historically permitted it (as do some other compilers), and I think there is a fair amount of existing code that relies on it.
This is normally 1 (indicating a hosted implementation, where the full standard library is available and the program starts by executing main()), but it is 0 if one of the pragmas for special types of programs with different entry points has been used.
This generalizes the heuristic approach for checking whether _Noreturn functions could execute to the end of the function, extending it to apply to any function with a non-void return type. These checks use the same #pragma lint bit but give different messages depending on the situation.
This uses a heuristic that may produce both false positives and false negatives, but any false positives should reflect extraneous code at the end of the function that is not actually reachable.
We now insert spaces corresponding to whitespace between tokens, and string tokens are enclosed in quotes.
There are still issues with (at least) escape sequences in strings and comments between tokens.
Currently, this only flags return statements, not cases where they may execute to the end of the function. (Whether the function will actually return is not decidable in general, although it may be in special cases).
This currently checks for:
*Calls to undefined functions (same as bit 0)
*Parameters not declared in K&R-style function definitions
*Declarations or type names with no type specifiers (includes but is broader than the condition checked by bit 1)
Previously, the designated initializer syntax could confuse the parser enough to cause null pointer dereferences. This avoids that, and also gives a more meaningful error message to the user.
In the #pragma lint line, the integer indicating the checks to perform can now optionally be followed by a semicolon and another integer. If these are present and the second integer is 0, then the lint checks will be performed, but will be treated as warnings rather than errors, so that they do not cause compilation to fail.
These were previously allowed in some cases, but not as the last argument to a macro. Also, stringization and concatenation of them did not behave according to the standards.
In combination with earlier patches, this fixes#53.
Also, if the lint flag requiring explicit function types is set, then also require that K&R-style parameters be explicitly declared with types, rather than not being declared and defaulting to int. (This is a requirement in C99 and later.)
Previously, these would report "identifier expected"; now they correctly say "')' expected".
This introduces a new UnexpectedTokenError procedure that can be used more generally for cases where the expected token may differ based on context.
_Thread_local is recognized but gives a "not supported" error. It could arguably be 'supported' trivially by saying the execution of an ORCA/C program is just one thread and so no special handling is needed, but that likely isn't what someone using it would expect.
There would be a possible issue if a "static" or "typedef" storage class specifier occurred after a type specifier that required memory to be allocated for it, because that memory conceptually might be in the local pool, but static objects are processed at the end of the translation unit, so their types need to stick around. In practice, this should not occur, because the local pool isn't currently used for much (in particular, not for statements or declarations in the body of a function). We give an error in case this somehow might occur.
In combination with preceding commits, this fixes#14. Declaration specifiers can now appear in any order, as required by the C standards.
_Bool, _Complex, _Imaginary, _Atomic, restrict, and _Alignas are now recognized in types, but all except restrict and _Alignas will give an error saying they are not supported.
This also introduces uniform definitions of the syntactic classes of tokens that can be used in declaration specifiers and related constructs (currently used in some places but not yet in others).
Type specifiers and type qualifiers can now appear in any order, as specified by the C standards. However, storage class specifiers and function specifiers still cannot be freely mixed with them.
Specifically, the following six punctuator tokens are now supported:
<: :> <% %> %: %:%:
These behave the same as the existing tokens [, ], {, }, #, and ## (respectively), apart from their spelling.
This can be useful when the full ASCII character set cannot easily be displayed or input (e.g. on the IIgs text screen with certain language settings).
Specifically, the following will now be tokenized as keywords:
_Alignas
_Alignof
_Atomic
_Bool
_Complex
_Generic
_Imaginary
_Noreturn
_Static_assert
_Thread_local
restrict
('inline' was also added as a standard keyword in C99, but ORCA/C already treated it as such.)
The parser currently has no support for any of these keywords, so for now errors will still be generated if they are used, but this is a first step toward adding support for them.
This could happen in some very obscure cases like using these macros for the names of segments or include files. The fix is to just terminate precompiled header generation if they are encountered.
The issue was that invalid sym files could be generated if an #include is encountered within an #if or #ifdef block in the main source file. The fix (for now) is to simply terminate precompiled header generation if such an #include is encountered.
Fixes#2.
This makes something like the following work:
#define STDIO_H <stdio.h>
#include STDIO_H
It didn't previously, because workString would be overwritten by NextToken. The effect in this case was that it would erroneously try to include the header <hh>, rather than <stdio.h>.
Detected based on a couple programs from FizzBuzz-C.
This could happen, e.g., for a "'}' expected" error at end-of-file. It occurred because the 0..maxint type being used caused the Pascal compiler to use unsigned comparisons, which were inappropriate here.
Previously, the error markers would generally be misaligned in this case, because a tab would expand to no spaces (in ORCA/Shell) or multiple spaces (in most other environments), but the error-printing code would use a single space to try to line up with it.
The solution adopted is just to print tabs in the error lines at the positions where they occur in the source lines. The actual amount of space displayed will depend on the console being used, but in any case it should line up correctly with the source line.
This adds lint bit 5 (a value of 32), which currently enables checking for the following conditions:
*Integer overflow from arithmetic in constant expressions (currently only of type int).
*Invalid constant shift counts (negative, or >= the width of the type)
*Division by (constant) zero.
These (mainly the first two) can be indicative of code that was designed for larger type sizes and needs changes to support 16-bit int.
Mainly, this causes the messages from the format checker to be displayed after the relevant line is printed, along with any other error messages. The wording and formatting of some of the messages is also slightly adjusted, but there should be no substantive change in what is warned about.
Previously, the characters ", /, and ? within string literals were not escaped in #pragma expand output, which could result in them being erroneously interpreted as ending the string literal, starting an escape sequence, or being part of a trigraph (respectively). Also, escape sequences were output in hexadecimal format. Since there is no length limit on hexadecimal escape sequences, this could result in subsequent characters in the string being interpreted as part of the escape sequence.
This fixes the issues by escaping the characters ", /, and ?, and by using three-digit octal escape sequences rather than hexadecimal ones.
commit 4265329097538640e9e21202f1b141bcd42a44f3
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 21:45:32 2018 -0400
indent to match standard indent.
commit 783518fbeb01d2df43ef2083d3341004c05e4e2e
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 20:21:15 2018 -0400
clean up the typenames
commit 29b627ecf5ca9b8a143761f85a1807a6ca35ddd9
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 20:18:04 2018 -0400
enable feature_hh, warn about %n with non-int modifier.
commit fc4ac8129e3772c4eda36658e344ec475938369c
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 15:13:47 2018 -0400
warn thar %lc, %ls, etc are unsupported.
commit 7e6b433ba0552f7e52f0f034d398e9195c764326
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 13:36:25 2018 -0400
warn about hh/ll modifier (if not supported)
commit 1943c9979d0013f9f38045ec04a962fbf0269f31
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 11:42:41 2018 -0400
use error facilities for format errors.
commit 7811168f56dca1387055574ba8d32638da2fad96
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Thu Mar 22 15:34:21 2018 -0400
add feature flags to disable c99 enhancements until orca lib is updated.
commit c2149cc5953155cfc3c3b4d0483cd25fb946b055
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Thu Mar 22 08:59:10 2018 -0400
Add printf/scanf format checking [WIP]
This parses out the xprintf / xscanf format string and compares it with the function arguments.
enabled via #pragma lint 16.