Commit Graph

57 Commits

Author SHA1 Message Date
Stephen Heumann d24dacf01a Add initial support for universal character names.
This currently only works in character constants or strings, not identifiers.
2020-01-19 23:59:54 -06:00
Stephen Heumann 6e89dc5883 Give a basic error message for use of _Generic. 2020-01-19 18:03:21 -06:00
Stephen Heumann dd92585116 Give errors for most illegal uses of "restrict". 2020-01-19 17:31:20 -06:00
Stephen Heumann 49dea49cb8 Detect and give errors for various illegal uses of _Alignas. 2020-01-19 17:06:01 -06:00
Stephen Heumann a130e79929 Prohibit _Noreturn specifier on non-functions. 2020-01-19 14:57:28 -06:00
Stephen Heumann b4232fd4ea Flag more appropriate errors about unexpected tokens in type names.
Previously, these would report "identifier expected"; now they correctly say "')' expected".

This introduces a new UnexpectedTokenError procedure that can be used more generally for cases where the expected token may differ based on context.
2020-01-18 16:43:25 -06:00
Stephen Heumann df029ce06f Handle storage class specifiers in DeclarationSpecifiers.
_Thread_local is recognized but gives a "not supported" error. It could arguably be 'supported' trivially by saying the execution of an ORCA/C program is just one thread and so no special handling is needed, but that likely isn't what someone using it would expect.

There would be a possible issue if a "static" or "typedef" storage class specifier occurred after a type specifier that required memory to be allocated for it, because that memory conceptually might be in the local pool, but static objects are processed at the end of the translation unit, so their types need to stick around. In practice, this should not occur, because the local pool isn't currently used for much (in particular, not for statements or declarations in the body of a function). We give an error in case this somehow might occur.

In combination with preceding commits, this fixes #14. Declaration specifiers can now appear in any order, as required by the C standards.
2020-01-18 14:52:27 -06:00
Stephen Heumann 8341f71ffc Initial phase of support for new C99/C11 type syntax.
_Bool, _Complex, _Imaginary, _Atomic, restrict, and _Alignas are now recognized in types, but all except restrict and _Alignas will give an error saying they are not supported.

This also introduces uniform definitions of the syntactic classes of tokens that can be used in declaration specifiers and related constructs (currently used in some places but not yet in others).
2020-01-12 15:43:30 -06:00
Stephen Heumann 428c991895 Rewrite type specifier parsing.
Type specifiers and type qualifiers can now appear in any order, as specified by the C standards. However, storage class specifiers and function specifiers still cannot be freely mixed with them.
2020-01-07 20:26:56 -06:00
Stephen Heumann 3121a465f1 Implement the _Alignof operator (from C11).
In ORCA/C, the alignment of all object types is 1.
2020-01-06 20:17:29 -06:00
Stephen Heumann 9036a98e1c Implement support for digraphs.
Specifically, the following six punctuator tokens are now supported:

<: :> <% %> %: %:%:

These behave the same as the existing tokens [, ], {, }, #, and ## (respectively), apart from their spelling.

This can be useful when the full ASCII character set cannot easily be displayed or input (e.g. on the IIgs text screen with certain language settings).
2020-01-04 21:49:50 -06:00
Stephen Heumann 6f2eb301e5 Implement C11 _Static_assert mechanism.
This allows code to contain static assertions (checked at compile time).
2020-01-04 18:16:29 -06:00
Stephen Heumann 0184e3db7b Recognize the new keywords from C99 and C11 as such.
Specifically, the following will now be tokenized as keywords:

_Alignas
_Alignof
_Atomic
_Bool
_Complex
_Generic
_Imaginary
_Noreturn
_Static_assert
_Thread_local
restrict

('inline' was also added as a standard keyword in C99, but ORCA/C already treated it as such.)

The parser currently has no support for any of these keywords, so for now errors will still be generated if they are used, but this is a first step toward adding support for them.
2020-01-03 22:48:53 -06:00
Stephen Heumann ae6de310c7 Avoid storing stale values of __DATE__ or __TIME__ in sym files.
This could happen in some very obscure cases like using these macros for the names of segments or include files. The fix is to just terminate precompiled header generation if they are encountered.
2019-12-24 15:58:12 -06:00
Stephen Heumann 095807517b Fix bug leading to spurious errors in some cases when a sym file is present.
The issue was that invalid sym files could be generated if an #include is encountered within an #if or #ifdef block in the main source file. The fix (for now) is to simply terminate precompiled header generation if such an #include is encountered.

Fixes #2.
2019-12-24 15:45:32 -06:00
Stephen Heumann 60484d6f69 Fix for including system headers via macros.
This makes something like the following work:

#define STDIO_H <stdio.h>
#include STDIO_H

It didn't previously, because workString would be overwritten by NextToken. The effect in this case was that it would erroneously try to include the header <hh>, rather than <stdio.h>.

Detected based on a couple programs from FizzBuzz-C.
2018-09-13 21:59:46 -05:00
Stephen Heumann 95f5ec9c13 Don't print a whole bunch of spaces for an error message if the column number is 0.
This could happen, e.g., for a "'}' expected" error at end-of-file. It occurred because the 0..maxint type being used caused the Pascal compiler to use unsigned comparisons, which were inappropriate here.
2018-09-10 21:55:02 -05:00
Stephen Heumann 15b1c88d44 Give accurate error message if a numeric constant is too long.
Previously, "integer overflow" was reported in this case, even for floating constants.
2018-09-08 14:40:06 -05:00
Stephen Heumann f6381b7523 Indicate errors at correct positions when the source line contains tabs.
Previously, the error markers would generally be misaligned in this case, because a tab would expand to no spaces (in ORCA/Shell) or multiple spaces (in most other environments), but the error-printing code would use a single space to try to line up with it.

The solution adopted is just to print tabs in the error lines at the positions where they occur in the source lines. The actual amount of space displayed will depend on the console being used, but in any case it should line up correctly with the source line.
2018-09-07 17:48:19 -05:00
Stephen Heumann d33ac61af3 Fix bug where exit-to-editor may use wrong position in certain cases.
This could happen in certain situations where an error is detected at the end of a line (for example with "cannot redefine a macro" errors).

Fixes #40.
2018-09-06 23:55:25 -05:00
Stephen Heumann dc1b0aa29f Add lint flag to check for several forms of undefined behavior in computations.
This adds lint bit 5 (a value of 32), which currently enables checking for the following conditions:

*Integer overflow from arithmetic in constant expressions (currently only of type int).
*Invalid constant shift counts (negative, or >= the width of the type)
*Division by (constant) zero.

These (mainly the first two) can be indicative of code that was designed for larger type sizes and needs changes to support 16-bit int.
2018-09-05 23:48:35 -05:00
Stephen Heumann 55dbc718c1 Small format checker adjustments.
Format checking for "%p" is improved: in the case of scanf, the corresponding argument must be a pointer to a pointer.
2018-09-02 15:12:52 -05:00
Stephen Heumann 69f086367c Adjust how messages from the printf/scanf format checker are displayed.
Mainly, this causes the messages from the format checker to be displayed after the relevant line is printed, along with any other error messages. The wording and formatting of some of the messages is also slightly adjusted, but there should be no substantive change in what is warned about.
2018-09-01 19:59:52 -05:00
Stephen Heumann 9ff3407c60 Avoid producing invalid string literals in #pragma expand output.
Previously, the characters ", /, and ? within string literals were not escaped in #pragma expand output, which could result in them being erroneously interpreted as ending the string literal, starting an escape sequence, or being part of a trigraph (respectively). Also, escape sequences were output in hexadecimal format. Since there is no length limit on hexadecimal escape sequences, this could result in subsequent characters in the string being interpreted as part of the escape sequence.

This fixes the issues by escaping the characters ", /, and ?, and by using three-digit octal escape sequences rather than hexadecimal ones.
2018-09-01 16:11:18 -05:00
Stephen Heumann a6f1211ee6 Properly treat #line directive as giving the next line number, not the current one. 2018-08-31 21:46:10 -05:00
Stephen Heumann caabb5addf Allow declarations in first clause of for loop (C99). 2018-04-01 16:48:11 -05:00
Stephen Heumann 275e1f080b Add a new flag to control whether mixed declarations are allowed and C99 scope rules are used.
#pragma ignore bit 4 (a value of 16) now controls these. It is on by default (allowing them), but turning it off will restore the C89 rules.
2018-04-01 14:14:18 -05:00
Kelvin Sherlock 6c1ccc5c0d Squashed commit of the following:
commit 4265329097538640e9e21202f1b141bcd42a44f3
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date:   Fri Mar 23 21:45:32 2018 -0400

    indent to match standard indent.

commit 783518fbeb01d2df43ef2083d3341004c05e4e2e
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date:   Fri Mar 23 20:21:15 2018 -0400

    clean up the typenames

commit 29b627ecf5ca9b8a143761f85a1807a6ca35ddd9
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date:   Fri Mar 23 20:18:04 2018 -0400

    enable feature_hh, warn about %n with non-int modifier.

commit fc4ac8129e3772c4eda36658e344ec475938369c
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date:   Fri Mar 23 15:13:47 2018 -0400

    warn thar %lc, %ls, etc are unsupported.

commit 7e6b433ba0552f7e52f0f034d398e9195c764326
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date:   Fri Mar 23 13:36:25 2018 -0400

    warn about hh/ll modifier (if not supported)

commit 1943c9979d0013f9f38045ec04a962fbf0269f31
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date:   Fri Mar 23 11:42:41 2018 -0400

    use error facilities for format errors.

commit 7811168f56dca1387055574ba8d32638da2fad96
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date:   Thu Mar 22 15:34:21 2018 -0400

    add feature flags to disable c99 enhancements until orca lib is updated.

commit c2149cc5953155cfc3c3b4d0483cd25fb946b055
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date:   Thu Mar 22 08:59:10 2018 -0400

    Add printf/scanf format checking [WIP]

    This parses out the xprintf / xscanf format string and compares it with the function arguments.

    enabled via #pragma lint 16.
2018-03-23 21:51:27 -04:00
Stephen Heumann c55acd1150 Give an error if the element type of an array type is an incomplete or function type. 2018-03-06 22:46:23 -06:00
Stephen Heumann bdd60d9d08 When using varargs stack repair, only disable native-code peephole opt in functions containing varargs calls.
There is no need to reduce the optimization in other functions, which will not contain any varargs stack repair code.
2018-01-13 21:37:28 -06:00
Stephen Heumann 5152790b00 Don't inappropriately re-expand a macro's name when it appears within the macro.
This should implement the C standard rules about making macro name tokens ineligible for replacement, except that it does not currently handle cases of nested replacements (such as a cycle of mutually-referential macros).

This fixes #12. There are still a couple other bugs with macro expansion in obscure cases, but I'll consider them separate issues.
2017-12-05 23:28:57 -06:00
Stephen Heumann 14cb6c6b8f Merge commit '3c09d1c4ff3ae3afc9afe0f44d58c5502ea953ed' 2017-11-12 20:08:26 -06:00
Stephen Heumann 8d31481182 Report errors for illegal pointer arithmetic operations.
These include arithmetic on pointers to incomplete types or functions, as well as subtraction of pointers to incompatible types.
2017-10-29 20:21:36 -05:00
Kelvin Sherlock 3c09d1c4ff add allowTokensAfterEndif setting for pragma ignore (from MPW ORCA/C IIgs) [WIP] 2017-10-28 20:19:00 -04:00
Stephen Heumann a69fc2be59 Restrict octal escape sequences in character constants and strings to at most three octal digits.
This is what is required by the C standards.

This partially reverts a change in ORCA/C 2.1.0, which should only have been applied to hexadecimal escape sequences.
2017-10-21 20:36:21 -05:00
Stephen Heumann 65009db03f Implement #warning preprocessor directive.
This prints a warning message, but does not abort compilation.

#warning is non-standard, but supported by other common compilers like GCC and Clang.
2017-10-21 20:36:21 -05:00
Stephen Heumann 3ce69a4070 Add support for binary constants.
This is a patch from Kelvin Sherlock, with minor changes.
2017-10-21 20:36:21 -05:00
Stephen Heumann b5bad4da72 Add support for multi-character character constants.
This is based on a patch from Kelvin Sherlock, and in turn on code from MPW IIgs ORCA/C, but with modifications to be more standards-compliant.

Bit 1 in #pragma ignore controls a new option to (non-standardly) treat character constants with three or more characters as having type long, so they can contain up to four bytes.

Note that this patch orders the bytes the opposite way from MPW IIgs ORCA/C, but the same way as GCC and Clang.
2017-10-21 20:36:21 -05:00
Stephen Heumann e7cc513ad4 Add support for inline procedure names as documented in IIgs tech note #103.
These are enabled when bit 15 is set in the #pragma debug directive.

Support is still needed to ensure these work properly with pre-compiled headers.

This patch is from Kelvin Sherlock.
2017-10-21 20:36:21 -05:00
Stephen Heumann d9523c145c Allow unknown preprocessor directives in skipped blocks.
For example, the following should not generate an error:

#if 0
#warning "..."
#endif
2017-10-21 20:36:21 -05:00
Stephen Heumann ccd653ddb9 Move some more code out of the blank segment to make space for static data. 2017-10-21 20:36:21 -05:00
Stephen Heumann 8ca3d5f4f0 Allow skipped code to contain pp-numbers that are not valid numeric constants.
The C standards define "pp-number" tokens handled by the preprocessor using a syntax that encompasses various things that aren't valid integer or floating constants, or are constants too large for ORCA/C to handle. These cases would previously give errors even in code skipped by the preprocessor. With this patch, most such errors in skipped code are now ignored.

This is useful, e.g., to allow for #ifdefed-out code containing 64-bit constants.

There are still some cases involving pp-numbers that should be allowed but aren't, particularly in the context of macros.
2017-10-21 20:36:21 -05:00
Stephen Heumann 227731a1a8 Allow "static inline" function declarations.
This should give C99-compatible behavior, as far as it goes. The functions aren't actually inlined, but that's just a quality-of-implementation issue. No C standard requires actual inlining.

Non-static inline functions are still not supported. The C99 semantics for them are more complicated, and they're less widely used, so they're a lower priority for now.

The "inline" function specifier can currently only come after the "static" storage class specifier. This relates to a broader issue where not all legal orderings of declaration specifiers are supported.

Since "inline" was already treated as a keyword in ORCA/C, this shouldn't create any extra compatibility issues for C89 code.
2017-10-21 20:36:21 -05:00
Stephen Heumann 6ea43d34a1 Skip tokens following preprocessing directives on lines that are skipped.
This allows code like the following to compile:

#if 0
#if some bogus stuff !
#endif
#endif

This is what the C standards require. The change affects #if, #ifdef, and #ifndef directives.

This may be needed to handle code targeted at other compilers that allow pseudo-functions such as "__has_feature" in preprocessor expressions.
2017-10-21 20:36:21 -05:00
Stephen Heumann 4c0b02f32e Always convert keywords and typedef names to identifiers when processing preprocessor expressions.
This avoids various problems with inappropriately processing these elements, which should not be recognized as such at preprocessing time. For example, the following program should compile without errors, but did not:

typedef long foo;
#if int+1
#if foo-1
int main(void) {}
#endif
#endif
2017-10-21 20:36:21 -05:00
Stephen Heumann d0b4b75970 Fix problem where if a macro's name appeared inside the macro, it would be expanded repeatedly, leading to a crash.
This is a problem introduced by the scanner changes between ORCA/C 2.1.0 and ORCA/C 2.1.1 B3.

The following examples demonstrate the problem:

#define m m
m

#define f(x) f(x)
f(a)
2017-10-21 20:36:21 -05:00
Stephen Heumann 5618b2810e Don’t produce spurious error messages when #error is used with tokens other than a string constant.
According to the C standards, #error is supposed to take an arbitrary sequence of pp-tokens, not just a string constant.
2017-10-21 20:36:21 -05:00
Stephen Heumann 8b4c83f527 Give an error when trying to use sizeof on incomplete struct or union types.
The following demonstrates cases that would erroneously be allowed (and misleadingly give a size of 0) before:

#include <stdio.h>
struct s *S;
int main(void)
{
        printf("%lu %lu\n", sizeof(struct s), sizeof *S);
}
2017-10-21 20:36:21 -05:00
Stephen Heumann 45cc0a0721 Prevent struct/union members from having incomplete type, except for flexible array member.
ORCA/C previously allowed struct/union members to be declared with incomplete type. Because of this, it allowed C99-style flexible array members to be declared, albeit by accident rather than by design. In some basic testing, these seem to work correctly, except that they could be initialized and that would give rise to odd behavior.

I have restricted it to allowing flexible array members only in the cases allowed by C99/C11, and otherwise disallowing members with incomplete type. I have also prohibited initializing flexible array members.
2017-10-21 20:36:20 -05:00
Stephen Heumann 4247eb2c91 Fix bug where line number was off by one when using defaults.h.
This fixes the compco11.c test case.
2017-10-21 20:36:20 -05:00