Commit Graph

94 Commits

Author SHA1 Message Date
Stephen Heumann 5871820e0c Support UTF-8/16/32 string literals and character constants (C11).
These have u8, u, or U prefixes, respectively. The types char16_t and char32_t (defined in <uchar.h>) are used for UTF-16 and UTF-32 code points.
2021-10-11 20:54:37 -05:00
Stephen Heumann 222c34a385 Fix bug in initialization using string literals with embedded nulls.
When using such a string literal to initialize an array with automatic storage duration, the bytes after the first null would be set to 0, rather than the values from the string literal.

Here is an example program showing the problem:

#include <stdio.h>
int main(void) {
        char s[] = "a\0b";
        puts(s+2);
}
2021-10-11 19:55:09 -05:00
Stephen Heumann 7ae830ae7e Initial support for compound literals.
Compound literals outside of functions should work at this point.

Compound literals inside of functions are not fully implemented, so they are disabled for now. (There is some code to support them, but the code to actually initialize them at the appropriate time is not written yet.)
2021-09-16 18:34:55 -05:00
Stephen Heumann 894baac94f Give an error if assigning to a whole struct or union that has a const member.
Such structs or unions are not modifiable lvalues, so they cannot be assigned to as a whole. Any non-const fields can be assigned to individually.
2021-09-11 18:12:58 -05:00
Stephen Heumann 2614f10ced Make the actual return type of a function be the unqualified version of the type specified.
This is a change that was introduced in C17. However, it actually keeps things closer to ORCA/C's historical behavior, which generally ignored qualifiers in return types.
2021-09-10 18:09:50 -05:00
Stephen Heumann 2f7e71cd24 Treat the fields of const structs as const-qualified.
This causes an error to be produced when trying to assign to these fields, which was being allowed before. It is also necessary for correct behavior of _Generic in some cases.
2021-09-09 18:39:19 -05:00
Stephen Heumann 00cc05a6a1 Move type qualifiers from array types to their element types.
This behavior is specified by the C standards. It can come up when declaring an array using a typedef'd array type and a qualifier.

This is necessary for correct behavior of _Generic, as well as to give an error if code tries to write to const arrays declared in this way.

Here is an example showing these issues:

#define f(e) _Generic((e), int *: 1, const int *:2, default: 0)
int main(void) {
        typedef int A[2][3];
        const A a = {{4, 5, 6}, {7, 8, 9}};
        _Static_assert(f(&a[0][0]) == 2, "qualifier error"); // OK
        a[1][1] = 42; // error
}
2021-08-30 18:30:05 -05:00
Stephen Heumann b16210a50b Record volatile and restrict qualifiers in types.
These are needed to correctly distinguish pointer types in _Generic. They should also be used for type compatibility checks in other contexts, but currently are not.

This also fixes a couple small problems related to type qualifiers:
*restrict was not allowed to appear after * in type-names
*volatile status was not properly recorded in sym files

Here is an example of using _Generic to distinguish pointer types based on the qualifiers of the pointed-to type:

#include <stdio.h>

#define f(e) _Generic((e),\
        int * restrict *: 1,\
        int * volatile const *: 2,\
        int **: 3,\
        default: 0)

#define g(e) _Generic((e),\
        int *: 1,\
        const int *: 2,\
        volatile int *: 3,\
        default: 0)

int main(void) {
        int * restrict * p1;
        int * volatile const * p2;
        int * const * p3;

        // should print "1 2 0 1"
        printf("%i %i %i %i\n", f(p1), f(p2), f(p3), f((int * restrict *)0));

        int *q1;
        const int *q2;
        volatile int *q3;
        const volatile int *q4;

        // should print "1 2 3 0"
        printf("%i %i %i %i\n", g(q1), g(q2), g(q3), g(q4));
}

Here is an example of a problem resulting from volatile not being recorded in sym files (if a sym file was present, the read of x was lifted out of the loop):

#pragma optimize -1
static volatile int x;
#include <stdio.h>
int main(void) {
        int y;
        for (unsigned i = 0; i < 100; i++) {
                y = x*2 + 7;
        }
}
2021-08-30 18:19:58 -05:00
Stephen Heumann fbdbad1f45 Report an error for certain large unsigned enumeration constants.
Enumeration constants must have values representable as an int (i.e. 16-bit signed values, in ORCA/C), but errors were not being reported if code tried to use the values 0xFFFF8000 to 0xFFFFFFFF. This problem could also affect certain larger values of type unsigned long long. The issue stemmed from not properly accounting for whether the constant expression had a signed or unsigned type.

This sample code demonstrated the problem:

enum E {
        a = 0xFFFFFFFF,
        b = 0xFFFF8000,
        y = 0x7FFFFFFFFFFFFFFFull,
        z = 0x8000000000000000
};
2021-07-07 20:06:05 -05:00
Stephen Heumann 979852be3c Use the right types for constants cast to character types.
These were previously treated as having type int. This resulted in incorrect results from sizeof, and would also be a problem for _Generic if it was implemented.

Note that this creates a token kind of "charconst", but this is not the kind for character constants in the source code. Those have type int, so their kind is intconst. The new kinds of "tokens" are created only through casts of constant expressions.
2021-03-07 13:38:21 -06:00
Stephen Heumann f9f79983f8 Implement the standard pragmas, in particular FENV_ACCESS.
The FENV_ACCESS pragma is now implemented. It causes floating-point operations to be evaluated at run time to the maximum extent possible, so that they can affect and be affected by the floating-point environment. It also disables optimizations that might evaluate floating-point operations at compile time or move them around calls to the <fenv.h> functions.

The FP_CONTRACT and CX_LIMITED_RANGE pragmas are also recognized, but they have no effect. (FP_CONTRACT relates to "contracting" floating-point expressions in a way that ORCA/C does not do, and CX_LIMITED_RANGE relates to complex arithmetic, which ORCA/C does not support.)
2021-03-06 00:57:13 -06:00
Stephen Heumann 77d66ab699 Support the predefined identifier __func__ (from C99).
This gives the name of the current function, as if the following definition appeared at the beginning of the function body:

static const char __func__[] = "function-name";
2021-03-02 22:28:28 -06:00
Stephen Heumann cf463ff155 Support switch statements using long long expressions. 2021-02-17 19:41:46 -06:00
Stephen Heumann 32ae4c2e17 Allow unsigned constants in "address+constant" constant expressions.
This affected initializers like the following:

static int a[50];
static int *ip = &a[0] + 2U;

Also, introduce some basic range checks for calculations that are obviously outside the 65816's address space.
2021-02-13 15:36:54 -06:00
Stephen Heumann 47fdd9e370 Implement support for functions returning (unsigned) long long.
These use a new calling convention specific to functions returning these types. When such functions are called, the caller must set the X register to the address within bank 0 that the return value is to be saved to. The function is then responsible for saving it there before returning to the caller.

Currently, the calling code always makes space for the return value on the stack and sets X to point to that. (As an optimization, it would be possible to have the return value written directly to a local variable on the direct page, with no change needed to the function being called, but that has not yet been implemented.)
2021-02-05 23:25:46 -06:00
Stephen Heumann 2408c9602c Make expressionValue a saturating approximation of the true value for long long expressions.
This gives sensible behavior for several things in the parser, e.g. where all negative values or all very large values should be disallowed.
2021-02-04 12:44:44 -06:00
Stephen Heumann 168a06b7bf Add support for emitting 64-bit constants in statically-initialized data. 2021-02-04 02:17:10 -06:00
Stephen Heumann 793f0a57cc Initial support for constants with long long types.
Currently, the actual values they can have are still constrained to the 32-bit range. Also, there are some bits of functionality (e.g. for initializers) that are not implemented yet.
2021-02-03 23:11:23 -06:00
Stephen Heumann 085cd7eb1b Initial code to recognize 'long long' as a type. 2021-01-29 22:27:11 -06:00
Stephen Heumann 52132db18a Implement the _Bool type from C99. 2021-01-25 21:22:58 -06:00
Stephen Heumann 5014fb97f9 Make 32-bit int (with #pragma unix 1) a distinct type from long. 2021-01-24 13:31:12 -06:00
Stephen Heumann c0b2b44cad Add a new representation of C basic types and use it for type checking.
This allows us to distinguish int from short, etc.
2020-03-01 15:00:02 -06:00
Stephen Heumann c84c4d9c5c Check for non-void functions that execute to the end without returning a value.
This generalizes the heuristic approach for checking whether _Noreturn functions could execute to the end of the function, extending it to apply to any function with a non-void return type. These checks use the same #pragma lint bit but give different messages depending on the situation.
2020-02-02 13:50:15 -06:00
Stephen Heumann bc951b6735 Make lint report some more cases where noreturn functions may return.
This uses a heuristic that may produce both false positives and false negatives, but any false positives should reflect extraneous code at the end of the function that is not actually reachable.
2020-01-30 17:35:15 -06:00
Stephen Heumann a4abc5e421 Recognize enum type specifiers as such.
They were erroneously triggering the "type specifier missing" lint warning.
2020-01-29 21:15:33 -06:00
Stephen Heumann 80c513bbf2 Add a lint flag for checking if _Noreturn functions may return.
Currently, this only flags return statements, not cases where they may execute to the end of the function. (Whether the function will actually return is not decidable in general, although it may be in special cases).
2020-01-29 19:26:45 -06:00
Stephen Heumann 4fd642abb4 Add lint check for return with no value in a non-void function.
This is disallowed in C99 and later.
2020-01-29 18:50:45 -06:00
Stephen Heumann a9f5fb13d8 Introduce a new #pragma lint bit for syntax that C99 disallows.
This currently checks for:
*Calls to undefined functions (same as bit 0)
*Parameters not declared in K&R-style function definitions
*Declarations or type names with no type specifiers (includes but is broader than the condition checked by bit 1)
2020-01-29 18:33:19 -06:00
Stephen Heumann ffe6c4e924 Spellcheck comments throughout the code.
There are no non-comment changes.
2020-01-29 17:09:52 -06:00
Stephen Heumann a72b611272 Make unnamed bit-fields take up space.
They should take up the same space as a named bit-field of the same width.

This fixes #60.
2020-01-28 22:54:40 -06:00
Stephen Heumann f5cd1e3e3a Recognize designated initializers enough to give an error and skip them.
Previously, the designated initializer syntax could confuse the parser enough to cause null pointer dereferences. This avoids that, and also gives a more meaningful error message to the user.
2020-01-28 12:48:09 -06:00
Stephen Heumann 9862500dee Give an error if a parameter in a function definition has an incomplete type.
In combination with earlier patches, this fixes #53.

Also, if the lint flag requiring explicit function types is set, then also require that K&R-style parameters be explicitly declared with types, rather than not being declared and defaulting to int. (This is a requirement in C99 and later.)
2020-01-20 12:43:01 -06:00
Stephen Heumann dd92585116 Give errors for most illegal uses of "restrict". 2020-01-19 17:31:20 -06:00
Stephen Heumann 49dea49cb8 Detect and give errors for various illegal uses of _Alignas. 2020-01-19 17:06:01 -06:00
Stephen Heumann a130e79929 Prohibit _Noreturn specifier on non-functions. 2020-01-19 14:57:28 -06:00
Stephen Heumann d10478967f Allow "restrict" in pointer declarators.
Valid uses of "restrict" should now be permitted, but invalid uses do not necessarily give errors (and it is not used for any kind of optimization).
2020-01-19 07:15:35 -06:00
Stephen Heumann b4232fd4ea Flag more appropriate errors about unexpected tokens in type names.
Previously, these would report "identifier expected"; now they correctly say "')' expected".

This introduces a new UnexpectedTokenError procedure that can be used more generally for cases where the expected token may differ based on context.
2020-01-18 16:43:25 -06:00
Stephen Heumann dbe330a7b1 Use centrally-defined token sets to recognize the beginning of declarations. 2020-01-18 15:21:27 -06:00
Stephen Heumann 0f3bb11d22 Remove special-case code for declarations with no declaration specifiers.
This can now be folded into the regular code path.
2020-01-18 15:21:24 -06:00
Stephen Heumann 08b4f8da3e Remove some unnecessary parameters and code.
These are not needed after the refactoring of declaration specifier processing.
2020-01-18 15:20:56 -06:00
Stephen Heumann df029ce06f Handle storage class specifiers in DeclarationSpecifiers.
_Thread_local is recognized but gives a "not supported" error. It could arguably be 'supported' trivially by saying the execution of an ORCA/C program is just one thread and so no special handling is needed, but that likely isn't what someone using it would expect.

There would be a possible issue if a "static" or "typedef" storage class specifier occurred after a type specifier that required memory to be allocated for it, because that memory conceptually might be in the local pool, but static objects are processed at the end of the translation unit, so their types need to stick around. In practice, this should not occur, because the local pool isn't currently used for much (in particular, not for statements or declarations in the body of a function). We give an error in case this somehow might occur.

In combination with preceding commits, this fixes #14. Declaration specifiers can now appear in any order, as required by the C standards.
2020-01-18 14:52:27 -06:00
Stephen Heumann fbe44e1852 Process function specifiers in DeclarationSpecifiers.
This includes both the standard ones (inline and _Noreturn) and the ORCA/C-specific ones (asm and pascal). They can now be freely mixed with other declaration specifiers.

Some errors related to function specifiers are not yet detected.
2020-01-15 07:28:44 -06:00
Stephen Heumann 84767f3340 Prevent output values of DeclarationSpecifiers from being corrupted by a recursive call to it.
This could happen in a declaration like "char _Alignas(long) c;", where typeSpec wound up specifying long rather than char.

Also, tweak error checks for _Alignas and _Atomic.
2020-01-12 17:15:25 -06:00
Stephen Heumann 8341f71ffc Initial phase of support for new C99/C11 type syntax.
_Bool, _Complex, _Imaginary, _Atomic, restrict, and _Alignas are now recognized in types, but all except restrict and _Alignas will give an error saying they are not supported.

This also introduces uniform definitions of the syntactic classes of tokens that can be used in declaration specifiers and related constructs (currently used in some places but not yet in others).
2020-01-12 15:43:30 -06:00
Stephen Heumann 3ce2be9f74 Simplify handling of const and volatile type qualifiers.
These qualifiers were previously sometimes accepted between the name and left brace of struct and enum type specifiers. This was non-standard and is no longer allowed.
2020-01-08 13:06:49 -06:00
Stephen Heumann 428c991895 Rewrite type specifier parsing.
Type specifiers and type qualifiers can now appear in any order, as specified by the C standards. However, storage class specifiers and function specifiers still cannot be freely mixed with them.
2020-01-07 20:26:56 -06:00
Stephen Heumann 06a027b237 Rename TypeSpecifier to DeclarationSpecifiers, consistent with C standard terminology.
Also remove its first argument, which was unused. There is no functional change yet.
2020-01-06 20:18:58 -06:00
Stephen Heumann a9fb1ba482 Move TypeName to be a top-level method in Parser.pas.
As of C11, type names are now used as part of the declaration syntax (in _Alignas and _Atomic specifiers), in addition to their uses in expressions. Moving the TypeName method will allow it to be called when processing declarations.
2020-01-06 20:18:58 -06:00
Stephen Heumann 6f2eb301e5 Implement C11 _Static_assert mechanism.
This allows code to contain static assertions (checked at compile time).
2020-01-04 18:16:29 -06:00
Stephen Heumann 9030052616 When initializing bitfields of type long, do not treat their values as pointer constants.
This was inappropriate and would lead to memory trashing.

Fixes case 3 in issue #59.
2019-12-24 18:52:25 -06:00