Commit Graph

135 Commits

Author SHA1 Message Date
Stephen Heumann a6ef872513 Add debugging option to detect illegal use of null pointers.
This adds debugging code to detect null pointer dereferences, as well as pointer arithmetic on null pointers (which is also undefined behavior, and can lead to later dereferences of the resulting pointers).

Note that ORCA/Pascal can already detect null pointer dereferences as part of its more general range-checking code. This implementation for ORCA/C will report the same error as ORCA/Pascal ("Subrange exceeded"). However, it does not include any of the other forms of range checking that ORCA/Pascal does, and (unlike in ORCA/Pascal) it is controlled by a separate flag from stack overflow checking.
2023-02-12 18:56:02 -06:00
Stephen Heumann 03fc7a43b9 Give an error for expressions with incomplete struct/union types.
These are erroneous, in situations where the expression is used for its value. For function return types, this violates a constraint (C17 6.5.2.2 p1), so a diagnostic is required. We also now diagnose this issue for identifier expressions or unary * (indirection) expressions. These cases cause undefined behavior per C17 6.3.2.1 p2, so a diagnostic is not required, but it is nice to give one.
2023-01-09 21:58:53 -06:00
Stephen Heumann 2958619726 Fix varargs stack repair.
Varargs-only stack repair (i.e. using #pragma optimize bit 3 but not bit 6) was broken by commit 32975b720f. It removed some code that was needed to allocate the direct page location used to hold the stack pointer value in that case. This would lead to invalid code being produced, which could cause a crash when run. The fix is to revert the erroneous parts of commit 32975b720f (which do not affect its core purpose of enabling intermediate code peephole optimization to be used when stack repair code is active).
2023-01-08 15:15:32 -06:00
Stephen Heumann 245dd0a3f4 Add lint check for implicit conversions that change a constant's value.
This occurs when the constant value is out of range of the type being assigned to. This is likely indicative of an error, or of code that assumes types have larger ranges than they do in ORCA/C (e.g. 32-bit int).

This intentionally does not report cases where a value is assigned to a signed type but is within the range of the corresponding unsigned type, or vice versa. These may be done intentionally, e.g. setting an unsigned value to "-1" or setting a signed value using a hex constant with the high bit set. Also, only conversions to 8-bit or 16-bit integer types are currently checked.
2023-01-03 18:57:32 -06:00
Stephen Heumann fe62f70d51 Add lint option to check for unused variables. 2022-12-12 21:47:32 -06:00
Stephen Heumann 32975b720f Allow native code peephole opt to be used when stack repair is enabled.
I think the reason this was originally disallowed is that the old code sequence for stack repair code (in ORCA/C 2.1.0) ended with TYA. If this was followed by STA dp or STA abs, the native code peephole optimizer (prior to commit 7364e2d2d3) would have turned the combination into a STY instruction. That is invalid if the value in A is needed. This could come up, e.g., when assigning the return value from a function to two different variables.

This is no longer an issue, because the current code sequence for stack repair code no longer ends in TYA and is not susceptible to the same kind of invalid optimization. So it is no longer necessary to disable the native code peephole optimizer when using stack repair code (either for all calls or just varargs calls).
2022-12-10 20:34:00 -06:00
Stephen Heumann e71fe5d785 Treat unary + as an actual operator, not a no-op.
This is necessary both to detect errors (using unary + on non-arithmetic types) and to correctly perform the integer promotions when unary + is used (which can be detected with sizeof or _Generic).
2022-12-09 19:03:38 -06:00
Stephen Heumann f027286b6a Do not generate varargs stack repair code if no variable args are passed.
This affects calls to a varargs function that do not actually supply any arguments beyond the fixed portion of the argument list, e.g. printf("foo"). Since these calls do not supply any variable arguments, they clearly do not include any extra variable arguments beyond those used by the function, so the standards-conformance issue that requires varargs stack repair code does not apply.

It is possible that the call may include too few variable arguments, but that is illegal behavior, and it will trash the stack even when varargs stack repair code is present (although in some cases programs may "get away" with it). If stack repair code around these calls is still desired, then general stack repair code can be enabled for all function calls.
2022-12-08 19:27:37 -06:00
Stephen Heumann 3f450bdb80 Support "inline" function definitions without static or extern.
This is a minimal implementation that does not actually inline anything, but it is intended to implement the semantics defined by the C99 and later standards.

One complication is that a declaration that appears somewhere after the function body may create an external definition for a function that appeared to be an inline definition when it was defined. To support this while preserving ORCA/C's general one-pass compilation strategy, we generate code even for inline definitions, but treat them as private and add the prefix "~inline~" to the name. If they are "un-inlined" based on a later declaration, we generate a stub with external linkage that just jumps to the apparently-inline function.
2022-11-19 23:04:22 -06:00
Stephen Heumann 65ec29ee3e Use 32-bit representation for line numbers.
C99 and later specify that line numbers set via #line can be up to 2147483647, so they need to be represented as (at least) a 32-bit value.
2022-10-22 21:46:12 -05:00
Stephen Heumann 1fa3ec8fdd Eliminate global variables for declaration specifiers.
They are now represented in local structures instead. This keeps the representation of declaration specifiers together and eliminates the need for awkward and error-prone code to save and restore the global variables.
2022-10-01 21:28:16 -05:00
Stephen Heumann fd54fd70d0 Remove some unnecessary/duplicate code.
This mainly comments out statements that zero out data that was already set to zero by a preceding Calloc call.
2022-07-18 21:19:44 -05:00
Stephen Heumann 6934c8890d Detect several cases of inappropriate operand types being used with ++ or --. 2022-07-12 18:35:52 -05:00
Stephen Heumann 63d33b47bf Generate valid code for "dereferencing" pointers to void.
This covers code like the following, which is very dubious but does not seem to be clearly prohibited by the standards:

int main(void) {
        void *vp;
        *vp;
}

Previously, this would do an indirect load of a four-byte value at the location, but then treat it as void. This could lead to the four-byte value being left on the stack, eventually causing a crash. Now we just evaluate the pointer expression (in case it has side effects), but effectively cast it to void without dereferencing it.
2022-07-12 18:34:58 -05:00
Stephen Heumann 11a3195c49 Use properly result type for statically evaluated ternary operators.
Also, update the tests to include statically-evaluated cases.
2022-07-04 22:30:25 -05:00
Stephen Heumann 102d6873a3 Fix type checking and result type computation for ? : operator.
This was non-standard in various ways, mainly in regard to pointer types. It has been rewritten to closely follow the specification in the C standards.

Several helper functions dealing with types have been introduced. They are currently only used for ? :, but they might also be useful for other purposes.

New tests are also introduced to check the behavior for the ? : operator.

This fixes #35 (including the initializer-specific case).
2022-06-23 22:05:34 -05:00
Stephen Heumann 15dc3a46c4 Allow casts between long long and pointer types.
This applies to casts in executable code. Some casts in initializers still don't work.
2022-06-20 21:55:54 -05:00
Stephen Heumann 5e20e02d06 Add a function to make a pointer type.
This allows us to refactor out code that was doing this in several places.
2022-06-19 17:55:08 -05:00
Stephen Heumann 58849607a1 Use cgPointerSize for size of pointers in various places.
This makes no practical difference when targeting the GS, but it better documents what the relevant size is.
2022-06-18 19:30:20 -05:00
Stephen Heumann 802ba3b0ba Make unary & always yield a pointer type, not an array.
This affects expressions like &*a (where a is an array) or &*"string". In most contexts, these undergo array-to-pointer conversion anyway, but as an operand of sizeof they do not. This leads to sizeof either giving the wrong value (the size of the array rather than of a pointer) or reporting an error when the array size is not recorded as part of the type (which is currently the case for string constants).

In combination with an earlier patch, this fixes #8.
2022-06-18 18:53:29 -05:00
Stephen Heumann 67ffeac7d4 Use the proper type for expressions like &"string".
These should have a pointer-to-array type, but they were treated like pointers to the first element.
2022-06-17 18:45:11 -05:00
Stephen Heumann 3c2b492618 Add support for compound literals within functions.
The basic approach is to generate a single expression tree containing the code for the initialization plus the reference to the compound literal (or its address). The various subexpressions are joined together with pc_bno pcodes, similar to the code generated for the comma operator. The initializer expressions are placed in a balanced binary tree, so that it is not excessively deep.

Note: Common subexpression elimination has poor performance for very large trees. This is not specific to compound literals, but compound literals for relatively large arrays can run into this issue. It will eventually complete and generate a correct program, but it may be quite slow. To avoid this, turn off CSE.
2022-06-08 21:34:12 -05:00
Stephen Heumann daff1754b2 Make volatile loads from IO/softswitches access exactly the byte(s) specified.
Previously, one-byte loads were typically done by reading a 16-bit value and then masking off the upper 8 bits. This is a problem when accessing softswitches or slot IO locations, because reading the subsequent byte may have some undesired effect. Now, ORCA/C will do an 8-bit read for such cases, if the volatile qualifier is used.

There were also a couple optimizations that could occasionally result in not all the bytes of a larger value actually being read. These are now disabled for volatile loads that may access softswitches or IO.

These changes should make ORCA/C more suitable for writing low-level software like device drivers.
2022-05-23 21:10:29 -05:00
Stephen Heumann 5357e65859 Fix indentation of a few lines. 2022-01-22 18:21:11 -06:00
Stephen Heumann 5871820e0c Support UTF-8/16/32 string literals and character constants (C11).
These have u8, u, or U prefixes, respectively. The types char16_t and char32_t (defined in <uchar.h>) are used for UTF-16 and UTF-32 code points.
2021-10-11 20:54:37 -05:00
Stephen Heumann 1b9955bf8b Allow access to fields from all struct-typed expressions.
This affects field selection expressions where the left expressions is a struct/union assignment or a function call returning a struct or union. Such expressions should be accepted, but they were giving spurious errors.

The following program illustrates the problem:

struct S {int a,b;} x, y={2,3};

struct S f(void) {
        struct S s = {7,8};
        return s;
}

int main(void) {
        return f().a + (x=y).b;
}
2021-09-17 22:04:10 -05:00
Stephen Heumann 7ae830ae7e Initial support for compound literals.
Compound literals outside of functions should work at this point.

Compound literals inside of functions are not fully implemented, so they are disabled for now. (There is some code to support them, but the code to actually initialize them at the appropriate time is not written yet.)
2021-09-16 18:34:55 -05:00
Stephen Heumann 894baac94f Give an error if assigning to a whole struct or union that has a const member.
Such structs or unions are not modifiable lvalues, so they cannot be assigned to as a whole. Any non-const fields can be assigned to individually.
2021-09-11 18:12:58 -05:00
Stephen Heumann 7848e50218 Implement stricter type checks for comparisons.
These rules are used if loose type checks are disabled. They are intended to strictly implement the constraints in C17 sections 6.5.9 and 6.5.10.

This patch also fixes a bug where object pointer comparisons to "const void *" should be permitted but were not.
2021-09-10 21:02:55 -05:00
Stephen Heumann a8682e28d3 Give an error for pointer assignments that discard qualifiers.
This is controlled by #pragma ignore bit 5, which is now a more general "loose type checks" bit.
2021-09-10 17:58:20 -05:00
Stephen Heumann 2f7e71cd24 Treat the fields of const structs as const-qualified.
This causes an error to be produced when trying to assign to these fields, which was being allowed before. It is also necessary for correct behavior of _Generic in some cases.
2021-09-09 18:39:19 -05:00
Stephen Heumann 99f5e2fc87 Avoid leaking memory when processing _Generic expressions. 2021-09-07 19:30:57 -05:00
Stephen Heumann b16210a50b Record volatile and restrict qualifiers in types.
These are needed to correctly distinguish pointer types in _Generic. They should also be used for type compatibility checks in other contexts, but currently are not.

This also fixes a couple small problems related to type qualifiers:
*restrict was not allowed to appear after * in type-names
*volatile status was not properly recorded in sym files

Here is an example of using _Generic to distinguish pointer types based on the qualifiers of the pointed-to type:

#include <stdio.h>

#define f(e) _Generic((e),\
        int * restrict *: 1,\
        int * volatile const *: 2,\
        int **: 3,\
        default: 0)

#define g(e) _Generic((e),\
        int *: 1,\
        const int *: 2,\
        volatile int *: 3,\
        default: 0)

int main(void) {
        int * restrict * p1;
        int * volatile const * p2;
        int * const * p3;

        // should print "1 2 0 1"
        printf("%i %i %i %i\n", f(p1), f(p2), f(p3), f((int * restrict *)0));

        int *q1;
        const int *q2;
        volatile int *q3;
        const volatile int *q4;

        // should print "1 2 3 0"
        printf("%i %i %i %i\n", g(q1), g(q2), g(q3), g(q4));
}

Here is an example of a problem resulting from volatile not being recorded in sym files (if a sym file was present, the read of x was lifted out of the loop):

#pragma optimize -1
static volatile int x;
#include <stdio.h>
int main(void) {
        int y;
        for (unsigned i = 0; i < 100; i++) {
                y = x*2 + 7;
        }
}
2021-08-30 18:19:58 -05:00
Stephen Heumann 2b9d332580 Give an appropriate error for an illegal operator in a constant expression.
This was being reported as an "illegal type cast".
2021-08-22 20:33:34 -05:00
Stephen Heumann debd0ccffc Always allow the middle expression of a ? : expression to use the comma operator.
This should be allowed, but it previously could lead to spurious errors in contexts like argument lists, where a comma would normally be expected to end the expression.

The following example program demonstrated the problem:

#include <stdlib.h>
int main(void) {
        return abs(1 ? 2,-3 : 4);
}
2021-03-16 18:20:42 -05:00
Stephen Heumann 031af54112 Save the original value when doing postfix ++/-- on fp types.
The old code would add 1 and then subtract 1, which does not necessarily give the original value (e.g. if it is much less than 1).
2021-03-09 19:29:55 -06:00
Stephen Heumann 4381b97f86 Report an error if a type name is missing in a _Generic expression. 2021-03-09 17:45:49 -06:00
Stephen Heumann f2414cd815 Create a new function that checks for compatible types strictly according to the C standards.
For now, this is only used for _Generic expressions. Eventually, it should probably replace the current CompTypes, but CompTypes currently performs somewhat looser checks that are suitable for some situations, so adjustments would be needed at some call sites.
2021-03-07 23:39:30 -06:00
Stephen Heumann 2de8ac993e Fix to make _Generic handle struct types properly.
Also, use an existing error message instead of creating a new equivalent one.
2021-03-07 23:35:12 -06:00
Stephen Heumann bccd86a627 Implement _Generic expressions (from C11).
Note that this code relies on CompTypes for type compatibility testing, and it has slightly non-standard behavior in some cases.
2021-03-07 21:59:37 -06:00
Stephen Heumann 979852be3c Use the right types for constants cast to character types.
These were previously treated as having type int. This resulted in incorrect results from sizeof, and would also be a problem for _Generic if it was implemented.

Note that this creates a token kind of "charconst", but this is not the kind for character constants in the source code. Those have type int, so their kind is intconst. The new kinds of "tokens" are created only through casts of constant expressions.
2021-03-07 13:38:21 -06:00
Stephen Heumann 8f8e7f12e2 Distinguish the different types of floating-point constants.
As with expressions, the type does not actually limit the precision and range of values represented.
2021-03-07 00:48:51 -06:00
Stephen Heumann 41623529d7 Keep track of semantic type of floating-point expressions.
Previously, the type was forced to extended in many circumstances. This was visible in that the results of sizeof were incorrect. It would also affect _Generic, if and when that is implemented.

Note that this does not affect the actual format used for computations and storage of intermediates. That is still the extended format.
2021-03-06 23:54:55 -06:00
Stephen Heumann fc515108f4 Make floating-point casts reduce the range and precision of numbers.
The C standards generally allow floating-point operations to be done with extra range and precision, but they require that explicit casts convert to the actual type specified. ORCA/C was not previously doing that.

This patch relies on some new library routines (currently in ORCALib) to do this precision reduction.

This fixes #64.
2021-03-06 22:28:39 -06:00
Stephen Heumann f9f79983f8 Implement the standard pragmas, in particular FENV_ACCESS.
The FENV_ACCESS pragma is now implemented. It causes floating-point operations to be evaluated at run time to the maximum extent possible, so that they can affect and be affected by the floating-point environment. It also disables optimizations that might evaluate floating-point operations at compile time or move them around calls to the <fenv.h> functions.

The FP_CONTRACT and CX_LIMITED_RANGE pragmas are also recognized, but they have no effect. (FP_CONTRACT relates to "contracting" floating-point expressions in a way that ORCA/C does not do, and CX_LIMITED_RANGE relates to complex arithmetic, which ORCA/C does not support.)
2021-03-06 00:57:13 -06:00
Stephen Heumann 4ad7a65de6 Process floating-point values within the compiler using the extended type.
This means that floating-point constants can now have the range and precision of the extended type (aka long double), and floating-point constant expressions evaluated within the compiler also have that same range and precision (matching expressions evaluated at run time). This new behavior is intended to match the behavior specified in the C99 and later standards for FLT_EVAL_METHOD 2.

This fixes the previous problem where long double constants and constant expressions of type long double were not represented and evaluated with the full range and precision that they should be. It also gives extra range and precision to constants and constant expressions of type double or float. This may have pluses and minuses, but at any rate it is consistent with the existing behavior for expressions evaluated at run time, and with one of the possible models of floating point evaluation specified in the C standards.
2021-03-04 23:58:08 -06:00
Stephen Heumann 77d66ab699 Support the predefined identifier __func__ (from C99).
This gives the name of the current function, as if the following definition appeared at the beginning of the function body:

static const char __func__[] = "function-name";
2021-03-02 22:28:28 -06:00
Kelvin Sherlock b39dd0f34c ||, &&, ==, and != ops were clobbering the upper 32-bits
before comparing them.

#if 0xffffffff00000000==0
#error ...
#endif
#if 0xffffffff00000000!=0xffffffff00000000
#error ...
#endif
2021-03-01 18:13:16 -05:00
Stephen Heumann 21f8876f50 In PP expressions, make sure identifiers turn into 0LL. 2021-02-25 21:42:54 -06:00
Stephen Heumann 4020098dd6 Evaluate constant expressions with long long and floating operands.
Note that we currently defer evaluation of such expressions to run time if the long long value cannot be represented exactly in a double, because statically-evaluated floating point expressions use the double format rather than the extended (long double) format used at run time.
2021-02-21 18:43:53 -06:00