Commit Graph

168 Commits

Author SHA1 Message Date
Stephen Heumann a6ef872513 Add debugging option to detect illegal use of null pointers.
This adds debugging code to detect null pointer dereferences, as well as pointer arithmetic on null pointers (which is also undefined behavior, and can lead to later dereferences of the resulting pointers).

Note that ORCA/Pascal can already detect null pointer dereferences as part of its more general range-checking code. This implementation for ORCA/C will report the same error as ORCA/Pascal ("Subrange exceeded"). However, it does not include any of the other forms of range checking that ORCA/Pascal does, and (unlike in ORCA/Pascal) it is controlled by a separate flag from stack overflow checking.
2023-02-12 18:56:02 -06:00
Stephen Heumann 245dd0a3f4 Add lint check for implicit conversions that change a constant's value.
This occurs when the constant value is out of range of the type being assigned to. This is likely indicative of an error, or of code that assumes types have larger ranges than they do in ORCA/C (e.g. 32-bit int).

This intentionally does not report cases where a value is assigned to a signed type but is within the range of the corresponding unsigned type, or vice versa. These may be done intentionally, e.g. setting an unsigned value to "-1" or setting a signed value using a hex constant with the high bit set. Also, only conversions to 8-bit or 16-bit integer types are currently checked.
2023-01-03 18:57:32 -06:00
Stephen Heumann 9f36e99194 Document __useTimeTool and add a declaration for it. 2023-01-02 18:10:41 -06:00
Stephen Heumann 5476118951 Add documentation and headers for timespec_get.
A macro is used to control whether struct timespec is declared, because GNO might want to declare it in other headers, and this would allow it to avoid duplicate declarations. (This will still require changes in the GNO headers. Currently, they declare struct timespec with different field names, although the layout is the same.)
2023-01-01 21:46:19 -06:00
Stephen Heumann 59664df9d9 Document <time.h> bug fixes. 2023-01-01 21:44:02 -06:00
Stephen Heumann f7a139b4b5 Document use of Time Tool Set by gmtime and strftime.
Also include some tests for strftime %z and %Z conversions (although just producing no output will satisfy them).
2022-12-28 19:57:19 -06:00
Stephen Heumann 7d3f1c8dd7 Add headers, documentation, and tests for tgamma(). 2022-12-24 20:21:31 -06:00
Stephen Heumann 265a16d2f5 Add headers, documentation, and tests for erf() and erfc(). 2022-12-17 22:26:59 -06:00
Stephen Heumann 4bc486eade Do not require unused static functions to be defined.
This mostly implements the rule in C17 6.9 p3, which requires a definition to be provided only if the function is used in an expression. Per that rule, we should also exclude most sizeof or _Alignof operands, but we don't do that yet.
2022-12-12 22:10:36 -06:00
Stephen Heumann fe62f70d51 Add lint option to check for unused variables. 2022-12-12 21:47:32 -06:00
Stephen Heumann 17936a14ed Rework root file code for CDevs to avoid leaking user IDs.
Formerly, the code would allocate user IDs but never free them. The result was that one user ID was leaked for each time a CDev was opened and closed.

The new root code calls new cleanup code in ORCALib, which detects if the CDev is going away and deallocates its user ID if so.
2022-12-11 22:01:29 -06:00
Stephen Heumann ecca7a7737 Never make the segment in the root file dynamic.
This would previously happen if a segment directive with "dynamic" appeared before the first function in the program. That would cause the resulting program not to work, because the root segment needs to be a static segment at the start of the program, but if it is dynamic it would come after a jump table and a static segment of library code.

The root segments are also configured to refer to main or the NDA/CDA entry points using LEXPR records, so that they can be in dynamic segments (not that they necessarily should be). That change is intentionally not done for CDEV/XCMD/NBA, because they use code resources, which do not support dynamic segments, so it is better to force a linker error in these cases.
2022-12-11 14:46:38 -06:00
Stephen Heumann 32975b720f Allow native code peephole opt to be used when stack repair is enabled.
I think the reason this was originally disallowed is that the old code sequence for stack repair code (in ORCA/C 2.1.0) ended with TYA. If this was followed by STA dp or STA abs, the native code peephole optimizer (prior to commit 7364e2d2d3) would have turned the combination into a STY instruction. That is invalid if the value in A is needed. This could come up, e.g., when assigning the return value from a function to two different variables.

This is no longer an issue, because the current code sequence for stack repair code no longer ends in TYA and is not susceptible to the same kind of invalid optimization. So it is no longer necessary to disable the native code peephole optimizer when using stack repair code (either for all calls or just varargs calls).
2022-12-10 20:34:00 -06:00
Stephen Heumann 7364e2d2d3 Fix issue with native code optimization of TYA+STA.
This would be changed to STY, but that is invalid if the A value is needed afterward. This could affect the code for certain division operations (after the optimizations in commit 4470626ade).

Here is an example that would be miscompiled:

#pragma optimize -1
#include <stdio.h>
int main(void) {
        unsigned i = 55555;
        unsigned a,b;
        a = b = i / 10000;
        printf("%u %u\n", a,b);
}

Also, remove MVN from the list of "ASafe" instructions since it really isn't, although I don't think this was affecting anything in practice.
2022-12-10 19:37:48 -06:00
Stephen Heumann e71fe5d785 Treat unary + as an actual operator, not a no-op.
This is necessary both to detect errors (using unary + on non-arithmetic types) and to correctly perform the integer promotions when unary + is used (which can be detected with sizeof or _Generic).
2022-12-09 19:03:38 -06:00
Stephen Heumann bb1bd176f4 Add a command-line option to select the C standard to use.
This provides a more straightforward way to place the compiler in a "strict conformance" mode. This could essentially be achieved by setting several pragma options, but having a single setting is simpler. "Compatibility modes" for older standards can also be selected, although these actually continue to enable most C17 features (since they are unlikely to cause compatibility problems for older code).
2022-12-07 21:35:15 -06:00
Stephen Heumann 8e1db102eb Allow line continuations within // comments.
This is what the standards specify.
2022-12-04 23:16:06 -06:00
Stephen Heumann c06d78bb5e Add __STDC_VERSION__ macro.
With the addition of designated initializers, ORCA/C now supports all the major mandatory language features added between C90 and C17, apart from those made optional by C11. There are still various small areas of nonconformance and a number of missing library functions, but at this point it is reasonable for ORCA/C to report itself as being a C17 implementation.
2022-12-04 22:25:02 -06:00
Stephen Heumann 2550081517 Fix bug with 4-byte comparisons against globals in large memory model.
Long addressing was not being used to access the values, which could lead to mis-evaluation of comparisons against values in global structs, unions, or arrays, depending on the memory layout.

This could sometimes affect the c99desinit.c test, when run with large memory model and at least intermediate code peephole optimization. It could also affect this simpler test (depending on memory layout):

#pragma memorymodel 1
#pragma optimize 1
struct S {
        void *p;
} s =  {&s};
int main(void) {
        return s.p != &s; /* should be 0 */
}
2022-12-04 21:54:29 -06:00
Stephen Heumann 736e7575cf Fix issues with type conversions in static initialization.
*Initialization of floating-point variables from unsigned long expressions with value > LONG_MAX would give the wrong value.
*Initialization of floating-point variables from (unsigned) long long expressions would give the wrong value.
*Initialization of _Bool variables should give 0 or 1, as per the usual rules for conversion to _Bool.
*Initialization of integer variables from floating-point expressions should be allowed, applying the usual conversions.
2022-12-04 16:36:16 -06:00
Stephen Heumann 7c0492cfa4 Document designated initializers in the release notes. 2022-12-03 18:04:50 -06:00
Stephen Heumann ac741e26ab Allow nested auto structs/unions to be initialized with an expression of the same type.
When the expression is initially parsed, we do not necessarily know whether it is the initializer for the struct/union or for its first member. That needs to be determined based on the type. To support that, a new function is added to evaluate the expression separately from using it to initialize an object.
2022-11-29 13:19:59 -06:00
Stephen Heumann 740468f75c Avoid generating invalid .sym files if header ends with a partial prototyped function decl.
This could happen because the nested calls to DoDeclaration for the parameters would set inhibitHeader to false.
2022-11-26 14:20:58 -06:00
Stephen Heumann 2bf3862e5d Avoid generating invalid .sym files if header ends with a partial declaration.
The part of the declaration within the header could be ignored on subsequent compilations using the .sym file, which could lead to errors or misbehavior.

(This also applies to headers that end in the middle of a _Static_assert(...) or segment directive.)
2022-11-26 00:18:57 -06:00
Stephen Heumann 3f450bdb80 Support "inline" function definitions without static or extern.
This is a minimal implementation that does not actually inline anything, but it is intended to implement the semantics defined by the C99 and later standards.

One complication is that a declaration that appears somewhere after the function body may create an external definition for a function that appeared to be an inline definition when it was defined. To support this while preserving ORCA/C's general one-pass compilation strategy, we generate code even for inline definitions, but treat them as private and add the prefix "~inline~" to the name. If they are "un-inlined" based on a later declaration, we generate a stub with external linkage that just jumps to the apparently-inline function.
2022-11-19 23:04:22 -06:00
Stephen Heumann e168a4d6cb Treat static followed by extern declarations as specifying internal linkage.
See C17 section 6.2.2 p4-5.
2022-11-06 21:19:47 -06:00
Stephen Heumann 83147655d2 Revise NewSymbol to more closely align with standards.
Function declarations within a block are now entered within its symbol table rather than moved to the global one. Several error checks are also added or tightened.

This fixes at least one bug: if a function declared within a block had the same name as a variable in an outer scope, the symbol table entry for that variable could be corrupted, leading to spurious errors or incorrect code generation. This example program illustrates the problem:

/* This should compile without errors and return 2 */
int f(void) {return 1;}
int g(void) {return 2;}
int main(void) {
        int (*f)(void) = g;
        {
                int f(void);
        }
        f = g;
        return f();
}

Errors now detected include:
*Duplicate declarations of a static variable within a block (with the second one initialized)
*Duplicate declarations of the same variable as static and non-static
*Declaration of the same identifier as a typedef and a variable (at file scope)
2022-11-06 20:50:25 -06:00
Stephen Heumann 9a7dc23c5d When a symbol is multiply declared, form the composite type.
Previously, it generally just used the later type (except for function types where only the earlier one included a prototype). One effect of this is that if a global array is first declared with a size and then redeclared without one, the size information is lost, causing the proper space not to be allocated.

See C17 section 6.2.7 p4.

Here is an example affected by the array issue (dump the object file to see the size allocated):

int foo[50];
int foo[];
2022-10-30 18:54:40 -05:00
Stephen Heumann f31b5ea1e6 Allow "extern inline" functions.
A function declared "inline" with an explicit "extern" storage class has the same semantics as if "inline" was omitted. (It is not an inline definition as defined in the C standards.) The "inline" specifier suggests that the function should be inlined, but it is legal to just ignore it, as we already do for "static inline" functions.

Also add a test for the inline function specifier.
2022-10-29 19:43:57 -05:00
Stephen Heumann 913052fe7c Add documentation and tests for _Pragma. 2022-10-29 16:02:38 -05:00
Stephen Heumann e63d827049 Do not do macro expansion on preprocessor directive names.
According to the C standards (C17 section 6.10.3 p8), they should not be subject to macro replacement.

A similar change also applies to the "STDC" in #pragma STDC ... (but we still allow macros for other pragmas, which is allowed as part of the implementation-defined behavior of #pragma).

Here is an example affected by this issue:

#define ifdef ifndef
#ifdef foobar
#error "foobar defined?"
#else
int main(void) {}
#endif
2022-10-25 22:40:20 -05:00
Stephen Heumann 81353a9f8a Always interpret the digit sequence in #line as decimal.
This is what the standards call for.
2022-10-23 13:47:59 -05:00
Stephen Heumann e3a3548443 Fix line numbering via #line when using a .sym file.
The line numbering would be off by one in this case.
2022-10-22 21:56:16 -05:00
Stephen Heumann 6d8ca42734 Parse the _Thread_local storage-class specifier.
This does not really do anything, because ORCA/C does not support multithreading, but the C11 and later standards indicate it should be allowed anyway.
2022-10-18 21:01:26 -05:00
Stephen Heumann 99e268e3b9 Implement support for anonymous structures and unions (C11).
Note that this implementation allows anonymous structures and unions to participate in initialization. That is, you can have a braced initializer list corresponding to an anonymous structure or union. Also, anonymous structures within unions follow the initialization rules for structures (and vice versa).

I think the better interpretation of the standard text is that anonymous structures and unions cannot participate in initialization as such, and instead their members are treated as members of the containing structure or union for purposes of initialization. However, all other compilers I am aware of allow anonymous structures and unions to participate in initialization, so I have implemented it that way too.
2022-10-16 18:44:19 -05:00
Stephen Heumann 83ac0ecebf Add a function to peek at the next character.
This is necessary to correctly handle line continuations in a few places:
* Between an initial . and the subsequent digit in a floating constant
* Between the third and fourth characters of a %:%: digraph
* Between the second and third dots of a ... token

Previously, these would not be tokenized correctly, leading to spurious errors in the first and second cases above.

Here is a sample program illustrating the problem:

int printf(const char * restrict, ..\
\
??/
.);
int main(void) {
        double d = .??/
\
??/
\
1234;
        printf("%f\n", d);
}
2022-10-15 21:42:02 -05:00
Stephen Heumann 6fadd52fc2 Update release notes to cover fixes to fgets() and gets(). 2022-10-15 19:11:11 -05:00
Stephen Heumann 99a10590b1 Avoid out-of-range branches around asm code using dcl directives.
The branch range calculation treated dcl directives as taking 2 bytes rather than 4, which could result in out-of-range branches. These could result in linker errors (for forward branches) or silently generating wrong code (for backward branches).

This patch now treats dcb, dcw, and dcl as separate directives in the native-code layer, so the appropriate length can be calculated for each.

Here is an example of code affected by this:

int main(int argc, char **argv) {
top:
        if (!argc) { /* this caused a linker error */
                asm {
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                }
                goto top; /* this generated bad code with no error */
        }
}
2022-10-13 18:00:16 -05:00
Stephen Heumann 19683706cc Do not optimize code from asm statements.
Previously, the assembly-level optimizations applied to code in asm statements. In many cases, this was fine (and could even do useful optimizations), but occasionally the optimizations could be invalid. This was especially the case if the assembly involved tricky things like self-modifying code.

To avoid these problems, this patch makes the assembly optimizers ignore code from asm statements, so it is always emitted as-is, without any changes.

This fixes #34.
2022-10-12 22:03:37 -05:00
Stephen Heumann 995ded07a5 Always treat "struct T;" as declaring the tag within the current scope.
A declaration of this exact form always declares the tag T within the current scope, and as such makes this "struct T" a distinct type from any other "struct T" type in an outer scope. (Similarly for unions.)

See C17 section 6.7.2.3 p7 (and corresponding places in all other C standards).

Here is an example of a program affected by this:

struct S {char a;};
int main(void) {
        struct S;
        struct S *sp;
        struct S {long b;} s;
        sp = &s;
        sp->b = sizeof(*sp);
        return s.b;
}
2022-10-04 18:45:11 -05:00
Stephen Heumann 05ecf5eef3 Add option to use the declared type for float/double/comp params.
This differs from the usual ORCA/C behavior of treating all floating-point parameters as extended. With the option enabled, they will still be passed in the extended format, but will be converted to their declared type at the start of the function. This is needed for strict standards conformance, because you should be able to take the address of a parameter and get a usable pointer to its declared type. The difference in types can also affect the behavior of _Generic expressions.

The implementation of this is based on ORCA/Pascal, which already did the same thing (unconditionally) with real/double/comp parameters.
2022-09-18 21:16:46 -05:00
Stephen Heumann 4e76f62b0e Allow additional letters in identifiers.
The added characters are accented roman letters that were added to the Mac OS Roman character set at some time after it was first defined. Some IIGS fonts include them, although others do not.
2022-08-01 19:59:49 -05:00
Stephen Heumann 1177ddc172 Tweak release notes.
The "known issue" about not issuing required diagnostics is removed because ORCA/C has gotten significantly better about that, particularly if strict type checking is enabled. There are still probably some diagnostics that are missed, but it is no longer a big enough issue to be called out more prominently than other bugs.
2022-07-19 20:38:13 -05:00
Stephen Heumann 6e3fca8b82 Implement strict type checking for enum types.
If strict type checking is enabled, this will prohibit redefinition of enums, like:

enum E {a,b,c};
enum E {x,y,z};

It also prohibits use of an "enum E" type specifier if the enum has not been previously declared (with its constants).

These things were historically supported by ORCA/C, but they are prohibited by constraints in section 6.7.2.3 of C99 and later. (The C90 wording was different and less clear, but I think they were not intended to be valid there either.)
2022-07-19 20:35:44 -05:00
Stephen Heumann d576f19ede Remove trailing whitespace in release notes.
(No substantive changes.)
2022-07-18 21:45:55 -05:00
Stephen Heumann 6d07043783 Do not treat uses of enum types from outer scopes as redeclarations.
This affects code like the following:

enum E {a,b,c};
int main(void) {
        enum E e;
        struct E {int x;}; /* or: enum E {x,y,z}; */
}

The line "enum E e;" should refer to the enum type declared in the outer scope, but not redeclare it in the inner scope. Therefore, a subsequent struct, union, or enum declaration using the same tag in the same scope is acceptable.
2022-07-18 21:34:29 -05:00
Stephen Heumann 2cbcdc736c Allow the same identifier to be used as a typedef and an enum tag.
This should be allowed (because they are in separate name spaces), but was not.

This affected code like the following:

typedef int T;
enum T {a,b,c};
2022-07-18 18:33:54 -05:00
Stephen Heumann 6bfd491f2a Update release notes. 2022-07-14 18:40:59 -05:00
Stephen Heumann 7b0dda5a5e Fix a flawed optimization.
The optimization could turn an unsigned comparison "x <= 0xFFFF" into "x < 0".

Here is an example affected by this:

int main(void) {
        unsigned i = 1;
        return (i <= 0xffff);
}
2022-07-10 22:25:55 -05:00
Stephen Heumann 7898c619c8 Fix several cases where a condition might not be evaluated correctly.
These could occur because the code for certain operations was assumed to set the z flag based on the result value, but did not actually do so. The affected operations were shifts, loads or stores of bit-fields, and ? : expressions.

Here is an example showing the problem with a shift:

#pragma optimize 1
int main(void) {
        int i = 1, j = 0;
        return (i >> j) ? 1 : 0;
}

Here is an example showing the problem with a bit-field load:

struct {
        signed int i : 16;
} s = {1};
int main(void) {
        return (s.i) ? 1 : 0;
}

Here is an example showing the problem with a bit-field store:

#pragma optimize 1
struct {
        signed int i : 16;
} s;
int main(void) {
        return (s.i = 1) ? 1 : 0;
}

Here is an example showing the problem with a ? : expression:

#pragma optimize 1
int main(void) {
        int a = 5;
        return (a ? (a<<a) : 0) ? 0 : 1;
}
2022-07-07 18:26:37 -05:00