Commit Graph

258 Commits

Author SHA1 Message Date
Stephen Heumann
06a027b237 Rename TypeSpecifier to DeclarationSpecifiers, consistent with C standard terminology.
Also remove its first argument, which was unused. There is no functional change yet.
2020-01-06 20:18:58 -06:00
Stephen Heumann
a9fb1ba482 Move TypeName to be a top-level method in Parser.pas.
As of C11, type names are now used as part of the declaration syntax (in _Alignas and _Atomic specifiers), in addition to their uses in expressions. Moving the TypeName method will allow it to be called when processing declarations.
2020-01-06 20:18:58 -06:00
Stephen Heumann
3121a465f1 Implement the _Alignof operator (from C11).
In ORCA/C, the alignment of all object types is 1.
2020-01-06 20:17:29 -06:00
Stephen Heumann
9036a98e1c Implement support for digraphs.
Specifically, the following six punctuator tokens are now supported:

<: :> <% %> %: %:%:

These behave the same as the existing tokens [, ], {, }, #, and ## (respectively), apart from their spelling.

This can be useful when the full ASCII character set cannot easily be displayed or input (e.g. on the IIgs text screen with certain language settings).
2020-01-04 21:49:50 -06:00
Stephen Heumann
6f2eb301e5 Implement C11 _Static_assert mechanism.
This allows code to contain static assertions (checked at compile time).
2020-01-04 18:16:29 -06:00
Stephen Heumann
0184e3db7b Recognize the new keywords from C99 and C11 as such.
Specifically, the following will now be tokenized as keywords:

_Alignas
_Alignof
_Atomic
_Bool
_Complex
_Generic
_Imaginary
_Noreturn
_Static_assert
_Thread_local
restrict

('inline' was also added as a standard keyword in C99, but ORCA/C already treated it as such.)

The parser currently has no support for any of these keywords, so for now errors will still be generated if they are used, but this is a first step toward adding support for them.
2020-01-03 22:48:53 -06:00
Stephen Heumann
9030052616 When initializing bitfields of type long, do not treat their values as pointer constants.
This was inappropriate and would lead to memory trashing.

Fixes case 3 in issue #59.
2019-12-24 18:52:25 -06:00
Stephen Heumann
ae6de310c7 Avoid storing stale values of __DATE__ or __TIME__ in sym files.
This could happen in some very obscure cases like using these macros for the names of segments or include files. The fix is to just terminate precompiled header generation if they are encountered.
2019-12-24 15:58:12 -06:00
Stephen Heumann
095807517b Fix bug leading to spurious errors in some cases when a sym file is present.
The issue was that invalid sym files could be generated if an #include is encountered within an #if or #ifdef block in the main source file. The fix (for now) is to simply terminate precompiled header generation if such an #include is encountered.

Fixes #2.
2019-12-24 15:45:32 -06:00
Stephen Heumann
4db26d14bd Skip initializer processing for flexible array members.
This could result in null pointer dereferences.
2019-12-23 21:33:27 -06:00
Stephen Heumann
cb063afa47 Set expressionType to a default value when LoadAddress encounters an error.
This can occur in cases such as trying to assign to a non-l-value.

This patch ensures consistent handling of errors and prevents null pointer dereferences.
2019-12-23 20:36:27 -06:00
Stephen Heumann
91094e9292 Correctly increment/decrement pointers to large (>=64KiB) types.
Previously, the logic for this was incorrect and would lead to a null pointer dereference in the compiler. In most cases the generated code would not actually change the pointer.

The following program demonstrates the issue:

#include <stdio.h>
#pragma memorymodel 1
typedef char bigarray[0x20000];
bigarray big[5];
int main(void) {
        bigarray *p = big;
        p++;
        printf("%p %p\n", (void*)big, (void*)p);
}
2019-12-23 19:59:18 -06:00
Stephen Heumann
13a14d9389 When an operand is missing, synthesize a "0" operand for post-error processing.
This ensures consistent behavior and avoids null-pointer dereferences in some cases.
2019-12-23 18:57:17 -06:00
Stephen Heumann
b0b2b3fa91 Do not attempt to generate code for malformed initializers with no usable initializer expression.
This would lead to null pointer dereferences, and could possibly cause unpredictable behavior based on the values read.
2019-12-23 14:09:08 -06:00
Stephen Heumann
7f79b49c3a Set expressionType to a default value when an expression parsing error is encountered.
This ensures consistent handling of errors and prevents possible null pointer dereferences.
2019-12-23 14:03:46 -06:00
Stephen Heumann
a4f7284a8a Avoid null pointer dereferences when processing K&R-style function parameter declarations.
These are initially entered into the symbol table with no known type (itype = nil), so this case should be accounted for in NewSymbol.

This typically would not cause a problem, but might if the zero page contained certain values
2019-12-22 23:16:26 -06:00
Stephen Heumann
b88dc5b39c Eliminate a null pointer dereference when processing prototyped function declarators.
This should normally have been harmless, but might possibly give an error in obscure circumstances.
2019-12-22 22:16:03 -06:00
Stephen Heumann
2fb075ce58 Do not dereference and write through a null pointer during loop invariant removal. 2019-12-22 19:58:57 -06:00
Stephen Heumann
095060ca70 Avoid null pointer dereferences when generating code for initializers.
These should have been harmless under ORCA/Pascal on the IIgs, assuming the code is otherwise functioning properly.
2019-12-22 19:12:05 -06:00
Stephen Heumann
2190b7e7ed Fix two places in the optimizer where null pointers could be dereferenced.
These were generally fairly harmless, but one could have caused problems if the zero page contained certain values.
2019-12-17 18:03:51 -06:00
Stephen Heumann
a09581b84e Fix crash or error in certain cases when using common subexpression elimination.
In certain rare cases, constant subexpression elimination could set the left subtree of a pc_bno operation in the intermediate code to nil. This could lead to null pointer dereferences, sometimes resulting in a crash or error during native code generation.

The below program sometimes demonstrates the problem (dependent on zero page contents):

#pragma optimize 16
struct F {int *p;};
void foo(struct F* f)
{
    struct {int c;} s = {0};
    ++f->p;
    s.c |= *--f->p;
}
2019-12-17 16:13:07 -06:00
Stephen Heumann
8b339a9ab7 Remove a test that expected floating-point division by 0.0 to fail.
As of commit 618188c6b2, this is now allowed and does not generate an error.
2019-03-31 18:46:29 -05:00
Stephen Heumann
17de3914ad Fix problem with using values of enum constants in integer constant expressions.
This could cause spurious errors, or in some cases bad code generation.

The following example illustrates the problem:

#include <stdio.h>

enum {A,B,C};

/* arr was treated as having a size of 1, rather than 3 */
char arr[(int)C+1] = {1,2,3}; /* incorrectly gave an error for initializer */

int main(void) {
        static int i = (int)C+1; /* incorrectly gave an error */

        printf("%zu\n", sizeof(arr));
        printf("%i\n", (int)C+1); /* OK */
        printf("%i\n", i);
}
2019-03-31 18:40:14 -05:00
Kelvin Sherlock
618188c6b2 floating point division by 0.0 is well defined and occasionally used to generate +/- infinite values. 2019-03-10 16:47:18 -04:00
Stephen Heumann
1a2e4772cd Don't generate scalar load operations with bogus types.
This could occur with some operations on function parameters declared with array types, in some cases leading to compiler errors.

Fixes #61.
2019-01-27 11:53:52 -06:00
Kelvin Sherlock
6bd600157d Debugger symbol table fix
const structs are wrapped in definedType.  The debugger symbol table code is unaware of this, which results in missing or incomplete entries.

example:

const struct { int a; int b; } cs;

cs:  isForwardDeclared =    false; class = ident
    4 byte constant defined type of
    4 byte struct: 223978

const struct { int a; int b; } *pcs;

pcs:  isForwardDeclared =    false; class = ident
    4 byte  pointer to
    4 byte constant defined type of
    4 byte struct: 224145

const struct { const struct { const int a; } a[2]; } csa[5];

csa:  isForwardDeclared =    false; class = ident
    20 byte 5 element array of
    4 byte constant defined type of
    4 byte struct: 225155

const struct { const struct { const int a; } a[2]; } *cspa[5];

cspa:  isForwardDeclared =    false; class = ident
    20 byte 5 element array of
    4 byte  pointer to
    4 byte constant defined type of
    4 byte struct: 224850

This change unwraps the definedType so the underlying type info can be placed in the debugger symbol table.
2019-01-13 21:11:00 -05:00
Stephen Heumann
cfa3e4e02d Allow prototyped function parameter types to begin with "volatile".
Prototypes with such parameter types were incorrectly being rejected, e.g. in the following example:

void foo(volatile int *p);
2018-12-03 18:19:28 -06:00
Stephen Heumann
1af7505a00 Add prototypes for SANE housekeeping calls in <sane.h>.
There still aren't prototypes for the main SANE calls, since they aren't really designed to be called directly from C, and may take variable numbers of parameters depending on the operation.
2018-11-19 21:56:24 -06:00
Stephen Heumann
89bd5d9947 Update ORCA/C version number to 2.2.0 B3. 2018-09-15 12:20:08 -05:00
Stephen Heumann
3e75258c56 Update release notes. 2018-09-15 12:19:47 -05:00
Stephen Heumann
411f911b60 Bump sym file version and add a couple sanity checks.
Bumping the version forces regeneration of any sym files created by old ORCA/C versions with the bug that was just fixed.

A couple sanity checks are also introduced when reading sym files, including one that would have caught that bug.
2018-09-15 00:42:05 -05:00
Stephen Heumann
62757acdb1 Fix bug that could cause generation of invalid sym files.
When these invalid sym files were used during subsequent compiles, certain type pointers (for what should be const-qualified struct or union types) could be left uninitialized, or possibly initialized pointing to different types. This could result in spurious errors or potentially in other problems.
2018-09-15 00:11:36 -05:00
Stephen Heumann
ef099f4f83 Update test to reflect that operands of % may be negative. 2018-09-15 00:03:44 -05:00
Stephen Heumann
80b96c1147 Ensure % with negative operands is not mis-optimized in intermediate code.
This will not be triggered in most cases, but might be if one of the operand expressions was itself subject to optimization.
2018-09-14 19:18:45 -05:00
Stephen Heumann
fedd275395 Fix issues where static initialization may generate the wrong number of bytes.
This relates to unions or structs that are "filled" with zeros because the initializer does not include explicit terms for them, and that contain bit-fields or (for unions) do not start with the longest member.

The following program is an example that was miscompiled:

#include <stdio.h>

struct BF {
        int i:3;
        int j:4;
};

union U {
        int i;
        long l;
};

struct Outer1 {
        int n;
        struct BF bf[7];
        union U u[5];
};

struct Outer2 {
        long p;
        struct Outer1 o1;
        long q;
};

int main(void) {
        static struct Outer2 s = {1,{0},212};
        printf("%li %li\n", s.p, s.q);
}
2018-09-14 19:02:21 -05:00
Stephen Heumann
4d10fbae01 Fix several initialization issues.
This fixes case 1 (dealing with run-time initialization of structures containing bit-fields) and case 2 (dealing with initialization of structs where initializer values are not provided for all elements) from issue #59. It also fixes cases that could result in invalid initialization of unions if their first element was not the longest, as in the following example:

#include <stdio.h>
union U {
        int i;
        long l;
};
int main(void) {
        union U a[5] = {1,2,3,4};
        printf("a[0].i=%i, a[1].i=%i, a[2].i=%i, a[3].i=%i, a[4].i=%i\n",
                a[0].i, a[1].i, a[2].i, a[3].i, a[4].i);
}
2018-09-14 18:13:33 -05:00
Stephen Heumann
8b4213cd5a Fix bug where the condition check of the ?: operator may be mis-evaluated.
This could happen in certain cases where the condition codes might not be set at expected. The following program gives an example:

#pragma optimize 1
#include <stdio.h>
int one(void) {return 1;}
int negative_one(void) {return -1;}
int main(void) {
    puts((one() + negative_one()) ? "A" : "B");
}

This could also occur if the condition used the % operator, particularly after the recent changes to it.

Also, add unsigned multiplication, division, and modulo operations to the list of those that may not set the condition codes based on the result value, both in this and other contexts.

Detected based on several programs from FizzBuzz-C.
2018-09-14 13:25:40 -05:00
Stephen Heumann
60484d6f69 Fix for including system headers via macros.
This makes something like the following work:

#define STDIO_H <stdio.h>
#include STDIO_H

It didn't previously, because workString would be overwritten by NextToken. The effect in this case was that it would erroneously try to include the header <hh>, rather than <stdio.h>.

Detected based on a couple programs from FizzBuzz-C.
2018-09-13 21:59:46 -05:00
Stephen Heumann
95f5ec9c13 Don't print a whole bunch of spaces for an error message if the column number is 0.
This could happen, e.g., for a "'}' expected" error at end-of-file. It occurred because the 0..maxint type being used caused the Pascal compiler to use unsigned comparisons, which were inappropriate here.
2018-09-10 21:55:02 -05:00
Stephen Heumann
857e432896 Disable a native-code optimization that was generating bad code for %.
Specifically, it converted PLX followed by PHA to STA 1,S. This is invalid if the x value is actually used, which is a case that can come up in the code now generated for the % operator.

It might be possible to re-enable this optimization with tighter checks about where it's applied, but I don't think it's terribly important.

The below program demonstrates an example that was being miscompiled:

#pragma optimize -1
#include <stdio.h>
int main(void) {
        int a = 100, b = 200, c = 3, d = 4;
        printf("%i\n", (a+b) % (c+d)); /* should be 6 */
}
2018-09-10 19:29:16 -05:00
Stephen Heumann
a359543769 Add EACCES as another name for EACCESS in <errno.h>.
The EACCES name is used in the ORCA/C manual, and also matches the name in POSIX and GNO. EACCESS is retained as an alias for compatibility.
2018-09-10 18:26:24 -05:00
Stephen Heumann
2d43074d5a Make % operator give proper remainders even if one or both operands are negative.
Per the C standards, the % operator should give a remainder after division, such that (a/b)*b + a%b equals a (provided that a/b is representable). As such, the operation of % is defined for cases where either or both of the operands are negative. Since division truncates toward 0, a%b should give a negative result (or 0) in cases where a is negative.

Previously, the % operator was essentially behaving like the "mod" operator in Pascal, which is equivalent for positive operands but not if either operand is negative. It would generally give incorrect results in those cases, or in some cases give compile-time or run-time errors.

This patch addresses both 16-bit and 32-bit signed computations at run time, and operations in constant expressions. The approach at run time is to call existing division routines, which return the correct remainder, except always as a positive number. The generated code checks the sign of the first operand, and if it is negative negates the remainder.

The code generated is somewhat large (especially for the 32-bit case), so it might be sensible to put it in a library function and call that, but for now it's just generated in-line. This avoids introducing a dependency on a new library function, so the generated code remains compatible with older versions of ORCALib (e.g. the GNO one).

Fixes #10.
2018-09-10 18:21:17 -05:00
Stephen Heumann
e8d89c9b39 Add prototype for _Exit() function from C99. 2018-09-10 17:41:53 -05:00
Stephen Heumann
3352e874dd Update a test to account for correct behavior of #line directive. 2018-09-10 17:40:57 -05:00
Stephen Heumann
f7010f1746 Update release notes. 2018-09-08 23:08:17 -05:00
Stephen Heumann
dc0f7dc4e4 Correct most of the LDBL_* values in <float.h>.
These had been the values appropriate for double, not accounting for the fact that long double is now the 80-bit extended type.

The integer LDBL_* values have now all been updated to be correct, as has LDBL_EPSILON. LDBL_MAX and LDBL_MIN are still not correct, because ORCA/C internally processes floating-point constants in the double format, and so the correct values would get rounded to INF and 0, respectively.

Note that the SANE 80-bit extended format is almost like the x87 80-bit extended format used for long double on many modern systems, but not entirely. The difference is that SANE allows the biased exponent field of an extended value (either normalized or denormalized) to be 0, yielding an effective exponent of 0-16383 = -16383. x87 uses an effective exponent of -16382 in these cases (which it considers to be pseudo-denormalized or denormalized), yielding values that are twice as large as the SANE values. This difference causes LDBL_MIN_EXP to be different, and would also cause LDBL_MIN to be different if it could be represented correctly.
2018-09-08 22:39:30 -05:00
Stephen Heumann
4de0035d77 Remove support for the non-standard "long float" type.
This was an alias for double, but it's non-standard and undocumented. Apparently it existed in some other pre-standard compilers, but it's not in any version of standard C, and I can't find any evidence of it being used. Considering the possibility for confusion, I think it's best to remove it.
2018-09-08 18:12:48 -05:00
Stephen Heumann
15b1c88d44 Give accurate error message if a numeric constant is too long.
Previously, "integer overflow" was reported in this case, even for floating constants.
2018-09-08 14:40:06 -05:00
Stephen Heumann
f6381b7523 Indicate errors at correct positions when the source line contains tabs.
Previously, the error markers would generally be misaligned in this case, because a tab would expand to no spaces (in ORCA/Shell) or multiple spaces (in most other environments), but the error-printing code would use a single space to try to line up with it.

The solution adopted is just to print tabs in the error lines at the positions where they occur in the source lines. The actual amount of space displayed will depend on the console being used, but in any case it should line up correctly with the source line.
2018-09-07 17:48:19 -05:00
Stephen Heumann
d33ac61af3 Fix bug where exit-to-editor may use wrong position in certain cases.
This could happen in certain situations where an error is detected at the end of a line (for example with "cannot redefine a macro" errors).

Fixes #40.
2018-09-06 23:55:25 -05:00