Commit Graph

67 Commits

Author SHA1 Message Date
Stephen Heumann e5c7aebb3f Avoid spurious undefined variable errors for functions returning structs/unions.
When the lint check for undefined variables was enabled, a "lint: unused variable: @struct" would be produced for any function returning a struct or union, due to the special static variable that is created to hold the return value. That spurious lint message is now suppressed.
2023-04-06 18:52:45 -05:00
Stephen Heumann 0b3f48157e Simplify code for writing out extended constants.
This removes the need for the CnvSX function, so it is removed.
2023-04-04 18:11:04 -05:00
Stephen Heumann cc36e9929f Remove some unused variables. 2023-03-20 11:12:48 -05:00
Stephen Heumann 1b7b198039 Remove unneeded extern declarations. 2023-03-17 18:14:19 -05:00
Stephen Heumann 3406dbd3ae Prevent a tag declared in an inner scope from shadowing a typedef.
This could occur because when FindSymbol was called to look for symbols in all spaces, it would find a tag in an inner scope before a typedef in an outer scope. The processing order has been changed to look for regular symbols (including typedefs) in any scope, and only look for tags if no regular symbol is found.

Here is an example illustrating the problem:

typedef int T;
int main(void) {
        struct T;
        T x;
}
2023-03-06 21:38:05 -06:00
Stephen Heumann 1f6bc44b48 Fix handling of typedef names immediately after an inner scope where the identifier is redeclared.
If an identifier is used as a typedef in an outer scope but then declared as something else in an inner scope (e.g. a variable name or tag), and that same identifier is the next token after the end of the inner scope, it would not be recognized properly as a typedef name leading to spurious errors.\

Here is an example that triggered this:

typedef char Type;
void f(int Type);
Type t;

Here is another one:

int main(void) {
        typedef int S;
        if (1)
                (struct S {int a;} *)0;
        S x;
}
2023-03-05 21:40:59 -06:00
Stephen Heumann 53fcb84352 Allocate staticNum prefixes only for scopes that have statics.
This is somewhat faster, and also reduces the theoretical chance of overflowing the available staticNum values for a large source file.
2022-12-13 22:17:25 -06:00
Stephen Heumann a7551d8c44 Optimize unused variable checks to only run on scopes with variables.
This reduces the overhead of them considerably.
2022-12-13 21:34:24 -06:00
Stephen Heumann 09fbfb1905 Maintain a pool of empty symbol tables that can be reused.
The motivation for this is that allocating and clearing symbol tables is a common operation, especially with C99+, where a construct like "if (...) { ... }" involves three levels of scope with their own symbol tables. In some tests, it could take an appreciable fraction of total execution time (sometimes ~10%).

This patch allows symbol tables that have already been allocated and cleared to be reused for a subsequent scope, as long as they are still empty. It does this by maintaining a pool of empty symbol tables and taking one from there rather than allocating a new one when possible.

We impose a somewhat arbitrary limit of MaxBlock/150000 on the number of symbol tables we keep, to avoid filling up memory with them. It would probably be better to use purgeable handles here, but that would be a little more work, and this should be good enough for now.
2022-12-13 21:14:23 -06:00
Stephen Heumann 4bc486eade Do not require unused static functions to be defined.
This mostly implements the rule in C17 6.9 p3, which requires a definition to be provided only if the function is used in an expression. Per that rule, we should also exclude most sizeof or _Alignof operands, but we don't do that yet.
2022-12-12 22:10:36 -06:00
Stephen Heumann fe62f70d51 Add lint option to check for unused variables. 2022-12-12 21:47:32 -06:00
Stephen Heumann ecca7a7737 Never make the segment in the root file dynamic.
This would previously happen if a segment directive with "dynamic" appeared before the first function in the program. That would cause the resulting program not to work, because the root segment needs to be a static segment at the start of the program, but if it is dynamic it would come after a jump table and a static segment of library code.

The root segments are also configured to refer to main or the NDA/CDA entry points using LEXPR records, so that they can be in dynamic segments (not that they necessarily should be). That change is intentionally not done for CDEV/XCMD/NBA, because they use code resources, which do not support dynamic segments, so it is better to force a linker error in these cases.
2022-12-11 14:46:38 -06:00
Stephen Heumann 20770f388e Move memory allocation code to a new function in MM. 2022-12-03 18:50:26 -06:00
Stephen Heumann d56cf7e666 Pass constant data to backend as pointers into buffer.
This avoids needing to generate many intermediate code records representing the data at most 8 bytes at a time, which should reduce memory use and probably improve performance for large initialized arrays or structs.
2022-12-03 00:14:15 -06:00
Stephen Heumann 28e119afb1 Rework static initialization to support new-style initializer records.
Static initialization of arrays/structs/unions now essentially "executes" the initializer records to fill in a buffer (and keep track of relocations), then emits pcode to represent that initialized state. This supports overlapping and out-of-order initializer records, as can be produced by designated initialization.
2022-12-02 21:55:57 -06:00
Stephen Heumann 4621336c3b Give anonymous structs/unions unique internal names.
This will help deal with initialization of them.
2022-11-28 20:47:13 -06:00
Stephen Heumann 8cfc14b50a Rename itype field of initializerRecord to basetype. 2022-11-26 15:45:26 -06:00
Stephen Heumann 3f450bdb80 Support "inline" function definitions without static or extern.
This is a minimal implementation that does not actually inline anything, but it is intended to implement the semantics defined by the C99 and later standards.

One complication is that a declaration that appears somewhere after the function body may create an external definition for a function that appeared to be an inline definition when it was defined. To support this while preserving ORCA/C's general one-pass compilation strategy, we generate code even for inline definitions, but treat them as private and add the prefix "~inline~" to the name. If they are "un-inlined" based on a later declaration, we generate a stub with external linkage that just jumps to the apparently-inline function.
2022-11-19 23:04:22 -06:00
Stephen Heumann de57170ef8 Try to form composite types for extern declarations within blocks.
If the extern declaration refers to a global variable/function for which a declaration is already visible, the inner declaration should have the composite type (and it is an error if the types are incompatible).

This affects programs like the following:

static char a[60] = {5};
int main(void) {
        extern char a[];
        return sizeof(a)+a[0]; /* should return 65 */
}
2022-11-07 19:00:35 -06:00
Stephen Heumann fa166030fe Allow duplicate typedefs within block scopes (C11). 2022-11-06 21:39:58 -06:00
Stephen Heumann e168a4d6cb Treat static followed by extern declarations as specifying internal linkage.
See C17 section 6.2.2 p4-5.
2022-11-06 21:19:47 -06:00
Stephen Heumann 83147655d2 Revise NewSymbol to more closely align with standards.
Function declarations within a block are now entered within its symbol table rather than moved to the global one. Several error checks are also added or tightened.

This fixes at least one bug: if a function declared within a block had the same name as a variable in an outer scope, the symbol table entry for that variable could be corrupted, leading to spurious errors or incorrect code generation. This example program illustrates the problem:

/* This should compile without errors and return 2 */
int f(void) {return 1;}
int g(void) {return 2;}
int main(void) {
        int (*f)(void) = g;
        {
                int f(void);
        }
        f = g;
        return f();
}

Errors now detected include:
*Duplicate declarations of a static variable within a block (with the second one initialized)
*Duplicate declarations of the same variable as static and non-static
*Declaration of the same identifier as a typedef and a variable (at file scope)
2022-11-06 20:50:25 -06:00
Stephen Heumann 9a7dc23c5d When a symbol is multiply declared, form the composite type.
Previously, it generally just used the later type (except for function types where only the earlier one included a prototype). One effect of this is that if a global array is first declared with a size and then redeclared without one, the size information is lost, causing the proper space not to be allocated.

See C17 section 6.2.7 p4.

Here is an example affected by the array issue (dump the object file to see the size allocated):

int foo[50];
int foo[];
2022-10-30 18:54:40 -05:00
Stephen Heumann d4c4d18a55 Remove some unused code.
This seemed to be aimed at supporting lazy allocation of symbol tables. That could be a useful optimization, but the code that existed was incomplete and did not do anything useful. That or similar code could be reintroduced as part of a full implementation of lazy allocation, if it is ever done.
2022-10-30 15:06:28 -05:00
Stephen Heumann 99e268e3b9 Implement support for anonymous structures and unions (C11).
Note that this implementation allows anonymous structures and unions to participate in initialization. That is, you can have a braced initializer list corresponding to an anonymous structure or union. Also, anonymous structures within unions follow the initialization rules for structures (and vice versa).

I think the better interpretation of the standard text is that anonymous structures and unions cannot participate in initialization as such, and instead their members are treated as members of the containing structure or union for purposes of initialization. However, all other compilers I am aware of allow anonymous structures and unions to participate in initialization, so I have implemented it that way too.
2022-10-16 18:44:19 -05:00
Stephen Heumann 05ecf5eef3 Add option to use the declared type for float/double/comp params.
This differs from the usual ORCA/C behavior of treating all floating-point parameters as extended. With the option enabled, they will still be passed in the extended format, but will be converted to their declared type at the start of the function. This is needed for strict standards conformance, because you should be able to take the address of a parameter and get a usable pointer to its declared type. The difference in types can also affect the behavior of _Generic expressions.

The implementation of this is based on ORCA/Pascal, which already did the same thing (unconditionally) with real/double/comp parameters.
2022-09-18 21:16:46 -05:00
Stephen Heumann 102d6873a3 Fix type checking and result type computation for ? : operator.
This was non-standard in various ways, mainly in regard to pointer types. It has been rewritten to closely follow the specification in the C standards.

Several helper functions dealing with types have been introduced. They are currently only used for ? :, but they might also be useful for other purposes.

New tests are also introduced to check the behavior for the ? : operator.

This fixes #35 (including the initializer-specific case).
2022-06-23 22:05:34 -05:00
Stephen Heumann 5e20e02d06 Add a function to make a pointer type.
This allows us to refactor out code that was doing this in several places.
2022-06-19 17:55:08 -05:00
Stephen Heumann 58849607a1 Use cgPointerSize for size of pointers in various places.
This makes no practical difference when targeting the GS, but it better documents what the relevant size is.
2022-06-18 19:30:20 -05:00
Stephen Heumann a3104853fc Treat string constant base types as having unknown number of elements.
I am not aware of any effect from this, but the change makes their element count consistent with the size of 0 (indicating an incomplete type).
2022-06-18 19:18:29 -05:00
Stephen Heumann 161bb952e3 Dynamically allocate string space, and make it larger.
This increases the limit on total bytes of strings in a function, and also frees up space in the blank segment.
2022-06-08 22:09:30 -05:00
Stephen Heumann 8c0d65616c Remove an unnecessary variable. 2022-02-13 21:36:03 -06:00
Stephen Heumann b1bc840ec8 Reverse order of parameters for pascal function pointer types.
The parameters of the underlying function type were not being reversed when applying the "pascal" qualifier to a function pointer type. This resulted in the parameters not being in the expected order when a call was made using such a function pointer. This could result in spurious errors in some cases or inappropriate parameter conversions in others.

This fixes #75.
2022-01-13 19:38:22 -06:00
Stephen Heumann ed3035cb99 Fix bug in code for varargs functions with multiple fixed parameters.
This was broken by the varargs changes in commit a20d69a211. The code was not accounting for the internal representation of the parameters being in reverse order, so it was basing address calculations on the first fixed parameter rather than the last one, resulting in the wrong number of bytes being removed from the stack (generally causing a crash).

This affected the c99stdio.c test case, and is now also covered in c99stdarg.c.
2022-01-01 22:42:42 -06:00
Stephen Heumann a20d69a211 Revise variable argument handling to better comply with standards.
In the new implementation, variable arguments are not removed until the end of the function. This allows variable argument processing to be restarted, and it prevents the addresses of local variables from changing in the middle of the function. The requirement to turn off stack repair code around varargs functions is also removed.

This fixes #58.
2021-10-23 22:36:34 -05:00
Stephen Heumann 692ebaba85 Structs or arrays may not contain structs with a flexible array member.
We previously ignored this, but it is a constraint violation under the C standards, so it should be reported as an error.

GCC and Clang allow this as an extension, as we were effectively doing previously. We will follow the standards for now, but if there was demand for such an extension in ORCA/C, it could be re-introduced subject to a #pragma ignore flag.
2021-10-17 22:22:42 -05:00
Stephen Heumann 5871820e0c Support UTF-8/16/32 string literals and character constants (C11).
These have u8, u, or U prefixes, respectively. The types char16_t and char32_t (defined in <uchar.h>) are used for UTF-16 and UTF-32 code points.
2021-10-11 20:54:37 -05:00
Stephen Heumann 8077a248a4 Treat short and int as compatible if using loose type checks.
This gives a clearer pattern of matching ORCA/C's historical behavior if loose type checks are used, and the documentation is updated accordingly.

It also avoids breaking existing code that may be relying on the old behavior. I am aware of at least one place that does (conflicting declarations of InstallNetDriver in GNO's <gno/gno.h>).
2021-09-12 18:12:24 -05:00
Stephen Heumann 2614f10ced Make the actual return type of a function be the unqualified version of the type specified.
This is a change that was introduced in C17. However, it actually keeps things closer to ORCA/C's historical behavior, which generally ignored qualifiers in return types.
2021-09-10 18:09:50 -05:00
Stephen Heumann af455d1900 If not doing loose type checks, use stricter checks for function types.
These will check that the prototypes (if present) match in number and types of arguments. This primarily affects operations on function pointers, since similar checks were already done elsewhere on function declarations themselves.
2021-09-10 18:04:30 -05:00
Stephen Heumann a8682e28d3 Give an error for pointer assignments that discard qualifiers.
This is controlled by #pragma ignore bit 5, which is now a more general "loose type checks" bit.
2021-09-10 17:58:20 -05:00
Stephen Heumann 2f7e71cd24 Treat the fields of const structs as const-qualified.
This causes an error to be produced when trying to assign to these fields, which was being allowed before. It is also necessary for correct behavior of _Generic in some cases.
2021-09-09 18:39:19 -05:00
Stephen Heumann b16210a50b Record volatile and restrict qualifiers in types.
These are needed to correctly distinguish pointer types in _Generic. They should also be used for type compatibility checks in other contexts, but currently are not.

This also fixes a couple small problems related to type qualifiers:
*restrict was not allowed to appear after * in type-names
*volatile status was not properly recorded in sym files

Here is an example of using _Generic to distinguish pointer types based on the qualifiers of the pointed-to type:

#include <stdio.h>

#define f(e) _Generic((e),\
        int * restrict *: 1,\
        int * volatile const *: 2,\
        int **: 3,\
        default: 0)

#define g(e) _Generic((e),\
        int *: 1,\
        const int *: 2,\
        volatile int *: 3,\
        default: 0)

int main(void) {
        int * restrict * p1;
        int * volatile const * p2;
        int * const * p3;

        // should print "1 2 0 1"
        printf("%i %i %i %i\n", f(p1), f(p2), f(p3), f((int * restrict *)0));

        int *q1;
        const int *q2;
        volatile int *q3;
        const volatile int *q4;

        // should print "1 2 3 0"
        printf("%i %i %i %i\n", g(q1), g(q2), g(q3), g(q4));
}

Here is an example of a problem resulting from volatile not being recorded in sym files (if a sym file was present, the read of x was lifted out of the loop):

#pragma optimize -1
static volatile int x;
#include <stdio.h>
int main(void) {
        int y;
        for (unsigned i = 0; i < 100; i++) {
                y = x*2 + 7;
        }
}
2021-08-30 18:19:58 -05:00
Stephen Heumann f2414cd815 Create a new function that checks for compatible types strictly according to the C standards.
For now, this is only used for _Generic expressions. Eventually, it should probably replace the current CompTypes, but CompTypes currently performs somewhat looser checks that are suitable for some situations, so adjustments would be needed at some call sites.
2021-03-07 23:39:30 -06:00
Stephen Heumann 77d66ab699 Support the predefined identifier __func__ (from C99).
This gives the name of the current function, as if the following definition appeared at the beginning of the function body:

static const char __func__[] = "function-name";
2021-03-02 22:28:28 -06:00
Stephen Heumann 32f4e70826 Fix a comment. 2021-02-18 12:58:57 -06:00
Stephen Heumann 168a06b7bf Add support for emitting 64-bit constants in statically-initialized data. 2021-02-04 02:17:10 -06:00
Stephen Heumann 091a25b25d Update the debugging format for long long values.
For now, "long long" is represented with the existing code for the SANE comp format, since their representation is the same except for the comp NaN. This allows existing debuggers that support comp to work with it. The code for "unsigned long long" includes the unsigned flag, so it is unambiguous.
2021-01-31 20:26:51 -06:00
Stephen Heumann 085cd7eb1b Initial code to recognize 'long long' as a type. 2021-01-29 22:27:11 -06:00
Stephen Heumann 52132db18a Implement the _Bool type from C99. 2021-01-25 21:22:58 -06:00