Commit Graph

51 Commits

Author SHA1 Message Date
Stephen Heumann 81934109fc Fix issues with type names in the third expression of a for loop.
There were a couple issues here:
*If the type name contained a semicolon (for struct/union member declarations), a spurious error would be reported.
*Tags or enumeration constants declared in the type name should be in scope within the loop, but were not.

These both stemmed from the way the parser handled the third expression, which was to save the tokens from it and re-inject them at the end of the loop. To get the scope issues right, the expression really needs to be evaluated at the point where it occurs, so we now do that. To enable that while still placing the code at the end of the loop, a mechanism to remove and re-insert sections of generated code is introduced.

Here is an example illustrating the issues:

int main(void) {
        int i, j, x;
        for (i = 0; i < 123; i += sizeof(struct {int a;}))
                for (j = 0; j < 123; j += sizeof(enum E {A,B,C}))
                        x = i + j + A;
}
2024-03-13 22:09:25 -05:00
Stephen Heumann 24c6e72a83 Simplify some conditional branches.
This affects certain places where code like the following could be generated:

	bCC lab2
lab1	brl ...
lab2 ...

If lab1 is no longer referenced due to previous optimizations, it can be removed. This then allows the bCC+brl combination to be shortened to a single conditional branch, if the target is close enough.

This introduces a flag for tracking and potentially removing labels that are only used as the target of one branch. This could be used more widely, but currently it is only used for the specific code sequences shown above. Using it in other places could potentially open up possibilities for invalid native-code optimizations that were previously blocked due to the presence of the label.
2024-03-05 22:20:34 -06:00
Stephen Heumann 7e860e60df Generate better code for pc_ixa in large memory model.
This improves the code for certain array indexing operations.
2023-03-23 18:41:16 -05:00
Stephen Heumann a5eafe56af Generate more efficient code for 16-bit signed comparisons.
The new code is smaller and (in the common case where the subtraction does not overflow) faster. It takes advantage of the fact that in overflow cases the carry flag always gets set to the opposite of the sign bit of the result.
2023-03-18 20:05:56 -05:00
Stephen Heumann 85890e0b6b Give an error if assembly code tries to use direct page addressing for a local variable that is out of range.
This could previously cause bad code to be produced with no error reported.
2023-03-04 21:06:07 -06:00
Stephen Heumann a6ef872513 Add debugging option to detect illegal use of null pointers.
This adds debugging code to detect null pointer dereferences, as well as pointer arithmetic on null pointers (which is also undefined behavior, and can lead to later dereferences of the resulting pointers).

Note that ORCA/Pascal can already detect null pointer dereferences as part of its more general range-checking code. This implementation for ORCA/C will report the same error as ORCA/Pascal ("Subrange exceeded"). However, it does not include any of the other forms of range checking that ORCA/Pascal does, and (unlike in ORCA/Pascal) it is controlled by a separate flag from stack overflow checking.
2023-02-12 18:56:02 -06:00
Stephen Heumann 2958619726 Fix varargs stack repair.
Varargs-only stack repair (i.e. using #pragma optimize bit 3 but not bit 6) was broken by commit 32975b720f. It removed some code that was needed to allocate the direct page location used to hold the stack pointer value in that case. This would lead to invalid code being produced, which could cause a crash when run. The fix is to revert the erroneous parts of commit 32975b720f (which do not affect its core purpose of enabling intermediate code peephole optimization to be used when stack repair code is active).
2023-01-08 15:15:32 -06:00
Stephen Heumann d68e0b268f Generate more efficient code for a single return at end of function.
When a function has a single return statement at the end and meets certain other constraints, we now generate a different intermediate code instruction to evaluate the return value as part of the return operation, rather than assigning it to (effectively) a variable and then reading that value again to return it.

This approach could actually be used for all returns in C code, but for now we only use it for a single return at the end. Directly applying it in other cases could increase the code size by duplicating the function epilogue code.
2022-12-19 18:52:46 -06:00
Stephen Heumann ecca7a7737 Never make the segment in the root file dynamic.
This would previously happen if a segment directive with "dynamic" appeared before the first function in the program. That would cause the resulting program not to work, because the root segment needs to be a static segment at the start of the program, but if it is dynamic it would come after a jump table and a static segment of library code.

The root segments are also configured to refer to main or the NDA/CDA entry points using LEXPR records, so that they can be in dynamic segments (not that they necessarily should be). That change is intentionally not done for CDEV/XCMD/NBA, because they use code resources, which do not support dynamic segments, so it is better to force a linker error in these cases.
2022-12-11 14:46:38 -06:00
Stephen Heumann 32975b720f Allow native code peephole opt to be used when stack repair is enabled.
I think the reason this was originally disallowed is that the old code sequence for stack repair code (in ORCA/C 2.1.0) ended with TYA. If this was followed by STA dp or STA abs, the native code peephole optimizer (prior to commit 7364e2d2d3) would have turned the combination into a STY instruction. That is invalid if the value in A is needed. This could come up, e.g., when assigning the return value from a function to two different variables.

This is no longer an issue, because the current code sequence for stack repair code no longer ends in TYA and is not susceptible to the same kind of invalid optimization. So it is no longer necessary to disable the native code peephole optimizer when using stack repair code (either for all calls or just varargs calls).
2022-12-10 20:34:00 -06:00
Stephen Heumann bb1bd176f4 Add a command-line option to select the C standard to use.
This provides a more straightforward way to place the compiler in a "strict conformance" mode. This could essentially be achieved by setting several pragma options, but having a single setting is simpler. "Compatibility modes" for older standards can also be selected, although these actually continue to enable most C17 features (since they are unlikely to cause compatibility problems for older code).
2022-12-07 21:35:15 -06:00
Stephen Heumann d56cf7e666 Pass constant data to backend as pointers into buffer.
This avoids needing to generate many intermediate code records representing the data at most 8 bytes at a time, which should reduce memory use and probably improve performance for large initialized arrays or structs.
2022-12-03 00:14:15 -06:00
Stephen Heumann 99a10590b1 Avoid out-of-range branches around asm code using dcl directives.
The branch range calculation treated dcl directives as taking 2 bytes rather than 4, which could result in out-of-range branches. These could result in linker errors (for forward branches) or silently generating wrong code (for backward branches).

This patch now treats dcb, dcw, and dcl as separate directives in the native-code layer, so the appropriate length can be calculated for each.

Here is an example of code affected by this:

int main(int argc, char **argv) {
top:
        if (!argc) { /* this caused a linker error */
                asm {
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                        dcl 0
                }
                goto top; /* this generated bad code with no error */
        }
}
2022-10-13 18:00:16 -05:00
Stephen Heumann 19683706cc Do not optimize code from asm statements.
Previously, the assembly-level optimizations applied to code in asm statements. In many cases, this was fine (and could even do useful optimizations), but occasionally the optimizations could be invalid. This was especially the case if the assembly involved tricky things like self-modifying code.

To avoid these problems, this patch makes the assembly optimizers ignore code from asm statements, so it is always emitted as-is, without any changes.

This fixes #34.
2022-10-12 22:03:37 -05:00
Stephen Heumann ca21e33ba7 Generate more efficient code for indirect function calls. 2022-10-11 21:14:40 -05:00
Stephen Heumann 05ecf5eef3 Add option to use the declared type for float/double/comp params.
This differs from the usual ORCA/C behavior of treating all floating-point parameters as extended. With the option enabled, they will still be passed in the extended format, but will be converted to their declared type at the start of the function. This is needed for strict standards conformance, because you should be able to take the address of a parameter and get a usable pointer to its declared type. The difference in types can also affect the behavior of _Generic expressions.

The implementation of this is based on ORCA/Pascal, which already did the same thing (unconditionally) with real/double/comp parameters.
2022-09-18 21:16:46 -05:00
Stephen Heumann 60efb4d882 Generate better code for indexed jumps.
They now use a jmp (addr,X) instruction, rather than a more complicated code sequence using rts. This is an improvement that was suggested in an old Genie message from Todd Whitesel.
2022-07-18 21:18:26 -05:00
Stephen Heumann 9b31e7f72a Improve code generation for comparisons.
This converts comparisons like x > N (with constant N) to instead be evaluated as x >= N+1, since >= comparisons generate better code. This is possible as long as N is not the maximum value in the type, but in that case the comparison is always false. There are also a few other tweaks to the generated code in some cases.
2022-07-10 22:27:38 -05:00
Stephen Heumann 76e4b1f038 Optimize away some tax/tay instructions used only to set flags. 2022-07-10 17:35:56 -05:00
Stephen Heumann 00e7fe7125 Increase some size limits.
The new value of maxLocalLabel is aligned with the C99+ requirement to support "511 identifiers with block scope declared in one block".

The value of maxLabel is now the maximum it can be while keeping the size of the labelTab array under 32 KiB. (I'm not entirely sure the address calculations in the code generated by ORCA/Pascal would work correctly beyond that.)
2022-07-08 21:30:14 -05:00
Stephen Heumann 393b7304a0 Optimize 16-bit multiplication by various constants.
This optimizes most multiplications by a power of 2 or the sum of two powers of 2, converting them to equivalent operations using shifts which should be faster than the general-purpose multiplication routine.
2022-07-06 22:24:54 -05:00
Stephen Heumann 161bb952e3 Dynamically allocate string space, and make it larger.
This increases the limit on total bytes of strings in a function, and also frees up space in the blank segment.
2022-06-08 22:09:30 -05:00
Stephen Heumann 02fbf97a1e Work around the SANE comp conversion bug in another place.
This affects casts to comp evaluated at compile time, e.g.:

static comp c = (comp)-53019223785472.0;
2022-01-22 18:22:37 -06:00
Stephen Heumann b43036409e Add a new optimize flag for FP math optimizations that break IEEE rules.
There were several existing optimizations that could change behavior in ways that violated the IEEE standard with regard to infinities, NaNs, or signed zeros. They are now gated behind a new #pragma optimize flag. This change allows intermediate code peephole optimization and common subexpression elimination to be used while maintaining IEEE conformance, but also keeps the rule-breaking optimizations available if desired.

See section F.9.2 of recent C standards for a discussion of how these optimizations violate IEEE rules.
2021-11-29 20:31:15 -06:00
Stephen Heumann fc515108f4 Make floating-point casts reduce the range and precision of numbers.
The C standards generally allow floating-point operations to be done with extra range and precision, but they require that explicit casts convert to the actual type specified. ORCA/C was not previously doing that.

This patch relies on some new library routines (currently in ORCALib) to do this precision reduction.

This fixes #64.
2021-03-06 22:28:39 -06:00
Stephen Heumann c0727315e0 Recognize byte swapping and generate an xba instruction for it.
Specifically, this recognizes the pattern "(exp << 8) | (exp >> 8)", where exp has an unsigned 16-bit type and does not have side effects.
2021-03-05 22:00:13 -06:00
Stephen Heumann 4ad7a65de6 Process floating-point values within the compiler using the extended type.
This means that floating-point constants can now have the range and precision of the extended type (aka long double), and floating-point constant expressions evaluated within the compiler also have that same range and precision (matching expressions evaluated at run time). This new behavior is intended to match the behavior specified in the C99 and later standards for FLT_EVAL_METHOD 2.

This fixes the previous problem where long double constants and constant expressions of type long double were not represented and evaluated with the full range and precision that they should be. It also gives extra range and precision to constants and constant expressions of type double or float. This may have pluses and minuses, but at any rate it is consistent with the existing behavior for expressions evaluated at run time, and with one of the possible models of floating point evaluation specified in the C standards.
2021-03-04 23:58:08 -06:00
Stephen Heumann f19d21365a Recognize more indirect long instructions in the native code optimizer.
These instructions can be generated for indirect accesses to quad values, and the optimization can sometimes make those code sequences more efficient (e.g. avoiding unnecessary reloads of Y).
2021-03-02 19:19:00 -06:00
Stephen Heumann 043124db93 Implement support for doing quad ops without loading operands on stack.
This works when both operands are simple loads, such that they can be broken up into operations on their subwords in a standard format.

Currently, this is implemented for bitwise binary ops, but it can also be expanded to arithmetic, etc.
2021-02-24 19:44:46 -06:00
Stephen Heumann 8faafcc7c8 Implement 64-bit shifts. 2021-02-12 15:06:15 -06:00
Stephen Heumann cb97623878 Do not copy CGI.Comments into CGI.pas.
This has no functional effect, since it is all comments. It does mean that printed listings of CGI.pas would not contain those comments, but it is easy enough to restore if someone wants such listings.

This change should make compilation slightly faster, and it also avoids issues with filetypes when using certain tools (since they cannot infer the filetype of CGI.Comments from its extension).
2021-02-11 18:53:25 -06:00
Stephen Heumann 05868667b2 Implement 64-bit division and remainder, signed and unsigned.
These operations rely on new library routines in ORCALib (~CDIV8 and ~UDIV8).
2021-02-05 12:42:48 -06:00
Stephen Heumann 08cf7a0181 Implement 64-bit multiplication support.
Signed multiplication uses the existing ~MUL8 routine in SysLib. Unsigned multiplication will use a new ~UMUL8 library routine.
2021-02-04 22:23:59 -06:00
Stephen Heumann 168a06b7bf Add support for emitting 64-bit constants in statically-initialized data. 2021-02-04 02:17:10 -06:00
Stephen Heumann 793f0a57cc Initial support for constants with long long types.
Currently, the actual values they can have are still constrained to the 32-bit range. Also, there are some bits of functionality (e.g. for initializers) that are not implemented yet.
2021-02-03 23:11:23 -06:00
Stephen Heumann 807a143e51 Implement 64-bit addition and subtraction. 2021-01-30 23:31:18 -06:00
Stephen Heumann 2e44c36c59 Implement unary negation and bitwise complement for 64-bit types. 2021-01-30 13:49:06 -06:00
Stephen Heumann abb0fa0fc1 Implement bitwise and/or/xor for 64-bit types.
This introduces three new intermediate codes for these operations.
2021-01-30 00:25:15 -06:00
Stephen Heumann 085cd7eb1b Initial code to recognize 'long long' as a type. 2021-01-29 22:27:11 -06:00
Stephen Heumann ffe6c4e924 Spellcheck comments throughout the code.
There are no non-comment changes.
2020-01-29 17:09:52 -06:00
Stephen Heumann 4a7644e0b5 Don't allocate stack space for varargs stack repair unless it's needed.
If there are no varargs calls (and nothing else that saves stack positions), then space doesn't need to be allocated for the saved stack position. This can also lead to more efficient prolog/epilog code for small functions.
2018-01-13 20:02:43 -06:00
Stephen Heumann e7cc513ad4 Add support for inline procedure names as documented in IIgs tech note #103.
These are enabled when bit 15 is set in the #pragma debug directive.

Support is still needed to ensure these work properly with pre-compiled headers.

This patch is from Kelvin Sherlock.
2017-10-21 20:36:21 -05:00
Stephen Heumann c46cf79c79 Increase the maximum allowed number of local variables from 200 to 220. 2017-10-21 20:36:21 -05:00
Stephen Heumann ccd653ddb9 Move some more code out of the blank segment to make space for static data. 2017-10-21 20:36:21 -05:00
Stephen Heumann 02de5f4137 Increase the total size of string constants permitted in each function.
The size limit is increased from 8000 bytes to 12500 bytes. This was needed to compile some functions with many string constants.
2017-10-21 20:36:21 -05:00
Stephen Heumann a4bffe65e5 Increase the limit on the number of intermediate code labels in a function from 2400 to 3200.
This is necessary to compile some very large functions, such as the main interpreter loop in Git.

This consumes about 8K of extra memory for the additional label records.
2017-10-21 20:36:21 -05:00
Stephen Heumann 709f9b3f25 Fix bug where comparing 32-bit values in static arrays or structs against 0 may give wrong results with large memory model.
The issue was that 16-bit absolute addressing (in the data bank) was being used to access the data to compare, but with the large memory model the static arrays or structs are not necessarily in the same bank, so absolute long addressing should be used.

This was sometimes causing failures in the C4.6.4.1.CC and C4.6.6.1.CC conformance tests in the ORCA/C test suite.

The following program often demonstrates the problem (depending on memory layout and contents):

#pragma memorymodel 1
#pragma optimize 1

#include <stdio.h>

int i;
char ch1[32000];
long L1[1];

int main (void)
{
    if (L1 [0] != 0)
        printf("%li\n", L1[0]); /* shouldn't print */

    /* buggy behavior can happen if the bank bytes of these pointers differ */
    printf("%p %p\n", &L1[0], &i);
}
2017-10-21 20:36:21 -05:00
Stephen Heumann 0df71da4f1 Change Byte -> UByte conversion to use a "Word -> UByte" conversion, rather than introducing a new "Byte -> UByte" conversion.
The latter would require more changes to the code generator to understand it, whereas this approach doesn't require any changes. This is arguably less clean, but it matches other places where a byte value is subsequently operated on as a word without an explicit conversion, and the assembly instruction generated is the same.
2017-10-21 20:36:20 -05:00
Stephen Heumann c28e48a54f Do an explicit conversion when converting from signed to unsigned byte values. This is needed because the value is held in a 16-bit register, sign-extended. The high 8 bits need to be cleared to convert to an unsigned byte.
This fixes the compca06.c test case.

Note that this generates inefficient code in the case of loading a signed byte value and then immediately casting it to unsigned (it first sign-extends the value, then masks off the high bits). This should be optimized, but at least the generated code is correct now.
2017-10-21 20:36:20 -05:00
Stephen Heumann 46b6aa389f Change all text/source files to LF line endings. 2017-10-21 18:40:19 -05:00