Commit Graph

51 Commits

Author SHA1 Message Date
Stephen Heumann
75234dbf83 Handle long long in pc_equ/pc_neq optimizations. 2021-02-13 17:03:49 -06:00
Stephen Heumann
8faafcc7c8 Implement 64-bit shifts. 2021-02-12 15:06:15 -06:00
Stephen Heumann
00d72f04d3 Implement basic peephole optimizations for some 64-bit operations.
This currently covers bitwise ops, addition, and subtraction.
2021-02-11 19:47:42 -06:00
Stephen Heumann
a804d1766b Merge branch 'master' into longlong 2021-02-11 15:55:15 -06:00
Stephen Heumann
895d0585a8 Small new optimization: "anything % 1" equals 0. 2021-02-11 15:52:44 -06:00
Stephen Heumann
8078675aae Do not eliminate expressions with side effects in "exp | -1" or "exp & 0".
This was previously happening in intermediate code peephole optimization.

The following example program demonstrates the problem:

#pragma optimize 1
int main(void) {
        int i = 0;
        long j = 0;
        ++i | -1;
        ++i & 0;
        ++j | -1;
        ++j & 0;
        return i+j; /* should be 4 */
}
2021-02-11 14:50:36 -06:00
Stephen Heumann
05868667b2 Implement 64-bit division and remainder, signed and unsigned.
These operations rely on new library routines in ORCALib (~CDIV8 and ~UDIV8).
2021-02-05 12:42:48 -06:00
Stephen Heumann
08cf7a0181 Implement 64-bit multiplication support.
Signed multiplication uses the existing ~MUL8 routine in SysLib. Unsigned multiplication will use a new ~UMUL8 library routine.
2021-02-04 22:23:59 -06:00
Stephen Heumann
793f0a57cc Initial support for constants with long long types.
Currently, the actual values they can have are still constrained to the 32-bit range. Also, there are some bits of functionality (e.g. for initializers) that are not implemented yet.
2021-02-03 23:11:23 -06:00
Stephen Heumann
807a143e51 Implement 64-bit addition and subtraction. 2021-01-30 23:31:18 -06:00
Stephen Heumann
2426794194 Add support for new pcodes in optimizer. 2021-01-30 21:11:06 -06:00
Stephen Heumann
2e44c36c59 Implement unary negation and bitwise complement for 64-bit types. 2021-01-30 13:49:06 -06:00
Stephen Heumann
abb0fa0fc1 Implement bitwise and/or/xor for 64-bit types.
This introduces three new intermediate codes for these operations.
2021-01-30 00:25:15 -06:00
Stephen Heumann
fa835aca43 Implement basic load/store ops for long long.
The following intermediate codes should now work:
pc_lod
pc_pop
pc_str
pc_cop
pc_sro
pc_cpo
2021-01-29 23:11:08 -06:00
Stephen Heumann
c84c4d9c5c Check for non-void functions that execute to the end without returning a value.
This generalizes the heuristic approach for checking whether _Noreturn functions could execute to the end of the function, extending it to apply to any function with a non-void return type. These checks use the same #pragma lint bit but give different messages depending on the situation.
2020-02-02 13:50:15 -06:00
Stephen Heumann
bc951b6735 Make lint report some more cases where noreturn functions may return.
This uses a heuristic that may produce both false positives and false negatives, but any false positives should reflect extraneous code at the end of the function that is not actually reachable.
2020-01-30 17:35:15 -06:00
Stephen Heumann
ffe6c4e924 Spellcheck comments throughout the code.
There are no non-comment changes.
2020-01-29 17:09:52 -06:00
Stephen Heumann
2fb075ce58 Do not dereference and write through a null pointer during loop invariant removal. 2019-12-22 19:58:57 -06:00
Stephen Heumann
2190b7e7ed Fix two places in the optimizer where null pointers could be dereferenced.
These were generally fairly harmless, but one could have caused problems if the zero page contained certain values.
2019-12-17 18:03:51 -06:00
Stephen Heumann
a09581b84e Fix crash or error in certain cases when using common subexpression elimination.
In certain rare cases, constant subexpression elimination could set the left subtree of a pc_bno operation in the intermediate code to nil. This could lead to null pointer dereferences, sometimes resulting in a crash or error during native code generation.

The below program sometimes demonstrates the problem (dependent on zero page contents):

#pragma optimize 16
struct F {int *p;};
void foo(struct F* f)
{
    struct {int c;} s = {0};
    ++f->p;
    s.c |= *--f->p;
}
2019-12-17 16:13:07 -06:00
Stephen Heumann
80b96c1147 Ensure % with negative operands is not mis-optimized in intermediate code.
This will not be triggered in most cases, but might be if one of the operand expressions was itself subject to optimization.
2018-09-14 19:18:45 -05:00
Stephen Heumann
a9f7f97a2f Avoid errors from attempting common subexpression elimination on the left subexpression of the comma operator.
This could happen because the left subexpression does not produce a result for use in the enclosing expression, and therefore is not of the form expected by the CSE code.

The following program (derived from a csmith-generated test case) illustrates the problem:

#pragma optimize 16
int main(void) {
    int i;
    i, (i, 1);
}
2018-03-31 20:10:50 -05:00
Stephen Heumann
21493271b9 Fix optimizer bug where tests of long or floating-point constants can trash the stack.
This problem could lead to crashes in code like the following (derived from a csmith-generated test case):

#pragma optimize 1
int main (void)
{
    if (1L) ;
}
2018-03-26 21:54:01 -05:00
Stephen Heumann
f2d15b8fc7 Fix optimizer bug where casts with unused results could sometimes cause stack corruption.
This problem could lead to crashes in code like the following (derived from a csmith-generated test case):

#pragma optimize 1
static int main(void) {
    long i = 2;
    (long)(i > 1);
}
2018-03-26 19:57:18 -05:00
Stephen Heumann
7f94876fa8 Fix mis-optimization of "expression && non-zero constant" operations with 32-bit type.
The previous code may have been intended to convert this to a "!=0" test, which would have been valid if correctly implemented, but with the current code generator that actually yields worse code than the original version, so for now I just removed the optimization for this case.

This problem could lead to crashes in code like the following (derived from a csmith-generated test case):

#pragma optimize 1
int main(int argc, char *argv[]){
    long l_57 = argc;

    return (4 ^ l_57) && 6;
}
2018-03-26 18:30:45 -05:00
Stephen Heumann
db98f7842d Fix mis-evaluation of certain equality comparisons with intermediate code optimization.
This affected comparisons of the form "logical operation or comparison == constant other than 0 or 1". These should always evaluate to 0 (false), but could mis-evaluate to true due to the bad optimization.

The following program gives an example showing the problem:

#pragma optimize 1
int main(void) {
        int i = 0, j = 42;
        return (i || j) == 123;
}
2018-03-26 18:20:36 -05:00
Stephen Heumann
9b08d4337a Prevent errors in loop invariant removal from trying to remove only the left subexpression of a comma operator.
Such subexpressions are not of the right form to work with the existing code, because they do not generate a value for use in the enclosing expression. For now, the code has been changed to simply not remove the subexpression in these cases. Alternative code could be written to make it work, but that might be more trouble than it's worth.

Here's an example that shows the problem (derived from a csmith-generated test case):

#pragma optimize 32+1 /* also had a problem with just 32 */
int main(void) {
    int x, y=10; /* also had problems if x was global */
    do {
        x=42, y-=1;
    } while (y);
    return x+y;
}
2018-03-25 18:22:37 -05:00
Stephen Heumann
4e7a7e67e7 Fix problems with loop invariant removal optimization.
These mainly related to situations where the optimization of multiple natural loops (including those created by continue statements) could interact to generate invalid results. Invalid optimizations could also be performed in certain other cases where there were multiple goto statements targeting a single label and at least one of them formed a loop.

These issues are addressed by appropriately adjusting the control flow and updating various data structures after each loop is processed during loop invariant removal.

This fixes #18 (compca18.c).
2017-12-12 13:50:17 -06:00
Stephen Heumann
ba09d5ee6d Fix issues with addressing/pointer arithmetic using unsigned indexes that generate a displacement of 32K to 64K.
These cases should now always work when using an expression of type unsigned as the index. They will work in some cases but not others when using an int as the index: making those cases work consistently would require more extensive changes and/or a speed hit, so I haven't done it for now.

Note that this now uses an "unsigned multiply" operation for all 16-bit index computations. This should actually work even when the index is a negative signed value, because it will wind up producing (the low-order 16 bits of) the right answer. The signed multiply, on the other hand, generally does not produce the low-order 16 bits of the right answer in cases where it overflows.

The following program is an example that was miscompiled (both with and without optimization):

int c[20000] = {3};

int main(void) {
    int *p;
    unsigned i = 17000;

    p = c + 17000u;
    return *(p-i); /* should return 3 */
}
2017-11-12 17:21:05 -06:00
Stephen Heumann
df42ce257f Fix issue where certain address computations could be improperly restricted to a 32K or 64K range (even when using the large memory model).
This could occur with computations where multiple variables were added to a pointer.

The following program is an example that was miscompiled:

#pragma optimize 1
#pragma memorymodel 1

char c[80000];

int main(void) {
    unsigned i = 30000, j = 40000;
    c[70000] = 3;
    return *(c+i+j); /* should return 3 */
}
2017-11-12 12:29:06 -06:00
Stephen Heumann
763c5192df When optimizing certain index calculations, properly indicate whether they should be signed or unsigned.
This type information is currently used when generating code for the large memory model, but not for the short memory model (which is a bug in itself, causing issue such as #45).

Because the correct type information was not being provided, the code generator could incorrectly use signed index computations when a 16-bit unsigned index value was used in large-memory-model code. The following program is an example that was being miscompiled:

#pragma optimize 1
#pragma memorymodel 1

char c[0xFFFF];

int main(void) {
    unsigned i = 0xABCD;
    c[0xABCD] = 3;
    return c[i]; /* should return 3 */
}
2017-11-10 22:24:50 -06:00
Stephen Heumann
730544a6ce Fix optimizer bug that could limit certain address calculations to a 32k or 64k range even when using the large memory model.
This optimization could apply when indexing into an array whose elements are a power-of-2 size using a 16-bit index value. It is now only used when addressing arrays on the stack (which are necessarily smaller than 64k).

The following program demonstrates the problem:

#pragma optimize 1
#pragma memorymodel 1

long c[40000];

int main(void) {
    int i = 30000;
    c[30000] = 3;
    return c[i]; /* should return 3 */
}
2017-11-10 22:23:51 -06:00
Stephen Heumann
9144002b3b Don't remove bitfield stores during loop invariant removal.
This could generate bad code (e.g. invalidly moving stores ahead of loads, as in #44). It would be possible to do this validly in some cases, but it would take more work to do the necessary checks. For now, we'll just block the optimization for bitfield stores.

In combination with the previous commit, this fixes #44.
2017-10-28 22:41:33 -05:00
Stephen Heumann
ff90151e77 Block invalid movement of bitfield accesses in common subexpression elimination.
This fixes the problem in #44 for the case of using common subexpression elimination only. (Loop invariant removal still causes the problem.)
2017-10-28 22:16:10 -05:00
Stephen Heumann
1e8413138e Avoid lifting indirect loads out of loops where the value may be modified.
The code was not accounting for the possibility that the loaded-from location aliases with the destination of an indirect store in the loop, or for the possibility that it may be written by a function called in the loop. Since we don't have sophisticated alias analysis, we now conservatively assume there may be aliasing in all such cases.

This fixes #20 (compca20.c) and #21 (compca21.c).
2017-10-28 21:59:15 -05:00
Stephen Heumann
e242f03501 Don't attempt bogus common subexpression elimination when loading structures on the stack.
Previously, the structure load would be treated as a common subexpression eligible for elimination, but the structure would always be treated as if it had a size of 4 bytes. If it did not, this would generally lead to a crash. (I'm also not sure if dependency analysis was being performed properly for these structures.)

The following program illustrates the problem:

#pragma optimize 17
struct mystruct { char x; } ms;
static void foo(struct mystruct pk) {}
int main(void)
{
    struct mystruct *p = &ms;
    foo(*p);
    foo(*p);
}
2017-10-21 20:36:21 -05:00
Stephen Heumann
ccd653ddb9 Move some more code out of the blank segment to make space for static data. 2017-10-21 20:36:21 -05:00
Stephen Heumann
1502e48188 Fix bug causing incorrect code generation in programs that use the 'volatile' keyword.
This bug could both cause accesses to volatile variables to be omitted, and also cause other expressions to be erroneously optimized out in certain circumstances.

As an example, both the access of x and the call to bar() would be erroneously removed in the following program:

#pragma optimize 1
volatile int x;
int bar(void);
void foo(void)
{
    if(x) ;
    if(bar()) ;
}

Note that this patch disables even more optimizations than previously if the 'volatile' keyword is used anywhere in a translation unit. This is necessary for correctness given the current design of ORCA/C, but it means that care should be taken to avoid unnecessary use of 'volatile'.
2017-10-21 20:36:21 -05:00
Stephen Heumann
5c81d970b5 Fix issue where statically-evaluated conversions from floating-point values to some integer types could yield wrong values.
This occurred because the values were being rounded rather than truncated when converted to long, unsigned long, or unsigned int.

This was causing problems in the C6.2.3.5.CC test case when compiled with optimization.

The below program demonstrates the problem:

#pragma optimize 1
#include <stdio.h>
int main (void)
{
   long L;
   unsigned int ui;
   unsigned long ul;
   L = -1.5;
   ui = 1.5;
   ul = 1.5;
   printf("%li %u %lu\n", L, ui, ul); /* should print "-1 1 1" */
}
2017-10-21 20:36:21 -05:00
Stephen Heumann
aa7084dada Fix problem where long basic blocks could lead to a crash due to stack overflow in common subexpression elimination.
The issue was that one of the procedures used for CSE would recursively call itself for every 'next' link in the code of the basic block. To avoid this, I made it loop back to the top instead (i.e. did a manual tail-call elimination transformation).

This problem could be observed with large switch statements as in the following example, although other codes with very large basic blocks might have triggered it too. Whether ORCA/C actually crashes will depend on the memory layout--in my testing, this example consistently caused it to crash when running under GNO:

#pragma optimize 16
int main (int argc, char **argv)
{
    switch (argc)
    {
        case 0:  case 1:  case 2:  case 3:  case 4:  case 5:  case 6:  case 7:
        case 8:  case 9:  case 10: case 11: case 12: case 13: case 14: case 15:
        case 16: case 17: case 18: case 19: case 20: case 21: case 22: case 23:
        case 24: case 25: case 26: case 27: case 28: case 29: case 30: case 31:
        case 32: case 33: case 34: case 35: case 36: case 37: case 38: case 39:
        case 40: case 41: case 42:
        case 262:
        ;
    }
}
2017-10-21 20:36:21 -05:00
Stephen Heumann
88279484af Generate better code when accessing structs and unions within other structures.
This cuts a few instructions from code like what is shown in commit affbe9. (It also works around the bug with that example, although that patch addresses the root cause of the problem.)
2017-10-21 20:36:20 -05:00
Stephen Heumann
953a7c36d5 Optimization to avoid generating unnecessary sign-extension code when converting UChar values to Long/ULong.
This ameliorates a performance regression due to promoting char to int instead of unsigned int.
2017-10-21 20:36:20 -05:00
Stephen Heumann
cc75a9b12b Fix a problem where a conversion to a smaller type could be improperly omitted when followed by a conversion to a larger type.
This also fixes a logic error that may have permitted other conversions to be improperly omitted in some cases.

The following program demonstrates the problem (should print 211):

#pragma optimize 1
#include <stdio.h>
int main(void)
{
        unsigned int i = 1234;
        long l = (unsigned char)(i+1);
        printf("%li\n", l);
}
2017-10-21 20:36:20 -05:00
Stephen Heumann
0e82755334 Fix error that could happen when a function call was used in a comparison expression.
This resulted from the addition of the signed-to-unsigned comparison optimization. Specifically, it calls TypeOf for the expressions on each side of the comparison, and this did not handle function calls. That support has now been added, and will give the proper return type for direct and indirect calls to C functions. The IR for tool calls doesn't include the return type (just the number of bytes), so we return cgVoid for them. This is OK for the present use case.
2017-10-21 20:36:20 -05:00
Stephen Heumann
dce39a851e Peephole optimization: fix some places where an expression with side effects could be removed during optimization.
Without this fix, an expression of the form "0 * exp" would be reduced to simply "0" unless exp contained a function call; other side effects of exp (such as assignments or increments) would be removed.

A similar issue could occur with additions that use the same expression on both sides of the "+": after optimization, it would only be evaluated once. I think the cases addressed here are all undefined behavior under the C standards, so the old behavior wasn't technically wrong, but the new behavior is still less confusing.
2017-10-21 20:36:20 -05:00
Stephen Heumann
bd19465ab2 Fix so that reaching definitions are properly computed in loop optimization when there is unreachable code.
Specifically, this ensures that the depth-first numbering of basic blocks starts from 1, which is what ReachingDefinitions expects. Without this fix, reaching definitions wouldn't be correctly computed for functions that contain unreachable basic blocks (including the implicit one to return at the end). This could result in invalid hoisting of operations out of the loop.

This fixes the compca26.c test case.
2017-10-21 20:36:20 -05:00
Stephen Heumann
5ab7c7876b New optimization: Use unsigned rather than signed comparisons in some cases involving values that were originally unsigned bytes.
Specifically, convert signed word comparisons to unsigned if both sides were either unsigned byte values or non-negative constants. This is incorporated as part of intermediate code peephole optimization (bit 0).

This should alleviate some cases of performance regressions due to promoting char to int instead of unsigned int.
2017-10-21 20:36:20 -05:00
Stephen Heumann
be91f1d1cc Disable an invalid optimization that would suppress stores of address values when they appeared to the left of the comma operator.
This fixes the compca22.c test case.

This optimization could be fixed and re-enabled, but to do so, you would have to check if the stored value is ever used subsequently, which is not information that's readily available in the peephole optimization pass. It would also be necessary to check if there are any stores to the same location within the right-side expression, which could kill the optimization.
2017-10-21 20:36:20 -05:00
Stephen Heumann
b9446a28a7 Don't do common subexpression elimination for indirect accesses where there is an intervening operation that may write to the same location.
This fixes the bug in the compca05.c test case.
2017-10-21 20:36:20 -05:00
Stephen Heumann
46b6aa389f Change all text/source files to LF line endings. 2017-10-21 18:40:19 -05:00