The issue was that one of the procedures used for CSE would recursively call itself for every 'next' link in the code of the basic block. To avoid this, I made it loop back to the top instead (i.e. did a manual tail-call elimination transformation).
This problem could be observed with large switch statements as in the following example, although other codes with very large basic blocks might have triggered it too. Whether ORCA/C actually crashes will depend on the memory layout--in my testing, this example consistently caused it to crash when running under GNO:
#pragma optimize 16
int main (int argc, char **argv)
{
switch (argc)
{
case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7:
case 8: case 9: case 10: case 11: case 12: case 13: case 14: case 15:
case 16: case 17: case 18: case 19: case 20: case 21: case 22: case 23:
case 24: case 25: case 26: case 27: case 28: case 29: case 30: case 31:
case 32: case 33: case 34: case 35: case 36: case 37: case 38: case 39:
case 40: case 41: case 42:
case 262:
;
}
}
Also, fix a case where an uninitialized value could be used, potentially resulting in errors not being reported (although I haven’t seen that in practice).
This fixes problems where >>= operations might not use an arithmetic shift in certain cases where they should, as in the below program:
#include <stdio.h>
int main (void)
{
int i;
unsigned u;
long l;
unsigned long ul;
i = -1;
u = 1;
i >>= u;
printf("%i\n", i); /* should be -1 */
l = -1;
ul = 3;
l >>= ul;
printf("%li\n", l); /* should be -1 */
}
This is necessary so that subsequent processing sees the correct expression type and proceeds accordingly. In particular, without this patch, the casts in the below program were erroneously ignored:
#include <stdio.h>
int main(void)
{
unsigned int u;
unsigned char c;
c = 0;
u = (unsigned char)~c;
printf("%u\n", u);
c = 200;
printf("%i\n", (unsigned char)(c+c));
}
This is as required by the C standards: the type of the right operand should not affect the result type.
The following program demonstrates problems with the old behavior:
#include <stdio.h>
int main(void)
{
unsigned long ul;
long l;
unsigned u;
int i;
ul = 0x8000 << 1L; /* should be 0 */
printf("%lx\n", ul);
l = -1 >> 1U; /* should be -1 */
printf("%ld\n", l);
u = 0xFF10;
l = 8;
ul = u << l; /* should be 0x1000 */
printf("%lx\n", ul);
l = -4;
ul = 1;
l = l >> ul; /* should be -2 */
printf("%ld\n", l);
}
The following demonstrates cases that would erroneously be allowed (and misleadingly give a size of 0) before:
#include <stdio.h>
struct s *S;
int main(void)
{
printf("%lu %lu\n", sizeof(struct s), sizeof *S);
}
The following test case demonstrates the problem:
#include <stdio.h>
int main (void)
{
if (sizeof(int) - 5 < 0) puts("error 1");
if (sizeof &main - 9 < 0) puts("error 2");
}
This affected binary or unary expressions with constant operands, at least one of which was of type unsigned long.
The following test case demonstrates the problem:
#include <stdio.h>
int main (void)
{
if (0 + 0x80000000ul < 0) puts("error 1");
if (~0ul < 0) puts("error 2");
}
This cuts a few instructions from code like what is shown in commit affbe9. (It also works around the bug with that example, although that patch addresses the root cause of the problem.)
This generated invalid code in instances like the following. The code generated for "s = s->u.next" would update the most significant word of s first, then use an indirect load with the half-updated pointer value to update the least significant word of s. This would generally corrupt the result if the new and old pointers had different bank bytes.
#pragma optimize 79
#include <stdio.h>
struct S {
int i;
union {
struct S * next;
} u;
} s1 = {0, 0};
int main (void)
{
struct S * s = &s1;
s = s->u.next;
if (s != 0)
puts("compiler bug detected\n"); /* May not always be triggered, depending on memory contents. */
}
ORCA/C previously allowed struct/union members to be declared with incomplete type. Because of this, it allowed C99-style flexible array members to be declared, albeit by accident rather than by design. In some basic testing, these seem to work correctly, except that they could be initialized and that would give rise to odd behavior.
I have restricted it to allowing flexible array members only in the cases allowed by C99/C11, and otherwise disallowing members with incomplete type. I have also prohibited initializing flexible array members.
Also, generate better code for zero-initializing small arrays.
The problem was that the code would call the library routine ~ZERO with a size of 1, but it only works properly with a size of 2 or more. While adding a check here, I also changed it to not call ~ZERO for other small arrays (<=10 bytes), since it is generally more efficient to just initialize them directly.
The initializations in the following are examples that could trigger the problem:
int main(void)
{
struct { int i; char s[1]; } foo = {1, 0};
char arr[2][1] = {2};
}
This may be someone trying to use a C11-style anonymous struct/union, which should be flagged as an error until and unless those are supported. Otherwise, it probably just indicates that the programmer is confused. In any case, an error should be flagged for it.
C89 restricts bit fields to (signed) int and unsigned int only, although later standards note that additional types may be supported. ORCA/C supports the other integer types as an extension.
This fixes the compco01.c test case.
These would previously be allowed, with the void value treated as if it was unsigned long.
Void values are still allowed for the second and third operands of the ?: operator, but only if they are both void. This is as required by the C standard.
This patch should also permit the union initialization code to handle unions containing bit fields, but for the time being they are still prohibited by code elsewhere in the compiler.
This fixes a problem with ORCA/C conformance test C5.6.0.1.CC, which was introduced by commit bf9fa66. A slightly more involved fix was needed to preserve the correct behavior while avoiding the memory trashing fixed by that patch.
Unsigned constants in the range 0x8000-0xFFFF were erroneously being treated like negative signed values in some contexts, including in array declarators and in the case labels of switch statements where the expression switched over has type long or unsigned long.
This could lead to bogus compile errors for array declarations and typedefs such as the following:
typedef char foo[0x8000];
It could also lead to cases in switch statements not being properly matched, as in the following program:
#include <stdio.h>
int main(void)
{
long i = 0xFF00;
switch (i) {
case 0xFF00:
puts("good");
break;
default:
puts("bad");
}
}
The case label values are converted to the promoted type of the expression being switched on, as if by a cast. In practice, this means discarding the high bits of a 32-bit value to produce a 16-bit one.
Code requiring this is dubious and would be a good candidate for a warning or a lint error, but it's allowed under the C standards.
The following code demonstrates the issue:
#include <stdio.h>
int main(void)
{
int i = 0x1234;
switch (i) {
case 0xABCD1234:
puts("good");
break;
default:
puts("bad");
}
}
This also fixes a logic error that may have permitted other conversions to be improperly omitted in some cases.
The following program demonstrates the problem (should print 211):
#pragma optimize 1
#include <stdio.h>
int main(void)
{
unsigned int i = 1234;
long l = (unsigned char)(i+1);
printf("%li\n", l);
}
This resulted from the addition of the signed-to-unsigned comparison optimization. Specifically, it calls TypeOf for the expressions on each side of the comparison, and this did not handle function calls. That support has now been added, and will give the proper return type for direct and indirect calls to C functions. The IR for tool calls doesn't include the return type (just the number of bytes), so we return cgVoid for them. This is OK for the present use case.
Without this fix, an expression of the form "0 * exp" would be reduced to simply "0" unless exp contained a function call; other side effects of exp (such as assignments or increments) would be removed.
A similar issue could occur with additions that use the same expression on both sides of the "+": after optimization, it would only be evaluated once. I think the cases addressed here are all undefined behavior under the C standards, so the old behavior wasn't technically wrong, but the new behavior is still less confusing.
Specifically, this ensures that the depth-first numbering of basic blocks starts from 1, which is what ReachingDefinitions expects. Without this fix, reaching definitions wouldn't be correctly computed for functions that contain unreachable basic blocks (including the implicit one to return at the end). This could result in invalid hoisting of operations out of the loop.
This fixes the compca26.c test case.
Specifically, convert signed word comparisons to unsigned if both sides were either unsigned byte values or non-negative constants. This is incorporated as part of intermediate code peephole optimization (bit 0).
This should alleviate some cases of performance regressions due to promoting char to int instead of unsigned int.
This occurred because the code to handle the function-like macro use would read the following token, which could prompt processing of the following preprocessing directive in an inappropriate context.
The following example illustrates the problem (the error message would be printed):
#define A()
#define FOO 1
A()
#if !FOO
# error "shouldn't get here"
#endif
Previously, such initializations would sometimes generate a garbage value pointing up to 65535 bytes beyond the start of the string constant. (This was due to a lack of sign-extension in the object code generation.)
Computing a pointer to before the start of an object invokes undefined behavior, so the previous behavior wasn't technically wrong, but it was unintuitive and served no useful purpose. The new behavior should at least be easier to understand and debug.
The latter would require more changes to the code generator to understand it, whereas this approach doesn't require any changes. This is arguably less clean, but it matches other places where a byte value is subsequently operated on as a word without an explicit conversion, and the assembly instruction generated is the same.
This fixes the compca06.c test case.
Note that this generates inefficient code in the case of loading a signed byte value and then immediately casting it to unsigned (it first sign-extends the value, then masks off the high bits). This should be optimized, but at least the generated code is correct now.
This fixes the compca22.c test case.
This optimization could be fixed and re-enabled, but to do so, you would have to check if the stored value is ever used subsequently, which is not information that's readily available in the peephole optimization pass. It would also be necessary to check if there are any stores to the same location within the right-side expression, which could kill the optimization.