The following program (derived from a csmith-generated test case) demonstrates the crash:
#pragma optimize 8+64
#include <stdio.h>
long g = 0;
int main (void) {
long l = 0x10305070;
printf("%08lx\n", l ^ (g = (1 , 0x12345678)));
}
This was a bug with the code for moving the return address. It would generate a "LDA 0" instruction when it was trying to load the value at DP+256.
The following program (derived from a csmith-generated test case) demonstrates the crash:
#pragma optimize 8
int main (int argc, char **argv) {
char s[0xFC];
}
If there are no varargs calls (and nothing else that saves stack positions), then space doesn't need to be allocated for the saved stack position. This can also lead to more efficient prolog/epilog code for small functions.
Previously, the stack repair code always generated code to save and restore a register, but this can be omitted except in cases where a 32-bit value or pointer is returned.
Previously, when stack repair code was generated, it always included instructions to save and restore a previously-saved stack position, but this was only actually used for function calls nested within the arguments to other function calls using stack repair code. Now that code is only generated in cases where it is needed, and the stack repair code for other calls is simplified to omit it.
This optimization affects all (non-nested) function calls when not using optimize bit 3, and varargs function calls when not using optimize bit 6.
This introduces a function to check whether the index portion of a pc_ixa intermediate code operation (used for array indexing) may be negative. This is also used when generating code for the large memory model, which can allow slightly more efficient code to be generated in some cases.
This fixes#45.
These are enabled when bit 15 is set in the #pragma debug directive.
Support is still needed to ensure these work properly with pre-compiled headers.
This patch is from Kelvin Sherlock.
This could happen in certain cases where the destination is not considered "simple" (e.g. because it is a local array location that does not fit in the direct page).
The following program demonstrates the problem:
#pragma optimize 1
int main(void) {
long temp1 = 1, temp2 = 2, A[64];
long B[2] = {0};
B[1] = temp1 + temp2;
return B[1]; /* should return 3 */
}
This bug occurred because the generated code tried to store part of the return address to a direct page offset of 256, but instead an offset of 0 was used, resulting in an invalid return address and (typically) a crash. It could occur if the function took one or more parameters, and the total size of parameters and local variables (including compiler-generated ones) was 254 bytes.
The following program demonstrates the problem:
int main(int argc, char **argv) {
char x[244];
}
The issue was that 16-bit absolute addressing (in the data bank) was being used to access the data to compare, but with the large memory model the static arrays or structs are not necessarily in the same bank, so absolute long addressing should be used.
This was sometimes causing failures in the C4.6.4.1.CC and C4.6.6.1.CC conformance tests in the ORCA/C test suite.
The following program often demonstrates the problem (depending on memory layout and contents):
#pragma memorymodel 1
#pragma optimize 1
#include <stdio.h>
int i;
char ch1[32000];
long L1[1];
int main (void)
{
if (L1 [0] != 0)
printf("%li\n", L1[0]); /* shouldn't print */
/* buggy behavior can happen if the bank bytes of these pointers differ */
printf("%p %p\n", &L1[0], &i);
}
The latter would require more changes to the code generator to understand it, whereas this approach doesn't require any changes. This is arguably less clean, but it matches other places where a byte value is subsequently operated on as a word without an explicit conversion, and the assembly instruction generated is the same.
This fixes the compca06.c test case.
Note that this generates inefficient code in the case of loading a signed byte value and then immediately casting it to unsigned (it first sign-extends the value, then masks off the high bits). This should be optimized, but at least the generated code is correct now.