These involve recent standards-conformance patches for printf and scanf, which changed some (non-standard) behaviors that the test cases were expecting.
I also fixed a couple things that clang flagged as undefined behavior, even though they weren't really causing problems under ORCA/C.
This affects command lines like:
cmpl myprog.c cc=(-da=+) ...
Previously, this would be accepted, but a was actually defined to 0 rather than +.
Now, this gives an error, consistent with other tokens that are not supported in such definitions on the command line. (Perhaps we should support definitions using any tokens, but that would require bigger code changes.)
This also cleans up some related code to avoid possible null-pointer dereferences.
This affects cases where the floating value, truncated to an integer, is outside the range of the destination type. Previously, the result value might appear to be an int value outside the range of the character type.
These situations are undefined behavior under the C standards, so this was not technically a bug, but the new behavior is less surprising. (Note that it still may not raise the "invalid" floating-point exception in some cases where Annex F would call for that.)
The basic issue with all of these is that they failed to sign-extend the 8-bit signed char value to the full 16-bit A register. This could make certain operations on negative signed char values appear to yield positive values outside the range of signed char.
The following example code demonstrates the problems:
#include <stdio.h>
signed char f(void) {return -50;}
int main(void) {
long l = -123;
int i = -99;
signed char sc = -47;
signed char *scp = ≻
printf("%i\n", (signed char)l);
printf("%i\n", (signed char)i);
printf("%i\n", f());
printf("%i\n", (*scp)++);
printf("%i\n", *scp = -32);
}
There are several conversions that do not set the necessary flags, so they must be set separately before doing a comparison. Without this fix, comparisons of a value that was just converted might be mis-evaluated.
This led to bugs where the wrong side of an "if" could be followed in some cases, as in the below examples:
#include <stdio.h>
int g(void) {return 50;}
signed char h(void) {return 50;}
long lf(void) {return 50;}
int main(void) {
signed char sc = 50;
if ((int)(signed char)g()) puts("OK1");
if ((int)h()) puts("OK2");
if ((int)sc) puts("OK3");
if ((int)lf()) puts("OK4");
}
These would generally not work correctly on bit-fields, or on floating-point values that were in a structure or were accessed via a pointer.
The below program is an example that would demonstrate problems:
#include <stdio.h>
int main(void) {
struct {
signed int i:7;
unsigned long int j:6;
_Bool b:1;
double d;
} s = {-10, -20, 0, 5.0};
double d = 70.0, *dp = &d;
printf("%i\n", (int)s.i++);
printf("%i\n", (int)s.i--);
printf("%i\n", (int)++s.i);
printf("%i\n", (int)--s.i);
printf("%i\n", (int)s.i);
printf("%i\n", (int)s.j++);
printf("%i\n", (int)s.j--);
printf("%i\n", (int)++s.j);
printf("%i\n", (int)--s.j);
printf("%i\n", (int)s.j);
printf("%i\n", s.b++);
printf("%i\n", s.b--);
printf("%i\n", ++s.b);
printf("%i\n", --s.b);
printf("%i\n", s.b);
printf("%f\n", s.d++);
printf("%f\n", s.d--);
printf("%f\n", ++s.d);
printf("%f\n", --s.d);
printf("%f\n", s.d);
printf("%f\n", (*dp)++);
printf("%f\n", (*dp)--);
printf("%f\n", ++*dp);
printf("%f\n", --*dp);
printf("%f\n", *dp);
}
This is contrary to the C standards, but ORCA/C historically permitted it (as do some other compilers), and I think there is a fair amount of existing code that relies on it.
This is normally 1 (indicating a hosted implementation, where the full standard library is available and the program starts by executing main()), but it is 0 if one of the pragmas for special types of programs with different entry points has been used.
This was not happening for declared identifiers (variables and functions) or for enum constants, as demonstrated in the following example:
enum {a,b,c};
#if b
#error "bad b"
#endif
int x = 0;
#if x
#error "bad x"
#endif
This generalizes the heuristic approach for checking whether _Noreturn functions could execute to the end of the function, extending it to apply to any function with a non-void return type. These checks use the same #pragma lint bit but give different messages depending on the situation.
This uses a heuristic that may produce both false positives and false negatives, but any false positives should reflect extraneous code at the end of the function that is not actually reachable.
We now insert spaces corresponding to whitespace between tokens, and string tokens are enclosed in quotes.
There are still issues with (at least) escape sequences in strings and comments between tokens.
Currently, this only flags return statements, not cases where they may execute to the end of the function. (Whether the function will actually return is not decidable in general, although it may be in special cases).
This currently checks for:
*Calls to undefined functions (same as bit 0)
*Parameters not declared in K&R-style function definitions
*Declarations or type names with no type specifiers (includes but is broader than the condition checked by bit 1)
The register optimizer tracks when a register is known to contain the same value as a memory location (direct page or absolute) and does optimizations based on this. But it did not always recognize when this information had become invalid because of a subsequent store to the memory location, so it might perform invalid optimizations. This patch adds those checks.
This fixes#66.
Previously, the designated initializer syntax could confuse the parser enough to cause null pointer dereferences. This avoids that, and also gives a more meaningful error message to the user.
In the #pragma lint line, the integer indicating the checks to perform can now optionally be followed by a semicolon and another integer. If these are present and the second integer is 0, then the lint checks will be performed, but will be treated as warnings rather than errors, so that they do not cause compilation to fail.