This can occur in cases such as trying to assign to a non-l-value.
This patch ensures consistent handling of errors and prevents null pointer dereferences.
Previously, the logic for this was incorrect and would lead to a null pointer dereference in the compiler. In most cases the generated code would not actually change the pointer.
The following program demonstrates the issue:
#include <stdio.h>
#pragma memorymodel 1
typedef char bigarray[0x20000];
bigarray big[5];
int main(void) {
bigarray *p = big;
p++;
printf("%p %p\n", (void*)big, (void*)p);
}
This could cause spurious errors, or in some cases bad code generation.
The following example illustrates the problem:
#include <stdio.h>
enum {A,B,C};
/* arr was treated as having a size of 1, rather than 3 */
char arr[(int)C+1] = {1,2,3}; /* incorrectly gave an error for initializer */
int main(void) {
static int i = (int)C+1; /* incorrectly gave an error */
printf("%zu\n", sizeof(arr));
printf("%i\n", (int)C+1); /* OK */
printf("%i\n", i);
}
Per the C standards, the % operator should give a remainder after division, such that (a/b)*b + a%b equals a (provided that a/b is representable). As such, the operation of % is defined for cases where either or both of the operands are negative. Since division truncates toward 0, a%b should give a negative result (or 0) in cases where a is negative.
Previously, the % operator was essentially behaving like the "mod" operator in Pascal, which is equivalent for positive operands but not if either operand is negative. It would generally give incorrect results in those cases, or in some cases give compile-time or run-time errors.
This patch addresses both 16-bit and 32-bit signed computations at run time, and operations in constant expressions. The approach at run time is to call existing division routines, which return the correct remainder, except always as a positive number. The generated code checks the sign of the first operand, and if it is negative negates the remainder.
The code generated is somewhat large (especially for the 32-bit case), so it might be sensible to put it in a library function and call that, but for now it's just generated in-line. This avoids introducing a dependency on a new library function, so the generated code remains compatible with older versions of ORCALib (e.g. the GNO one).
Fixes#10.
Mainly, this detects errors in several cases where a pointer could inappropriately be used where an arithmetic type was expected. In some cases, other types (e.g. structs) could be used too.
This adds lint bit 5 (a value of 32), which currently enables checking for the following conditions:
*Integer overflow from arithmetic in constant expressions (currently only of type int).
*Invalid constant shift counts (negative, or >= the width of the type)
*Division by (constant) zero.
These (mainly the first two) can be indicative of code that was designed for larger type sizes and needs changes to support 16-bit int.
The following program demonstrates the problem:
#include <stdio.h>
int main(void) {
long l = (char)1 - (char)5;
printf("%li\n", l); /* should print -4 */
}
Specifically:
*The result of pointer arithmetic (or equivalent operations like &a[i]) always has pointer type.
*Array types decay to integer types in the context of comparison operations, so it is legal to compare two differently-sized arrays with the same element type.
The following program (partially derived from a csmith-generated test case) illustrates the issues:
int main(void) {
int a[2], b[10];
if (a == b) ; /* legal */
if (&a[1] != &b[0]) ; /* legal */
return sizeof(&b[1]); /* Should be sizeof(int*), i.e. 4 on GS */
}
Here's an example that shows the issues (derived from a csmith-generated test case):
struct S {
unsigned f;
};
void f1(struct S p) {
printf("%u\n", p.f);
}
int main(void) {
const struct S l = {123};
struct S s;
f1(l);
s = l;
printf("%u\n", s.f);
}
commit 4265329097538640e9e21202f1b141bcd42a44f3
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 21:45:32 2018 -0400
indent to match standard indent.
commit 783518fbeb01d2df43ef2083d3341004c05e4e2e
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 20:21:15 2018 -0400
clean up the typenames
commit 29b627ecf5ca9b8a143761f85a1807a6ca35ddd9
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 20:18:04 2018 -0400
enable feature_hh, warn about %n with non-int modifier.
commit fc4ac8129e3772c4eda36658e344ec475938369c
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 15:13:47 2018 -0400
warn thar %lc, %ls, etc are unsupported.
commit 7e6b433ba0552f7e52f0f034d398e9195c764326
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 13:36:25 2018 -0400
warn about hh/ll modifier (if not supported)
commit 1943c9979d0013f9f38045ec04a962fbf0269f31
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Fri Mar 23 11:42:41 2018 -0400
use error facilities for format errors.
commit 7811168f56dca1387055574ba8d32638da2fad96
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Thu Mar 22 15:34:21 2018 -0400
add feature flags to disable c99 enhancements until orca lib is updated.
commit c2149cc5953155cfc3c3b4d0483cd25fb946b055
Author: Kelvin Sherlock <ksherlock@gmail.com>
Date: Thu Mar 22 08:59:10 2018 -0400
Add printf/scanf format checking [WIP]
This parses out the xprintf / xscanf format string and compares it with the function arguments.
enabled via #pragma lint 16.
If there are no varargs calls (and nothing else that saves stack positions), then space doesn't need to be allocated for the saved stack position. This can also lead to more efficient prolog/epilog code for small functions.
These cases should now always work when using an expression of type unsigned as the index. They will work in some cases but not others when using an int as the index: making those cases work consistently would require more extensive changes and/or a speed hit, so I haven't done it for now.
Note that this now uses an "unsigned multiply" operation for all 16-bit index computations. This should actually work even when the index is a negative signed value, because it will wind up producing (the low-order 16 bits of) the right answer. The signed multiply, on the other hand, generally does not produce the low-order 16 bits of the right answer in cases where it overflows.
The following program is an example that was miscompiled (both with and without optimization):
int c[20000] = {3};
int main(void) {
int *p;
unsigned i = 17000;
p = c + 17000u;
return *(p-i); /* should return 3 */
}
This could already be optimized out by the peephole optimizer, but it's bad enough code that it really shouldn't be generated even when not using that optimization.
This could occur because a temporary location might be used both in the l-value and r-value computations, but the final assignment code assumed it still had the value from the l-value computation.
The following function demonstrates this problem (*ip is not updated, and *p is trashed):
int badinc(char **p, int *ip)
{
*ip += *(*p)++;
}
Previously the result type was based on the operand types (using the arithmetic conversions), which is incorrect. The following program illustrates the issue:
#include <stdio.h>
int main(void)
{
/* should print "1 0 2 2" */
printf("%i %i %lu %lu\n", 0L || 2, 0.0 && 2,
sizeof(1L || 5), sizeof(1.0 && 2.5));
}
The following is an example of a program that requires this:
#include <stdio.h>
int main(void)
{
int true = !0.0;
int false = !1.1;
printf("%i %i\n", true, false);
}
This is as required by C90. C99 and later require (u)intmax_t, which must be 64-bit or greater.
The following example shows problems with the previous behavior:
#if (30000 + 30000 == 60000) && (1 << 16 == 0x10000)
int main(void) {}
#else
#error "preprocessor error"
#endif
The following is an example that would give a compile error before this patch:
int main(void)
{
unsigned long i = 1 % 3000000000;
}
The remainder operation still does not work properly for signed types when either operand is negative. It gives either errors or incorrect values in various cases, both when evaluated at compile time and run time. Fully addressing this (including the run-time cases) would require library updates.
This fixes the compco09.c test case.
This implementation permits duplicate copies of type qualifiers to appear. This is technically illegal in C90, but it’s legal in C99 and later, and ORCA/C already allows this in other contexts.
Also, fix a case where an uninitialized value could be used, potentially resulting in errors not being reported (although I haven’t seen that in practice).
This fixes problems where >>= operations might not use an arithmetic shift in certain cases where they should, as in the below program:
#include <stdio.h>
int main (void)
{
int i;
unsigned u;
long l;
unsigned long ul;
i = -1;
u = 1;
i >>= u;
printf("%i\n", i); /* should be -1 */
l = -1;
ul = 3;
l >>= ul;
printf("%li\n", l); /* should be -1 */
}
This is necessary so that subsequent processing sees the correct expression type and proceeds accordingly. In particular, without this patch, the casts in the below program were erroneously ignored:
#include <stdio.h>
int main(void)
{
unsigned int u;
unsigned char c;
c = 0;
u = (unsigned char)~c;
printf("%u\n", u);
c = 200;
printf("%i\n", (unsigned char)(c+c));
}
This is as required by the C standards: the type of the right operand should not affect the result type.
The following program demonstrates problems with the old behavior:
#include <stdio.h>
int main(void)
{
unsigned long ul;
long l;
unsigned u;
int i;
ul = 0x8000 << 1L; /* should be 0 */
printf("%lx\n", ul);
l = -1 >> 1U; /* should be -1 */
printf("%ld\n", l);
u = 0xFF10;
l = 8;
ul = u << l; /* should be 0x1000 */
printf("%lx\n", ul);
l = -4;
ul = 1;
l = l >> ul; /* should be -2 */
printf("%ld\n", l);
}
The following demonstrates cases that would erroneously be allowed (and misleadingly give a size of 0) before:
#include <stdio.h>
struct s *S;
int main(void)
{
printf("%lu %lu\n", sizeof(struct s), sizeof *S);
}
The following test case demonstrates the problem:
#include <stdio.h>
int main (void)
{
if (sizeof(int) - 5 < 0) puts("error 1");
if (sizeof &main - 9 < 0) puts("error 2");
}
This affected binary or unary expressions with constant operands, at least one of which was of type unsigned long.
The following test case demonstrates the problem:
#include <stdio.h>
int main (void)
{
if (0 + 0x80000000ul < 0) puts("error 1");
if (~0ul < 0) puts("error 2");
}
These would previously be allowed, with the void value treated as if it was unsigned long.
Void values are still allowed for the second and third operands of the ?: operator, but only if they are both void. This is as required by the C standard.
Unsigned constants in the range 0x8000-0xFFFF were erroneously being treated like negative signed values in some contexts, including in array declarators and in the case labels of switch statements where the expression switched over has type long or unsigned long.
This could lead to bogus compile errors for array declarations and typedefs such as the following:
typedef char foo[0x8000];
It could also lead to cases in switch statements not being properly matched, as in the following program:
#include <stdio.h>
int main(void)
{
long i = 0xFF00;
switch (i) {
case 0xFF00:
puts("good");
break;
default:
puts("bad");
}
}