If such macros were used within other macros, they would generally not be expanded, due to the order in which operations were evaluated during preprocessing.
This is actually an issue that was fixed by the changes from ORCA/C 2.1.0 to 2.1.1 B3, but then broken again by commit d0b4b75970.
Here is an example with the name of a keyword:
#define X long int
#define long
X x;
int main(void) {
return sizeof(x); /* should be sizeof(int) */
}
Here is an example with the name of a typedef:
typedef short T;
#define T long
#define X T
X x;
int main(void) {
return sizeof(x); /* should be sizeof(long) */
}
This allows those tokens (asm, comp, extended, pascal, and segment) to be used as identifiers, consistent with the C standards.
A new pragma (#pragma extensions) is introduced to control this. It might also be used for other things in the future.
This did not work correctly before, because such tokens were recorded as starting with the third character of the trigraph.
Here is an example affected by this:
#define mkstr(a) # a
#include <stdio.h>
int main(void) {
puts(mkstr(??!));
puts(mkstr(??!??!));
puts(mkstr('??<'));
puts(mkstr(+??!));
puts(mkstr(+??'));
}
A suffix will now be printed on any integer constant with a type other than int, or any floating constant with a type other than double. This ensures that all constants have the correct types, and also serves as documentation of the types.
Previously, continuations or trigraphs would be included in the string as-is, which should not be the case because they are (conceptually) processed in earlier compilation phases. Initial trigraphs still do not get stringized properly, because the token starting position is not recorded correctly for them.
This fixes code like the following:
#define mkstr(a) # a
#include <stdio.h>
int main(void) {
puts(mkstr(a\
bc));
puts(mkstr(qr\
));
puts(mkstr(\
xy));
puts(mkstr(12??/
34));
puts(mkstr('??<'));
}
This is necessary for correct behavior if such tokens are subsequently stringized with #. Previously, only the first half of the token would be produced.
Here is an example demonstrating the issue:
#define mkstr(a) # a
#define in_between(a) mkstr(a)
#define joinstr(a,b) in_between(a ## b)
#include <stdio.h>
int main(void) {
puts(joinstr(123,456));
puts(joinstr(abc,def));
puts(joinstr(dou,ble));
puts(joinstr(+,=));
puts(joinstr(:,>));
}
They were not being saved, which would result in ORCA/C not searching the proper paths when looking for an include file after the sym file had ended. Here is an example showing the problem:
#pragma path "include"
#include <stdio.h>
int k = 50;
#include "n.h" /* will not find include:n.h */
There were various places where the flag for macro expansions was saved, set to false, and then later restored. If #pragma expand was used within those areas, it would not be properly applied. Here is an example showing that problem:
void f(void
#pragma expand 1
) {}
This could also affect some uses of #pragma expand within precompiled headers, e.g.:
#pragma expand 1
#include "a.h"
#undef foobar
#include "b.h"
...
Also, add a note saying that code in precompiled headers will not be expanded. (This has always been the case, but was not clearly documented.)
This would occur if the macro had already been saved in the sym file and the #undef occurred before a subsequent #include that was also recorded in the sym file. The solution is simply to terminate sym file generation if an #undef of an already-saved macro is encountered.
Here is an example showing the problem:
test.c:
#include "test1.h"
#undef x
#include "test2.h"
int main(void) {
#ifdef x
return x;
#else
return y;
#endif
}
test1.h:
#define x 27
test2.h:
#define y 6
Macros and include paths from the cc= parameters may be included in the symbol file, so incorrect behavior could result if the symbol file was used for a later compilation with different cc= parameters.
There were a couple issues that could occur with #pragma keep and sym files:
*If a source file used #pragma keep but it was overridden by KEEP= on the command line or {KeepName} in the shell, then the overriding keep name would be saved to the sym file. It would therefore be applied to subsequent compilations even if it was no longer specified in the command line or shell variable.
*If a source file used #pragma keep, that keep name would be recorded in the sym file. On subsequent compilations, it would always be used, overriding any keep name specified by the command line or shell, contrary to the usual rule that the name on the command line takes priority.
With this patch, the keep name recorded in the sym file (if any) should always be the one specified by #pragma keep, but it can be overridden as usual.
This affects functions whose body spans multiple files due to includes, or is treated as doing so due to #line directives. ORCA/C will now generate a COP 6 instruction to record each source file change, allowing debuggers to properly track the flow of execution across files.
This causes __FILE__ to give the name of an include file if used within it, which seems to be what the standards intend (and what other compilers do). It also affects the file name recorded in debugging information for functions declared in an include file.
(Note that occ will generate a #line directive before an #append, essentially to work around the problem this patch fixes. After the patch, such a #line directive is effectively ignored. This should be OK, although it may result in a difference in whether a full or partial pathname is used for __FILE__ and in debug info.)
There were several forms that should be permitted but were not, such as &"str"[1], &*"str", &*a (where a is an array), and &*f (where f is a function).
This fixes#15 and also certain other cases illustrated in the following example:
char a[10];
int main(void);
static char *s1 = &"string"[1];
static char *s2 = &*"string";
static char *s3 = &*a;
static int (*f2)(void)=&*main;
This is necessary to allow declarations of pascal-qualified function pointers as members of a structure, among other things.
Note that the behavior for "pascal" now differs from that for the standard function specifiers, which have more restrictive rules for where they can be used. This is justified by the fact that the "pascal" qualifier is allowed and meaningful for function pointer types, so it should be able to appear anywhere they can.
This fixes#28.
The parameters of the underlying function type were not being reversed when applying the "pascal" qualifier to a function pointer type. This resulted in the parameters not being in the expected order when a call was made using such a function pointer. This could result in spurious errors in some cases or inappropriate parameter conversions in others.
This fixes#75.
Failing to do this could allow the type spec to be overwritten if the expression contained another type name within it (e.g. a cast). This could cause the wrong type to be computed, which could lead to incorrect behavior for constructs that use type names, e.g. sizeof.
Here is an example program that demonstrated the problem:
int main(void) {
return sizeof(short[(long)50]);
}
There were several existing optimizations that could change behavior in ways that violated the IEEE standard with regard to infinities, NaNs, or signed zeros. They are now gated behind a new #pragma optimize flag. This change allows intermediate code peephole optimization and common subexpression elimination to be used while maintaining IEEE conformance, but also keeps the rule-breaking optimizations available if desired.
See section F.9.2 of recent C standards for a discussion of how these optimizations violate IEEE rules.
This could give incorrect results for extended-to-comp conversions of certain negative integers like -2147483648 and -53021371269120. To get a fix for the same problem with regard to long long, ORCA/C should be linked with the latest version of ORCALib (which also works around some instances of the problem at run time). There are still other cases involving code in SysFloat that has not yet been patched.
Parameters declared directly with array types were already adjusted to pointer types in commit 5b953e2db091, but this code is needed for the remaining case where a typedef'd array type is used.
With these changes, 'array' parameters are treated for all purposes as really having pointer types, which is what the standards call for. This affects at least their size as reported by sizeof and the debugging information generated for them.
This allows the length of the string plus a few extra bytes used internally to be represented by a 16-bit integer. Since the size limit for memory allocations has been raised, there is no good reason to impose a shorter limit on strings.
Note that C99 and later specify a minimum translation limit for string constants of at least 4095 characters.
In the new implementation, variable arguments are not removed until the end of the function. This allows variable argument processing to be restarted, and it prevents the addresses of local variables from changing in the middle of the function. The requirement to turn off stack repair code around varargs functions is also removed.
This fixes#58.
When using such a string literal to initialize an array with automatic storage duration, the bytes after the first null would be set to 0, rather than the values from the string literal.
Here is an example program showing the problem:
#include <stdio.h>
int main(void) {
char s[] = "a\0b";
puts(s+2);
}