This affects code like the following:
enum E {a,b,c};
int main(void) {
enum E e;
struct E {int x;}; /* or: enum E {x,y,z}; */
}
The line "enum E e;" should refer to the enum type declared in the outer scope, but not redeclare it in the inner scope. Therefore, a subsequent struct, union, or enum declaration using the same tag in the same scope is acceptable.
They now use a jmp (addr,X) instruction, rather than a more complicated code sequence using rts. This is an improvement that was suggested in an old Genie message from Todd Whitesel.
This avoids strangeness where an enum tag declared within a typedef declaration would act like a typedef. For example, the following would compile without error:
typedef enum E {a,b,c} T;
E e;
This covers code like the following, which is very dubious but does not seem to be clearly prohibited by the standards:
int main(void) {
void *vp;
*vp;
}
Previously, this would do an indirect load of a four-byte value at the location, but then treat it as void. This could lead to the four-byte value being left on the stack, eventually causing a crash. Now we just evaluate the pointer expression (in case it has side effects), but effectively cast it to void without dereferencing it.
This avoids any possible issue with code possibly expecting __FE_DFL_ENV to be in the data bank when using the large memory model, although I don't think that happened in practice.
This converts comparisons like x > N (with constant N) to instead be evaluated as x >= N+1, since >= comparisons generate better code. This is possible as long as N is not the maximum value in the type, but in that case the comparison is always false. There are also a few other tweaks to the generated code in some cases.
The optimization could turn an unsigned comparison "x <= 0xFFFF" into "x < 0".
Here is an example affected by this:
int main(void) {
unsigned i = 1;
return (i <= 0xffff);
}
The TAY instruction would set the flags, but that is unnecessary because pc_cnv is a "NeedsCondition" operation (and some other conversions also do not reliably set the flags).
The code is also changed to preserve the Y register, possibly facilitating register optimizations.
This was a problem introduced in commit 393b7304a0. It could cause a compiler error for unoptimized array indexing code, e.g.:
int a[100];
int main(void) {
return a[5];
}
The new value of maxLocalLabel is aligned with the C99+ requirement to support "511 identifiers with block scope declared in one block".
The value of maxLabel is now the maximum it can be while keeping the size of the labelTab array under 32 KiB. (I'm not entirely sure the address calculations in the code generated by ORCA/Pascal would work correctly beyond that.)
This affects 16-bit subtractions where where only the left operand is "complex" (i.e. most things other than constants and simple loads). They were using an unnecessarily complicated code path suitable for the case where both operands are complex.
These could occur because the code for certain operations was assumed to set the z flag based on the result value, but did not actually do so. The affected operations were shifts, loads or stores of bit-fields, and ? : expressions.
Here is an example showing the problem with a shift:
#pragma optimize 1
int main(void) {
int i = 1, j = 0;
return (i >> j) ? 1 : 0;
}
Here is an example showing the problem with a bit-field load:
struct {
signed int i : 16;
} s = {1};
int main(void) {
return (s.i) ? 1 : 0;
}
Here is an example showing the problem with a bit-field store:
#pragma optimize 1
struct {
signed int i : 16;
} s;
int main(void) {
return (s.i = 1) ? 1 : 0;
}
Here is an example showing the problem with a ? : expression:
#pragma optimize 1
int main(void) {
int a = 5;
return (a ? (a<<a) : 0) ? 0 : 1;
}
This optimizes most multiplications by a power of 2 or the sum of two powers of 2, converting them to equivalent operations using shifts which should be faster than the general-purpose multiplication routine.
This changes unsigned 16-bit multiplies to use the new ~CUMul2 routine in ORCALib, rather than ~UMul2 in SysLib. They differ in that ~CUMul2 gives the low-order 16 bits of the true result in case of overflow. The C standards require this behavior for arithmetic on unsigned types.
Now rewind() will always be called as a function. In combination with an update to the rewind() function in ORCALib, this will ensure that the error indicator is always cleared, as required by the C standards.
This was non-standard in various ways, mainly in regard to pointer types. It has been rewritten to closely follow the specification in the C standards.
Several helper functions dealing with types have been introduced. They are currently only used for ? :, but they might also be useful for other purposes.
New tests are also introduced to check the behavior for the ? : operator.
This fixes#35 (including the initializer-specific case).
This affects expressions like &*a (where a is an array) or &*"string". In most contexts, these undergo array-to-pointer conversion anyway, but as an operand of sizeof they do not. This leads to sizeof either giving the wrong value (the size of the array rather than of a pointer) or reporting an error when the array size is not recorded as part of the type (which is currently the case for string constants).
In combination with an earlier patch, this fixes#8.
This makes a macro defined on the command line like -Dfoo=-1 consist of two tokens, the same as it would if defined in code. (Previously, it was just one token.)
This also somewhat expands the set of macros accepted on the command line. A prefix of +, -, *, &, ~, or ! (the one-character unary operators) can now be used ahead of any identifier, number, or string. Empty macro definitions like -Dfoo= are also permitted.
The basic approach is to generate a single expression tree containing the code for the initialization plus the reference to the compound literal (or its address). The various subexpressions are joined together with pc_bno pcodes, similar to the code generated for the comma operator. The initializer expressions are placed in a balanced binary tree, so that it is not excessively deep.
Note: Common subexpression elimination has poor performance for very large trees. This is not specific to compound literals, but compound literals for relatively large arrays can run into this issue. It will eventually complete and generate a correct program, but it may be quite slow. To avoid this, turn off CSE.
The desired location for the quad result was not saved, so it could be overwritten when generating code for the left operand. This could result in incorrect code that might trash the stack.
Here is an example affected by this:
#pragma optimize 1
int main(void) {
long long a, b=2;
char c = (a=1,b);
}