ORCA-C

Commit Graph

Author	SHA1	Message	Date
Stephen Heumann	00e7fe7125	Increase some size limits. The new value of maxLocalLabel is aligned with the C99+ requirement to support "511 identifiers with block scope declared in one block". The value of maxLabel is now the maximum it can be while keeping the size of the labelTab array under 32 KiB. (I'm not entirely sure the address calculations in the code generated by ORCA/Pascal would work correctly beyond that.)	2022-07-08 21:30:14 -05:00
Stephen Heumann	f0d827eade	Generate more efficient code for certain subtractions. This affects 16-bit subtractions where where only the left operand is "complex" (i.e. most things other than constants and simple loads). They were using an unnecessarily complicated code path suitable for the case where both operands are complex.	2022-07-07 18:38:41 -05:00
Stephen Heumann	7898c619c8	Fix several cases where a condition might not be evaluated correctly. These could occur because the code for certain operations was assumed to set the z flag based on the result value, but did not actually do so. The affected operations were shifts, loads or stores of bit-fields, and ? : expressions. Here is an example showing the problem with a shift: #pragma optimize 1 int main(void) { int i = 1, j = 0; return (i >> j) ? 1 : 0; } Here is an example showing the problem with a bit-field load: struct { signed int i : 16; } s = {1}; int main(void) { return (s.i) ? 1 : 0; } Here is an example showing the problem with a bit-field store: #pragma optimize 1 struct { signed int i : 16; } s; int main(void) { return (s.i = 1) ? 1 : 0; } Here is an example showing the problem with a ? : expression: #pragma optimize 1 int main(void) { int a = 5; return (a ? (a<<a) : 0) ? 0 : 1; }	2022-07-07 18:26:37 -05:00
Stephen Heumann	393b7304a0	Optimize 16-bit multiplication by various constants. This optimizes most multiplications by a power of 2 or the sum of two powers of 2, converting them to equivalent operations using shifts which should be faster than the general-purpose multiplication routine.	2022-07-06 22:24:54 -05:00
Stephen Heumann	497e5c036b	Use new 16-bit unsigned multiply routine that complies with C standards. This changes unsigned 16-bit multiplies to use the new ~CUMul2 routine in ORCALib, rather than ~UMul2 in SysLib. They differ in that ~CUMul2 gives the low-order 16 bits of the true result in case of overflow. The C standards require this behavior for arithmetic on unsigned types.	2022-07-06 22:22:02 -05:00
Stephen Heumann	11a3195c49	Use properly result type for statically evaluated ternary operators. Also, update the tests to include statically-evaluated cases.	2022-07-04 22:30:25 -05:00
Stephen Heumann	f5d5b88002	Correct result strings in a couple tests.	2022-07-04 22:29:15 -05:00
Stephen Heumann	f6fedea288	Update release notes and header to reflect recent stdio fixes.	2022-07-04 22:28:45 -05:00
Stephen Heumann	06bf0c5f46	Remove macro definition of rewind() which does not clear the IO error indicator. Now rewind() will always be called as a function. In combination with an update to the rewind() function in ORCALib, this will ensure that the error indicator is always cleared, as required by the C standards.	2022-06-24 18:32:08 -05:00
Stephen Heumann	c987f240c6	Optimize out ? : operations with constant conditions. The condition expression may become a constant due to optimizations, and optimizing out the ? : operation may also enable further optimizations.	2022-06-24 18:23:29 -05:00
Stephen Heumann	102d6873a3	Fix type checking and result type computation for ? : operator. This was non-standard in various ways, mainly in regard to pointer types. It has been rewritten to closely follow the specification in the C standards. Several helper functions dealing with types have been introduced. They are currently only used for ? :, but they might also be useful for other purposes. New tests are also introduced to check the behavior for the ? : operator. This fixes #35 (including the initializer-specific case).	2022-06-23 22:05:34 -05:00
Stephen Heumann	15dc3a46c4	Allow casts between long long and pointer types. This applies to casts in executable code. Some casts in initializers still don't work.	2022-06-20 21:55:54 -05:00
Stephen Heumann	5e20e02d06	Add a function to make a pointer type. This allows us to refactor out code that was doing this in several places.	2022-06-19 17:55:08 -05:00
Stephen Heumann	e5501dc902	Update test for maximum length of string constants.	2022-06-18 22:03:22 -05:00
Stephen Heumann	58849607a1	Use cgPointerSize for size of pointers in various places. This makes no practical difference when targeting the GS, but it better documents what the relevant size is.	2022-06-18 19:30:20 -05:00
Stephen Heumann	a3104853fc	Treat string constant base types as having unknown number of elements. I am not aware of any effect from this, but the change makes their element count consistent with the size of 0 (indicating an incomplete type).	2022-06-18 19:18:29 -05:00
Stephen Heumann	802ba3b0ba	Make unary & always yield a pointer type, not an array. This affects expressions like &a (where a is an array) or &"string". In most contexts, these undergo array-to-pointer conversion anyway, but as an operand of sizeof they do not. This leads to sizeof either giving the wrong value (the size of the array rather than of a pointer) or reporting an error when the array size is not recorded as part of the type (which is currently the case for string constants). In combination with an earlier patch, this fixes #8.	2022-06-18 18:53:29 -05:00
Stephen Heumann	91b63f94d3	Note an error in the manual.	2022-06-17 18:45:59 -05:00
Stephen Heumann	67ffeac7d4	Use the proper type for expressions like &"string". These should have a pointer-to-array type, but they were treated like pointers to the first element.	2022-06-17 18:45:11 -05:00
Stephen Heumann	5e08ef01a9	Use quotes around "C" locale in release notes. This is consistent with the usage in the C standards.	2022-06-15 21:54:11 -05:00
Stephen Heumann	8406921147	Parse command-line macros more consistently with macros in code. This makes a macro defined on the command line like -Dfoo=-1 consist of two tokens, the same as it would if defined in code. (Previously, it was just one token.) This also somewhat expands the set of macros accepted on the command line. A prefix of +, -, *, &, ~, or ! (the one-character unary operators) can now be used ahead of any identifier, number, or string. Empty macro definitions like -Dfoo= are also permitted.	2022-06-15 21:52:35 -05:00
Stephen Heumann	161bb952e3	Dynamically allocate string space, and make it larger. This increases the limit on total bytes of strings in a function, and also frees up space in the blank segment.	2022-06-08 22:09:30 -05:00
Stephen Heumann	3c2b492618	Add support for compound literals within functions. The basic approach is to generate a single expression tree containing the code for the initialization plus the reference to the compound literal (or its address). The various subexpressions are joined together with pc_bno pcodes, similar to the code generated for the comma operator. The initializer expressions are placed in a balanced binary tree, so that it is not excessively deep. Note: Common subexpression elimination has poor performance for very large trees. This is not specific to compound literals, but compound literals for relatively large arrays can run into this issue. It will eventually complete and generate a correct program, but it may be quite slow. To avoid this, turn off CSE.	2022-06-08 21:34:12 -05:00
Stephen Heumann	a85846cc80	Fix codegen bug with pc_bno in some cases with a 64-bit right operand. The desired location for the quad result was not saved, so it could be overwritten when generating code for the left operand. This could result in incorrect code that might trash the stack. Here is an example affected by this: #pragma optimize 1 int main(void) { long long a, b=2; char c = (a=1,b); }	2022-06-08 20:49:32 -05:00
Stephen Heumann	0e8b485f8f	Improve debug printing of pcodes. Some pcode instruction names were missing.	2022-06-08 20:49:16 -05:00
Stephen Heumann	0b6d150198	Move Short function out of the blank segment. This makes a bit more room in the blank segment, which is necessary when codegen debugging is enabled.	2022-05-26 19:15:03 -05:00
Stephen Heumann	58771ec71c	Do not do macro expansion after each ## operator is evaluated. It should only be done after all the ## operators in the macro have been evaluated, potentially merging together several tokens via successive ## operators. Here is an example illustrating the problem: #define merge(a,b,c) a##b##c #define foobar #define foobarbaz a int merge(foo,bar,baz) = 42; int main(void) { return a; }	2022-05-24 22:38:56 -05:00
Stephen Heumann	deca73d233	Properly expand macros that have the same name as a keyword or typedef. If such macros were used within other macros, they would generally not be expanded, due to the order in which operations were evaluated during preprocessing. This is actually an issue that was fixed by the changes from ORCA/C 2.1.0 to 2.1.1 B3, but then broken again by commit `d0b4b75970`. Here is an example with the name of a keyword: #define X long int #define long X x; int main(void) { return sizeof(x); /* should be sizeof(int) / } Here is an example with the name of a typedef: typedef short T; #define T long #define X T X x; int main(void) { return sizeof(x); / should be sizeof(long) */ }	2022-05-24 22:22:37 -05:00
Stephen Heumann	daff1754b2	Make volatile loads from IO/softswitches access exactly the byte(s) specified. Previously, one-byte loads were typically done by reading a 16-bit value and then masking off the upper 8 bits. This is a problem when accessing softswitches or slot IO locations, because reading the subsequent byte may have some undesired effect. Now, ORCA/C will do an 8-bit read for such cases, if the volatile qualifier is used. There were also a couple optimizations that could occasionally result in not all the bytes of a larger value actually being read. These are now disabled for volatile loads that may access softswitches or IO. These changes should make ORCA/C more suitable for writing low-level software like device drivers.	2022-05-23 21:10:29 -05:00
Stephen Heumann	21f266c5df	Require use of digraphs in macro redefinitions to match the original. This is part of the general requirement that macro redefinitions be "identical" as defined in the standard. This affects code like: #define x [ #define x <:	2022-04-05 19:47:22 -05:00
Stephen Heumann	a1d57c4db3	Allow ORCA/C-specific keywords to be disabled via a new pragma. This allows those tokens (asm, comp, extended, pascal, and segment) to be used as identifiers, consistent with the C standards. A new pragma (#pragma extensions) is introduced to control this. It might also be used for other things in the future.	2022-03-26 18:45:47 -05:00
Stephen Heumann	b2edeb4ad1	Properly stringize tokens that start with a trigraph. This did not work correctly before, because such tokens were recorded as starting with the third character of the trigraph. Here is an example affected by this: #define mkstr(a) # a #include <stdio.h> int main(void) { puts(mkstr(??!)); puts(mkstr(??!??!)); puts(mkstr('??<')); puts(mkstr(+??!)); puts(mkstr(+??')); }	2022-03-25 18:10:13 -05:00
Stephen Heumann	f531f38463	Use suffixes on numeric constants in #pragma expand output. A suffix will now be printed on any integer constant with a type other than int, or any floating constant with a type other than double. This ensures that all constants have the correct types, and also serves as documentation of the types.	2022-03-01 19:46:14 -06:00
Stephen Heumann	182cf66754	Properly stringize tokens with line continuations or non-initial trigraphs. Previously, continuations or trigraphs would be included in the string as-is, which should not be the case because they are (conceptually) processed in earlier compilation phases. Initial trigraphs still do not get stringized properly, because the token starting position is not recorded correctly for them. This fixes code like the following: #define mkstr(a) # a #include <stdio.h> int main(void) { puts(mkstr(a\ bc)); puts(mkstr(qr\ )); puts(mkstr(\ xy)); puts(mkstr(12??/ 34)); puts(mkstr('??<')); }	2022-03-01 19:01:11 -06:00
Stephen Heumann	fec7b57ec2	Generate a string representation of tokens merged with ##. This is necessary for correct behavior if such tokens are subsequently stringized with #. Previously, only the first half of the token would be produced. Here is an example demonstrating the issue: #define mkstr(a) # a #define in_between(a) mkstr(a) #define joinstr(a,b) in_between(a ## b) #include <stdio.h> int main(void) { puts(joinstr(123,456)); puts(joinstr(abc,def)); puts(joinstr(dou,ble)); puts(joinstr(+,=)); puts(joinstr(:,>)); }	2022-02-22 18:48:34 -06:00
Stephen Heumann	6cfe8cc886	Remove an unused string representation of macro tokens. The string representation of macro tokens is needed for some preprocessor operations, but we get this in other ways (e.g. based on tokenStart/tokenEnd).	2022-02-21 18:39:39 -06:00
Stephen Heumann	8f27b8abdb	Print any ## tokens in #pragma expand output. Note that ## will not currently be recognized as a token in some contexts, leading to it not being printed.	2022-02-20 20:53:37 -06:00
Stephen Heumann	bf7a6fa5db	Use separate functions for merging tokens with ## and merging adjacent strings. These are conceptually separate operations occurring in different phases of the translation process. This change means that ## can no longer merge string constants: such operations will give an error about an illegal token. Cases like this are technically undefined behavior, so the old behavior could have been permitted, but it is clearer and more consistent with other compilers to treat this as an error.	2022-02-20 20:16:08 -06:00
Stephen Heumann	26e1bfc253	Allow generation of digraphs via ## token merging.	2022-02-20 18:57:03 -06:00
Stephen Heumann	2b062a8392	Make ## token merging on character constants give an error. This ultimately should be supported, but that will be more work. For now, we just set the string representation to '?', which will usually give an error when merged. (Previously, whatever was at memory location 0 would be treated as the string representation of the token. Frequently this would just be an empty string, leading to no error but incorrect results.)	2022-02-20 16:19:00 -06:00
Stephen Heumann	da978932bf	Save string representation of macros defined on command line. This is necessary for correct operation of the # and ## preprocessor operators on the tokens from such macros. Integers with a sign character still have the non-standard property of being treated as a single token, so they cannot be used with ##, but in most cases such uses will now give an error.	2022-02-20 15:35:49 -06:00
Stephen Heumann	2a9ec8fc43	Explicitly terminate PCH generation if there is an initialized variable. Initialized variables have always been one of the things that stops PCH generation, but previously this was only detected when trying to write out the symbol records at the point of a later #include. On a subsequent compile using the sym file, nothing would recognize that PCH generation had stopped for this reason, so the PCH code would recognize the later #include as a potential opportunity to extend the sym file, and therefore would delete it to force regeneration next time. This led to the sym file being deleted and regenerated on alternate compiles, so its full benefit was not realized. There is code in Header.pas to abort PCH generation if an initialized symbol is found. That is probably superfluous after this change, but it has been left in place for now.	2022-02-19 14:22:25 -06:00
Stephen Heumann	aabbadb34b	Terminate header generation if #warning is encountered. This is necessary to ensure that the warning message is printed on subsequent compiles.	2022-02-19 14:06:15 -06:00
Stephen Heumann	a73dce103b	Terminate PCH generation if an #append is encountered. If the appended file was another C file and that file contained an #include, this would create an invalid record in the sym file. It would record memory from the buffer holding the original file to the buffer holding the appended file. In general, these are not contiguous, so superfluous data from other parts of memory would be included in the sym file. This record would normally just be treated as invalid on subsequent compiles, but it could theoretically be very large (depending on the memory layout) and might contain sensitive data from other parts of memory.	2022-02-19 14:05:07 -06:00
Stephen Heumann	1e98a63bf4	Avoid generating duplicate "Including ..." messages. This could happen if a header was saved in the sym file, but the sym file data was not actually used because the source code in the main file did not match what was saved.	2022-02-16 21:31:49 -06:00
Stephen Heumann	f2d6625300	Save #pragma path directives in sym files. They were not being saved, which would result in ORCA/C not searching the proper paths when looking for an include file after the sym file had ended. Here is an example showing the problem: #pragma path "include" #include <stdio.h> int k = 50; #include "n.h" /* will not find include:n.h */	2022-02-15 21:27:35 -06:00
Stephen Heumann	30fcc7227f	Tweak comments in Scanner.asm. There are no code changes.	2022-02-15 20:51:16 -06:00
Stephen Heumann	3893db1346	Make sure #pragma expand is properly applied in all cases. There were various places where the flag for macro expansions was saved, set to false, and then later restored. If #pragma expand was used within those areas, it would not be properly applied. Here is an example showing that problem: void f(void #pragma expand 1 ) {} This could also affect some uses of #pragma expand within precompiled headers, e.g.: #pragma expand 1 #include "a.h" #undef foobar #include "b.h" ... Also, add a note saying that code in precompiled headers will not be expanded. (This has always been the case, but was not clearly documented.)	2022-02-15 20:50:02 -06:00
Stephen Heumann	8c0d65616c	Remove an unnecessary variable.	2022-02-13 21:36:03 -06:00
Stephen Heumann	c96cf4f1dd	Do not save predefined and command-line macros in the sym file. Previously, these might or might not be saved (based on the contents of uninitialized memory), but in many cases they were. This was unnecessary, since these macros are automatically defined when the scanner is initialized. Reading them from the sym file could result in duplicate copies of them in the macro list. This is usually harmless, but might result in #undefs of macros from the command line not working properly.	2022-02-13 20:17:33 -06:00

1 2 3 4 5 ...

621 Commits All Branches Search

621 Commits

All Branches