ORCA-C

mirror of https://github.com/byteworksinc/ORCA-C.git synced 2025-01-20 00:29:50 +00:00

Author	SHA1	Message	Date
Stephen Heumann	2550081517	Fix bug with 4-byte comparisons against globals in large memory model. Long addressing was not being used to access the values, which could lead to mis-evaluation of comparisons against values in global structs, unions, or arrays, depending on the memory layout. This could sometimes affect the c99desinit.c test, when run with large memory model and at least intermediate code peephole optimization. It could also affect this simpler test (depending on memory layout): #pragma memorymodel 1 #pragma optimize 1 struct S { void p; } s = {&s}; int main(void) { return s.p != &s; / should be 0 */ }	2022-12-04 21:54:29 -06:00
Stephen Heumann	19683706cc	Do not optimize code from asm statements. Previously, the assembly-level optimizations applied to code in asm statements. In many cases, this was fine (and could even do useful optimizations), but occasionally the optimizations could be invalid. This was especially the case if the assembly involved tricky things like self-modifying code. To avoid these problems, this patch makes the assembly optimizers ignore code from asm statements, so it is always emitted as-is, without any changes. This fixes #34.	2022-10-12 22:03:37 -05:00
Stephen Heumann	ca21e33ba7	Generate more efficient code for indirect function calls.	2022-10-11 21:14:40 -05:00
Stephen Heumann	05ecf5eef3	Add option to use the declared type for float/double/comp params. This differs from the usual ORCA/C behavior of treating all floating-point parameters as extended. With the option enabled, they will still be passed in the extended format, but will be converted to their declared type at the start of the function. This is needed for strict standards conformance, because you should be able to take the address of a parameter and get a usable pointer to its declared type. The difference in types can also affect the behavior of _Generic expressions. The implementation of this is based on ORCA/Pascal, which already did the same thing (unconditionally) with real/double/comp parameters.	2022-09-18 21:16:46 -05:00
Stephen Heumann	60efb4d882	Generate better code for indexed jumps. They now use a jmp (addr,X) instruction, rather than a more complicated code sequence using rts. This is an improvement that was suggested in an old Genie message from Todd Whitesel.	2022-07-18 21:18:26 -05:00
Stephen Heumann	bdf8ed4f29	Simplify some code.	2022-07-17 18:15:29 -05:00
Stephen Heumann	417fd1ad9c	Generate better code for && and \|\|.	2022-07-11 21:16:18 -05:00
Stephen Heumann	312a3a09b9	Generate better code for long long >= comparisons.	2022-07-11 19:20:55 -05:00
Stephen Heumann	687a5eaa45	Generate better code for pc_not on boolean operands.	2022-07-11 18:54:39 -05:00
Stephen Heumann	b5b76b624c	Use pei rather than load+push in a few places.	2022-07-11 18:42:14 -05:00
Stephen Heumann	607211d38e	Rearrange some labels to facilitate branch-shortening optimization.	2022-07-11 18:39:00 -05:00
Stephen Heumann	9b31e7f72a	Improve code generation for comparisons. This converts comparisons like x > N (with constant N) to instead be evaluated as x >= N+1, since >= comparisons generate better code. This is possible as long as N is not the maximum value in the type, but in that case the comparison is always false. There are also a few other tweaks to the generated code in some cases.	2022-07-10 22:27:38 -05:00
Stephen Heumann	76e4b1f038	Optimize away some tax/tay instructions used only to set flags.	2022-07-10 17:35:56 -05:00
Stephen Heumann	bf40e861aa	Fix indentation.	2022-07-10 13:12:10 -05:00
Stephen Heumann	2dff68e6ae	Eliminate an unnecessary instruction in quad-to-word conversion. The TAY instruction would set the flags, but that is unnecessary because pc_cnv is a "NeedsCondition" operation (and some other conversions also do not reliably set the flags). The code is also changed to preserve the Y register, possibly facilitating register optimizations.	2022-07-09 21:48:56 -05:00
Stephen Heumann	4470626ade	Optimize division/remainder by various constants. This generally covers powers of two and certain other values. (Details differ for signed/unsigned div/rem.)	2022-07-09 15:05:47 -05:00
Stephen Heumann	054719aab2	Fix bug in code generation for the product of two constants. This was a problem introduced in commit 393b7304a05. It could cause a compiler error for unoptimized array indexing code, e.g.: int a[100]; int main(void) { return a[5]; }	2022-07-09 15:01:25 -05:00
Stephen Heumann	f0d827eade	Generate more efficient code for certain subtractions. This affects 16-bit subtractions where where only the left operand is "complex" (i.e. most things other than constants and simple loads). They were using an unnecessarily complicated code path suitable for the case where both operands are complex.	2022-07-07 18:38:41 -05:00
Stephen Heumann	7898c619c8	Fix several cases where a condition might not be evaluated correctly. These could occur because the code for certain operations was assumed to set the z flag based on the result value, but did not actually do so. The affected operations were shifts, loads or stores of bit-fields, and ? : expressions. Here is an example showing the problem with a shift: #pragma optimize 1 int main(void) { int i = 1, j = 0; return (i >> j) ? 1 : 0; } Here is an example showing the problem with a bit-field load: struct { signed int i : 16; } s = {1}; int main(void) { return (s.i) ? 1 : 0; } Here is an example showing the problem with a bit-field store: #pragma optimize 1 struct { signed int i : 16; } s; int main(void) { return (s.i = 1) ? 1 : 0; } Here is an example showing the problem with a ? : expression: #pragma optimize 1 int main(void) { int a = 5; return (a ? (a<<a) : 0) ? 0 : 1; }	2022-07-07 18:26:37 -05:00
Stephen Heumann	393b7304a0	Optimize 16-bit multiplication by various constants. This optimizes most multiplications by a power of 2 or the sum of two powers of 2, converting them to equivalent operations using shifts which should be faster than the general-purpose multiplication routine.	2022-07-06 22:24:54 -05:00
Stephen Heumann	497e5c036b	Use new 16-bit unsigned multiply routine that complies with C standards. This changes unsigned 16-bit multiplies to use the new ~CUMul2 routine in ORCALib, rather than ~UMul2 in SysLib. They differ in that ~CUMul2 gives the low-order 16 bits of the true result in case of overflow. The C standards require this behavior for arithmetic on unsigned types.	2022-07-06 22:22:02 -05:00
Stephen Heumann	161bb952e3	Dynamically allocate string space, and make it larger. This increases the limit on total bytes of strings in a function, and also frees up space in the blank segment.	2022-06-08 22:09:30 -05:00
Stephen Heumann	a85846cc80	Fix codegen bug with pc_bno in some cases with a 64-bit right operand. The desired location for the quad result was not saved, so it could be overwritten when generating code for the left operand. This could result in incorrect code that might trash the stack. Here is an example affected by this: #pragma optimize 1 int main(void) { long long a, b=2; char c = (a=1,b); }	2022-06-08 20:49:32 -05:00
Stephen Heumann	daff1754b2	Make volatile loads from IO/softswitches access exactly the byte(s) specified. Previously, one-byte loads were typically done by reading a 16-bit value and then masking off the upper 8 bits. This is a problem when accessing softswitches or slot IO locations, because reading the subsequent byte may have some undesired effect. Now, ORCA/C will do an 8-bit read for such cases, if the volatile qualifier is used. There were also a couple optimizations that could occasionally result in not all the bytes of a larger value actually being read. These are now disabled for volatile loads that may access softswitches or IO. These changes should make ORCA/C more suitable for writing low-level software like device drivers.	2022-05-23 21:10:29 -05:00
Stephen Heumann	785a6997de	Record source file changes within a function as part of debug info. This affects functions whose body spans multiple files due to includes, or is treated as doing so due to #line directives. ORCA/C will now generate a COP 6 instruction to record each source file change, allowing debuggers to properly track the flow of execution across files.	2022-02-05 18:32:11 -06:00
Stephen Heumann	5ac79ff36c	Stop capitalizing file names in debug information. This does not seem to be necessary for any of the debuggers (at least in their latest versions), and it obviously causes problems with case-sensitive filesystems.	2022-02-04 22:15:02 -06:00
Stephen Heumann	242bef1f6e	Correct a comment.	2022-01-17 18:27:10 -06:00
Stephen Heumann	7584f8185c	Add ability to force stack repair and checking off for certain calls. This can be used on library calls generated by the compiler for internal purposes.	2021-10-19 22:10:04 -05:00
Stephen Heumann	5871820e0c	Support UTF-8/16/32 string literals and character constants (C11). These have u8, u, or U prefixes, respectively. The types char16_t and char32_t (defined in <uchar.h>) are used for UTF-16 and UTF-32 code points.	2021-10-11 20:54:37 -05:00
Stephen Heumann	650ff4697f	Update release notes to include a bug fix in ORCALib. Also, update a comment to reflect the actual behavior.	2021-09-17 19:28:21 -05:00
Stephen Heumann	d72c0fb9a5	Fix bug in some cases where a byte value is loaded and then stored as a word. It could wind up storing garbage in the upper 8 bits of the destination, because it was not doing a proper 8-bit to 16-bit conversion. This is an old bug, but the change in commit 95f518244212c caused it to be triggered in more cases, e.g. in the C7.5.1.1.CC test case. Here is a case that could exhibit the bug even before that: #pragma optimize 1 #include <stdio.h> int main(void) { int k[1]; int i = 0; unsigned char uch = 'm'; k[i] = uch; printf("%i\n", k[0]); }	2021-09-03 18:10:27 -05:00
Stephen Heumann	acddd93ffb	Avoid a precision reduction in some cases where it is not needed.	2021-03-06 23:14:29 -06:00
Stephen Heumann	fc515108f4	Make floating-point casts reduce the range and precision of numbers. The C standards generally allow floating-point operations to be done with extra range and precision, but they require that explicit casts convert to the actual type specified. ORCA/C was not previously doing that. This patch relies on some new library routines (currently in ORCALib) to do this precision reduction. This fixes #64.	2021-03-06 22:28:39 -06:00
Stephen Heumann	c0727315e0	Recognize byte swapping and generate an xba instruction for it. Specifically, this recognizes the pattern "(exp << 8) \| (exp >> 8)", where exp has an unsigned 16-bit type and does not have side effects.	2021-03-05 22:00:13 -06:00
Stephen Heumann	95f5182442	Change copies to stores when the value is unused. This was already done by the optimizer, but it is simple enough to just do it all the time. This avoids most performance regressions from the previous commit, and also generates more efficient code for long long stores (in the common cases where the value of an assignment expression is not used in any larger expression).	2021-03-05 19:44:38 -06:00
Stephen Heumann	4a7e994da8	Eliminate extra precision when doing floating-point assignments. The value of an assignment expression should be exactly what gets written to the destination, without any extra range or precision. Since floating-point expressions generally do have extra precision, we need to load the actual stored value to get rid of it.	2021-03-05 19:21:54 -06:00
Stephen Heumann	4ad7a65de6	Process floating-point values within the compiler using the extended type. This means that floating-point constants can now have the range and precision of the extended type (aka long double), and floating-point constant expressions evaluated within the compiler also have that same range and precision (matching expressions evaluated at run time). This new behavior is intended to match the behavior specified in the C99 and later standards for FLT_EVAL_METHOD 2. This fixes the previous problem where long double constants and constant expressions of type long double were not represented and evaluated with the full range and precision that they should be. It also gives extra range and precision to constants and constant expressions of type double or float. This may have pluses and minuses, but at any rate it is consistent with the existing behavior for expressions evaluated at run time, and with one of the possible models of floating point evaluation specified in the C standards.	2021-03-04 23:58:08 -06:00
Stephen Heumann	36d31ab37c	Optimize quad == 0 comparisons.	2021-02-25 21:40:32 -06:00
Stephen Heumann	5c92a8a0d3	Do unsigned quad inequalities without loading operands on stack.	2021-02-25 20:18:59 -06:00
Stephen Heumann	c5c401d229	Do quad equality comparisons without loading operands on stack.	2021-02-25 20:03:13 -06:00
Stephen Heumann	f1c19d2940	Do unary quad ops without loading operand on stack.	2021-02-25 19:28:36 -06:00
Stephen Heumann	0b56689626	Do quad add/subtract without loading operands on stack. As with the previous support for bitwise ops, this applies if the operands are simple quad loads.	2021-02-25 18:26:26 -06:00
Stephen Heumann	043124db93	Implement support for doing quad ops without loading operands on stack. This works when both operands are simple loads, such that they can be broken up into operations on their subwords in a standard format. Currently, this is implemented for bitwise binary ops, but it can also be expanded to arithmetic, etc.	2021-02-24 19:44:46 -06:00
Stephen Heumann	b0a61fbadf	Let functions store a long long return value directly into a variable in the caller. This optimization works when the return value is stored directly to a local variable and not used otherwise (typically only recognized when using intermediate code peephole optimization).	2021-02-21 18:37:17 -06:00
Stephen Heumann	daff197811	Optimize some quad ops to use interleaved loads and stores. This allows them to bypass the intermediate step of loading the value onto the stack. Currently, this only works for simple cases where a value is loaded and immediately stored.	2021-02-20 23:38:42 -06:00
Stephen Heumann	3c0e4baf78	Basic infrastructure for using different quadword locations in codegen. For the moment, this does not really do anything, but it lays the groundwork for not always having to load quadword values to the stack before operating on or storing them.	2021-02-20 17:07:47 -06:00
Stephen Heumann	e3b24fb50b	Add support for real to long long conversions.	2021-02-16 18:47:28 -06:00
Stephen Heumann	e38be489df	Implement comparisons for signed long long. These use a library function to perform the comparison.	2021-02-15 18:10:34 -06:00
Stephen Heumann	d2d871181a	Implement comparisons (>, >=, <, <=) for unsigned long long.	2021-02-15 14:43:26 -06:00
Stephen Heumann	c537153ee5	Implement pc_ind (load indirect) for long long.	2021-02-13 21:42:06 -06:00

1 2

94 Commits