Commit Graph

83 Commits

Author SHA1 Message Date
Stephen Heumann 9181b0bd73 fclose: Free the file buffer earlier.
This moves the free() call for the file buffer before the malloc() that occurs when closing a temp file, which should at least slightly reduce the chances that the malloc() call fails.
2024-02-20 22:22:26 -06:00
Stephen Heumann 7384c82667 fclose: Check for malloc failure when closing temp files.
Previously, the code for closing a temporary file assumed that malloc would succeed. If it did not, the code would trash memory and (at least in my testing) crash the system. Now it checks for and handles malloc failures, although they will still lead to the temporary file not being deleted.

Here is a test program illustrating the problem:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
        FILE *f = tmpfile();
        if (!f)
                return 0;

        void *p;
        do {
                p = malloc(8*1024);
        } while (p);

        fclose(f);
}
2024-02-20 22:20:17 -06:00
Stephen Heumann 16c7952648 fclose: close stream even if there is an error flushing buffered data.
This can happen, e.g., if there is an IO error or if there is insufficient free disk space to flush the data. In this case, fclose should return -1 to report an error, but it should still effectively close the stream and deallocate the buffer for it. (This behavior is explicitly specified in the C99 and later standards.)

Previously, ORCA/C effectively left the stream open in these cases. As a result, the buffer was not deallocated. More importantly, this could cause the program to hang at exit, because the stream would never be removed from the list of open files.

Here is an example program that demonstrates the problem:

/*
 * Run this on a volume with less than 1MB of free space, e.g. a floppy.
 * The fclose return value should be -1 (EOF), indicating an error, but
 * the two RealFreeMem values should be close to each other (indicating
 * that the buffer was freed), and the program should not hang on exit.
 */

#include <stdio.h>
#include <stddef.h>
#include <memory.h>

#define BUFFER_SIZE 1000000

int main(void) {
        size_t i;
        int ret;

        printf("At start, RealFreeMem = %lu\n", RealFreeMem());

        FILE *f = fopen("testfile", "wb");
        if (!f)
                return 0;

        setvbuf(f, NULL, _IOFBF, BUFFER_SIZE);

        for (i = 0; i < BUFFER_SIZE; i++) {
                putc('x', f);
        }

        ret = fclose(f);
        printf("fclose return value = %d\n", ret);

        printf("At end, RealFreeMem = %lu (should be close to start value)\n",
                RealFreeMem());
}
2024-02-19 22:30:15 -06:00
Stephen Heumann bf3a4d7ceb Small optimizations in library code.
There should be no functional differences.
2024-02-18 17:35:21 -06:00
Stephen Heumann 3f70daed7d Remove floating-point code from ORCALib.
It is being moved to SysFloat.
2023-06-23 15:52:46 -05:00
Stephen Heumann a5504be621 scanf: skip remaining directives after encountering EOF.
Encountering EOF is an input failure, which terminates scanf processing. Thus, remaining directives (even %n) should not be processed.
2023-06-11 16:07:19 -05:00
Stephen Heumann 6bc1c3741c scanf: Do not test for EOF at the beginning of scanf processing.
If the format string is empty or contains only %n conversions, then nothing should be read from the stream, so no error should be indicated even if it is at EOF. If a directive does read from the stream and encounter EOF, that will be handled when the directive is processed.

This could cause scanf to pause waiting for input from the console in cases where it should not.
2023-06-11 16:05:12 -05:00
Stephen Heumann 614af65c68 printf: print inf/nan rather than INF/NAN when using f and a formats.
This works in conjunction with SysFloat commit e409ecd4717, and at least that version of SysFloat is now required.
2023-06-01 20:10:12 -05:00
Stephen Heumann afff478793 More small size optimizations for (f)printf. 2023-04-18 22:27:49 -05:00
Stephen Heumann 78a9e1d93b printf: Unify code for hex and octal formatting.
These are similar enough that they can use the same code with just a few conditionals, which saves space.

(This same code can also be used for binary when that is added.)
2023-04-17 21:52:26 -05:00
Stephen Heumann 67ae5f7b44 printf: optimize hex and octal printing code.
This also fixes a bug: printf("%#.0o\n", 0) should print "0", rather than nothing.
2023-04-17 19:36:45 -05:00
Stephen Heumann 80c0bbc32b Small size optimizations in printf code. 2023-04-16 22:14:15 -05:00
Stephen Heumann 34f78fb1f2 printf: add support for 'a'/'A' conversion specifiers (C99).
These print a floating-point number in a hexadecimal format, with several variations based on the conversion specification:

Upper or lower case letters (%A or %a)
Number of digits after decimal point (precision)
Use + sign for positive numbers? (+ flag)
Use leading space for positive numbers (space flag)
Include decimal point when there are no more digits? (# flag)
Pad with leading zeros after 0x? (0 flag)

If no precision is given, enough digits are printed to represent the value exactly. Otherwise, the value is correctly rounded based on the rounding mode.
2023-04-16 20:23:53 -05:00
Stephen Heumann 578e544174 Update comments about printf. 2023-04-16 15:41:00 -05:00
Stephen Heumann b7b4182cd2 printf: ignore '0' flag if '-' is also used.
This is what the standards require. Previously, the '0' flag would effectively override '-'.

Here is a program that demonstrates the problem:

#include <stdio.h>
int main(void) {
        printf("|%-020d|\n", 123);
        printf("|%0-20d|\n", 123);
        printf("|%0*d|\n", -20, 123);
}
2023-04-16 15:39:49 -05:00
Stephen Heumann fca8c1ef85 Save a few bytes in printf and scanf. 2023-04-03 13:26:37 -05:00
Stephen Heumann 48371dc669 tmpnam: allow slightly longer temp directory name
ORCA/C's tmpnam() implementation is designed to use prefix 3 if it is defined and the path is sufficiently short. I think it was intended to allow up to a 15-character disk name to be specified, but it used a GS/OS result buffer size of 16, which only leaves 12 characters for the path, including initial and terminal : characters. As such, only up to a 10-character disk name could be used. This patch increases the specified buffer size to 21, allowing for a 17-character path that can encompass a 15-character disk name.
2023-03-08 18:59:10 -06:00
Stephen Heumann b81b4e1109 Fix several bugs in fgets() and gets().
Bugs fixes:
*fgets() would write 2 bytes in the buffer if called with n=1 (should be 1).
*fgets() would write 2 bytes in the buffer if it encountered EOF before reading any characters, but the EOF flag had not previously been set. (It should not modify the buffer in this case.)
*fgets() and gets() would return NULL if EOF was encountered after reading one or more characters. (They should return the buffer pointer).
2022-10-15 19:01:16 -05:00
Stephen Heumann 12f8d74c99 Do not use separate segments for __-prefixed versions of functions.
The __-prefixed versions were introduced for use in <stdio.h> macros that have since been removed, so they are not really necessary any more. However, they may be used in old object files, so those symbols are still included for now.
2022-07-05 18:24:18 -05:00
Stephen Heumann 463d24a028 Avoid errors caused by fseek after ungetc on read-only files.
The error could occur because fseek calls fflush followed by ftell. fflush would reset the file position as if the characters in the putback buffer were removed, but ftell would still see them and try to adjust for them (in the case of a read-only file). This could result in trying to seek before the beginning of the file, producing an error.

Here is a program that was affected:

#include <stdio.h>
int main(void) {
        FILE *f = fopen("somefile","r");
        if (!f) return 0;
        fgetc(f);
        ungetc('X', f);
        fseek(f, 0, SEEK_CUR);
        if (ferror(f)) puts("error encountered");
}
2022-07-03 21:58:00 -05:00
Stephen Heumann 219e4352a0 fseek: do not clear read/write flags for read-only/write-only streams.
This maintains the invariant that these flags stay set to reflect the setting of the stream as read-only or write-only, allowing code elsewhere to behave appropriately based on that.
2022-07-03 20:27:19 -05:00
Stephen Heumann 89b501f259 fread: do not try to read if EOF flag is set.
This behavior is implied by the specification of fread in terms of fgetc.
2022-07-03 20:04:12 -05:00
Stephen Heumann ef63f26c4f Allow writing immediately after reading to EOF.
This should be allowed (for read/write files), but it was leading to errors, both because fputc would error out if the EOF flag was set and because the FILE record was not fully reset from its "reading" state. In particular, the _IOREAD flag and the current buffer position pointer need to be reset.

Here is a program that demonstrates the problem:

#include <stdio.h>
int main(void) {
        FILE *f = fopen("somefile","r+");
        if (!f) return 0;
        while (!feof(f)) fgetc(f); /* or: fgetc then fread */
        if (fputc('X', f) == EOF)
                puts("fputc error");
}
2022-07-03 18:54:24 -05:00
Stephen Heumann c877c74b92 In fflush, reset the mark to account for flushing the buffer even on read-only files.
This is needed to keep the file mark consistent with the state of the buffer. Otherwise, subsequent reads may return data from the wrong place. (Using fflush on a read-only stream is undefined behavior, but other things in stdio call it that way, so it must work consistently with what they expect.)

Here is an example that was affected by this:

#include <stdio.h>
#include <string.h>

char buf[100];

int main(void) {
        FILE *f = fopen("somefile","r");
        if (!f) return 0;

        setvbuf(f, 0L, _IOFBF, 14);

        fgets(buf, 10, f);
        printf("first read : %s\n", buf);
        ftell(f);

        memset(buf, 0, sizeof(buf));
        fgets(buf, 10, f);
        printf("second read: %s\n", buf);

        fseek(f, 0, SEEK_CUR);
        memset(buf, 0, sizeof(buf));
        fgets(buf, 10, f);
        printf("third read : %s\n", buf);
}
2022-07-02 23:34:01 -05:00
Stephen Heumann 2d6ca8a7b2 Do not set read/write error indicator if ftell cannot get the file mark.
This is not really an IO operation on the file, and accordingly the C standards do not describe the error indicator as being set here. In particular, you cannot get the mark for a character device, but that is expected behavior, not really an error condition.

errno is still set, indicating that the ftell operation failed.
2022-07-02 22:09:56 -05:00
Stephen Heumann 3eb8a9cb55 Fix bug where ftell would inappropriately change the file mark.
ftell would set the mark as if the buffer and putback were being discarded, but not actually discard them. This resulted in characters being reread once the end of the buffer was reached.

Here is an example that illustrates the problem:

#include <stdio.h>
#include <string.h>
char buf[100];
int main(void) {
        FILE *f = fopen("somefile","r"); // a file with some text
        if (!f) return 0;
        setvbuf(f, 0L, _IOFBF, 14); // to see problem sooner
        fgets(buf, 10, f);
        printf("first read : %s\n", buf);
        ftell(f);
        memset(buf, 0, sizeof(buf));
        fgets(buf, 10, f);
        printf("second read: %s\n", buf);]
}
2022-07-02 22:05:40 -05:00
Stephen Heumann 36808404ca Fix fread bug causing it to discard buffered data.
If you read from a file using fgetc (or another function that calls it internally), and then later read from it using fread, any data left in the buffer from fgetc would be skipped over. The pattern causing this to happen was as follows:

fread called ~SetFilePointer, which (if there was buffered data) called fseek(f, 0, SEEK_CUR). fseek would call fflush, which calls ~InitBuffer. That zeros out the buffer count. fseek then calls ftell, which gets the current mark from GS/OS and then subtracts the buffer count.  But the count was set to 0 in ~InitBuffer, so ftell reflects the position after any previously buffered data. fseek sets the mark to the position returned by ftell, i.e. after any data that was previously read and buffered, so that data would get skipped over. (Before commits 0047b755c9 and c95bfc19fb the behavior would be somewhat different due to the issue with ~InitBuffer that they addressed, but you could still get similar symptoms.)

The fix is simply to use the buffered data (if any), rather than discarding it.

Here is a test program illustrating the problem:

#include <stdio.h>
char buf[BUFSIZ+1];
#define RECSIZE 2
int main(void) {
        FILE *f = fopen("somefile","r"); // a file with some data
        if (!f) return 0;
        fgetc(f);
        size_t count = fread(buf, RECSIZE, BUFSIZ/RECSIZE, f);
        printf("read %zu records: %s\n", count, buf);
}
2022-07-02 18:24:57 -05:00
Stephen Heumann 84f471474a Use newer, more efficient ph2/ph4 macros throughout ORCALib.
These push DP values with pei, rather than lda+pha as in the old versions of the macros.
2022-06-30 19:01:47 -05:00
Stephen Heumann c95bfc19fb Fix a logic error.
There was a problem with the fix in commit 0047b755c9660: an instruction should have had an immediate operand, but did not because it was missing the #. This might cause the code to behave incorrectly, depending on memory contents.
2022-06-28 22:25:10 -05:00
Stephen Heumann a2b3d4541a fread fixes.
This includes changes to fread to fix two problems. The first is that fread would ignore and discard characters put back with ungetc(). The second is that it would generally return the wrong value if reading from stdin with an element size other than 1 (it would return the count of bytes, not elements).

This fixes #9 (the first problem mentioned above).
2022-06-27 18:24:52 -05:00
Stephen Heumann 1f88a38e2e Clear the EOF flag on successful calls to ungetc().
This is specified by the C standards.
2022-06-27 17:59:21 -05:00
Stephen Heumann 38666ee111 Restore changes to allow ungetc of byte values 0x80 through 0xFF.
These are the stdio.asm changes that were present in the beta source code on Opus ][, but had been reverted in commit e3c0c962d4. As it turns out, the changes to stdio.asm were OK--the issue was simply that the definitions of stdin/stdout/stderr and the associated initialization code in vars.asm had not been updated to account for the new version of the FILE structure. That has now been done, allowing the changes to work properly.

This fixes #7.
2022-06-27 17:58:23 -05:00
Stephen Heumann 7c2fb70c4a freopen improvements
This adds a check for the filename argument being null, in which case it bails out and returns NULL. Previously, it would just dereference the null pointer and treat the contents of memory location 0 as a filename, with unpredictable results. A null filename indicates freopen should try to reopen the same file with a different mode, but the degree of support for that is implementation-defined. We still don't really support it, but at least this will report the failure and avoid unpredictable behavior.

It also removes some unused code at the end, but still forces fputc and fgetc to be linked in. These are needed because there are weak references to them in putchar and getchar, which may need to be used if stdin/stdout have been reopened.
2022-06-26 21:20:23 -05:00
Stephen Heumann 8cfb73a474 Force the file mark to EOF whenever writing to a stream in append mode.
This is what the standards require. Note that the mark is only set to EOF when actually writing to the underlying file, not merely buffering data to write later. This is consistent with the usual POSIX implementation using O_APPEND.
2022-06-26 18:59:57 -05:00
Stephen Heumann 0047b755c9 Fix bug with fseek(..., SEEK_CUR) on read-only streams.
This would seek to the wrong location, and in some cases try to seek before the beginning of the file, leading to an error condition.

The problem stemmed from fseek calling fflush, which sets up the stream's buffer state in a way appropriate for writing but not for reading. It then calls ftell, which (for a read-only stream) would misinterpret this state and miscalculate the current position.

The fix is to make fflush work correctly on a read-only stream, setting up the state for reading. (User code calling fflush on a read-only stream would be undefined behavior, but since fseek does it we need to make it work.)

This fixes #6.
2022-06-26 14:35:56 -05:00
Stephen Heumann 3581d20a7c Standardize indentation using spaces.
Most files already used spaces, but three used tabs for indentation. These have been converted to use spaces. This allows the files to be displayed with proper formatting in modern editors and on GitHub. It also removes any dependency on SysTabs settings when assembling them.

The spacing in fpextra.asm was also modified to use standard column positions.

There are no non-whitespace changes in this commit.
2022-06-25 18:27:20 -05:00
Stephen Heumann ab2f17c249 Clear the IO error indicator as part of rewind().
The C standards require it to do this, in addition to calling fseek.

Here is a test that can show the issue (in a realistic program, the indicator would be set due to an actual IO error, but for testing purposes this just sets it explicitly):

#include <stdio.h>
int main(void) {
        FILE *f = tmpfile();
        if (!f) return 0;
        f->_flag |= _IOERR;
        if (!ferror(f)) puts("bad ferror");
        rewind(f);
        if (ferror(f)) puts("rewind does not reset ferror");
        fclose(f);
}
2022-06-24 18:36:25 -05:00
Stephen Heumann ae504c6e4f Add support for EILSEQ errno value.
EILSEQ is required by C95 and later.
2021-10-02 14:34:35 -05:00
Stephen Heumann 9af245933c Change long long printing routine to a new technique not using division.
The core loop uses the 65816's decimal mode to create a BCD representation of the number, then we convert that to ASCII to print it. The idea is based on some code from Kent Dickey, with adjustments to fit this codebase. This technique should be faster, and also should typically result in smaller code, since the long long division routine will not need to be linked into every program that uses printf.
2021-02-21 18:33:04 -06:00
Stephen Heumann 6bf27c6743 Use some improved macros for 64-bit operations in the libraries.
The new m16.int64 file contains a new "negate8" macro (for 64-bit negation), as well as an improved version of "ph8". These have been added to the individual *.macros files, but if those are regenerated, m16.int64 should be used as an input to macgen ahead of m16.ORCA (which contains the original version of ph8).
2021-02-12 20:24:37 -06:00
Stephen Heumann c185face85 Add long long support for printf d/i/u conversions.
At this point, the long long support for the printf and scanf families of functions is complete.
2021-02-10 17:51:23 -06:00
Stephen Heumann 4f9b41938a Implement vscanf/vfscanf/vsscanf (from C99).
These are equivalent to the existing scanf calls, except that they use variable arguments from a va_list.
2021-02-09 22:09:00 -06:00
Stephen Heumann cfc3fd3468 scanf: if %c gets some characters but < field width, it's a matching failure.
EOF should not be returned in this case.

I think it shouldn't actually write anything to the destination in this case, but it currently does. That problem remains unfixed for the moment; addressing it would require us to have our own internal buffer.
2021-02-09 16:03:19 -06:00
Stephen Heumann 6626aad7f0 scanf: report EOF for 's' conversions, is appropriate.
They should be treated as input failures if they got EOF before finding a non-whitespace character.
2021-02-09 15:44:39 -06:00
Stephen Heumann bc21593507 scanf: 's' and '[' conversions should "put back" EOF if it is encountered.
This ensures that a subsequent %n conversion reports the right number of characters actually read (not including EOF).
2021-02-09 15:31:19 -06:00
Stephen Heumann 375d664ae1 Recognize 'L' as a length modifier for scanf.
Conversions using it will not actually work right yet (unless assignment is suppressed), because the necessary changes have not been made in SysFloat, but the infrastructure is now in place to allow for those changes.
2021-02-09 14:32:37 -06:00
Stephen Heumann 2f68708b47 scanf: Report EOF for o/u/x conversions, if appropriate.
This was not being reported unless assignment was suppressed, because the "sta ~eofFound" instruction was too late.

(Found based on test cases by Rich Felker.)
2021-02-09 13:14:24 -06:00
Stephen Heumann 66e1835175 scanf could trash a word of the caller's stack if it did not perform all possible assignments.
Code in ~scanf would call ~RemoveWord to remove the extra parameters, but ~RemoveWord is designed to be called from one of the ~Scan_... subroutines, which is (in effect) called by JSR and thus has an extra word on the stack for its return address. This stack misalignment caused ~RemoveWord to overwrite a word of the caller's stack when called from the code to remove extra parameters.

This could cause a crash in the following program:

#include <stdio.h>
void f(void) {
    int a,b;
    sscanf("Z", "%i%i", &a, &b);
}
int main(void) {
    f();
}
2021-02-09 13:09:04 -06:00
Stephen Heumann 61bfc70b9b scanf: for 'i' or 'd' specifiers, count a leading sign toward the field width. 2021-02-09 00:40:24 -06:00
Stephen Heumann de9f830c0c scanf: when an ordinary character cannot match due to EOF, flag an input failure.
This may cause scanf to return EOF, if no conversions have been done.

This seems to be the intent of the C standards, although the wording could be clearer. It is also consistent with all the other implementations I tested.
2021-02-08 00:19:23 -06:00