Add documentation of <uchar.h> functions.

2025-02-11 07:30:24 +00:00 · 2021-10-02 22:40:31 -05:00 · 2021-10-02 22:40:31 -05:00 · 020f5ca5b2
commit 020f5ca5b2
parent cc8e003860
1 changed files with 20 additions and 0 deletions
--- a/cc.notes
+++ b/cc.notes
@ -575,6 +575,8 @@ ORCA/C now includes several new headers specified by recent C standards.

 8.  (C89 and C99) The <locale.h> header provides functions and definitions related to locale support.

+9.  (C11) The <uchar.h> header defines the types char16_t and char32_t suitable for holding UTF-16 and UTF-32 code units, and provides functions for handling Unicode characters.
+
 Library Updates
 ---------------

@ -793,6 +795,24 @@ If s is not a null pointer, the mblen function determines the length of the mult

 If s is a null pointer, mblen returns a nonzero value if multibyte characters have state-dependent encodings, or 0 if they do not.  On ORCA/C, it returns 0.

+16. (C11) The mbrtoc16, mbrtoc32, c16rtomb, and c32rtomb functions have been added:
+
+#include <uchar.h>
+size_t mbrtoc16(char16_t * restrict pc16, const char * restrict s, size_t n,
+   mbstate_t * restrict ps);
+size_t mbrtoc32(char32_t * restrict pc32, const char * restrict s, size_t n,
+   mbstate_t * restrict ps);
+
+These functions convert a multibyte character to Unicode, using the UTF-16 or UTF-32 encoding.  They read the multibyte character pointed to by s, examining at most n bytes.  If a valid multibyte character is found, it is converted to UTF-16 and written to *pc16 (in mbrtoc16), or converted to UTF-32 and written to *pc32 (in mbrtoc32); 0 is returned if processing the null character, or the number of bytes in the multibyte character otherwise.  In certain other cases, these functions may return a negative value cast to size_t, but those cases do not occur in ORCA/C provided that n>0.  The ps argument (if not null) points to an object that can hold any conversion state, but in ORCA/C it is not used.
+
+#include <uchar.h>
+size_t c16rtomb(char * restrict s, char16_t c16, mbstate_t * restrict ps);
+size_t c32rtomb(char * restrict s, char32_t c32, mbstate_t * restrict ps);
+
+These functions convert from a UTF-16-encoded or UTF-32-encoded Unicode character (c16 or c32) to a multibyte character, which is written to *s.  They return the number of bytes written to s, or return (size_t)(-1) and set errno to EILSEQ if a conversion cannot be performed (e.g. because there is no corresponding character in the multibyte character set).  The ps argument (if not null) points to an object that can hold any conversion state, but in ORCA/C it is currently not used.
+
+In ORCA/C, multibyte characters are always one byte and are considered to be encoded in the character set used by the IIGS desktop environment (known as Mac OS Roman), so these functions convert between that character set and Unicode.
+
 -- Compiler changes introduced in C 2.1.0 -----------------------------------

 The Default .h File