syncfiles/docs/tech/apis.md

10 KiB
Raw Blame History

layout title nav_order permalink parent
page File APIs 4 /tech/apis Technical Guide

File APIs

Mac OS filesystem APIs evolved as the underlying filesystem semantics changed.

Pascal Strings

Prior to Mac OS X, Macintosh APIs used Pascal strings to represent strings. Pascal strings are passed by pointer. The first byte pointed to stores the string length, and the string contents follows. Pascal strings are not null-terminated and not compatible with C-style strings. Note that since the string length is stored in a single byte, these strings cannot be longer than 255 bytes.

Compilers for Mac OS support Pascal strings by putting the \p sequence at the beginning of a string. For example,

const unsigned char kFilename[] = "\pMy File";

This is equivalent to:

const unsigned char kFilename[8] = {
  7, 'M', 'y', ' ', 'F', 'i', 'l', 'e'
};

Pascal strings are encoded using one of the old Macintosh character encodings, such as Mac OS Roman. In some cases, the encoding is assumed to be the systems encoding, whatever that is. In other cases, the encoding is explicitly specified using a ScriptCode (although this value is somewhat ambiguous). In other cases still, the actual encoding is ignored and the string is treated as if it were encoded using the Mac OS Roman encoding.

Early Mac OS

The filesystem on the very first Macintosh, later called the Mac 128K, did not support folders or directories. Each file was identified by volume ID and name. For example, OpenDF opens the data fork of a file (presumably, DF stands for “data fork”—there is a corresponding OpenRF for the resource fork), and Create creates a new file.

OSErr OpenDF(
  const unsigned char *  fileName,
  short                  vRefNum,
  short *                refNum);

OSErr Create(
  const unsigned char *  fileName,
  short                  vRefNum,
  OSType                 creator,
  OSType                 fileType);

Filenames have a maximum length of 63 characters and are case insensitive. Systems from this era did not support multiple character encodings, so the encoding did not need to be specified. Note that the 63-character limit is an API limit, and common filesystems have a lower, 31-character limit.

You should not be using this API unless you are targeting extremely old Macintosh systems, like the Mac 128K. This API became obsolete with the introduction of 128K ROMs with the Mac Plus in 1986.

This API is not part of Carbon and cannot be used on Mac OS X.

Swapping Floppy Disks

During this era, it was common to swap floppy disks while a program was running. You can run a program from one disk and save files on another disk. When a program tries to access a file thats on a different disk, the operating system ejects the current disk and prompts the user to insert the correct disk. This happens automatically; programs do not need to include any code to make this possible.

Compatibility with HFS

Old applications written to use this API continue to work after files after the introduction of the hierarchical filesystem. The way this happens is through something called working directories.

A working directory is the combination of a volume ID and a directory ID, and it can be used in place of a volume ID in the filesystem API. The intent is that old code which only uses volume IDs can be used to save files in different locations on the filesystem. For example, if an old application creates a “save file” dialog box, instead of a volume ID, the dialog box returns a working directory pointing to the directory where the user chose to save the file.

Hierarchical API

Alongside Apples first hard disk for the Macintosh, the operating system introduced a new filesystem (HFS) which supported directories, and this required a new API. Instead of OpenDF, programs now call HOpenDF to the data fork of a file, and call HCreate instead of Create. Presumably, “H” stands for “hierarchical”.

OSErr HOpenDF(
  short                  vRefNum,
  long                   dirID,
  const unsigned char *  fileName,
  SInt8                  permission,
  short *                refNum);

OSErr HCreate(
  short                  vRefNum,
  long                   dirID,
  const unsigned char *  fileName,
  OSType                 creator,
  OSType                 fileType);

In this API, files are identified by volume ID, the directory ID within that volume, and the filename within that directory. The encoding is not specified, and presumably, files will be created using the systems default character encoding.

FSSpec API

Mac System 7 introduces the FSSPec API. It does not change semantics, but provides a simple data structure which is used to store the volume ID, directory ID, and filename. This structure is called FSSpec.

struct FSSpec {
  short          vRefNum;
  long           parID;
  unsigned char  name[64];
};

Functions which use an FSSpec are named with the FSp prefix. These functions are preferred for over the previous versions for their simplicity. For example, FSpOpenDF, which opens a files data fork, and FSpCreate, which creates a new file:

OSErr FSpOpenDF(
  const FSSpec *  spec,
  SInt8           permission,
  short *         refNum);

OSErr FSpCreate(
  const FSSpec *  spec,
  OSType          creator,
  OSType          fileType,
  ScriptCode      scriptTag);

Note that the character encoding is specified when creating a file, using the scriptTag parameter (although technically, this does not completely specify an encoding). The character encoding is not specified when opening a file, instead, the encoding is ignored and treated as if it were the Mac OS Roman encoding.

This API preserves the encoding used for filenames, but you only need the correct bytestring to refer to existing files.

Starting in Mac OS X 10.4, this API and other APIs before it are marked as deprecated.

FSRef API

Mac OS 9 introduces a new opaque alternative to FSSpec called FSRef. You can use an FSRef to refer to an existing file, but there is no way to get any information from an FSRef without invoking the filesystem API. The FSRef is not a drop-in replacement for FSSpec, beceause a FSRef must refer to an existing file, and therefore, cannot be used to create a new file.

struct FSRef {
  UInt8  hidden[80];
};

The FSRef structure is private to an application and cannot be assumed to be valid if the file is moved/renamed, if the volume is unmounted and remounted, or if the structure is passed to another process. We can speculate that the FSRef structure contains information about a file like its filesystem inode, which can be used to look up the file name.

You can convert an FSSpec to an FSRef, although this will fail if the file does not exist:

OSErr FSpMakeFSRef(
  const FSSpec *  source,
  FSRef *         newRef);

The FSRef API has some other differences. For example, when you open a file, you specify the name of the fork you want to open. Previous APIs used different functions for opening the data fork and the resource fork. File names are now specified as Unicode strings encoded with UTF-16.

OSErr FSOpenFork(
  const FSRef *    ref,
  UniCharCount     forkNameLength,
  const UniChar *  forkName,
  SInt8            permissions,
  SInt16 *         forkRefNum);

OSErr FSCreateFileUnicode(
  const FSRef *          parentRef,
  UniCharCount           nameLength,
  const UniChar *        name,
  FSCatalogInfoBitmap    whichInfo,
  const FSCatalogInfo *  catalogInfo,
  FSRef *                newRef,
  FSSpec *               newSpec);

This API is a part of Carbon. Carbon was never ported to 64-bit architectures, and was deprecated in Mac OS X 10.8. The last version that supports this API is macOS 10.14, which is the last version of macOS that supports 32-bit programs.

Unix API

Mac OS X brought us the Unix APIs:

int open(
  const char *  pathname,
  int           flags,
  ...);

int openat(
  int           fd,
  const char *  pathname,
  int           flags,
  ...);

The openat call is available from Mac OS 10.10 onwards.

Strings are now null-terminated C strings, which are interpreted as UTF-8. Mac OS filesystems do not support arbitrary bytestrings as filenames. If you try to create a file with open with a filename that is not supported by the filesystem, the system call will fail and set errno to EILSEQ (92). This is not documented in the man page for open.

Forks on Unix

The resource fork can be accessed as if it were a separate file. To access the resource fork, append /rsrc or /..namedfork/rsrc to the file path. This allows you to view resource fork data through ordinary Unix APIs or with Unix command-line utilities like hexdump and ls.

Linux also allows you to access the resource fork by appending /rsrc to the file path, for filesystems that support resource forks.

Extended Attributes

Starting in version 10.4, Mac OS X provides an interface for accessing extended attributes on a file.

ssize_t getxattr(
  const char *  path,
  const char *  name,
  void *        value,
  size_t        size,
  u_int32_t     position,
  int           options);

The resource fork is presented as an extended attribute with the name com.apple.ResourceFork. Since resource forks can be as much as 16 MiB in size, the getxattr function provides a way to read portions of the resource fork without having to read the entire fork.

Finder info is contained in an attribute named com.apple.FinderInfo.

Aliases and Bookmarks

Mac OS also provides facilities to store a reference to a file. These references are designed to be durable, and work even if the file is moved and renamed. These references work by containing various pieces of metadata about the file. If the file is moved, the metadata can be used to find it again.

On older Mac OS systems, these records are called aliases and you can create them using the alias manager APIs, which are available starting in System 7.

In Cocoa, these records are called bookmarks.

The main use of aliases and bookmarks is to store references to files in the “recently used files” menu option in applications.