Summary of POSIX filesystem (for SAFE FS discussions)

happybeing · July 31, 2020, 5:14pm

Summary of POSIX filesystem

This post is a summary of features relevant to discussion of the new SAFE FS being designed to support replication using CDRTs (see FileTree CRDT for SAFE Network).

inodes, Directories and Files

inodes are structures which hold metadata about the objects in a filesystem. In Unix/Linux operating systems a mounted filesystem is typically implemented using a fixed number of inodes (e.g. 2^64), which altogether hold the structure and metadata for all files and directories. Each inode corresponds to an object or entry in the filesystem (e.g. a file, directory or symlink) and is referred to in the POSIX filesystem APIs by an inode number (an unsigned 64 bit integer). Typical metadata stored in an inode includes creation and modification time, operating system owner, group and mode (access controls).

directories are inode objects which have a list of directory entries which map names to inode numbers. A file or directory is therefore independent of its name or location, and either can be changed without touching the entry’s inode object, and just modifying an entry in a directory.

files are inodes which have content, typically a list of block locations on a storage device (disk or memory). These locations can change when a file is modified, a device is defragmented etc, and causing the inodes list of block locations to be updated.

The filesystem always has at least one directory inode known as ‘root’ and is the ultimate directory, or the base path for all other directories and files in the filesystem. The filesystem ‘root’ may appear at different paths on computer device (e.g. at ‘/’ or ‘/tmp’ etc.), the mount point.

Implementation notes:

It appears that the root inode is always given an inode number of 1 (though I have no POSIX reference for this yet).
the low-level FUSE API appears to use a zero inode value to signify no inode (like null in many languages, and None in Rust). This implies that zero is never used for a valid inode number. Again I don’t have a POSIX reference yet, but see fuse_entry_param::ino

Symlinks

A symlink (short for ‘symbolic link’) is an inode which holds a path which acts like a pointer to another location. This can be used to make a file or directory appear in more than one filesystem path, and for many applications this will look as if the file or directory pointed to by the symlink, is at the location of the symlink.

Changes to the destination will therefore be reflected when anything accesses it via the symlink.

If the desination is renamed, moved or deleted, the destination of the symlink will no longer point to anything and the destination file or directory will no longer be accessible via the symlink.

This is different to hard-links described next.

See: Symbolic link (Wikipedia)

Hard-links

A hard link is a named entry in a directory which holds an inode number. So in practice all files visible in the system have at least one hard-link, from the directory in which they appear. But in POSIX systems, more than one directory entry can refer to the same inode number, in which case the same inode appears in more than one location.

Each inode keeps track of the number of such links as part of its metadata, and will be deleted from the filesystem when this count reaches zero. However, this link count includes links from open processes accessing the file not just links from the filesystem itself. This feature means that an inode which has been removed completely from the filesystem is kept until any active processes have closed their access to the file, or been terminated, and its link count reaches zero.

Do we need hard-links?

Hard links mostly go unseen by applications, but are very useful in multithreaded operating systems. For example, they ensure that a program, library or script can continue working even if it is deleted from the filesystem by another process - rather than causing the running program to behave unpredictably. This can be useful when developing and updating programs or scripts, and for updating system libraries without the need to shut the system down.

Hard-links can also be useful for users to create ‘views’ of data held in the filesystem and are more robust than symlinks, which are pointers to a location rather than an object. This means that no directory entry that is a hard link is preferred over any other, and can be removed or moved without affecting other links, allowing multiple directory hierarchies to co-exist. Uses include sharing access to files and directories in different contexts, and making snapshots of large directory trees without duplicating files (cf. rsnapshot).

Summary: hard-links are very useful in some circumstances, but easily overlooked so we should be cautious about omitting them in the long term.

File Locking

File locking allows multiple programs or processes to access the same file or directory in a co-operative way to avoid unwanted side effects from data being changed unexpectedly by another process.

Several locking mechanisms exist, even on a single operating system, and I think none of them are able to prevent an unco-operative process from interfering with a locked file.

[ ] To do: File locking needs further investigation.

Ref: File locking (Wikipedia)

dirvine · July 31, 2020, 5:58pm

Hey @happybeing have you seen https://github.com/ubnt-intrepid/polyfuse (async fuse)

happybeing · July 31, 2020, 9:30pm

Thanks for the tip David, looks interesting.

danda · August 1, 2020, 12:00am

Thanks for the writeup. Helpful to bring everyone to common understanding.

small nit: I think should be “is an inode of type symlink which holds a path…”. Because an inode can be one of: directory, file, symlink.

happybeing · August 1, 2020, 9:01am

You are correct, and nit picks are welcome because precision is understanding. Now fixed, thanks. Feel free to edit the OP to correct or add stuff.

drehb · August 1, 2020, 9:21pm

Thinking you meant to link to FileTree CRDT for SAFE Network here