Skip to content
Advertisement

How do I expose custom files similar to /procfs on Linux?

I have a writer process which outputs its status at regular intervals as a readable chunck of wchar_t. I would need to ensure the following properties:

  1. When there’s and update, the readers shouldn’t read partial/corrupted data
  2. The file should be volatile in memory so that when the writer quits, the file is gone
  3. The file content size is variable
  4. Multiple readers could read the file in parallel, doesn’t matter if the content is synced, as long as is non partial for each client
  5. If using truncate and then write, clients should only read the full file and not observe such partial operations

How could I implement such /procfs-like file, outside /procfs filesystem?

I was thinking to use classic c Linux file APIs and create something under /dev/shm by default, but I find it hard to implement effectively point 1 and 5 most of all. How could I expose such file?

Advertisement

Answer

Typical solution is to create a new file in the same directory, then rename (hardlink) it over the old one.

This way, processes see either an old one or a new one, never a mix; and it only depends on the moment when they open the file.

The Linux kernel takes care of the caching, so if the file is accessed often, it will be in RAM (page cache). The writer must, however, remember to delete the file when it exits.


A better approach is to use fcntl()-based advisory record locks (typically over the entire file, i.e. .l_whence = SEEK_SET, .l_start = 0, .l_len = 0).

The writer will grab a write/exclusive lock before truncating and rewriting the contents, and readers a read/shared lock before reading the contents.

This requires cooperation, however, and the writer must be prepared to not be able to lock (or grabbing the lock may take undefined amount of time).


A Linux-only scheme would be to use atomic replacement (via rename/hardlinking), and file leases.

(When the writer process has an exclusive lease on an open file, it gets a signal whenever another process wants to open that same file (inode, not file name). It has at least a few seconds to downgrade or release the lease, at which point the opener gets access to the contents.)

Basically, the writer process creates an empty status file, and obtains exclusive lease on it. Whenever the writer receives a signal that a reader wants to access the status file, it writes the current status to the file, releases the lease, creates a new empty file in the same directory (same mount suffices) as the status file, obtains an exclusive lease on that one, and renames/hardlinks it over the status file.

If the status file contents do not change all the time, only periodically, then the writer process creates an empty status file, and obtains exclusive lease on it. Whenever the writer receives a signal that a reader wants to access the (empty) status file, it writes the current status to the file, and releases the lease. Then, when the writer process’ status is updated, and there is no lease yet, it creates a new empty file in the status file directory, takes an exclusive lease on it, and renames/hardlinks over the status file.

This way, the status file is always updated just before a reader opens it, and only then. If there are multiple readers at the same time, they can open the status file without interruption when the writer releases the lease.

It is important to note that the status information should be collected in a single structure or similar, so that writing it out to the status file is efficient. Leases are automatically broken if not released soon enough (but there are a few seconds at least to react), and the lease is on the inode – file contents – not the file name, so we still need the atomic replacement.

Here’s a crude example implementation:

#define _POSIX_C_SOURCE  200809L
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdarg.h>
#include <inttypes.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>
#include <signal.h>
#include <limits.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>

#define   LEASE_SIGNAL  (SIGRTMIN+0)

static pthread_mutex_t  status_lock = PTHREAD_MUTEX_INITIALIZER;
static int              status_changed = 0;
static size_t           status_len = 0;
static char            *status = NULL;

static pthread_t        status_thread;
static char            *status_newpath = NULL;
static char            *status_path = NULL;
static int              status_fd = -1;
static int              status_errno = 0;

char *join2(const char *src1, const char *src2)
{
    const size_t  len1 = (src1) ? strlen(src1) : 0;
    const size_t  len2 = (src2) ? strlen(src2) : 0;
    char         *dst;

    dst = malloc(len1 + len2 + 1);
    if (!dst) {
        errno = ENOMEM;
        return NULL;
    }

    if (len1 > 0)
        memcpy(dst, src1, len1);
    if (len2 > 0)
        memcpy(dst+len1, src2, len2);
    dst[len1+len2] = '';

    return dst;
}

static void *status_worker(void *payload __attribute__((unused)))
{
    siginfo_t info;
    sigset_t  mask;
    int       err, num;

    /* This thread blocks all signals except LEASE_SIGNAL. */
    sigfillset(&mask);
    sigdelset(&mask, LEASE_SIGNAL);
    err = pthread_sigmask(SIG_BLOCK, &mask, NULL);
    if (err)
        return (void *)(intptr_t)err;

    /* Mask for LEASE_SIGNAL. */
    sigemptyset(&mask);
    sigaddset(&mask, LEASE_SIGNAL);

    /* This thread can be canceled at any cancellation point. */
    pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, NULL);
    pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);

    while (1) {
        num = sigwaitinfo(&mask, &info);
        if (num == -1 && errno != EINTR)
            return (void *)(intptr_t)errno;

        /* Ignore all but the lease signals related to the status file. */
        if (num != LEASE_SIGNAL || info.si_signo != LEASE_SIGNAL || info.si_fd != status_fd)
            continue;

        /* We can be canceled at this point safely. */
        pthread_testcancel();

        /* Block cancelability for a sec, so that we maintain the mutex correctly. */
        pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, NULL);

        pthread_mutex_lock(&status_lock);        
        status_changed = 0;

        /* Write the new status to the file. */
        if (status && status_len > 0) {
            const char        *ptr = status;
            const char *const  end = status + status_len;
            ssize_t            n;

            while (ptr < end) {
                n = write(status_fd, ptr, (size_t)(end - ptr));
                if (n > 0) {
                    ptr += n;
                } else
                if (n != -1) {
                    if (!status_errno)
                        status_errno = EIO;
                    break;
                } else
                if (errno != EINTR) {
                    if (!status_errno)
                        status_errno = errno;
                    break;
                }
            }
        }

        /* Close and release lease. */
        close(status_fd);
        status_fd = -1;

        /* After we release the mutex, we can be safely canceled again. */
        pthread_mutex_unlock(&status_lock);
        pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);

        pthread_testcancel();
    }
}

static int start_status_worker(void)
{
    sigset_t          mask;
    int               result;
    pthread_attr_t    attrs;

    /* This thread should block LEASE_SIGNAL signals. */
    sigemptyset(&mask);
    sigaddset(&mask, LEASE_SIGNAL);
    result = pthread_sigmask(SIG_BLOCK, &mask, NULL);
    if (result)
        return errno = result;

    /* Create the worker thread. */
    pthread_attr_init(&attrs);
    pthread_attr_setstacksize(&attrs, 2*PTHREAD_STACK_MIN);
    result = pthread_create(&status_thread, &attrs, status_worker, NULL);
    pthread_attr_destroy(&attrs);

    /* Ready. */
    return 0;
}

int set_status(const char *format, ...)
{
    va_list  args;
    char    *new_status = NULL;
    int      len;

    if (!format)
        return errno = EINVAL;

    va_start(args, format);
    len = vasprintf(&new_status, format, args);
    va_end(args);
    if (len < 0)
        return errno = EINVAL;

    pthread_mutex_lock(&status_lock);
    free(status);
    status = new_status;
    status_len = len;
    status_changed++;

    /* Do we already have a status file prepared? */
    if (status_fd != -1 || !status_newpath) {
        pthread_mutex_unlock(&status_lock);
        return 0;
    }

    /* Prepare the status file. */
    do {
        status_fd = open(status_newpath, O_WRONLY | O_CREAT | O_CLOEXEC, 0666);
    } while (status_fd == -1 && errno == EINTR);
    if (status_fd == -1) {
        pthread_mutex_unlock(&status_lock);
        return 0;
    }

    /* In case of failure, do cleanup. */
    do {
        /* Set lease signal. */
        if (fcntl(status_fd, F_SETSIG, LEASE_SIGNAL) == -1)
            break;

        /* Get exclusive lease on the status file. */
        if (fcntl(status_fd, F_SETLEASE, F_WRLCK) == -1)
            break;

        /* Replace status file with the new, leased one. */
        if (rename(status_newpath, status_path) == -1)
            break;

        /* Success. */
        pthread_mutex_unlock(&status_lock);
        return 0;
    } while (0);

    if (status_fd != -1) {
        close(status_fd);
        status_fd = -1;
    }
    unlink(status_newpath);

    pthread_mutex_unlock(&status_lock);
    return 0;
}


int main(int argc, char *argv[])
{
    char   *line = NULL;
    size_t  size = 0;
    ssize_t len;

    if (argc != 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
        const char *argv0 = (argc > 0 && argv[0]) ? argv[0] : "(this)";
        fprintf(stderr, "n");
        fprintf(stderr, "Usage: %s [ -h | --help ]n", argv0);
        fprintf(stderr, "       %s STATUS-FILEn", argv0);
        fprintf(stderr, "n");
        fprintf(stderr, "This program maintains a pseudofile-like status file,n");
        fprintf(stderr, "using the contents from standard input.n");
        fprintf(stderr, "Supply an empty line to exit.n");
        fprintf(stderr, "n");
        return EXIT_FAILURE;
    }

    status_path = join2(argv[1], "");
    status_newpath = join2(argv[1], ".new");
    unlink(status_path);
    unlink(status_newpath);

    if (start_status_worker()) {
        fprintf(stderr, "Cannot start status worker thread: %s.n", strerror(errno));
        return EXIT_FAILURE;
    }

    if (set_status("Emptyn")) {
        fprintf(stderr, "Cannot create initial empty status: %s.n", strerror(errno));
        return EXIT_FAILURE;
    }

    while (1) {
        len = getline(&line, &size, stdin);
        if (len < 1)
            break;

        line[strcspn(line, "n")] = '';
        if (line[0] == '')
            break;

        set_status("%sn", line);
    }

    pthread_cancel(status_thread);
    pthread_join(status_thread, NULL);

    if (status_fd != -1)
        close(status_fd);

    unlink(status_path);
    unlink(status_newpath);

    return EXIT_SUCCESS;
}

Save the above as server.c, then compile using e.g.

gcc -Wall -Wextra -O2 server.c -lpthread -o server

This implements a status server, storing each line from standard input to the status file if necessary. Supply an empty line to exit. For example, to use the file status in the current directory, just run

./server status

Then, if you use another terminal window to examine the directory, you see it has a file named status (with typically zero size). But, cat status shows you its contents; just like procfs/sysfs pseudofiles.

Note that the status file is only updated if necessary, and only for the first reader/accessor after status changes. This keeps writer/server overhead and I/O low, even if the status changes very often.

The above example program uses a worker thread to catch the lease-break signals. This is because pthread mutexes cannot be locked or released safely in a signal handler (pthread_mutex_lock() etc. are not async-signal safe). The worker thread maintains its cancelability, so that it won’t be canceled when it holds the mutex; if canceled during that time, it will be canceled after it releases the mutex. It is careful that way.

Also, the temporary replacement file is not random, it is just the status file name with .new appended at end. Anywhere on the same mount would work fine.

As long as other threads also block the lease break signal, this works fine in multithreaded programs, too. (If you create other threads after the worker thread, they’ll inherit the correct signal mask from the main thread; start_status_worker() sets the signal mask for the calling thread.)

I do trust the approach in the program, but there may be bugs (and perhaps even thinkos) in this implementation. If you find any, please comment or edit.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement