In principle what I want is very simple.
Two executables ./read
and ./write
respectively read and write from a resource (let’s say a file). Using flock(2)
it is easy to prevent race conditions between arbitrary invocations of ./read
and ./write
at arbitrary times.
The requirement is that each invocation of ./read
contains a snapshot of the resource from a previous invocation, and if the current resource matches the snapshot, ./read
should wait (sleep) until an invocation of ./write
changes the resource.
From what I gather, the program flows of each program should be something like:
//read.c obtain mutex0 read resource is resource same as our snapshot? release mutex0 [1] sleep until ./write says to wake up [2] obtain mutex0 read resource do something with resource release mutex0 //write.c obtain mutex0 change resource in some way tell any sleeping ./read's to wake up release mutex0
The main problem with this approach is that there is a tangible delay between the lines labelled [1]
and [2]
. This means that a ./read
can release mutex0
at [1]
, an entire invocation of ./write
can complete, and then [2]
executes, but will stall indefinitely because ./write
already tried to wake up any sleeping ./read
s before.
Is there no easy way to do what I want, besides using an entire separate full-blown server process? Also, for those curious I want to do this for an application in CGI.
Advertisement
Answer
No, the program flow for the reader is incorrect. You need some sort of locking mechanism to prevent writes while one or more reads are in progress, and some sort of wakeup mechanism to notify readers whenever a write is completed.
Your program flow for the writer(s) is okay:
# Initial read of file contents Obtain lock Read file Release lock # Whenever wishes to modify file: Obtain lock Modify file Signal readers Release lock
The program flow for the reader(s) should be:
# Initial read of file contents Obtain lock Read file Release lock # Wait and respond to changes in file On signal: Obtain lock Read file Release lock Do something with modified file contents
If there is only one reader, then a mutex (pthread_mutex_t
) in shared memory (accessible to all writers and the reader) suffices; otherwise, I recommend using an rwlock (pthread_rwlock_t
) instead. For waking up any waiting readers, broadcast on a condition variable (pthread_cond_t
). The difficulty, of course, is setting up that shared memory.
Advisory file locking and the fanotify interface is also sufficient. Readers install a fanotify FAN_MODIFY
mark, and simply wait for the corresponding event. Writers do not need to co-operate, except for the use of an advisory lock (which exists only to stop readers from reading while the file is modified).
Unfortunately, the interface currently requires the CAP_SYS_ADMIN
capability, which you definitely do not want random CGI programs to have.
Advisory file locking and the inotify interface is sufficient, and I believe the most appropriate for this, when both readers and writers open and close the file for each set of operations. The program flow for this case for the reader(s) is:
Initialize inotify interface Add inotify watch for IN_CREATE and IN_CLOSE_WRITE for "file" Open "file" read-only Obtain shared/read-lock Read contents Release lock Close "file" Loop: Read events from inotify descriptor. If IN_CREATE or IN_CLOSE_WRITE for "file": Open "file" read-only Obtain shared/read-lock Read contents Release lock Close "file" Do something with file contents
The writer is still just
# Initial read of file contents Open "file" for read-only Obtain shared/read-lock on "file" Read contents Release lock Close "file" # Whenever wishes to modify file: Open "file" for read-write Obtain exclusive/write-lock Modify file Release lock Close "file"
Even if the writers do not obtain the lock, the readers will be notified when a writer closes the file; the only risk is that another set of changes is written (by another lock-spurning modifier) while the readers are reading the file.
Even if a modifier replaces the file with a new one, the readers get correctly notified when a new one is ready (either renamed/linked on top of the old one, or the new file creator closes the file). It is important to note that if the readers keep the file open, their file descriptors will not magically jump to the new file, and they will only see the old (probably deleted) contents.
If it is for some reason important that readers and writers do not close the file, the readers can still use inotify, but an IN_MODIFY
mark instead, to be notified whenever the file is truncated or written to. In this case, it is important to remember that if the file is then replaced (renamed over, or deleted and recreated), the readers and writers will not see the new file, but will operate on the old, now invisible-in-the-filesystem file contents.
The program flow for the reader:
Initialize inotify interface Add inotify watch for IN_MODIFY for "file" Open "file" read-only Obtain shared/read-lock Read contents Release lock Loop: Read events from inotify descriptor. If IN_CREATE or IN_CLOSE_WRITE for "file": Obtain shared/read-lock on "file" Read contents Release lock Do something with file contents
The program flow for the writer is still almost the same:
# Initial read of file contents Open "file" for read-only Obtain shared/read-lock on "file" Read contents Release lock Close "file" Open "file" for read-write # Whenever writer wishes to modify the file: Obtain exclusive/write-lock Modify file Release lock
It may be important to note that the inotify events occur after the fact. There is usually some small latency, which might depend on the load on the machine. So, if prompt response to file changes is important for the system to work correctly, you may have to go with a mutex or rwlock and a condition variable in shared memory approach instead.
In my experience, these latencies tend to be shorter than the typical human reaction interval. Therefore, I consider — and I suggest you do so too — the inotify interface fast and reliable enough at human timescales; not so at millisecond and sub-millisecond machine timescales.