Skip to content
Advertisement

ACID Transactions at the File System

Background:

I am getting a temperature float from an arduino via a serial connection. I need to be able to cache this temperature data every 30 seconds for other applications (e.g. web, thermostat controller) to access and not overload the serial connection.

Currently I cache this data to RAM as a file in /run (I’m trying to follow Linux convention). Then, other applications can poll the file for the temperature as they want it all day long with i/o now the only bottle neck (using an rpi, so not a lot of enterprise-level need here).

Problem:

I think when an app reads these files, it risks reading corrupt data. Should a writer update the file, and a reader try to read the file at the same time, can corrupt data be read, causing the thermostat to behave erratically?

Should I just use sqlite3 as an overkill solution, or use file locks (and does that risk something else not working perfectly)?

This is all taking place in multiple python processes. Is Linux able to handle this situation natively or do I need to apply somehow the principles mentioned here?

Advertisement

Answer

Calls to write(2) ought to be atomic under Linux.

Which means as long you are writing a single buffer, you can be certain that readers won’t read an incomplete record. You might want to use os.write to make sure that no buffering/chunking happens you are not aware of.

if a read is happening and a file is updated, will the read use the new data while in the middle of a file, or does it somehow know how to get data from the old file (how)?

If there is exactly one read(2) and one write(2), you are guaranteed to see a consistent result. If you split your write into two, it might happen that you write the first part, read and then write the second part which would be an atomicity violation. In case you need to write multiple buffers, either combine them yourself or use writev(2).

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement