I have some Python code that goes roughly like this, using some libraries that you may or may not have:
# Open it for writing vcf_file = open(local_filename, "w") # Download the region to the file. subprocess.check_call(["bcftools", "view", options.truth_url.format(sample_name), "-r", "{}:{}-{}".format(ref_name, ref_start, ref_end)], stdout=vcf_file) # Close parent process's copy of the file object vcf_file.close() # Upload it file_id = job.fileStore.writeGlobalFile(local_filename)
Basically, I’m starting a subprocess that’s supposed to go download some data for me and print it to standard out. I’m redirecting that data to a file, and then, as soon as the subprocess call returns, I’m closing my handle to the file and then copying the file elsewhere.
I’m observing that, sometimes, the tail end of the data I’m expecting isn’t making it into the copy. Now, it’s possible that bcftools
is just occasionally not writing that data, but I’m worried that I might be doing something unsafe and somehow getting access to the file after subprocess.check_call()
has returned, but before the data that the child process writes to standard output makes it onto the disk where I can see it.
Looking at the C standard (since bcftools is implemented in C/C++), it looks like when a program exits normally, all open streams (including standard output) are flushed and closed. See the [lib.support.start.term]
section here, describing the behavior of exit()
, which is called implicitly when main()
returns:
–Next, all open C streams (as mediated by the function signatures declared in ) with unwritten buffered data are flushed, all open C streams are closed, and all files created by calling tmp- file() are removed.30)
–Finally, control is returned to the host environment. If status is zero or EXIT_SUCCESS, an implementation-defined form of the status successful termination is returned. If status is EXIT_FAILURE, an implementation-defined form of the status unsuccessful termination is returned. Otherwise the status returned is implementation-defined.31)
So before the child process exits, it closes (and thus flushes) standard output.
However, the manual page for Linux close(2)
notes that closing a file descriptor does not necessarily guarantee that any data written to it has actually made it to disk:
A successful close does not guarantee that the data has been successfully saved to disk, as the kernel defers writes. It is not common for a filesystem to flush the buffers when the stream is closed. If you need to be sure that the data is physically stored, use fsync(2). (It will depend on the disk hardware at this point.)
Thus, it would appear that, when a process exits, its standard output stream is flushed, but if that stream is actually backed by a file descriptor pointing to a file on disk, the write to disk is not guaranteed to have completed. I suspect that that may be what is going on here.
So, my actual questions:
Is my reading of the specs correct? Can a child process appear to its parent to have terminated before its redirected standard output is available on disk?
Is it possible to somehow wait until all data written by the child process to files has actually been synced to disk by the OS?
Should I be calling
flush()
or some Python version offsync()
on the parent process’s copy of the file object? Can that force writes to the same file descriptor by child processes to be committed to disk?
Advertisement
Answer
Yes, there could be minutes before the data is written to the disk (physically). But you can read it long before that.
Unless you are worrying about a power failure or a kernel panic; it doesn’t matter whether the data is on disk. The important part whether the kernel thinks that the data is written.
It is safe to read from the file as soon as check_call()
returns. If you don’t see all the data; it may indicate a bug in bcftools
or that writeGlobalFile()
doesn’t upload all the data from the file. You could try to workaround the former by disabling the block-buffering mode for bsftools
‘ stdout (provide a pseudo-tty, use unbuffer
command-line utility, etc).
Q: Is my reading of the specs correct? Can a child process appear to its parent to have terminated before its redirected standard output is available on disk?
yes. yes.
Q: Is it possible to somehow wait until all data written by the child process to files has actually been synced to disk by the OS?
no. fsync()
is not enough in the general case. Likely, you don’t need it anyway (reading data back is a different issue, from making sure that it is written to disk).
Q: Should I be calling flush() or some Python version of fsync() on the parent process’s copy of the file object? Can that force writes to the same file descriptor by child processes to be committed to disk?
It would be pointless. .flush()
flushes buffers that are internal to the parent process (you can use open(filename, 'wb', 0)
to avoid creating unnecessary buffers in the parent).
fsync()
works on a file descriptor (the child has its own file descriptor). I don’t know whether the kernel uses different buffers for different file descriptors referring to the same disk file. Again, it doesn’t matter — if you observe data missing (no-crashes); fsync()
won’t help here.
Q: Just to be clear, I see that you’re asserting that the data should indeed be readable by other processes, because the relevant OS buffers are shared between processes. But what’s your source for that assertion? Is there a place in a spec or the Linux documentation you can point to that guarantees that those buffers are shared?
Look for “After a write()
to a regular file has successfully returned”:
Any successful
read()
from each byte position in the file that was modified by that write shall return the data specified by thewrite()
for that position until such byte positions are again modified.