This question is somehow similar to Bad file descriptor but it’s not the same at all. I know this is “bad question” (“too localized” maybe), but I can’t figure it out and I’m now out of any ideas.
Introduction
I have a manager thread, that starts 75 other threads. Each of these threads do a lot of things, so I’ll describe only the relevant ones.
Please note: if I start only a few threads – for example 3 or 5, or 10, this error does not appear! This makes me think, that this is some multithreading issue, but it doesn’t seem to be such.. And you’ll see why in the next sections.
So, in the following 2 cases, SOMETIMES I receive this error Bad file descriptor
:
case 1
The error appears in TinyXML
There’s an xml file, that’s needed by all threads. All of these threads use TinyXML
to parse the file. ALL of these threads use this file READ-ONLY! (I know this can be optimized, but whatever).
So, the code, that causes the Bad file descriptor
error is this:
// ... // NOTE: this is LOCAL, other threads do NOT have access to it TiXmlDocument doc; doc.LoadFile( filename ); // and here's the LoadFile: bool TiXmlDocument::LoadFile( const char* _filename, TiXmlEncoding encoding ) { //... FILE* file = fopen( value.c_str (), "rb" ); if ( file ) { // this IS executed, so file is NOT NULL for sure bool result = LoadFile( file, encoding ); //... } //... } bool TiXmlDocument::LoadFile( FILE* file, TiXmlEncoding encoding ) { // ... long length = 0; fseek( file, 0, SEEK_END ); // from the code above, we are SURE that file is NOT NULL, it's valid, but length = ftell( file ); // RETURNS -1 with errno: 9 (BAD FILE DESCRIPTOR) // how is this possible, as "file" is not NULL and it appears to be valid? // ... }
case 2
This is a bit more complicated. I’ve removed the checking of the return values, but I have them in my real code, so this is not a problem
int hFileR = open( sAlarmFileName.c_str(), O_CREAT | O_RDONLY, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH ); // hFileR is > 0 for sure, so success flock( hFileR, LOCK_EX ) /* the result is > 0 for sure, so success*/ // read the file into a string while( (nRes = read(hFileR, BUFF, MAX_RW_BUFF_SIZE)) > 0 ) // ... //Write new data to file: reopen/create file - write and truncate mode int hFileW = open( sAlarmFileName.c_str(), O_CREAT | O_WRONLY | O_TRUNC, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH ); // hFileW is > 0 for sure, so success do { int nWrtRes = write( hFileW, szText + nBytesWritten, nSize - nBytesWritten ); // nWrtRes is always >= 0, so success nBytesWritten += nWrtRes; } while( nSize > nBytesWritten ); close( hFileW ); // this one is successful too if( flock(hFileR, LOCK_UN) == -1 ) { // THIS FAILS and executes _Exit( FAILURE ); } if( close( hFileR ) < 0 ) { // if the previous one do not fail, this one is successful too }
Sorry for the long question. Any ideas?
Advertisement
Answer
Some words on the understanding of file descriptors:
Files are global resources. To handle such, a (process) global indexing is used: Integer values, called file descriptors. If a thread opens a file this opened file is referred by an index. This index is unique to a process (not to a thread). If a file is closed, the file descriptor (integer index) is not used any more and could be reused by the process (and any of its threads).
Example:
By any thread in a process the 1st call to open()
might return 3, the 2nd might return 4.
If then 3 is closed, the 3rd call to open()
may return 3 again.
If the 1st call is done by thread 1, the 2nd by thread 2 and the 3rd by thread 3, it is easy to understand that thread 1 shall not close its file descriptor again, as the value of 3 might already have been recycled and in use by thread 3, which would try to access an invalid file descriptor as it might have been closed by the 2nd (errorous) call to close()
by thread 1. Ok? 😉
Try setting up some example code, and inspect/log the integer values returned by calls to open()
and assigned as file descriptors, to get an impression on how it works.
Note:
This also might refer to stdin
, stdout
and stderr
, the “predefined” file descriptors 0
, 1
and 2
. Under recent Linux closing stdin
followed by a call to int fd = open("myfoofile.bar", ...)
might very well return 0
as file descriptor fd
. Anyhow, either the Kernel or the glibc
is not able to handle such a 0
as expected. Obscure errors might occur using lseek(fd, ...)
for example. Try it! ;->>