I think that it’s safe to say that C locales are universally recognized as a bad idea.
Writing an application that tries to parse or write text-based machine formats (which happens quite often) with C standard library functions gets near-impossible if you have to account for locale being set to anything different than "C"
. Since locale is normally per-process (and setlocale
is often not thread-safe), if you are writing a library or you have a multithreaded program it’s not safe even to do setlocale(LC_ALL, "C")
and restore it after doing your stuff.
Now, for these reasons the rule is normally “avoid setlocale
, period”; but: we’ve been bitten several times in the past by the peculiar behavior of QCoreApplication
and derived classes; the documentation says:
On Unix/Linux Qt is configured to use the system locale settings by default. This can cause a conflict when using POSIX functions, for instance, when converting between data types such as floats and strings, since the notation may differ between locales. To get around this problem, call the POSIX function
setlocale(LC_NUMERIC,"C")
right after initializingQApplication
orQCoreApplication
to reset the locale that is used for number formatting to “C”-locale.
This behavior has been described in another question; my question is: what could be the rationale of this apparently foolish behavior? In particular, what’s so peculiar about Unix and Linux that prompted such decision only on these platforms?
(Incidentally, will everything break if I just do setlocale(LC_ALL, "C");
after creating the QApplication
? If it’s fine, why don’t they just remove their setlocale(LC_ALL, "");
?)
Advertisement
Answer
From investigations through the Qt source code conducted by @Phil Armstrong and me (see the chat log), it seems that the setlocale
call is there since version 1 for several reasons:
- XIM, at least in ancient times, didn’t correctly “get” the current locale without such a call.
- On Solaris, it even crashed with the default C locale.
- On Unix systems, it’s used (among other systems, in a complex game of fallbacks) to “sniff” the “system character set” (whatever that means on Unix), and thus be able to convert between the
QString
representation and the “local” 8 bit encoding (this is particularly critical for file paths).
It’s true that it already checks the LC_*
environment variables, as it does with QLocale
, but I suppose that it may be useful to have nl_langinfo
decode the current LC_CTYPE
if the application explicitly changed it (but to see if there is an explicit change, it has to start with system defaults).
It’s interesting that they did a setlocale(LC_NUMERIC, "C")
immediately after the setlocale(LC_ALL, "")
, but this was removed in Qt 4.4. The rationale for this decision seems to lie in the task #132859 of the old Qt bugtracker (which moved between TrollTech, Nokia and QtSoftware.com before vanishing without leaving any track, not even in the Wayback Machine), and it’s referenced in two bugs regarding this topic. I think that an authoritative answer on the topic was there, but I can’t find a way to recover it.
My guess is that it introduced subtle bugs, since the environment seemed pristine, but it was in fact touched by the setlocale
call in all but the LC_NUMERIC
category (which is the most evident); probably they removed the call to make the locale setting more evident and have application developers act accordingly.