If I build a C++ app with -static-libstdc++
which loads a shared lib (via dlopen) which was also built with -static-libstdc++
, then the app seg faults during dlopen.
BUT–this only happens in some setups:
- GCC 4.7.4, 32-bit: pass
- GCC 4.8.3, 32-bit: pass
- GCC 4.8.4, 64-bit: pass
- GCC 4.9.2, 64-bit: pass
- GCC 4.9.3, 32-bit: FAIL (unless
RTLD_DEEPBIND
is specified) - GCC 4.9.3, 64-bit: pass
Findings:
- If
-static-libstdc++
is not used when building for either the shared lib or the app, it works. - If (
RTLD_LAZY | RTLD_DEEPBIND
) is passed to dlopen, it works. So I suspect the problem is related to symbol confusion/duplication between the app & .so. - Interestingly, if I have the code load the .so first with (
RTLD_LAZY | RTLD_DEEPBIND
), and then close it and re-load with onlyRTLD_LAZY
, it also works.
Repro Steps
Code:
functions.cpp
extern "C" { int ExportedFunction1() { std::cout << "n---n" << __FUNCTION__ << "n---n" << std::endl; return(0); } }
main.cpp
#include <iostream> #include <dlfcn.h> int main(int argc, char * argv[]) { void * ph(NULL); if(argc == 2 && argv[1][0] == '1') { std::cout << "Calling dlopen with flags RTLD_LAZY | RTLD_DEEPBIND..." << std::flush; ph = dlopen("./libfunctions.so", RTLD_LAZY | RTLD_DEEPBIND); std::cout << "done. Result: " << ph << std::endl; if(ph) dlclose(ph); } std::cout << "Calling dlopen with flags RTLD_LAZY..." << std::flush; ph = dlopen("./libfunctions.so", RTLD_LAZY); std::cout << "done. Result: " << ph << std::endl; if(ph) dlclose(ph); return 0; }
Build
$ g++ -m32 -g -fPIC -c functions.cpp -o functions.o $ g++ -m32 -g -fPIC -shared -Wl,-soname,libfunctions.so -static-libgcc -static-libstdc++ functions.o -o libfunctions.so $ g++ -m32 -g -fPIC -static-libgcc -static-libstdc++ main.cpp -l dl -o main
Run
$ ./main Calling dlopen with flags RTLD_LAZY...Segmentation fault (core dumped)
Backtrace
$ gdb -c ./core ./main GNU gdb (GDB) 7.9.1 Copyright (C) 2015 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-linux". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./main...done. [New LWP 19846] warning: Could not load shared library symbols for linux-gate.so.1. Do you need "set solib-search-path" or "set sysroot"? Core was generated by `./main'. Program terminated with signal SIGSEGV, Segmentation fault. #0 __atomic_add_single (__val=1, __mem=0x0) at /home/test/dev/3rdParty/gcc/build/i686-pc-linux-gnu/libstdc++-v3/include/ext/atomicity.h:74 74 { *__mem += __val; } (gdb) bt #0 __atomic_add_single (__val=1, __mem=0x0) at /home/test/dev/3rdParty/gcc/build/i686-pc-linux-gnu/libstdc++-v3/include/ext/atomicity.h:74 #1 __atomic_add_dispatch (__val=1, __mem=0x0) at /home/test/dev/3rdParty/gcc/build/i686-pc-linux-gnu/libstdc++-v3/include/ext/atomicity.h:98 #2 _M_add_reference (this=0x0) at /home/test/dev/3rdParty/gcc/build/i686-pc-linux-gnu/libstdc++-v3/include/bits/locale_classes.h:510 #3 std::locale::locale (this=0xb74f7ffc <__gnu_internal::buf_cout_sync+28>) at /home/test/dev/3rdParty/gcc/gcc-4.9.3/libstdc++-v3/src/c++98/locale_init.cc:223 #4 0xb746f559 in basic_streambuf (this=<optimized out>) at /home/test/dev/3rdParty/gcc/build/i686-pc-linux-gnu/libstdc++-v3/include/streambuf:466 #5 stdio_sync_filebuf (__f=0xb76a2a20 <_IO_2_1_stdout_>, this=<optimized out>) at /home/test/dev/3rdParty/gcc/build/i686-pc-linux-gnu/libstdc++-v3/include/ext/stdio_sync_filebuf.h:77 #6 std::ios_base::Init::Init (this=0xb74f7a01 <std::__ioinit>) at /home/test/dev/3rdParty/gcc/gcc-4.9.3/libstdc++-v3/src/c++98/ios_init.cc:85 #7 0xb7469419 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at /home/test/dev/3rdParty/build_toolchain/include/c++/4.9.3/iostream:74 #8 0xb7469456 in _GLOBAL__sub_I_functions.cpp(void) () at functions.cpp:12 #9 0xb772c25e in ?? () from /lib/ld-linux.so.2 #10 0xb772c35a in ?? () from /lib/ld-linux.so.2 #11 0xb7730622 in ?? () from /lib/ld-linux.so.2 #12 0xb772c117 in ?? () from /lib/ld-linux.so.2 #13 0xb772fdf4 in ?? () from /lib/ld-linux.so.2 #14 0xb76edcae in ?? () from /lib/libdl.so.2 #15 0xb772c117 in ?? () from /lib/ld-linux.so.2 #16 0xb76ee3b6 in ?? () from /lib/libdl.so.2 #17 0xb76edd61 in dlopen () from /lib/libdl.so.2 #18 0x0804e102 in main (argc=1, argv=0xbfadc254) at main.cpp:18
It’s odd to me that this only seems to fail in some versions of GCC, and only for 32-bit. I have not tried GCC 5 yet.
I appreciate thoughts/suggestions.
Advertisement
Answer
If -static-libstdc++ is not used when building for either the shared lib or the app, it works.
In general, you should either avoid using -static-libstdc++
, or hide all of its symbols to avoid such problems.
So I suspect the problem is related to symbol confusion/duplication between the app & .so.
Correct. In particular, the problem is that some symbols are duplicated, while others are not. We’ve had to disable STB_GNU_UNIQUE
symbols for that reason.
if I have the code load the .so first with (RTLD_LAZY | RTLD_DEEPBIND), and then close it and re-load with only RTLD_LAZY, it also works.
That’s because dlclose
doesn’t actually unload the library if you used it. From man dlclose
:
The function dlclose() decrements the reference count on the dynamic library handle handle. If the reference count drops to zero and no other loaded libraries use symbols in it, then the dynamic library is unloaded.
You should be able to verify that this is the case by stopping the program in GDB after dlclose
and looking at its /proc/$PID/maps
— it’s very likely that you’ll find that libfunctions.so
is still present in memory.