I have a shared library that occasionally throws SIGSEGV
by design. I can find out if a SIGSEGV
is caused by me, and if it is then handle it. However I ran into some problems when implementing the other branch (ie. when it isn’t my SIGSEGV
). My primary problem is what if a handler was set to SIG_DFL
. This is my current code which I want to make generic (as it currently only supports a few signals, and relies on the default behaviors of Linux, not just any POSIX):
void call_next_sighandler(struct sigaction* act, int signo, siginfo_t* info, void* context) { if (act->sa_flags & SA_SIGINFO) { if (act->sa_sigaction) { act->sa_sigaction(signo, info, context); } } else { if (act->sa_handler == SIG_IGN) { return; } else if (act->sa_handler == SIG_DFL) { // we only support a few signals, all of which just dump core: // SIGFPE P1990 Core Floating-point exception // SIGSEGV P1990 Core Invalid memory reference // SIGTRAP P2001 Core Trace/breakpoint trap // // Therefore we just unregister ourselves and let the process crash sigaction(signo, act, nullptr); return; } else { act->sa_handler(signo); } } } struct sigaction old_sigsegv; void handle_sigsegv(int signo, siginfo_t* info, void* context) { if (is_my_sigsegv(context)) handle_my_sigsegv(context); else call_next_sighandler(&old_sigsegv, signo, info, context); }
Another problem I ran into is how I store the old signal handler in my own module. What happens if another module is loaded after me, and they also decide to handle signals? They will simply store my signal handler in their module and chain to that. However that means that when I’m unloaded, their signal handler will call invalid memory. Or as an alternative if I register back the handler that I received as old, then I remove the new module’s handler. The only solution I could come up with is allocating out-of-module executable memory that doesn’t go away when I’m unloaded, but is there really no better way?
Advertisement
Answer
After reviewing standards and others’ implementations, I decided to do a self-answer.
Most suitable solution for libraries
Simply don’t deal with the signal registration mess. Just expose a “signal handler” that must be called by the user of the library, and returns whether a signal was handled or not. Signal handlers are process-global, so they can be considered as a resource of the main executable. Libraries shouldn’t deal with others’ resources on their own. While this might cause some headaches to whoever is using your library, it is ultimately the most flexible solution.
I ended up with a rather simple function prototype:
LIB_EXPORT int lib_handle_signal(int signo, siginfo_t* info, void* context);
And documented that the user must call it on several signals.
Actual answers to the two concerns
Since my library’s primary user is a C# executable (in which you can’t write signal handlers due to the restriction to signal-safe functions) I still had to deal with the issue, except in a separate library that is rather considered to be “part of” the main executable.
Default action
The default actions for POSIX signals are actually specified in POSIX. For abnormal or normal termination handlers simply unregistering ourselves and letting the process crash is an appropriate solution, while the default ignored ones can be simply ignored.
Chaining and unloading
The simplest way to solve this issue is simply never unloading. While I haven’t found a truly POSIX solution to this, there is a simple one that works on most Unices:
static void make_permanently_loaded() { static char a_variable_in_the_module; // this is not POSIX but most BSDs, Linux and Mac have it Dl_info dl_info; memset(&dl_info, 0, sizeof(dl_info)); int res = dladdr(&a_variable_in_the_module, &dl_info); assert(res && dl_info.dli_fname); // Leak a reference to ourselves void* me = dlopen(dl_info.dli_fname, RTLD_NOW | RTLD_NODELETE); assert(me); }
Other implementations
While I haven’t really found similar problems here on SO, there are a few implementations that encountered the same problems as I did, and tried their best at handling them.
libsigsegv
libsigsegv simply discards the previous handlers and doesn’t even attempt chaining to whatever was registered before:
sigaction (sig, &action, (struct sigaction *) NULL);
It also does not handle unloading, your process will abort if you unload it then cause a SIGSEGV, even if prior to loading you had a SIGSEGV handler registered.
It handles unhandled signals similar to me in the question, by unregistering itself and letting the signal happen again which will result in normal or abnormal termination.
OpenJDK / Hotspot
Java brings libjsig which hooks signal
and sigaction
. When the JRE is installing signal handlers, libjsig backs up the old ones. When someone else is installing signal handlers to signals that the JRE installed prior, it simply saves them new ones (and returns the previous old one). The JVM is expected to implement the actual chaining, the old handlers are only to be queried from libjsig. This approach has the advantage of being stackable – multiple different versions of libjsig may be loaded and they will work. However unfortunately a single copy of the library can only be used by a single copy of a JRE (or similar), so as a library implementer you can’t use it if you aren’t sure that no one will attempt loading a JRE into the same process. However you can “fork” it and simply make a renamed copy of it for your purposes, making it safe to load next to a JRE in the same process.
The Hotspot JVM implementation contains signal handling and actually calling (chaining) the handlers saved by libjsig. Unfortunately the default action handling branch is not implemented as it instead decides to throw all unexpected signals as an UnexpectedException. However, the mask handling code is very useful for anyone else implementing chaining.
The unloading problem is not solved by libjsig – it is expected that the library will never be unloaded. You can add the anti-unloading code from the earlier part of the answer to make sure this is the case.
CLR
I did not review this in depth because it has the most complicated handling.
The CLR implements SEH exceptions (the Windows exception handling model) on top of POSIX signals, and a single level of chaining similar to the JRE. It might be possible to register your own SEH unwinding and exception handling information for your ranges of code, so if you don’t mind pulling in a CLR dependency, this might be worth looking into.
Structured Exception Handling is the standard exception handling method on Windows, which specifies an unwinding information format. When a hardware exception is received, the stack is unwound based on the provided information, language specific handlers associated to the code ranges of every return address are invoked, and they may decide to handle an exception or not. This means exceptions (signals) are “resources” belonging to whatever code causes them (as long as a lower frame doesn’t accidentally catch it due to a badly written filter function), unlike the *nix way where they’re process-global. In my personal opinion this is a much more sensible approach.