I’ve been working on some buggy code and wanted to install a SIGSEGV handler to get more information about the crash. However, I noticed that my handler is not invoked.
I’ve been looking for a reason and it seems it has to do with a corrupt stack pointer value (it’s not getting masked for sure). Here’s some proof-of-concept code I wrote up to verify:
static void catch_function(int sig, siginfo_t *info, void *cntxt) { puts("handler works"); } void main(int argc, char **argv) { struct sigaction sa; sa.sa_sigaction = (void *)catch_function; sigemptyset (&sa.sa_mask); sa.sa_flags = SA_SIGINFO | SA_NODEFER ; sigaction(SIGSEGV, &sa, NULL); puts("testing handler"); raise(SIGSEGV); puts("back"); __asm__ ( "xor %rax, %raxnt" "mov %rax, %rspnt" "push 0" ); // never reached... }
The idea is to set RSP to 0 (invalid offset) and then use it for something. However, this second SIGSEGV will not be caught by the handler but instead terminates the process.
Apparently, invoking the signal handler needs a sane stack pointer to begin with — but why? Isn’t this against the idea of handling signals? Any chance of getting around this?
I’m running Linux version 3.19.0-25-generic.
Advertisement
Answer
Okay, here is a solution to the above problem following EOF’s comment (using sigaltstack()
to provide a signal stack on the heap):
#include <stdio.h> #define __USE_GNU #include <signal.h> #include <stdlib.h> #include <ucontext.h> static long long int sbase; static void catch_function(int sig, siginfo_t *info, void *cntxt) { puts("handler works"); /* reset RSP if invalid */ ucontext_t *uc_context = (ucontext_t *)cntxt; if(!uc_context->uc_mcontext.gregs[REG_RSP]) { puts("resetting RSP"); uc_context->uc_mcontext.gregs[REG_RSP] = sbase; } } void main(int argc, char **argv) { /* RSP during main */ sbase = (long long int)&argv; stack_t ss; struct sigaction sa; ss.ss_sp = malloc(SIGSTKSZ); ss.ss_size = SIGSTKSZ; ss.ss_flags = 0; sigaltstack(&ss, NULL); sa.sa_sigaction = (void *)catch_function; sigemptyset (&sa.sa_mask); sa.sa_flags = SA_SIGINFO | SA_NODEFER | SA_ONSTACK; sigaction(SIGSEGV, &sa, NULL); puts("testing handler"); raise(SIGSEGV); puts("back"); __asm__ ( "xor %rax, %raxnt" "mov %rax, %rspnt" "push %raxnt" "pop %rax" ); puts("exiting."); }
The alternative signal stack is allocated on the heap and registered using sigaltstack(&ss,NULL)
. Also, the SA_ONSTACK
flag is set in the sigaction
struct to enable the alternative stack use for this specific action.
This basically resolves my problem, because now we see an endless stream of SIGSEGV
s being caught. After all, the above catch_function()
doesn’t do much to fix the invalid stack pointer. As a solution, I now store the valid stack pointer for the main()
in sbase
and use that to restore it in the handler if it’s invalid (through manipulation of the saved thread context).
To make all of this work, I also fixed my inline assembly to not just push a value but also pop it back afterwards, so the stack height remains unchanged. For the sake of replicability, I also included the includes this time.