I have been running my application successfully in CentOs6.6. Recently, the hardware(motherboard and RAM) was updated and my application is getting killed now without any reason at all.
[root@localhost PktBlaster]# ./PktBlaster Killed
File and ldd output
[root@localhost PktBlaster]# file PktBlaster PktBlaster: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped [root@localhost PktBlaster]# ldd PktBlaster not a dynamic executable
Output of strace
[root@localhost PktBlaster]# strace ./PktBlaster execve("./PktBlaster", ["./PktBlaster"], [/* 30 vars */] <unfinished ...> +++ killed by SIGKILL +++ Killed
GDB
[root@localhost PktBlaster]# gdb PktBlaster (gdb) break main Breakpoint 1 at 0x43d664: file VTP.c, line 544. (gdb) run Starting program: /root/Veryx/PktBlaster/PktBlaster During startup program terminated with signal SIGKILL, Killed.
While debugging, observed that the bss memory is huge(~6GB). The system has 4GB RAM and I think this could be the reason for the issue.
[root@localhost PktBlaster_1Gig]# size build/unix/bin/PktBlaster text data bss dec hex filename 375551 55936 6747541120 6747972607 19235e3ff build/unix/bin/PktBlaster
The application contains many .h
files and many datastructures and so it is difficult for me to identify why BSS is been raised to 6GB.
Could anyone please suggest how to identify which file is causing this? or any other easier way to debug this?
Advertisement
Answer
It seems that problem really is huge BSS size.
I have asked you to show output of LD_DEBUG=all /lib64/ld-linux-x86-64.so.2 /path/to/exe
in comments.
/lib64/ld-linux-x86-64.so.2
is runtime linker which is used by OS to load your binary in process memory during execve
system call. Runtime linker is responsible for parsing executable format, loading all sections and dependencies in memory, performing all required relocations and so on.
Setting environment variable LD_DEBUG
to all we instruct runtime linker to generate debug output.
[root@localhost PktBlaster]# LD_DEBUG=all /lib64/ld-linux-x86-64.so.2 /root/Veryx/PktBlaster/PktBlaster 851: file=/root/Veryx/PktBlaster/PktBlaster [0]; generating link map /root/Veryx/PktBlaster/PktBlaster: error while loading shared libraries: /root/Veryx/PktBlaster/PktBlaster: cannot map zero-fill pages: Cannot allocate memory
Searching for this error message in source code of runtime linker(glibc-2.17 elf/dl-load.c, lines ~1400) we see:
1393 if (zeroend > zeropage) 1394 { 1395 /* Map the remaining zero pages in from the zero fill FD. */ 1396 caddr_t mapat; 1397 mapat = __mmap ((caddr_t) zeropage, zeroend - zeropage, 1398 c->prot, MAP_ANON|MAP_PRIVATE|MAP_FIXED, 1399 -1, 0); 1400 if (__builtin_expect (mapat == MAP_FAILED, 0)) 1401 { 1402 errstring = N_("cannot map zero-fill pages"); 1403 goto call_lose_errno; 1404 }
dl-loader is in process of loading BSS segment, which by optimizations is stored in binary format as just number of bytes, that must be initialized to zero. Loader tries to allocate through mmap
zero initialized memory block(MAP_ANONYMOUS) and get error from the OS:
15 #define ENOMEM 12 /* Out of memory */
From man 2 mmap:
ENOMEM No memory is available, or the process’s maximum number of mappings would have been exceeded.
So it seems that for whatever reason OS cannot fulfill loader request for memory. Either some limits are used(systemd, process limit, some security LKM, whatever) or simply there are not enough free memory in kernel.
To determine what object file generates most part of the BSS – use
objdump -j '.bss' -t *.o