Here is a very simple assembly program, just return 12
after executed.
$ cat a.asm global _start section .text _start: mov rax, 60 ; system call for exit mov rdi, 12 ; exit code 12 syscall
It can be built and executed correctly:
$ nasm -f elf64 a.asm && ld a.o && ./a.out || echo $? 12
But the size of a.out is big, it is more than 4k:
$ wc -c a.out 4664 a.out
I try to understand it by reading elf content:
$ readelf -l a.out Elf file type is EXEC (Executable file) Entry point 0x401000 There are 2 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000 0x00000000000000b0 0x00000000000000b0 R 0x1000 LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000 0x000000000000000c 0x000000000000000c R E 0x1000 Section to Segment mapping: Segment Sections... 00 01 .text
it is strange, segment 00 is aligned by 0x1000, I think it means such segment at least will occupy 4096 bytes.
My question is what is this segment 00?
(nasm version 2.14.02, ld version 2.34, os is Ubuntu 20.04.1)
Advertisement
Answer
Since it starts at file offset zero, it is probably a “padding” segment introduced to make the loading of the ELF more efficient. The .text segment will, in fact, be already aligned in the file as it should be in memory.
You can force ld not to align sections both in memory and in the file with -n
. You can also strip the symbols with -s
.
This will reduce the size to about 352 bytes.
Now the ELF contains:
- The ELF header (Needed)
- The program header table (Needed)
- The code (Needed)
- The string table (Possibly unneeded)
- The section table (Possibly unneeded)
The string table can be removed, but apparently strips
can’t do that.
I’ve removed the .shstrtab
section data and all the section headers manually to shrink the size down to 144 bytes.
Consider that 64 bytes come from the ELF header, 60 from the single program header and 12 from your code; for a total of 136 bytes.
The extra 8 bytes are padding, 4 bytes at the end of the code section (easy to remove), and one at the end of the program header (which requires a bit of patching).