The following is my C file:
int main() { return 36; }
It contains only return statement. But if I use the size command, it shows the output like this:
mohanraj@ltsp63:~/Development/chap8$ size a.out text data bss dec hex filename 1056 252 8 1316 524 a.out mohanraj@ltsp63:~/Development/chap8$
Even though my program does not contain any global variable, or undeclared data. But, the output shows data segment have 252 and the bss have 8 bytes. So, why the output is like this? what is 252 and 8 refers.
Advertisement
Answer
Size Command
First see the definition of each column:
- text – Actual machine instructions that your CPU going to execute. Linux allows to share this data.
- data – All initialized variables (declarations) declared in a program (e.g., float salary=123.45;).
- bss – The BSS consists of uninitialized data such as arrays that you have not set any values to or null pointers.
As Blue Moon said. On Linux, the execution starts by calling _start() function. Which does environment setup. Every C program has hidden “libraries” that depends on compilator you using. There are settings for global parameters, exit calls and after complete configuration it finally calls your main() function. ASFAIK there’s no way to see how your code looks encapsulated with configuration and _start() function. But I can show you that even your code contains more information than you thought the closer to hardware we are.
Hint:
Type readelf -a a.out
to see how much information your exec really carrying.
What is inside?
Do not compare code in your source file to the size of executable file, it depends on the OS, compilator, and used libraries.
In my example, with exactly the same code, SIZE returns:
eryk@eryk-pc:~$ gcc a.c eryk@eryk-pc:~$ size a.out text data bss dec hex filename 1033 276 4 1313 521 a.out
Let’s see what is inside…
eryk@eryk-pc:~$ gcc -S a.c
This will run the preprocessor over a.c, perform the initial compilation and then stop before the assembler is run.
eryk@eryk-pc:~$ cat a.s .file "a.c" .text .globl main .type main, @function main: .LFB0: .cfi_startproc pushl %ebp .cfi_def_cfa_offset 8 .cfi_offset 5, -8 movl %esp, %ebp .cfi_def_cfa_register 5 movl $36, %eax popl %ebp .cfi_restore 5 .cfi_def_cfa 4, 4 ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Ubuntu 4.8.2-19ubuntu1) 4.8.2" .section .note.GNU-stack,"",@progbits
Then look on the assembly code
eryk@eryk-pc:~$ objdump -d -M intel -S a.out a.out: file format elf32-i386 Disassembly of section .init: 08048294 <_init>: 8048294: 53 push ebx 8048295: 83 ec 08 sub esp,0x8 8048298: e8 83 00 00 00 call 8048320 <__x86.get_pc_thunk.bx> 804829d: 81 c3 63 1d 00 00 add ebx,0x1d63 80482a3: 8b 83 fc ff ff ff mov eax,DWORD PTR [ebx-0x4] 80482a9: 85 c0 test eax,eax 80482ab: 74 05 je 80482b2 <_init+0x1e> 80482ad: e8 1e 00 00 00 call 80482d0 <__gmon_start__@plt> 80482b2: 83 c4 08 add esp,0x8 80482b5: 5b pop ebx 80482b6: c3 ret Disassembly of section .plt: 080482c0 <__gmon_start__@plt-0x10>: 80482c0: ff 35 04 a0 04 08 push DWORD PTR ds:0x804a004 80482c6: ff 25 08 a0 04 08 jmp DWORD PTR ds:0x804a008 80482cc: 00 00 add BYTE PTR [eax],al ... 080482d0 <__gmon_start__@plt>: 80482d0: ff 25 0c a0 04 08 jmp DWORD PTR ds:0x804a00c 80482d6: 68 00 00 00 00 push 0x0 80482db: e9 e0 ff ff ff jmp 80482c0 <_init+0x2c> 080482e0 <__libc_start_main@plt>: 80482e0: ff 25 10 a0 04 08 jmp DWORD PTR ds:0x804a010 80482e6: 68 08 00 00 00 push 0x8 80482eb: e9 d0 ff ff ff jmp 80482c0 <_init+0x2c> Disassembly of section .text: 080482f0 <_start>: 80482f0: 31 ed xor ebp,ebp 80482f2: 5e pop esi 80482f3: 89 e1 mov ecx,esp 80482f5: 83 e4 f0 and esp,0xfffffff0 80482f8: 50 push eax 80482f9: 54 push esp 80482fa: 52 push edx 80482fb: 68 70 84 04 08 push 0x8048470 8048300: 68 00 84 04 08 push 0x8048400 8048305: 51 push ecx 8048306: 56 push esi 8048307: 68 ed 83 04 08 push 0x80483ed 804830c: e8 cf ff ff ff call 80482e0 <__libc_start_main@plt> 8048311: f4 hlt 8048312: 66 90 xchg ax,ax 8048314: 66 90 xchg ax,ax 8048316: 66 90 xchg ax,ax 8048318: 66 90 xchg ax,ax 804831a: 66 90 xchg ax,ax 804831c: 66 90 xchg ax,ax 804831e: 66 90 xchg ax,ax 08048320 <__x86.get_pc_thunk.bx>: 8048320: 8b 1c 24 mov ebx,DWORD PTR [esp] 8048323: c3 ret 8048324: 66 90 xchg ax,ax 8048326: 66 90 xchg ax,ax 8048328: 66 90 xchg ax,ax 804832a: 66 90 xchg ax,ax 804832c: 66 90 xchg ax,ax 804832e: 66 90 xchg ax,ax 08048330 <deregister_tm_clones>: 8048330: b8 1f a0 04 08 mov eax,0x804a01f 8048335: 2d 1c a0 04 08 sub eax,0x804a01c 804833a: 83 f8 06 cmp eax,0x6 804833d: 77 01 ja 8048340 <deregister_tm_clones+0x10> 804833f: c3 ret 8048340: b8 00 00 00 00 mov eax,0x0 8048345: 85 c0 test eax,eax 8048347: 74 f6 je 804833f <deregister_tm_clones+0xf> 8048349: 55 push ebp 804834a: 89 e5 mov ebp,esp 804834c: 83 ec 18 sub esp,0x18 804834f: c7 04 24 1c a0 04 08 mov DWORD PTR [esp],0x804a01c 8048356: ff d0 call eax 8048358: c9 leave 8048359: c3 ret 804835a: 8d b6 00 00 00 00 lea esi,[esi+0x0] 08048360 <register_tm_clones>: 8048360: b8 1c a0 04 08 mov eax,0x804a01c 8048365: 2d 1c a0 04 08 sub eax,0x804a01c 804836a: c1 f8 02 sar eax,0x2 804836d: 89 c2 mov edx,eax 804836f: c1 ea 1f shr edx,0x1f 8048372: 01 d0 add eax,edx 8048374: d1 f8 sar eax,1 8048376: 75 01 jne 8048379 <register_tm_clones+0x19> 8048378: c3 ret 8048379: ba 00 00 00 00 mov edx,0x0 804837e: 85 d2 test edx,edx 8048380: 74 f6 je 8048378 <register_tm_clones+0x18> 8048382: 55 push ebp 8048383: 89 e5 mov ebp,esp 8048385: 83 ec 18 sub esp,0x18 8048388: 89 44 24 04 mov DWORD PTR [esp+0x4],eax 804838c: c7 04 24 1c a0 04 08 mov DWORD PTR [esp],0x804a01c 8048393: ff d2 call edx 8048395: c9 leave 8048396: c3 ret 8048397: 89 f6 mov esi,esi 8048399: 8d bc 27 00 00 00 00 lea edi,[edi+eiz*1+0x0] 080483a0 <__do_global_dtors_aux>: 80483a0: 80 3d 1c a0 04 08 00 cmp BYTE PTR ds:0x804a01c,0x0 80483a7: 75 13 jne 80483bc <__do_global_dtors_aux+0x1c> 80483a9: 55 push ebp 80483aa: 89 e5 mov ebp,esp 80483ac: 83 ec 08 sub esp,0x8 80483af: e8 7c ff ff ff call 8048330 <deregister_tm_clones> 80483b4: c6 05 1c a0 04 08 01 mov BYTE PTR ds:0x804a01c,0x1 80483bb: c9 leave 80483bc: f3 c3 repz ret 80483be: 66 90 xchg ax,ax 080483c0 <frame_dummy>: 80483c0: a1 10 9f 04 08 mov eax,ds:0x8049f10 80483c5: 85 c0 test eax,eax 80483c7: 74 1f je 80483e8 <frame_dummy+0x28> 80483c9: b8 00 00 00 00 mov eax,0x0 80483ce: 85 c0 test eax,eax 80483d0: 74 16 je 80483e8 <frame_dummy+0x28> 80483d2: 55 push ebp 80483d3: 89 e5 mov ebp,esp 80483d5: 83 ec 18 sub esp,0x18 80483d8: c7 04 24 10 9f 04 08 mov DWORD PTR [esp],0x8049f10 80483df: ff d0 call eax 80483e1: c9 leave 80483e2: e9 79 ff ff ff jmp 8048360 <register_tm_clones> 80483e7: 90 nop 80483e8: e9 73 ff ff ff jmp 8048360 <register_tm_clones> 080483ed <main>: 80483ed: 55 push ebp 80483ee: 89 e5 mov ebp,esp 80483f0: b8 24 00 00 00 mov eax,0x24 80483f5: 5d pop ebp 80483f6: c3 ret 80483f7: 66 90 xchg ax,ax 80483f9: 66 90 xchg ax,ax 80483fb: 66 90 xchg ax,ax 80483fd: 66 90 xchg ax,ax 80483ff: 90 nop 08048400 <__libc_csu_init>: 8048400: 55 push ebp 8048401: 57 push edi 8048402: 31 ff xor edi,edi 8048404: 56 push esi 8048405: 53 push ebx 8048406: e8 15 ff ff ff call 8048320 <__x86.get_pc_thunk.bx> 804840b: 81 c3 f5 1b 00 00 add ebx,0x1bf5 8048411: 83 ec 1c sub esp,0x1c 8048414: 8b 6c 24 30 mov ebp,DWORD PTR [esp+0x30] 8048418: 8d b3 0c ff ff ff lea esi,[ebx-0xf4] 804841e: e8 71 fe ff ff call 8048294 <_init> 8048423: 8d 83 08 ff ff ff lea eax,[ebx-0xf8] 8048429: 29 c6 sub esi,eax 804842b: c1 fe 02 sar esi,0x2 804842e: 85 f6 test esi,esi 8048430: 74 27 je 8048459 <__libc_csu_init+0x59> 8048432: 8d b6 00 00 00 00 lea esi,[esi+0x0] 8048438: 8b 44 24 38 mov eax,DWORD PTR [esp+0x38] 804843c: 89 2c 24 mov DWORD PTR [esp],ebp 804843f: 89 44 24 08 mov DWORD PTR [esp+0x8],eax 8048443: 8b 44 24 34 mov eax,DWORD PTR [esp+0x34] 8048447: 89 44 24 04 mov DWORD PTR [esp+0x4],eax 804844b: ff 94 bb 08 ff ff ff call DWORD PTR [ebx+edi*4-0xf8] 8048452: 83 c7 01 add edi,0x1 8048455: 39 f7 cmp edi,esi 8048457: 75 df jne 8048438 <__libc_csu_init+0x38> 8048459: 83 c4 1c add esp,0x1c 804845c: 5b pop ebx 804845d: 5e pop esi 804845e: 5f pop edi 804845f: 5d pop ebp 8048460: c3 ret 8048461: eb 0d jmp 8048470 <__libc_csu_fini> 8048463: 90 nop 8048464: 90 nop 8048465: 90 nop 8048466: 90 nop 8048467: 90 nop 8048468: 90 nop 8048469: 90 nop 804846a: 90 nop 804846b: 90 nop 804846c: 90 nop 804846d: 90 nop 804846e: 90 nop 804846f: 90 nop 08048470 <__libc_csu_fini>: 8048470: f3 c3 repz ret Disassembly of section .fini: 08048474 <_fini>: 8048474: 53 push ebx 8048475: 83 ec 08 sub esp,0x8 8048478: e8 a3 fe ff ff call 8048320 <__x86.get_pc_thunk.bx> 804847d: 81 c3 83 1b 00 00 add ebx,0x1b83 8048483: 83 c4 08 add esp,0x8 8048486: 5b pop ebx 8048487: c3 ret
Next step would converting above code to 01 notation.
As you can see. Even simple c program contains complicated operation the closer to hardware your code is. I hope I have explained to you why the executable file is bigger than you thought. If you have any doubts, feel free to comment my post. I will edit my answer immediately.