I have this short hello world program:
#include <stdio.h> static const char* msg = "Hello world"; int main(){ printf("%sn", msg); return 0; }
I compiled it into the following assembly code with gcc:
.file "hello_world.c" .section .rodata .LC0: .string "Hello world" .data .align 4 .type msg, @object .size msg, 4 msg: .long .LC0 .text .globl main .type main, @function main: .LFB0: .cfi_startproc pushl %ebp .cfi_def_cfa_offset 8 .cfi_offset 5, -8 movl %esp, %ebp .cfi_def_cfa_register 5 andl $-16, %esp subl $16, %esp movl msg, %eax movl %eax, (%esp) call puts movl $0, %eax leave .cfi_restore 5 .cfi_def_cfa 4, 4 ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4" .section .note.GNU-stack,"",@progbits
My question is: are all parts of this code essential if I were to write this program in assembly (instead of writing it in C and then compiling to assembly)? I understand the assembly instructions but there are certain pieces I don’t understand. For instance, I don’t know what .cfi* is, and I’m wondering if I would need to include this to write this program in assembly.
Advertisement
Answer
The absolute bare minimum that will work on the platform that this appears to be, is
.globl main main: pushl $.LC0 call puts addl $4, %esp xorl %eax, %eax ret .LC0: .string "Hello world"
But this breaks a number of ABI requirements. The minimum for an ABI-compliant program is
.globl main .type main, @function main: subl $24, %esp pushl $.LC0 call puts xorl %eax, %eax addl $28, %esp ret .size main, .-main .section .rodata .LC0: .string "Hello world"
Everything else in your object file is either the compiler not optimizing the code down as tightly as possible, or optional annotations to be written to the object file.
The .cfi_*
directives, in particular, are optional annotations. They are necessary if and only if the function might be on the call stack when a C++ exception is thrown, but they are useful in any program from which you might want to extract a stack trace. If you are going to write nontrivial code by hand in assembly language, it will probably be worth learning how to write them. Unfortunately, they are very poorly documented; I am not currently finding anything that I think is worth linking to.
The line
.section .note.GNU-stack,"",@progbits
is also important to know about if you are writing assembly language by hand; it is another optional annotation, but a valuable one, because what it means is “nothing in this object file requires the stack to be executable.” If all the object files in a program have this annotation, the kernel won’t make the stack executable, which improves security a little bit.
(To indicate that you do need the stack to be executable, you put "x"
instead of ""
. GCC may do this if you use its “nested function” extension. (Don’t do that.))
It is probably worth mentioning that in the “AT&T” assembly syntax used (by default) by GCC and GNU binutils, there are three kinds of lines: A line with a single token on it, ending in a colon, is a label. (I don’t remember the rules for what characters can appear in labels.) A line whose first token begins with a dot, and does not end in a colon, is some kind of directive to the assembler. Anything else is an assembly instruction.