Skip to content
Advertisement

GCC compiled code: why integer declaration needs several statements?

I’m learning AT&T assembly,I know arrays/variables can be declared using .int/.long, or using .equ to declare a symbol, that’s to be replaced by assembly.

They’re declared insided either .data section(initialzed),or .bss section(uninitialzed).

But when I used gcc to compiled a very simple .c file with ‘-S’ command line option to check the disassembly code, I noticed that: (1) .s is not using both .data and .bss, but only .data (2) The declaration of an integer(.long) cost several statements, some of them seems redundant or useless to me.

Shown as below, I’ve added some comments as per my questions.

$ cat n.c

int i=23; 
int j; 
int main(){ 
   return 0; 
} 

$ gcc -S n.c $ cat n.s

     .file    "n.c" 
     .globl    i 
     .data 
     .align 4 
     .type    i, @object #declare i, I think it's useless
     .size    i, 4 #There's '.long 23', we know it's 4 bytes, why need this line?
i: 
     .long    23       #Only this line is needed, I think
     .comm    j,4,4    #Why j is not put inside .bss .section?
     .text 
     .globl    main 
     .type    main, @function 
main:
.LFB0:                 #What does this symbol mean, I don't find it useful.
     .cfi_startproc 
     pushq    %rbp 
     .cfi_def_cfa_offset 16 
     .cfi_offset 6, -16 
     movq    %rsp, %rbp 
     .cfi_def_cfa_register 6 
     movl    $0, %eax 
     popq    %rbp 
     .cfi_def_cfa 7, 8 
     ret 
     .cfi_endproc 
.LFE0:                 #What does this symbol mean, I don't find it useful.
     .size    main, .-main 
     .ident    "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.2) 5.4.0 20160609" 
     .section    .note.GNU-stack,"",@progbits 

All my questions are in the comments above, I re-emphasize here again:

     .type    i, @object 
     .size    i, 4
i: 
     .long    23

I really think above code is redundant, should be as simple as:

i: 
     .long    23

Also, “j” doesn’t have a symbol tag, and is not put inside .bss section.

Did I get wrong with anything? Please help to correct. Thanks a lot.

Advertisement

Answer

(I am guessing you are using some Linux system)

They’re declared insided either .data section(initialzed),or .bss section(uninitialzed).

No, you have many other sections, notably .comm (for the “common” section, with initialized data common to several object files, that the linker would “merge”) and .rodata for read-only data. The ELF format is flexible enough to permit many sections and many segments (some of which are not loaded -more precisely memory mapped- in memory).

The description of sections in ELF files is much more complex than what you believe. Take time to read more, e.g. Linkers and loaders by Levine. Read also the documentation and the scripts of GNU binutils and also ld(1) & as(1). use objdump(1) and readelf(1) to explore existing ELF executables, object files and shared objects. Read also execve(2) & elf(5)

But when I used gcc to compiled a very simple .c file with -S command line option

When examining the assembler file generated by gcc I strongly recommend passing at least -fverbose-asm to ask gcc to emit some additional and useful comments in the assembler file. I also usually recommend to use some optimization flag -e.g. -O1 at least (or perhaps -Og on recent versions of gcc).

I noticed that: (1) .s is not using both .data and .bss, but only .data

No, your generated code use the .comm section and put the value of j there.

(2) The declaration of an integer(.long) cost several statements, some of them seems redundant or useless to me.

These are mostly not assembler statements (translated into machine code) but assembler directives; they are very useful (and they don’t waste space in the memory segment produced by ld, but the ELF format has information elsewhere). In particular .size and .type are both needed because the symbol tables in ELF files contain more than addresses (it also has a notion of size, and a very primitive notion of type).

The .LFB0 is a gcc (actually cc1-) generated label. GCC does not care about generating useless labels (it is simpler for the assembler generator in the GCC backend), since they don’t appear in object files.

There’s ‘.long 23’, we know it’s 4 bytes,

you might know that a long is 4 bytes, but that information (size of j) should go into the ELF file so requires explicit assembler directives….

(I don’t have space or time to explain the ELF format, you need to read many pages about it, and it is a lot more complex and more complete than what you believe)

BTW, Drepper’s How To Write Shared Libraries is quite long (more than 40 pages) and explains a good deal about ELF files, focusing on shared libraries.

Advertisement