Skip to content
Advertisement

How to get the first address of initialized data segment

my program is working on linux using gcc. Through the manual page, I find edata, which represent the first address past the end of the initialized data segment.
But I want know the first address of initialized data segment
How can I get it?

I have tried treating etext as the first address of initialized data segment. Then I got a segment fault when I increase the address and access the variable stored in it. I think some address space between etext and edata was not mapped into virtual memory. Is that right?

Advertisement

Answer

That depends on your linker scripts. For example on some platforms you have the symbol __bss_start at the beginning of BSS. It’s a symbol without any data associated with it, you can get a pointer to it by extern declaring a variable with that name (only for the sake of taking the address of that variable). For example:

#include <stdio.h>

extern char __bss_start;

int main()
{
    printf("%pn", &__bss_start);

    return 0;
}

You find this by looking in the linker script, for example in /usr/lib/ldscripts/elf_x64_64.x:

.data           :
{
  *(.data .data.* .gnu.linkonce.d.*)
  SORT(CONSTRUCTORS)
}
.data1          : { *(.data1) }
_edata = .; PROVIDE (edata = .);
__bss_start = .;  /*  <<<<< this is what you're looking for /*
.bss            :
{
 *(.dynbss)
 *(.bss .bss.* .gnu.linkonce.b.*)
 *(COMMON)
 /* Align here to ensure that the .bss section occupies space up to
    _end.  Align after .bss to ensure correct alignment even if the
    .bss section disappears because there are no input sections.
    FIXME: Why do we need it? When there is no .bss section, we don't
    pad the .data section.  */
 . = ALIGN(. != 0 ? 64 / 8 : 1);
}

You can also see the edata you mentioned, but as edata is not reserved for the implementation (the PROVIDE means only to create this symbol if it otherwise isn’t used) you should probably use _edata instead.

If you want the address to the start of the data section you could modify the linker script:

__data_start = . ;
.data           :
{
  *(.data .data.* .gnu.linkonce.d.*)
  SORT(CONSTRUCTORS)
}
.data1          : { *(.data1) }
_edata = .; PROVIDE (edata = .);
__bss_start = .;  /*  <<<<< this is what you're looking for /*
.bss            :
{
 *(.dynbss)
 *(.bss .bss.* .gnu.linkonce.b.*)
 *(COMMON)
 /* Align here to ensure that the .bss section occupies space up to
    _end.  Align after .bss to ensure correct alignment even if the
    .bss section disappears because there are no input sections.
    FIXME: Why do we need it? When there is no .bss section, we don't
    pad the .data section.  */
 . = ALIGN(. != 0 ? 64 / 8 : 1);
}

You probably want to make a copy of the linker script (look for the right one in /usr/lib/ldscripts, they are different depending on what kind of output you’re targeting) and supply it when you compile:

gcc -o execfile source.c -Wl,-T ldscript

Another option if you don’t want to modify the linker script could be to use the __executable_start and parse the ELF headers (hoping that the executable is sufficiently linearly mapped)

As for _etext, it is the end of the text section (you can read that in the linker script as well, but I didn’t include it in the excerpt), but the text section is followed by rodata, trying to write there is probably going to segfault.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement