I have been implementing just for fun a simple operating system for x86 architecture from scratch. I implemented the assembly code for the bootloader that loads the kernel from disk and enters in 32-bit mode. The kernel code that is loaded is written in C, so in order to be executed the idea is to generate the raw binary from the C code.
Firstly, I used these commands:
$gcc -ffreestanding -c kernel.c -o kernel.o -m32 $ld -o kernel.bin -Ttext 0x1000 kernel.o --oformat binary -m elf_i386
However, it didn’t generate any binary giving back these errors:
kernel.o: In function 'main': kernel.c:(.text+0xc): undefined reference to '_GLOBAL_OFFSET_TABLE_'
Just for clarity sake, the kernel.c code is:
/* kernel.c */ void main () { char *video_memory = (char *) 0xb8000 ; *video_memory = 'X'; }
Then I followed this tutorial: http://wiki.osdev.org/GCC_Cross-Compiler
to implement my own cross-compiler for my own target. It worked for my purpose, however disassembling with the command ndisasm
I obtained this code:
00000000 55 push ebp 00000001 89E5 mov ebp,esp 00000003 83EC10 sub esp,byte +0x10 00000006 C745FC00800B00 mov dword [ebp-0x4],0xb8000 0000000D 8B45FC mov eax,[ebp-0x4] 00000010 C60058 mov byte [eax],0x58 00000013 90 nop 00000014 C9 leave 00000015 C3 ret 00000016 0000 add [eax],al 00000018 1400 adc al,0x0 0000001A 0000 add [eax],al 0000001C 0000 add [eax],al 0000001E 0000 add [eax],al 00000020 017A52 add [edx+0x52],edi 00000023 0001 add [ecx],al 00000025 7C08 jl 0x2f 00000027 011B add [ebx],ebx 00000029 0C04 or al,0x4 0000002B 0488 add al,0x88 0000002D 0100 add [eax],eax 0000002F 001C00 add [eax+eax],bl 00000032 0000 add [eax],al 00000034 1C00 sbb al,0x0 00000036 0000 add [eax],al 00000038 C8FFFFFF enter 0xffff,0xff 0000003C 16 push ss 0000003D 0000 add [eax],al 0000003F 0000 add [eax],al 00000041 41 inc ecx 00000042 0E push cs 00000043 088502420D05 or [ebp+0x50d4202],al 00000049 52 push edx 0000004A C50C04 lds ecx,[esp+eax] 0000004D 0400 add al,0x0 0000004F 00 db 0x00
As you can see, the first 9 rows (except for the NOP that I don’t know why it is inserted) are the assembly translation of my main function. From 10 row to the end, there’s a lot code that I don’t know why it is here.
In the end, I have two questions:
1) Why is it produced that code?
2) Is there a way to produce the raw machine code from C without that useless stuff?
Advertisement
Answer
A few hints first:
avoid naming your starting routine
main
. It is confusing (both for the reader and perhaps for the compiler; when you don’t pass-ffreestanding
togcc
it is handlingmain
very specifically). Use something else likestart
orbegin_of_my_kernel
…compile with
gcc -v
to understand what your particular compiler is doing.you probably should ask your compiler for some optimizations and all warnings, so pass
-O -Wall
at least togcc
you may want to look into the produced assembler code, so use
gcc -S -O -Wall -fverbose-asm kernel.c
to get thekernel.s
assembler file and glance into itas commented by Michael Petch you might want to pass
-fno-exceptions
your probably need some linker script and/or some hand-written assembler for crt0
you should read something about linkers & loaders
kernel.c:(.text+0xc): undefined reference to '_GLOBAL_OFFSET_TABLE_'
This smells like something related to position-independent-code. My guess: try compiling with an explicit -fno-pic
or -fno-pie
(on some Linux distributions, their gcc
might be configured with some -fpic
enabled by default)
PS. Don’t forget to add -m32
to gcc
if you want x86 32 bits binaries.