Let’s say I have a prog1.c which is build as prog1.out. In prog1.out there is a linker information that will tell where the elf will be loaded. These addresses will be a virtual addresses. The loader will look for these information’s and a launch this as a process. Each section like DS,BSS will be loaded on the virtual address as mentioned in the linker. For example, I have prog2.out that also have the same loader address, BSS, DS,etc then will it conflict? I know it will not conflict but then will there be a performance issue. Since the two process have the same virtual address but they map to different physical addresses? I am confused, how it can protect the two processes having the same virtual addresses.
Advertisement
Answer
The thing is that when a process uses a memory address, it is talking about a Virtual Address, which may be different to the same physical address. This means that two process can refer to the same Address and don’t mix their data, because it will be in two different physical locations.
Below I describe how the virtual address gets converted to a physical address on typical computers (this varies a little on other architectures, but it’s the same idea)
Understanding the memory translation process on Intel x86 Architecture (3-level paging)
So, in one hand you have a Virtual Memory Address, and you want to get to a Physical Memory Address (i.e: The actual address on the RAM), the workflow is mostly this:
Virtual Address -> [Segmentation Unit] -> [Paging Unit] -> Physical Address
Each operating system may define the way the Segmentation Unit and Paging works. Linux, for example, uses a Flat Segmentation Model, which means that it is ignored, so we would do the same now.
Now, our virtual address goes trough something called Paging Unit, and gets converted to a physical address somehow.. This is how.
The memory is divided in blocks of a certain size, on intel this size may be 4KB or 4MB.
Each process defines a set of tables in memory so the computer knows how it should translate memory addresses. These tables are organized in a hierarchical way, and actually, the memory address you want to access gets decomposed in indexes for these tables.
I know, it sounds confusing, but stay with me for a few more sentences. You can follow my writing with this image:
There’s an internal CPU register called CR3 which stores the base address of the first table (we shall call this table Page Directory, and each one of its entries are called Page Directory Entry). When a process is being executed, its CR3 is loaded (among other things).
So, now you want to access to, let’s say, memory address 0x00C30404,
The paging unit says “Ok, let’s get the page directory base”, looks on the CR3 register and knows where is the base of the page directory, let’s call this address PDB (Page Directory Base).
Now you want to know which directory entry you should use.. As I said before, the address gets decomposed in a bunch of indexes. The most-significant 10 bits (bits 22 trough 31), corresponds to the index of the Page Directory.. In this case, 0x00C30404 is 0000 0000 1100 0011 0000 0100 0000 0100 in binary, and its most significant 10 bits are: 0000 0000 11 which is 0x3. This means that we want to seek the 3rd page directory entry.
¿What do we do now?
Remember that these tables are hierarchical: each Page Directory Entry has, among other things, the address of the next table, called Page Table. (This table may be different for each Page Directory Entry).
So now, we got another table.. The next 10 bits of our address will tell us which index of this table we shall access (let’s call them Page Table Entries).
00 0011 0000 are the next 10 bits, and they are the number: 0x30. This means that we have to access to the 30th Page Table Entry..
And Finally, this Page Table Entry holds the offset of the desired PAGE FRAME (remember that memory is divided in blocks of 4k). Finally, the least-significant 12 bits of our address, are the memory offset of this PAGE FRAME, note that the PAGE FRAME is an actual physical memory address.
This is called 3-level paging, on 64 bits (or with PAE) it’s very similar but there’s one more level of paging.
You may think that it is a real bummer to get all these memory accesses just to fetch a variable.. And it’s true. There are mechanisms in the computer to avoid all these steps, on of it is the TLB (Table Lookaside Buffer), it stores a cache of all the translations done, so it can fetch memory easily.
Also, each Entry of these structures has some properties regarding permissions, like “Is this page writable?” “is this page executable?”.
So, now that you understand how memory paging works, it is easy to grasp how Linux handles the memory:
- Each process has its own CR3 and when the process is scheduled to run, the cr3 register gets loaded.
- Each process has its own virtual space (the tables may not be full, a process can start with, for example, just a single page of 4kb).
- Each process has mapped some other pages (operating system memory), so when it gets interrupted the Interrupt Handler starts on the same task and handles the interruption within that task, executing the needed code.
Some motivations for this kind of schemes
- You don’t have to have all the processes’ memory at the same time, you can store some in disk.
- You can safely isolate each process memory, and give it some permissions too.
- A process may use 10MB of ram, but they aren’t required to be contiguous in physical memory.