I trying to understand the internals of the Linux kernel by reading Robert Love’s Linux Kernel Development.
On page 74 he says the easiest way to pass arguments to a syscall
is via :
Somehow, user-space must relay the parameters to the kernel during the trap.The easiest way to do this is via the same means that the syscall number is passed: The parameters are stored in registers. On x86-32, the registers ebx, ecx, edx, esi, and edi contain, in order, the first five arguments.
Now this is bothering me for a number of reasons:
- All syscalls are defined with the
asmlinkage
option. Which implies that the arguments are always to be found on the stack and not the register. So what is all this business with the registers ? - It may be possible that before the syscall is performed the values are copied on to the kernel stack. I have no idea why that would be efficient but it might be a possibility.
Advertisement
Answer
(This answer is for 32-bit x86 Linux to match your question; things are slightly different for 64-bit x86 and other architectures.)
The parameters are passed from userspace in registers as Love says.
When userspace invokes a system call with int $0x80
, the kernel syscall entry code gets control. This is written in assembly language and can be seen here, for instance. One of the things this code does is to take the parameters from the registers and push them onto the stack, and then call the appropriate kernel sys_XXX()
function (which is written in C). So those functions do indeed expect their arguments on the stack.
It wouldn’t work as well to try to pass parameters from userspace to the kernel on the stack. When the system call is made, the CPU switches to a separate kernel stack, so the parameters would have to be copied from the userspace stack to the kernel stack, and this is somewhat complicated. And it would have to be done even for very simple system calls that just take a few numeric arguments and wouldn’t otherwise need to access userspace memory at all (think about close()
for instance).