I decided to spend some time on Linux shellcoding this weekend and this was amazing! Last time I had similar experience on my Assembly Classes at the University, but it was a long time ago, it was about DOS and it wasn’t about shellcoding or security itself. So, I considered that this is the best time to have some fun and get new skills.
Ok, what does the word “Shellcode” mean?
In two words, this is a some piece of code that will perform necessary actions on attacked system after successful exploitation of some vulnerability. Usually, the word “Shellcode” is applicable only for binary exploitation (Buffer Overflows, Heap-based attacks, etc.). In other cases, it is just a “Payload”.
This is important to note, that shellcode is a byte sequence that contains instructions that will be executed by hardware (CPU, especially) on the target system.
There is no sections from usual executables like ELF or PE where we can store data. For example, you always can find .data or .bss sections in ELF binaries where some variables are stored. But if so, how can we store, for example, IP Address and Port for reverse TCP shell in our shellcode?
Well, I found two possible solutions.
1. EIP-based approach
You definitely know about CALL instruction that saves an address of the next instruction into the stack and changes program execution flow. It is similar with JMP (and other) which, actually, doesn’t save anything into the stack.
What if we place our data right after this instruction? In this case, address of our data will be pushed into the Stack and we will be able to place it into any register with a POP instruction.
This is a part of my ‘Hello, World!’ shellcode that uses this way to work with data.
How you can see, MESSAGE part of this code contains CALL instruction and string that we need to use in PAYLOAD section. In my case, ECX have to store address of the string to be printed with SYS_WRITE call. And single POP instruction allows us to load this address into ECX.
Ok, let’s look at the next example now. I have written a shellcode that runs command shell for us using this technique.
Here is the code:
In this case, string “/bin/sh” should have “\x00” byte at the end, this is a requirement of SYS_EXECVE syscall. This code works fine and creates command shell for us. But what if we want to use it as a part of our exploit? Ending of “/bin/sh” will create additional bad char for us, because “\x00” byte is forbidden for the most of applications.
What should we do in this case? Well, we can implement custom or use public-available crypter to avoid bad characters. I didn’t like this way, because I just couldn’t implement my custom crypter and I wanted to do the things by my own hands.
Or we can find another way to store data in our shellcode!
2. Stack-based approach
Stack is awesome! You should know that if you read this article :D We can push here any data we want to be pushed. And ESP will always point to the top of the Stack. But how can we use it in order to pass data in our shellcode?
Ok, let’s check string “/bin/sh”, 0h again. First of all, any string is just a sequence of bytes. We can split this byte array into some amount of parts where will be only 4 bytes in each part. And we can push these parts into the Stack one by one - in this case, we will have a whole string in the Stack and ESP will be pointing on the beginning of this string! The main thing to note here is that x86 implements little-endian architecture, so, we should push reversed values to the Stack.
What about “\x00”? Well, we can push this byte into the Stack with billion of possible ways. For example, we can run ‘xor eax, eax’ and ‘push eax’.
This code doesn’t contain “\x00” byte, but “\x00” byte is included in our “/bin/sh” string that actually is “/bin//sh”, 0h, because of padding (push instruction takes 5 bytes and 4 of them are for argument - if we will use single “/”, we get additional “\x00” in our shellcode). So, SYS_EXECVE will work and we will get our shell.
Both of techniques works fine in some conditions. First way doesn’t require Stack (useful in environments with limited stack space), you can just specify the whole string without splitting and the little-endian magic. Second way can give you necessary flexibility, but only at the cost of readability. Also, in other cases, you may be faces with necessity of advanced padding of your strings - you can’t solve all troubles with an extra “/”.