Threat Research

The Curious Case Of The Document Exploiting An Unknown Vulnerability – Part 1

By Wayne Chin Yick Low | August 20, 2015


Recently, we came across an unknown document exploit which was mentioned in a blogpost by the researcher @ropchain. As part of our daily routines, we decided to take a look to see if there was something interesting about the document exploit. The sample’s SHA1 used in the analysis is FB434BA4F1EAF9F7F20FE6F49C4375E90FA98069. The file we’re investigating is a Word document called amendment.doc.

Understanding the vulnerability

In fact, the exploit is not widely covered by AV vendors. Thus it becomes more challenging for any researcher who tries to identify which vulnerability the exploit attempts to leverage. An initial look at the document using a text editor gives no clues either. The purposes of the OLE objects embedded in the document are already explained by @ropchain in his blog post, so we will not discuss them here again. Our main objective in this section is to identify the vulnerability, so we will use a very primitive but effective way to determine its root cause: we want to crash Microsoft Word upon loading the document exploit. Because we know that the third OLE component is responsible for triggering the vulnerability, we strip the other unnecessary OLE components embedded inside the document. Obviously, the size of the new file is smaller after stripping some of the embedded components out of the original document:

Figure 1: Stripped down exploit document to find the root cause of the vulnerability


We are going to analyze the vulnerability on the following test environments:

  • Windows 7 X86
  • Microsoft Word 2007 SP3
  • WINWORD and WWLib version 12.0.6718.5000

As we already expected, Microsoft Word crashes immediately upon opening the document:

Figure 2: Exploit crashed Microsoft Word when opening the exploit document

We reproduce the crash a few times to make sure that this is the actual code that we should look into before we dive deeper into the code. Interestingly, we see the ECX always contains the constant address 0x7c38bd50 in all of the crashes. Thus we are fairly convinced that this is the actual code path that we want to look into. Without further ado, we attach WinDBG to WINWORD again and set a DLL loading breakpoint on MSVCR71.dll. This address seems to belong to a Microsoft DLL component because the exploit document attempts to load otkloadr.dll, which in turn loads MSVCR71.dll that will be used to bypass ASLR. We confirm our hypothesis with the following output from WinDBG:

Figure 3: Finding the constant address

At this point, we are inclined to believe that the constant address controlled by the attacker could be hardcoded in the exploit document. So we quickly run RTFScan from OfficeMalScanner to extract the second Word.Document object where the vulnerability resides and eventually we manage to locate the said constant address in document.xml:

Figure 4: XML smart tag that caused the vulnerability


Based on our analysis, these hardcoded addresses are carefully chosen by the exploit author(s) to make sure Microsoft Word will continue executing without crashing the process until it achieves arbitrary code execution. These values appear to be a series of DWORD stored in a structure that will be dereferenced in the function sub_31249E2E. Presumably, the purpose of picking these particular addresses is to always return acpositive value from the function sub_31249E0A. In other words, the exploit author(s) want to ensure that the function sub_31249E0A always returns true to the caller function.

Figure 5: Code path leads to vulnerability

Figure 6: DWORD values from the hardcoded addresses that will be dereferenced in the vulnerable function


After back-tracing further from the vulnerable function, we find that the cause of the vulnerability revolves around the SmartTag parser routine. This routine passes a wrong type structure and eventually dereferences the value in sub_31249E0A incorrectly. Coincidentally, this value is defined in the SmartTag element attribute. With the understanding of how the vulnerability works, we further verify our findings by crafting different proof-of-concept (POC) documents and these POC documents crash Microsoft Word reliably.


Figure 7: SmartTag parser passed a wrong type structure to the vulnerable function

A known CVE vulnerability or not

With the vulnerability information obtained above, it is possible to find out which vulnerability the document is exploiting. Since Fortinet is also a partner in the Microsoft Active Protections Program (MAPP), it becomes one of the valuable sources to verify our findings. We managed to find a report which is aligned to our findings and thus concluded the document is exploiting vulnerability in CVE-2015-1641.

Arbitrary code execution

By leveraging the incorrect structure type passed to the vulnerable function, the exploit author attempts to corrupt some part of the memory from MSVCR71.dll, a non-ASLR DLL module. A function pointer in MSVCR71!acmdln+0x4 will be referenced upon executing the vulnerable function and this function pointer will be corrupted and replaced with an arbitrary value at some point:

Figure 8: Memory corruption on MSVCR71!acmdln+0x4 lead to arbitrary value overwrite

The first stage shellcode will be stored in a heap-spray address sprayed via the ActiveX component embedded in the document. The heap-spray buffer is allocated with size 0x1ffa00 (2095616 bytes) which is exactly the file size of activeX1.bin. The execution will be transferred to the first stage shellcode at 0x08080808 upon calling MSVCR71!acmdln+0x4:


Figure 9: Arbitrary code execution on heap-spray buffer containing the nop-sled and shellcode

Figure 10: First stage shellcode in heap-spray buffer

Shellcode analysis

In this section, we will briefly discuss the characteristics of the first and second stage shellcode:

  1. First stage shellcode:
    • Enumerated the file handle value until it found the original document exploit handle. This is done by checking the file size of the original document exploit through the file handle passed to GetFileSize.
    • Allocated executable buffer via VirtualAlloc
    • Parsed the document exploit contents and looked for the markers 0xFEFEFEFE, 0xFEFEFEFE and 0xFFFFFFFF in the overlay of the document. The second stage shellcode can be found after the markers.
    • Copied the second stage shellcode content to a previously allocated executable buffer and then transferred execution to it via a single JMP instruction
  2. Second stage shellcode:
    • Decoded the second stage shellcode at offset 0x2E with size 0x3CC using XOR key 0xFC.
    • Obtained the original document exploit file name using ZwQueryVirtualMemory with MemorySectionName as the information class. The file name will be converted to ASCII using WideCharToMultiByte.
    • Obtained the full file path of the payload with the file name svchost.exe in %LOCALAPPDATA% folder
    • Obtained the encoded payload in the original document using the starting buffer marker 0xBABABABA and 0xBABABA. The encoded payload can be decoded using the XOR key 0xCAFEBABE for each DWORD value until the end of buffer marker 0xBBBBBB and 0xBBBBBB is encountered. The decoded payload is then dropped to %LOCALAPPDATA%\svchost.exe
    • Executed the payload using WinExec
    • Obtained the encoded decoy document in the original document using the starting buffer marker 0xBCBCBCBC. The encoded buffer can be decoded using the XOR key 0xBAADF00D. The decoded content will be written to the original document exploit
    • Cleaned up the registry in HKCU\Software\Microsoft\Office\1{2,4,5}.0\Word\Resiliency to avoid the potential warning message that will be displayed when someone re-opens a document that was crashed previously.
    • Re-opened the decoy document using WinExec

In conclusion, this vulnerability is clearly being actively exploited by the attackers. One of the reasons is definitely because of the exploit stability which can guarantee a successful exploitation on all vulnerable version of Microsoft office. Looking for the CVE number and the vulnerability description on Internet, we could identify some groups potentially distributing the exploit. We can further confirm our theory when we saw the exploit document includes a calc.exe payload which is a common POC payload for demo purposes. Furthermore we are able to identify the threat actor who is possibly closely related to this exploit by analyzing the RAT payload. In part 2 of the blog post, we will be discussing the possible source of this attack.

Fortinet released AV detection MSOffice/CVE_2015_1641.A!exploit and IPS signature MS.Office.RTF.Array.Out.of.bounds.Memory.Corruption to cover this threat.

-= FortiGuard Lion Team =-