Tuesday, April 8, 2014

Building a Decoder for the CVE-2014-0502 Shellcode

In late February of this year multiple security companies (FireEyeAlientVaultSecPod, Symantec, plus many more) were reporting on a Flash zero-day vulnerability (CVE-2014-0502) being exploited in the wild.  Around this time a friend asked me if I could reverse the exploit and its associated files in order to write a decoder for it. The purpose of the requested decoder was to statically determine the URL from where the backdoor executable (shown later) would be downloaded.

As explained in the referenced links, the exploit found in the wild works by:
  1. Having a victim's browser load a Flash file (cc.swf) that exploited the vulnerable Flash player 
  2. The exploit (shellcode) then downloads a GIF file (logo.gif) from the web server hosting the SWF file
  3. This GIF file contains encrypted/encoded shellcode embedded within it that eventually downloads a backdoor executable from an encrypted URL within the file. Static decryption of this URL is what my friend was after
Note: For simplicity, in the steps above, I skipped some of the details on how the shellcode utilizes ROP, defeats ASLR, and finds the necessary libraries and functions (LoadLibrary, InternetOpenURL, CreateFile, etc.). If you are interested in these details please see the FireEye writeup linked above. If you are interested in how the exploit achieves arbitrary code execution then you should read the writeup from Spider Labs here. (You should probably just read the Spider Labs writeup anyway as its very well done...)

Unfortunately, many of the previous research writeups were not available at the time of my friend's request. To assist with Flash decompilation, I used SoThink SWF decompiler, which is a tool I cannot recommend enough, and that I have used to successfully analyze numerous Flash files. Since my effort, Zscaler has published a nice writeup on the Flash file and how it constructs its payload, although it misses a key part to writing the decoder -- how to determine where the encrypted shellcode starts within the downloaded GIF file.

Through analysis with SoThink's tool I was able to determine that the last four bytes of the GIF file contained a little endian integer that represented the offset of the encrypted payload from the beginning of where the offset is stored (the end of the file minus 4). The following decompiled function shows this process. The decompilation is from SoThink and the comments are mine:

1  public class cc extends Sprite
2  {
3   [snip]
4   _loc_4 = new URLLoader();
5   _loc_4.dataFormat = "binary";
6   _loc_4.addEventListener("complete", Ƿ); // sets mpsc
7   _loc_4.load(new URLRequest("logo.gif")); // get logo.gif from same server as loaded
8   [snip]
9  }
10 
11 public function Ƿ(event:Event) : void
12 {
13  var _loc_3:* = new ByteArray();
14 
15  /* writes logo.gif  to loc_3 */
16  _loc_3.writeBytes(event.target.data as ByteArray, 0, (event.target.data as ByteArray).length);
17 
18  /* move to last 4 bytes */
19  _loc_3.position = _loc_3.length - 4;
20  _loc_3.endian = "littleEndian";
21         
22  /* last four bytes of logo.gif */
23  var _loc_4:* = _loc_3.readUnsignedInt();
24  var _loc_2:* = new ByteArray();
25   
26  /* length of file - integer from last 4 bytes - 4 */
27  _loc_2.writeBytes(_loc_3, _loc_3.length - 4 - _loc_4, _loc_4);
28  _loc_2.position = 0;
29 
30  /* integer read from offset: length of file - integer from last 4 bytes - 4 */
31  Ǵ.setSharedProperty("mpsc", _loc_2);
32  Ǵ.start();
33 
34  return;
35 }// end function

As can be seen on lines 4-7, which are inside the constructor of the Flash file, the logo.gif file is downloaded. The URLLoader instance used to download the file has its complete listener set to the function shown on lines 11 through 35. This function is triggered once the file is finished downloading. On line 16 the file's contents are read into an array named _loc_3 (the third declared local variable) by the decompiler. On lines 19 and 20 the array's position is moved to four bytes before the end of the file and its disposition is set to little endian. On line 23 the integer at these last four bytes is read and a new byte array, _loc_2 is declared. Line 27 is the key one as _loc_2 is filled with the bytes of _loc_3 (logo.gif) starting at the end of the file minus the integer read minus another four bytes. At this point _loc_2 holds the encrypted shellcode and is stored in the mpsc shared property. This buffer will later be executed by the exploit payload.

Now that I knew how to find the shellcode reliably, I then needed to decrypt the shellcode in order to find the instructions that located and decrypted the embedded backdoor URL. When analyzing the shellcode I was working in a native Linux environment so I used a mix of vim, dd, and ndisasm. To start, I figured out the offset of the encrypted shellcode within my file and then extracted a copy of the shellcode.

The following lists the beginning of the shellcode:

 1 $ ndisasm -b32 stage1 
 2 00000000  D9EE            fldz
 3 00000002  D97424F4        fnstenv [esp-0xc]
 4 00000006  5E              pop esi
 5 00000007  83C61F          add esi,byte +0x1f
 6 0000000A  33C9            xor ecx,ecx
 7 0000000C  66B90009        mov cx,0x900
 8 00000010  8A06            mov al,[esi]
 9 00000012  8A6601          mov ah,[esi+0x1]
10 00000015  8826            mov [esi],ah
11 00000017  884601          mov [esi+0x1],al
12 0000001A  83C602          add esi,byte +0x2
13 0000001D  E2F1            loop 0x10
14 0000001F  EE              out dx,al
15 00000020  D974D9F4        fnstenv [ecx+ebx*8-0xc]
16 00000024  2483            and al,0x83
[snip]

For those of you who have never used ndisasm before, it is the disassembler that comes with the nasm assembler. The b option defines the architecture (16, 32, or 64 bit Intel), and ndisasm simply treats the file as raw instructions. This makes it very useful when analyzing shellcode. In the output the first column is the offset of the instruction from the beginning of the file, the second column is the instruction's opcodes, and the third column is the instruction's mnemonic.

As you can see in the ndisasm output, the instructions make sense until line 14 (offset 0x1f) where the out instruction is used. out is used to talk directly with hardware devices, and as such is a privileged operation. Since this shellcode runs in userland, out cannot be used and even the operating system's kernel mode components use it sparingly. Examination of the instructions on lines 2 through 13 reveal a decryptor loop that targets the instructions starting at line 14. To begin, lines 2-4 leverage the floating point unit to determine the runtime address of where the fldz (offset 0) instruction is memory. The floating point internals that enable this are explained in an Symantec paper here and the shellcode trick was first disclosed by noir in 2003 here.

Lines 5-7 then setup the loop. First 0x1f is added to the esi register which moves it to the offset where the out instruction is. ecx is then set to zero using xor and the value 0x900 is moved into the cx (the bottom half of ecx) register. This is the loop counter, so we know that the first layer of decryption will operate on 0x900 (2304) bytes. Lines 8 through 12 then implement the deobfuscation of the bytes beginning at offset 14 with an algorithm that translates to:

1  void stage1(unsigned char *buf)
2  {
3    unsigned char *esi;
4    int ecx;
5    unsigned char ah, al;
6
7    // add esi,byte +0x1f
8    esi = buf + 0x1f;
9
10   // xor ecx,ecx
11   // mov cx,0x900
12   ecx = 0x900;
13
14   while(ecx > 0)
15   {
16     // mov al,[esi]
17     al = *esi;
18
19     // mov ah,[esi+0x1]
20     ah = *(esi + 1);
21
22     // mov [esi],ah
23     *esi = ah;
24
25     // mov [esi+0x1],al
26     *(esi + 1) = al;
27
28     // add esi,byte +0x2
29     esi = esi + 2;
30
31     // loop 0x10
32     ecx = ecx - 1;
33   }
34 }

As you can see from the converted assembly, the purpose of this loop is to flip each byte with the one preceding it in the obfuscated shellcode. This is done by using the ah and al registers which are 1 byte in size each. After running the decoder above, the instructions starting at our original out instruction (offset 0x1f) now make sense and become the second stage of shellcode decryption:

 1 $ ndisasm -b32 stage2 
 2 00000000  D9EE          fldz
 3 00000002  D97424F4      fnstenv [esp-0xc]
 4 00000006  5E            pop esi
 5 00000007  83C621        add esi,byte +0x21
 6 0000000A  56            push esi
 7 0000000B  5F            pop edi
 8 0000000C  33C9          xor ecx,ecx
 9 0000000E  66B9F008      mov cx,0x8f0
10 00000012  90            nop
11 00000013  66AD          lodsw
12 00000015  662D6161      sub ax,0x6161
13 00000019  C0E004        shl al,0x4
14 0000001C  02C4          add al,ah
15 0000001E  AA            stosb
16 0000001F  E2F2          loop 0x13
17 00000021  6E            outsb
18 00000022  6A6F          push byte +0x6f
19 00000024  6F            outsd
[snip]

As can be seen, this decryptor stage is another loop that transforms the code that follows it. Lines 2-4 contain code necessary to place esi at the fldz instruction of the second stage decryptor. Line 5 then adds 0x21 to esi in order to point it to the junk outsb instruction at line 17. The loop counter is initialized to 0x8f0 at line 9 and then lines 11 through 15 perform the transformation. This transformation can be expressed in C as:

1 void stage2(unsigned char *buf)
2 {
3   int ecx;
4   unsigned char *esi;
5   unsigned char *edi;
6   unsigned short ax;
7   unsigned char al, ah;
8
9   // add esi,byte +0x21
10  esi = buf + 0x21;
11
12  // push esi
13  // pop edi
14  edi = esi;
15
16  // xor ecx,ecx
17  // mov cx,0x8f0
18  ecx = 0x8f0;
19
20  while(ecx > 0)
21  {
22    // lodsw
23    ax = *(unsigned short *)esi;
24    esi = esi + 2;
25
26    // sub ax,0x6161
27    ax = ax - 0x6161;
28
29    ah = (ax >> 8) & 0xff;
30    al = ax & 0xff;
31
32    // shl al,0x4
33    al = al << 4;
34
35    // add al,ah
36    al = al + ah;
37
38    // stosb
39    *edi = al;
40    edi = edi + 1;
41
42    ecx = ecx - 1;
43   }
44 }

After the second stage of deobfuscation the outsb (line 17 from the previous ndisasm output) and its following instructions look like:

 1 $ ndisasm -b32 stage3
 2 00000000  D9EE          fldz
 3 00000002  D97424F4      fnstenv [esp-0xc]
 4 00000006  5E            pop esi
 5 00000007  83C61F        add esi,byte +0x1f
 6 0000000A  33C9          xor ecx,ecx
 7 0000000C  66B96804      mov cx,0x468
 8 00000010  8A06          mov al,[esi]
 9 00000012  8A6601        mov ah,[esi+0x1]
10 00000015  8826          mov [esi],ah
11 00000017  884601        mov [esi+0x1],al
12 0000001A  83C602        add esi,byte +0x2
13 0000001D  E2F1          loop 0x10
14 0000001F  EE            out dx,al
15 00000020  D974D9F4      fnstenv [ecx+ebx*8-0xc]
16 00000024  2483          and al,0x83
[snip]

You may notice that this is the same algorithm used in stage 1 for decryption, just with a different loop counter since the decrypting process is moving further down the file. By running the algorithm starting at line 14 (offset 0x1f) we get the fourth level of decryptor shellcode:

 1 $ ndisasm -b32 stage4 
 2 00000000  D9EE          fldz
 3 00000002  D97424F4      fnstenv [esp-0xc]
 4 00000006  5E            pop esi
 5 00000007  83C616        add esi,byte +0x16
 6 0000000A  33C9          xor ecx,ecx
 7 0000000C  66B9BB08      mov cx,0x8bb
 8 00000010  803631        xor byte [esi],0x31
 9 00000013  46            inc esi
10 00000014  E2FA          loop 0x10
[snip]

This decryptor loop decrypts the next 0x8bb bytes using the following algorithm:

1 void stage4(unsigned char *buf)
2 {
3   int ecx;
4   unsigned char *esi;
5
6   // add esi,byte +0x16
7   esi = buf + 0x16;
8
9   // add esi,byte +0x16
10  // mov cx,0x8bb
11  ecx = 0x8bb;
12
13  while (ecx > 0)
14  {
15    // xor byte [esi],0x31
16    *esi = *esi ^ 0x31;
17
18    // inc esi
19    esi = esi + 1;
20
21    // loop 0x10
22    ecx = ecx - 1;
23  }
24 }

After running this algorithm, we finally get to the fully deobfuscated shellcode and can begin analysis:

 1 $ ndisasm -b32 stage5 
 2 00000000  55               push ebp
 3 00000001  8BEC             mov ebp,esp
 4 00000003  81EC90010000     sub esp,0x190
 5 00000009  53               push ebx
 6 0000000A  56               push esi
 7 0000000B  57               push edi
[snip]

Remember that the original purpose of my friend's request was a decoder that could decrypt the encrypted URL used to download the backdoor file. The first relevant instruction that I found related to this task is at offset 0x90 in the deobfuscated function:

 1 00000090  E800000000        call dword 0x95
 2 00000095  5B                pop ebx
 3 00000096  83C350            add ebx,byte +0x50
 4 00000099  899D74FEFFFF      mov [ebp-0x18c],ebx
 5 0000009F  8B8574FEFFFF      mov eax,[ebp-0x18c]
 6 000000A5  813831123112      cmp dword [eax],0x12311231
 7 000000AB  740F              jz 0xbc
 8 000000AD  8B8574FEFFFF      mov eax,[ebp-0x18c]
 9 000000B3  40                inc eax
10 000000B4  898574FEFFFF      mov [ebp-0x18c],eax
11 000000BA  EBE3              jmp short 0x9f
[snip]

On line 1 we see a call instruction being made to the next instruction (pop ebx). This has the effect of placing the runtime address of the pop ebx instruction into ebx. 0x50 is then added to this address and a loop begins that is searching for 0x12311231 (0x31123112 on disk due to little endian) towards the end of the GIF file. This is the special marker used to denote where the encrypted URL begins. Once this marker is found, control is transferred to offset 0xbc (this check and jmp occurs on lines 6-7).

Starting at offset 0xbc, we have the following loop. Note that this disassembly is annoyingly long due to no optimizations being used.

  1 000000BC  8B8574FEFFFF      mov eax,[ebp-0x18c]
  2 000000C2  83C004            add eax,byte +0x4
  3 000000C5  898574FEFFFF      mov [ebp-0x18c],eax
  4 000000CB  8B8574FEFFFF      mov eax,[ebp-0x18c]
  5 000000D1  898580FEFFFF      mov [ebp-0x180],eax
  6 000000D7  83A588FEFFFF00    and dword [ebp-0x178],byte +0x0
  7 000000DE  8B8574FEFFFF      mov eax,[ebp-0x18c]
  8 000000E4  038588FEFFFF      add eax,[ebp-0x178]
  9 000000EA  0FBE00            movsx eax,byte [eax]
 10 000000ED  83F8FF            cmp eax,byte -0x1
 11 000000F0  744F              jz 0x141
 12 000000F2  8B8574FEFFFF      mov eax,[ebp-0x18c]
 13 000000F8  038588FEFFFF      add eax,[ebp-0x178]
 14 000000FE  0FBE00            movsx eax,byte [eax]
 15 00000101  83F012            xor eax,byte +0x12
 16 00000104  8B8D74FEFFFF      mov ecx,[ebp-0x18c]
 17 0000010A  038D88FEFFFF      add ecx,[ebp-0x178]
 18 00000110  8801              mov [ecx],al
 19 00000112  8B8574FEFFFF      mov eax,[ebp-0x18c]
 20 00000118  038588FEFFFF      add eax,[ebp-0x178]
 21 0000011E  0FBE00            movsx eax,byte [eax]
 22 00000121  83E831            sub eax,byte +0x31
 23 00000124  8B8D74FEFFFF      mov ecx,[ebp-0x18c]
 24 0000012A  038D88FEFFFF      add ecx,[ebp-0x178]
 25 00000130  8801              mov [ecx],al
 26 00000132  8B8588FEFFFF      mov eax,[ebp-0x178]
 27 00000138  40                inc eax
 28 00000139  898588FEFFFF      mov [ebp-0x178],eax
 29 0000013F  EB9D              jmp short 0xde

On lines 1-3 the pointer to where the marker was found is incremented by 4 to skip the marker and then placed into [ebp-0x18c]. This value is then also placed into [ebp-0x180] on line 5. The buffer holding the URL is then enumerated until an 0xff marker is found. This is accomplished by the byte comparison of -0x1 on line 10 and the bailout if found on line 11.  Lines 12 through 29 perform the decryption of the URL. The main part of this decryption is on lines 15 and 22 (shown in red) where each byte is transformed by XOR'ing with 0x12 and then subtracting 0x31.

After this analysis, we now finally know how to find and decrypt the URL:
  1. Read in logo.gif
  2. Search the file for 0x31123112 in little endian
  3. Once found, decrypt each byte by XOR it with 0x12 and subtract 0x31
  4. Stop processing when a byte of 0xff is found
After implementing the above steps I was able to give my friend a Python script that could read the decrypted URL from abitrary logo.gif files that he found through his analysis and hunting:

$ python decode.py logo.gif
Found \x31\x12\x31\x12 marker at offset 17764
Found 0xff, breaking URL processing loop
Download URL: http://redacted/redacted.exe

In the end, a simple decryptor loop is all that is needed to decrypt the URL, but it took reversing multiple stages of shellcode and runtime code modifications in order to find and understand this algorithm.

I hope that you found reading this blog post informative and interesting. I would like to thank the other members of the Volatility Team (@iMHLv2, @gleeda, @4tphi) for proof reading this before I hit 'Publish' and for @justdionysus providing the historical references related to the use of the FPU for finding EIP. If you have any questions or comments on the post please leave a comment below, ping me on Twitter (@attrc), or shoot me an email (andrew @@@ memoryanalysis.net).

1 comment:

  1. Excellent, excellent write up. I appreciate the granular details.

    ReplyDelete