Malware analysis with IDA/Radare2 2 - From unpacking to config extraction to full reversing (IceID Loader)

Malware analysis with IDA/Radare2 2 - From unpacking to config extraction to full reversing (IceID Loader)


In the previous chapters of the course we mainly focused on performing the unpacking process in order to get to the final piece of malware. This is usually the first thing to do when facing an unknown sample or just proceeding to determine what we are dealing with. Today we are going to take things a little bit further and move on after unpacking to fully reversing the program in order to determine the mechanisms it uses to connect to the command and control servers as well as the overall algorithm. This is important not only for the general malware reversing process but also for the unpacking itself, as sometimes the malware itself will continue to unravel by downloading more stages from a command and control server just to execute them and those will be the final thing. The sample we are going to deal with is the ICEID loader also this.


According to Microsoft security IcedID is a banking trojan that has evolved to become an entry point for more sophisticated threats, including human-operated ransomware. It connects to a command-and-control server and downloads additional implants and tools that allow attackers to perform hands-on-keyboard attacks, steal credentials, and move laterally across affected networks to delivering additional payloads.

Today we are going to learn about how this malware acts as a downloader to implant additional modules inside the affected systems.


Ok so a sample lands on our system, and after a first look we start seeing a potential packing going on:

PS C:\Users\labo\Desktop> rahash2.exe -a entropy .\new_iced.exe
.\new_iced.exe: 0x00000000-0x000239ff entropy: 6.10309516

Entropy is high on the .text section, OK but also on the data related sections! So something packed may be there:

[0x004024ee]> iS entropy

nth paddr          size vaddr         vsize perm entropy    name
0   0x00000400  0x10800 0x00401000  0x11000 -r-x 6.74434571 .text
1   0x00010c00   0x7000 0x00412000   0x7000 -r-- 4.97468941 .rdata
2   0x00017c00   0x2200 0x00419000  0x1b000 -rw- 5.28734121 .data
3   0x00019e00   0x8400 0x00434000   0x9000 -r-- 3.72823342 .rsrc
4   0x00022200   0x1800 0x0043d000   0x2000 -r-- 6.41329430 .reloc


If we go check the imports, we also see some Alloc/ReadWrite functions as well as potential anti-debug stuff (IsDebuggerPresent, sleep, getting info about the machine etc)

[0x004024ee]> is

nth paddr      vaddr      bind type size lib          name
1   0x00010c00 0x00412000 NONE FUNC 0    KERNEL32.dll imp.GetWindowsDirectoryA
2   0x00010c04 0x00412004 NONE FUNC 0    KERNEL32.dll imp.Sleep
3   0x00010c08 0x00412008 NONE FUNC 0    KERNEL32.dll imp.RemoveDirectoryA
4   0x00010c0c 0x0041200c NONE FUNC 0    KERNEL32.dll imp.VirtualProtectEx
5   0x00010c10 0x00412010 NONE FUNC 0    KERNEL32.dll imp.LocalAlloc
6   0x00010c14 0x00412014 NONE FUNC 0    KERNEL32.dll imp.GetTempPathA
7   0x00010c18 0x00412018 NONE FUNC 0    KERNEL32.dll imp.LocalFree
8   0x00010c1c 0x0041201c NONE FUNC 0    KERNEL32.dll imp.CreateThread
9   0x00010c20 0x00412020 NONE FUNC 0    KERNEL32.dll imp.CloseHandle
10  0x00010c24 0x00412024 NONE FUNC 0    KERNEL32.dll imp.WriteConsoleW
11  0x00010c28 0x00412028 NONE FUNC 0    KERNEL32.dll imp.SetFilePointerEx
12  0x00010c2c 0x0041202c NONE FUNC 0    KERNEL32.dll imp.GetLastError
13  0x00010c30 0x00412030 NONE FUNC 0    KERNEL32.dll imp.HeapFree
14  0x00010c34 0x00412034 NONE FUNC 0    KERNEL32.dll imp.HeapAlloc
15  0x00010c38 0x00412038 NONE FUNC 0    KERNEL32.dll imp.EncodePointer
16  0x00010c3c 0x0041203c NONE FUNC 0    KERNEL32.dll imp.DecodePointer
17  0x00010c40 0x00412040 NONE FUNC 0    KERNEL32.dll imp.GetCommandLineA
18  0x00010c44 0x00412044 NONE FUNC 0    KERNEL32.dll imp.RaiseException
19  0x00010c48 0x00412048 NONE FUNC 0    KERNEL32.dll imp.RtlUnwind
20  0x00010c4c 0x0041204c NONE FUNC 0    KERNEL32.dll imp.IsDebuggerPresent
21  0x00010c50 0x00412050 NONE FUNC 0    KERNEL32.dll imp.IsProcessorFeaturePresent
22  0x00010c54 0x00412054 NONE FUNC 0    KERNEL32.dll imp.GetProcessHeap
23  0x00010c58 0x00412058 NONE FUNC 0    KERNEL32.dll imp.ExitProcess
24  0x00010c5c 0x0041205c NONE FUNC 0    KERNEL32.dll imp.GetModuleHandleExW
25  0x00010c60 0x00412060 NONE FUNC 0    KERNEL32.dll imp.GetProcAddress
33  0x00010c80 0x00412080 NONE FUNC 0    KERNEL32.dll imp.GetACP
34  0x00010c84 0x00412084 NONE FUNC 0    KERNEL32.dll imp.GetOEMCP
35  0x00010c88 0x00412088 NONE FUNC 0    KERNEL32.dll imp.GetCPInfo

And the strings confirm our previous theory, that may correspond to a crypted/packed chunk that will be decoded in memory later during the execution:

47  0x000191c5 0x0041a5c5 4   5    .data   ascii   TXKU
48  0x000191e9 0x0041a5e9 4   5    .data   ascii   w#Hk
49  0x00019221 0x0041a621 4   5    .data   ascii   ^WZY
50  0x0001923e 0x0041a63e 6   7    .data   ascii   _\fA\vx\f
51  0x00019254 0x0041a654 6   7    .data   ascii   B(Atwa
52  0x0001926b 0x0041a66b 4   5    .data   ascii   T\b\v}
53  0x0001935d 0x0041a75d 4   5    .data   ascii   p"Vx
54  0x000193d4 0x0041a7d4 8   9    .data   ascii   \ttqk@`(0
55  0x00019409 0x0041a809 6   7    .data   ascii   R SjH@
56  0x0001941b 0x0041a81b 4   5    .data   ascii   T`(Z
57  0x0001943b 0x0041a83b 5   6    .data   ascii   [*ppP
58  0x00019451 0x0041a851 4   5    .data   ascii   <p\bb
59  0x00019464 0x0041a864 4   5    .data   ascii   s\vFH
60  0x0001946e 0x0041a86e 4   5    .data   ascii   jIh"

So after this point we have several options: doing a static analysis, trying some auto-unpacker, detonating the sample and checking if it drops anything… or just debugging it by inspecting api calls related to memory management:

[0x004024ee]> ood
Spawned new process with pid 3912, tid = 2884
= attach 3912 2884
File dbg://C:\\Users\\labo\\Desktop\\new_iced.exe  reopened in read-write mode
[0x77193820]> dcu entry0
Continue until 0x00fc24ee using 1 bpsize
(3912) loading library at 0x0000000077140000 (C:\Windows\System32\ntdll.dll) ntdll.dll
(3912) loading library at 0x0000000077300000 (C:\Windows\SysWOW64\ntdll.dll) ntdll.dll
(3912) loading library at 0x0000000074630000 (C:\Windows\System32\wow64.dll) wow64.dll
(3912) loading library at 0x00000000745D0000 (C:\Windows\System32\wow64win.dll) wow64win.dll
(3912) loading library at 0x00000000745C0000 (C:\Windows\System32\wow64cpu.dll) wow64cpu.dll
[0x771e6fb1]> dcu entry0
Continue until 0x00fc24ee using 1 bpsize
(3912) loading library at 0x0000000077020000 (C:\Windows\System32\kernel32.dll) kernel32.dll
(3912) unloading library at 0x0000000077020000 (C:\Windows\System32\kernel32.dll) kernel32.dll
(3912) loading library at 0x0000000076810000 (C:\Windows\SysWOW64\kernel32.dll) kernel32.dll
(3912) unloading library at 0x0000000076810000 (C:\Windows\SysWOW64\kernel32.dll) kernel32.dll
(3912) loading library at 0x0000000077020000 (C:\Windows\System32\kernel32.dll) kernel32.dll
(3912) unloading library at 0x0000000077020000 (C:\Windows\System32\kernel32.dll) kernel32.dll
(3912) loading library at 0x0000000076F20000 (C:\Windows\System32\user32.dll) user32.dll
(3912) unloading library at 0x0000000076F20000 (C:\Windows\System32\user32.dll) user32.dll
(3912) loading library at 0x0000000076810000 (C:\Windows\SysWOW64\kernel32.dll) kernel32.dll
(3912) loading library at 0x0000000076DF0000 (C:\Windows\SysWOW64\KernelBase.dll) KernelBase.dll
(3912) loading library at 0x0000000076C90000 (C:\Windows\SysWOW64\ole32.dll) ole32.dll
(3912) loading library at 0x00000000759F0000 (C:\Windows\SysWOW64\msvcrt.dll) msvcrt.dll
(3912) loading library at 0x0000000074D10000 (C:\Windows\SysWOW64\gdi32.dll) gdi32.dll
(3912) loading library at 0x0000000074AD0000 (C:\Windows\SysWOW64\user32.dll) user32.dll
(3912) loading library at 0x0000000076740000 (C:\Windows\SysWOW64\advapi32.dll) advapi32.dll
(3912) loading library at 0x0000000074E20000 (C:\Windows\SysWOW64\sechost.dll) sechost.dll
(3912) loading library at 0x0000000076B40000 (C:\Windows\SysWOW64\rpcrt4.dll) rpcrt4.dll
(3912) loading library at 0x0000000074A70000 (C:\Windows\SysWOW64\sspicli.dll) sspicli.dll
(3912) loading library at 0x0000000074A60000 (C:\Windows\SysWOW64\cryptbase.dll) cryptbase.dll
(3912) loading library at 0x0000000076800000 (C:\Windows\SysWOW64\lpk.dll) lpk.dll
(3912) loading library at 0x0000000076E40000 (C:\Windows\SysWOW64\usp10.dll) usp10.dll
[0x773a0fc5]> dcu entry0
Continue until 0x00fc24ee using 1 bpsize
(3912) loading library at 0x0000000076C30000 (C:\Windows\SysWOW64\imm32.dll) imm32.dll
(3912) loading library at 0x0000000074BD0000 (C:\Windows\SysWOW64\msctf.dll) msctf.dll
hit breakpoint at: 0xfc24ee

After hitting the entry point, having the needed libraries loaded, we place our breakpoints in those well known api calls:

Unpacking the sample

We search for those api calls, also paying attention to potential (very basic) anti debugging techniques:

[0x0010001f]> e = dbg.maps
[0x0010001f]> dmi KERNEL32 VirtualProtect
Unknown library, or not found in dm
[0x0010001f]> dmi kernel32 VirtualAlloc

nth  paddr      vaddr      bind   type size lib                               name
1264 0x00011826 0x76821826 GLOBAL FUNC 0    KERNEL32.dll                      VirtualAlloc
4    0x00010908 0x76820908 NONE   FUNC 0    API-MS-Win-Core-Memory-L1-1-0.dll imp.VirtualAlloc
[0x0010001f]> dmi kernel32 VirtualProtect

nth  paddr      vaddr      bind   type size lib                               name
1270 0x000143be 0x768243be GLOBAL FUNC 0    KERNEL32.dll                      VirtualProtect
8    0x00010918 0x76820918 NONE   FUNC 0    API-MS-Win-Core-Memory-L1-1-0.dll imp.VirtualProtect
[0x0010001f]> dmi kernel32 IsDebuggerPresent

nth paddr      vaddr      bind   type size lib                              name
770 0x0001494d 0x7682494d GLOBAL FUNC 0    KERNEL32.dll                     IsDebuggerPresent
4   0x00010d94 0x76820d94 NONE   FUNC 0    API-MS-Win-Core-Debug-L1-1-0.dll imp.IsDebuggerPresent
[0x0010001f]> dmi kernel32 CreateProcessA

nth paddr      vaddr      bind   type size lib          name
167 0x00011072 0x76821072 GLOBAL FUNC 0    KERNEL32.dll CreateProcessA
[0x0010001f]> dmi kernel32 CreateProcessW

nth paddr      vaddr      bind   type size lib          name
171 0x0001103d 0x7682103d GLOBAL FUNC 0    KERNEL32.dll CreateProcessW
[0x0010001f]> dmi kernel32 CreateProcessInternalW

nth paddr      vaddr      bind   type size lib          name
170 0x00023c23 0x76833c23 GLOBAL FUNC 0    KERNEL32.dll CreateProcessInternalW
[0x0010001f]> dmi kernel32 CreateProcessInternalA

nth paddr      vaddr      bind   type size lib          name
169 0x0002a507 0x7683a507 GLOBAL FUNC 0    KERNEL32.dll CreateProcessInternalA
[0x0010001f]> dmi kernel32 CreateRemoteThread

nth paddr      vaddr      bind   type size lib                                       name
172 0x000948e3 0x768a48e3 GLOBAL FUNC 0    KERNEL32.dll                              CreateRemoteThread
36  0x000108b0 0x768208b0 NONE   FUNC 0    API-MS-Win-Core-ProcessThreads-L1-1-0.dll imp.CreateRemoteThread

And we come up with the following

[0x0010001f]> db
0x76821826 - 0x76821827 1 --x sw break enabled valid cmd="" cond="" name="0x76821826" module=""
0x768243be - 0x768243bf 1 --x sw break enabled valid cmd="" cond="" name="0x768243be" module=""
0x7682494d - 0x7682494e 1 --x sw break enabled valid cmd="" cond="" name="0x7682494d" module=""
0x76821072 - 0x76821073 1 --x sw break enabled valid cmd="" cond="" name="0x76821072" module=""
0x7682103d - 0x7682103e 1 --x sw break enabled valid cmd="" cond="" name="0x7682103d" module=""
0x76833c23 - 0x76833c24 1 --x sw break enabled valid cmd="" cond="" name="0x76833c23" module=""
0x7683a507 - 0x7683a508 1 --x sw break enabled valid cmd="" cond="" name="0x7683a507" module=""

From now on, the process should be very well known, we note the memory regions being allocated, and we keep inspecting them as the execution flow goes on:

[0x00fc24ee]> dc
hit breakpoint at: 0x76821826
[0x76821826]> dcr
hit breakpoint at: 0x76dff13d
[0x76dff13e]> dr eax
[0x76dff13e]> dc
hit breakpoint at: 0x76821826
[0x76821826]> dcr
hit breakpoint at: 0x76dff13d
[0x76dff13e]> dr eax
[0x76dff13e]> pxw @ 0x00100000
0x00100000  0x0000e890 0x8d5b0000 0x04bf3143 0xb97b02f6  ......[.C1....{.
0x00100010  0x0000094c 0xdb31fa89 0xe683ce89 0x890a7503  L.....1......u..
0x00100020  0xda0166fb 0x8903cac1 0x401030d7 0xe208cac1  .f.......0.@....
0x00100030  0x04bce9e7 0x68900000 0x892e345c 0xb8000009  .......h\4......
0x00100040  0x0000001a 0xd300001a 0x5000000c 0x1c00017c  ...........P|...
0x00100050  0x3a000000 0x0e6d1cda 0x03050e18 0x060d0802  ...:..m.........
0x00100060  0x00000303 0x07140400 0x07050f08 0xf8000000  ................
0x00100070  0x00000197 0x07000000 0x00000000 0x3c00fc00  ...............<
0x00100080  0xfa0038d1 0x4200348b 0x8e000e31 0xae000718  .8...4.B1.......
0x00100090  0xda000068 0x94003560 0x5501c3fe 0x5651e589  h...`5.....U..QV
0x001000a0  0x0c4d8b57 0x8b10758b 0x36ff147d 0xe80875ff  W.M..u..}..6.u..
0x001000b0  0x00000013 0xc7830789 0x04c68304 0x5e5fece2  .............._^
0x001000c0  0x5dec8959 0x550010c2 0x5653e589 0xff645157  Y..]...U..SVWQd.
0x001000d0  0x00003035 0x408b5800 0x0c488b0c 0x8b90118b  50...X.@..H.....
0x001000e0  0x6a903041 0x087d8b02 0x70e85057 0x85000000  A0.j..}.WP.p....
0x001000f0  0x890474c0 0x8be5ebd1 0x8b501841 0xd8013c58  .t......A.P.X<..
[0x76dff13e]> dc
hit breakpoint at: 0x76821826
[0x76821826]> dcr
hit breakpoint at: 0x76dff13d
[0x76dff13e]> dr eax
[0x76dff13e]> pxw @ 0x00110000
0x00110000  0x0638fc23 0x07010134 0x056700ef 0x06a76352  #.8.4.....g.Rc..
0x00110010  0x01713211 0x00000000 0x90691465 0x00000000  .2q.....e.i.....
0x00110020  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00110030  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00110040  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00110050  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00110060  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00110070  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00110080  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00110090  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x001100a0  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x001100b0  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x001100c0  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x001100d0  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x001100e0  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x001100f0  0x00000000 0x00000000 0x00000000 0x00000000  ................
[0x76dff13e]> dc
hit breakpoint at: 0x768243be
[0x768243be]> pxw @ 0x00120000
0x00120000  0x905a384d 0x02660338 0xff710904 0x91c2b881  M8Z.8.f...q.....
0x00120010  0x15c24001 0x1c09c8c6 0xf8ba1f0e 0xcd09b400  .@..............
0x00120020  0x4c01b821 0x68540ac0 0x0e207369 0x676f7270  !..L..This .prog
0x00120030  0x876d6167 0x1f6e4763 0x62e7744f 0x75cfaf65
0x00120040  0x0669985f 0x537e4f44 0x646f6d03 0x890d2e65  _.i.DO~S.mode...
0x00120050  0x444c240a 0xd89b0189 0xb6facd84 0xbe0458d7  .$LD.........X..
0x00120060  0xd6b7980a 0x7cbc0cc0 0x2b11ee60 0x43d6be9e  .......|`..+...C
0x00120070  0x22b43cc8 0x69520acc 0x21286863 0x4550508c  .<."..Rich(!.PPE
0x00120080  0xa0014c80 0x2b7453c6 0x1c145d9c 0x010207e0  .L...St+.]......
0x00120090  0x0c0e230b 0x1b760a83 0x3d3314a4 0x2b100b16  .#....v...3=...+
0x001200a0  0xa0e62009 0x0502400c 0x41d001e0 0xaea2a608  . ...@.....A....
0x001200b0  0x401f8815 0x2c53d080 0x0fda0891 0x0c20801e  ...@..S,...... .
0x001200c0  0x2d784921 0x8cd79ce9 0x8956012b 0x1f5a94a8  !Ix-....+.V...Z.
0x001200d0  0x65742ec1 0x3222ce78 0x0a91b909 0x4342b84e  ..tex."2....N.BC
0x001200e0  0x722e60c0 0x74726164 0xae046880 0x090665fc  .`.rdart.h...e..
0x001200f0  0x73a32b0e 0x40272e52 0x0bca02ff 0x14654c30  .+.sR.'@....0Le.

After some calls, we see something that looks like a compressed/encoded binary being written. So we move on to see if the execution flow moves there at some point:

[0x768243be]> dcr
hit breakpoint at: 0x76dff0fe
[0x76dff0ff]> pd 10
            ;-- rip:
            0x76dff0ff      c21000         ret 0x10
            0x76dff102      cc             int3
            0x76dff103      cc             int3
            0x76dff104      cc             int3
            0x76dff105      cc             int3
            0x76dff106      cc             int3
            0x76dff107      8bff           mov edi, edi
            0x76dff109      55             push ebp
            0x76dff10a      8bec           mov ebp, esp
            0x76dff10c      ff7510         push dword [ebp + 0x10]
[0x76dff0ff]> ds
[0x001007a3]> pd 10
            ;-- rip:
            0x001007a3      85c0           test eax, eax
        ,=< 0x001007a5      7533           jne 0x1007da
        |   0x001007a7      8bbb7b107000   mov edi, dword [ebx + 0x70107b]
        |   0x001007ad      8bb3b2147000   mov esi, dword [ebx + 0x7014b2]
        |   0x001007b3      01fe           add esi, edi
        |   0x001007b5      8d83d2147000   lea eax, [ebx + 0x7014d2]
        |   0x001007bb      50             push eax
        |   0x001007bc      6a04           push 4                      ; 4
        |   0x001007be      6800100000     push 0x1000
        |   0x001007c3      57             push edi

After returning from the first call to VirtualProtect, we see that the execution resumes to a region that holds references to other memory regions and operates with them in some way:

[0x001007a3]> pd 50
            ;-- rip:
            0x001007a3      85c0           test eax, eax
        ,=< 0x001007a5      7533           jne 0x1007da
        |   0x001007a7      8bbb7b107000   mov edi, dword [ebx + 0x70107b]
        |   0x001007ad      8bb3b2147000   mov esi, dword [ebx + 0x7014b2]
        |   0x001007b3      01fe           add esi, edi
       .--> 0x001007b5      8d83d2147000   lea eax, [ebx + 0x7014d2]
       :|   0x001007bb      50             push eax
       :|   0x001007bc      6a04           push 4                      ; 4
       :|   0x001007be      6800100000     push 0x1000
       :|   0x001007c3      57             push edi
       :|   0x001007c4      ff93d6147000   call dword [ebx + 0x7014d6]
       :|   0x001007ca      85c0           test eax, eax
      ,===< 0x001007cc      7502           jne 0x1007d0
      |:|   0x001007ce      cd03           int 3
      `---> 0x001007d0      81c700100000   add edi, 0x1000
       :|   0x001007d6      39f7           cmp edi, esi
       `==< 0x001007d8      72db           jb 0x1007b5
        `-> 0x001007da      8b837b107000   mov eax, dword [ebx + 0x70107b]
            0x001007e0      8983ba147000   mov dword [ebx + 0x7014ba], eax
            0x001007e6      8bbbba147000   mov edi, dword [ebx + 0x7014ba]
            0x001007ec      8b8bb2147000   mov ecx, dword [ebx + 0x7014b2]
            0x001007f2      30c0           xor al, al
            0x001007f4      fc             cld
            0x001007f5      f3aa           rep stosb byte es:[edi], al
            0x001007f7 b    8bb3c2147000   mov esi, dword [ebx + 0x7014c2]
            0x001007fd      89f2           mov edx, esi
            0x001007ff      03563c         add edx, dword [esi + 0x3c]
            0x00100802      8d82f8000000   lea eax, [edx + 0xf8]
            0x00100808      0fb74a06       movzx ecx, word [edx + 6]
            0x0010080c      56             push esi
            0x0010080d      ffb3ba147000   push dword [ebx + 0x7014ba]
            0x00100813      50             push eax
            0x00100814      51             push ecx
            0x00100815      e8f8fbffff     call 0x100412
            0x0010081a      60             pushal
            0x0010081b      ffb3da147000   push dword [ebx + 0x7014da]
            0x00100821      ffb3ea147000   push dword [ebx + 0x7014ea]
            0x00100827      ffb3ba147000   push dword [ebx + 0x7014ba]
            0x0010082d      ffb280000000   push dword [edx + 0x80]
            0x00100833      e86cfbffff     call 0x1003a4
            0x00100838      61             popal
            0x00100839      8b83ce147000   mov eax, dword [ebx + 0x7014ce]
[0x001007f7]> pd 20
            ;-- rip:
            0x001007f7 b    8bb3c2147000   mov esi, dword [ebx + 0x7014c2]
            0x001007fd      89f2           mov edx, esi
            0x001007ff      03563c         add edx, dword [esi + 0x3c]
            0x00100802      8d82f8000000   lea eax, [edx + 0xf8]
            0x00100808      0fb74a06       movzx ecx, word [edx + 6]
            0x0010080c      56             push esi

[0x001007f7]> dr esi
[0x001007f7]> pxw @ 0x00120cd3
0x00120cd3  0x00905a4d 0x00000003 0x00000004 0x0000ffff  MZ..............
0x00120ce3  0x000000b8 0x00000000 0x00000040 0x00000000  ........@.......
0x00120cf3  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00120d03  0x00000000 0x00000000 0x00000000 0x000000c8  ................
0x00120d13  0x0eba1f0e 0xcd09b400 0x4c01b821 0x685421cd  ........!..L.!Th
0x00120d23  0x70207369 0x72676f72 0x63206d61 0x6f6e6e61  is program canno
0x00120d33  0x65622074 0x6e757220 0x206e6920 0x20534f44  t be run in DOS
0x00120d43  0x65646f6d 0x0a0d0d2e 0x00000024 0x00000000  mode....$.......
0x00120d53  0x84d89b89 0xd7b6facd 0xd7b6facd 0xd7b6facd  ................
0x00120d63  0xd6b798be 0xd7b6fac0 0xd7b7facd 0xd7b6faee  ................
0x00120d73  0xd6be9e2b 0xd7b6fac8 0xd6b49e2b 0xd7b6facc  +.......+.......
0x00120d83  0x68636952 0xd7b6facd 0x00000000 0x00000000  Rich............
0x00120d93  0x00000000 0x00000000 0x00004550 0x0004014c  ........PE..L...
0x00120da3  0x5d9c7453 0x00000000 0x00000000 0x010200e0  St.]............
0x00120db3  0x0c0e010b 0x00000a00 0x00000c00 0x00000000  ................
0x00120dc3  0x0000163d 0x00001000 0x00002000 0x00400000  =........ ....@.

After doing a bit of manual work, to inspect where those pointers refeer, we come up with a clear reference to a memory region (allocated) where a binary has been written. As this is a new PE, that is, not the main binary, it looks like it’s the unpacked code, that will be loaded/executed later on!

So what we did in the previous posts was to go search for the memory region (dm) where the program was being written to dump it to disk. This time we can just dump a particular chunk of memory located between two addresses using the wtf command, so we only dump what we need.

[0x001007f7]> wtf 0x00120cd3..0x00120bd3 > dumped.bin
Dumped 256 bytes from 0x001007f7 into 0x00120cd3..0x00120bd3

And thats it, at this point we came up with the unpacked sample. And an initial overview shows that packing may not be present here, so we have the final sample:

[0x0040163d]> iS entropy

nth paddr        size vaddr        vsize perm entropy    name
0   0x00000400  0xa00 0x00401000  0x1000 -r-x 6.04376285 .text
1   0x00000e00  0x600 0x00402000  0x1000 -r-- 3.93462298 .rdata
2   0x00001400  0x400 0x00403000  0x1000 -rw- 5.35342197 .data
3   0x00001800  0x200 0x00404000  0x1000 -r-- 2.17364736 .reloc


Extracting the config

So we have the unpacked sample and: what to do now? At this point we may perform a further analysis on the packed/unpacked samples to extract some yara rules. Or we could proceed to check for the behavior of the final sample to check for C2 ip addrs/domains, so we can, for example, block them.

So what we’ll do now is go for a full reverse engineering on the sample, to extract its config and reverse its algorithm. We can do this with r2 as usual, but for simplicity and visual agility we can also use software like iaito or cutter, those are GUIs for radare2/rizin.

So, moving on, we open the gui we see that first of all the progrma tries to access to the appdata folder, depending on its presence it will create a path with that one or use the other one presented the code (users public):


Moving on we see a very suspicious call. It pushes what seems to be two memory addresses both of them located in the .data section, one starting 8 bytes after the first one. Those sizes are also passed to the call, that may indicate that the first two 8 bytes of that chunk along with the rest of it are being passed to af unction.


By peeking inside of it, we locate another interesting call:


Looking inside we see what looks like the substitution box initialization for the RC4 algorithm.


Here we can inspect its disasm:

0x00401832      mov eax, dword [var_10h]
0x00401836      movzx esi, dl
0x00401839      mov dl, byte [ebx + edi]
0x0040183c      mov al, byte [esi + eax]
0x0040183f      add al, dl
0x00401841      add cl, al
0x00401843      mov byte [var_13h], cl
0x00401847      movzx ecx, cl
0x0040184a      mov al, byte [ecx + edi]
0x0040184d      mov byte [ebx + edi], al
0x00401850      lea eax, [esi + 1]
0x00401853      mov byte [ecx + edi], dl
0x00401856      xor edx, edx
0x00401858      mov cl, byte [var_13h]
0x0040185c      div ebp
0x0040185e      inc ebx
0x0040185f      cmp ebx, 0x100     ; 256
0x00401865      jb 0x401832

And how can we affirm that it belongs to the RC4 algorithm? No mystery, we just need to know about it, as we reverse it and see it more and more in our analyses we’ll learn to quickly identify it. RC4 along with AES for example, are common encryption algorithms we see in malware. So at the end we should be able to quickly recognize and operate with them.

After the initialization of the sbox, moving on on the code we see the full RC4 decryption routine:

dorc4 And the code we should pay attention to:

0x004018df      inc bl
0x004018e1      movzx ebx, bl
0x004018e4      mov cl, byte [esp + ebx + 0x14]
0x004018e8      movzx edx, cl
0x004018eb      add al, dl
0x004018ed      movzx eax, al
0x004018f0      mov dword [var_10h], eax
0x004018f4      mov al, byte [esp + eax + 0x14]
0x004018f8      mov byte [esp + ebx + 0x14], al
0x004018fc      mov eax, dword [var_10h]
0x00401900      mov byte [esp + eax + 0x14], cl
0x00401904      mov al, byte [esp + ebx + 0x14]
0x00401908      add al, dl
0x0040190a      movzx eax, al
0x0040190d      mov al, byte [esp + eax + 0x14]
0x00401911      xor al, byte [esi + edi]
0x00401914      mov byte [edi], al
0x00401916      inc edi
0x00401917      mov eax, dword [var_10h]
0x0040191b      sub ebp, 1
0x0040191e      jne 0x4018df

As a rule of thumb, as we see a routine this big containing instructions such as movxz, shl and the like or especially XOR, that may be a strong indicator that some encryption/decryption is going on. Of course turning that into pseudo-code and interpreting it or even debug it will kinda give the final answer!

So knowing that we can rename the function to make our analysis easier, as after renaming it we’ll easily detect that decryption routine somewhere else in the code:


Now the whole thing looks crystal clear:

0x0040157a      mov eax, 0x403008
0x0040157f      mov dword [var_24h], ; 0x403000
0x00401586      lea ecx, [var_24h]
0x00401589      mov dword [var_20h], 8
0x00401590      mov dword [var_1ch], eax
0x00401593      mov dword [var_18h], 0x248 ; 584
0x0040159a      mov dword [var_14h], eax
0x0040159d      call DECRYPT

As we can now guess, the first 8 bytes may correspond to a decryption KEY the rest being the data.

We can extract that from the .data section:


And use some kind of tool like cyberchef to perform the decryption and extract (in this case) the config for the loader:


The config as it looks contains three domains, that must be related to the command and control infrastructure. So at this point we have relevant IoCs to share about the malware infrastructure.

Reversing the algorithm

From here we can move on to grasp the full algorithm:

As we see, after having the config decrypted, the program will perform another call and try to read a file from disk (photo.png)


Being unable to read bytes from it it will create it, probably to write content into it later on.


Then the execution will move on to another call, that will do what it looks like a timestamp count. Depending on the results, between timestamps, it will initialize a set of variables to some values (we can see that on the code). What’s interesting is that those values will be later on passed to a string, having a format that resembles some HTTP parameters.

tick Here’s the code in charge of generating the full string:

0x004011eb      push    eax
0x004011ec      movzx   eax, byte [var_7h]
0x004011f1      push    eax
0x004011f2      movzx   eax, byte [var_ch_2]
0x004011f7      push    eax
0x004011f8      movzx   eax, byte [var_11h_2]
0x004011fd      push    eax
0x004011fe      movzx   eax, byte [var_1ah]
0x00401203      push    eax
0x00401204      movzx   eax, byte [var_1bh]
0x00401209      push    eax
0x0040120a      push    str.0.2X_0.2X_0.2X_0.2X_0.2X_0.2X_0.8X ; 0x4020b8 ; LPCSTR ARG_1
0x0040120f      push    dword [ARG_0] ; LPSTR ARG_0
0x00401213      call    dword [wsprintfA] ; 0x40205c ; int wsprintfA(LPSTR ARG_0, LPCSTR ARG_1, ...)

That looks like an interesting anti debug technique. As I guess here, it will send that information to the the C2 server and it will be the server the one that will send a valid/mock response accordig to the desired (normal) parameters (ticks corresponding to normal non debugged behavior).

As we move on we can see that effectivly, the HTTP requests to the C2 servers are performed:


In here we can use “pdc” in r2 to quickly get the general picture:

#include <stdint.h>
uint32_t RETRIEVE_FILE (void) {
    int32_t var_1ch_3;
    int32_t var_1ch_2;
    int32_t var_14h;
    int32_t var_18h;
    int32_t var_28h;
    int32_t var_1ch;
    int32_t var_40h;
    int32_t var_4ch;
    int32_t var_144h;
    int32_t var_158h;
    int32_t var_344h_2;
    int32_t var_344h;
    ebx = ecx;
    ecx = &var_1ch;
    edi = edx;
    ANTI_VM ();
    __asm ("rdtsc");
    eax = &var_4ch;
    uint32_t (*wsprintfA)(void, char*, void, uint32_t*, void, void) (eax, "/photo.png?id=%0.2X%0.8X%0.8X%s", 1, *(0x403008), eax, var_1ch);
    *(ebx) = 0;
    eax = &var_158h;
    *(edi) = 0;
    esi = 0x403050;
    ebp = *(wsprintfW);
    void (*ebp)(void, void, void) (eax, 0x402104, 0x403051);
    while (eax != 0xc8) {
        if (*(ebx) != 0) {
            if (*(edi) == 0) {
                goto label_0;
            eax = uint32_t (*GetProcessHeap)(void, uint32_t*) (0, *(ebx));
            uint32_t (*HeapFree)(void) (eax);
        uint32_t (*Sleep)(void) (0x1388);
        eax = *(esi);
        esi += eax;
        if (*(esi) == 0) {
            esi = 0x403050;
        *(ebx) = 0;
        eax = esi + 1;
        *(edi) = 0;
        eax = &var_144h;
        void (*ebp)(void, void, void) (eax, 0x402104, eax);
        eax = &var_344h;
        void (*ebp)(void, void, void) (eax, 0x402104, var_4ch);
        eax = &var_158h;
        var_28h = 1;
        var_1ch_2 = eax;
        ecx = &var_1ch_2;
        eax = &var_344h;
        var_1ch_2 = eax;
        edx = ebx;
        eax = 0x1bb;
        var_14h = ax;
        eax = fcn_0040164b (edi);
    eax = 0;
    return eax;

So if everything goes correctly, as we move on, we see that the content retrieved from the C2 is decrypted, using the same routine:


And after having it decrypted, we see a final call to a function that allocates space and sets permissions on it:


We see that after allocating the same, permissions are set:


And then after that vprotect is called


Moving on from there… we see a suspicious call to eax… so that may correspond to the execution flow being transfered there. If the code to be loaded / response can be loaded… the program will just exit returning 0 (xor eax, eax).


And that’s it, with this, we have the general idea on how the loader works.

Testing the config

The same thing we did, can be done in a very easy way by using the debugger, and placing a breakpoint after the call to “rc4 decrypt”:

|           0x00c61590      8945e4         mov dword [ebp - 0x1c], eax
|           0x00c61593      c745e8480200.  mov dword [ebp - 0x18], 0x248 ; 584
|           0x00c6159a      8945ec         mov dword [ebp - 0x14], eax
|           0x00c6159d      e8cc020000     call fcn.0040186e
|           0x00c615a2 b    5f             pop edi
|           0x00c615a3      5e             pop esi
|           0x00c615a4      85c0           test eax, eax
|       ,=< 0x00c615a6      7507           jne 0xc615af
|       |   ; CODE XREFS from fcn.004014f9 @ 0xc615ec, 0xc61605, 0xc6162b
|      .--> 0x00c615a8      33c0           xor eax, eax

Inspecting that memory region will then show the address for the config data:

[0x00c614f9]> dc
(3564) loading library at 0x0000000076C90000 (C:\Windows\SysWOW64\ole32.dll) ole32.dll
hit breakpoint at: 0xc615a2
[0x00c615a2]> pxw @ 0xc63000
0x00c63000  0xa267dce3 0xc4f1f313 0x1e3d33fb 0x00000002  ..g......3=.....
0x00c63010  0x646e692f 0x702e7865 0x00007068 0x00000000  /index.php......
0x00c63020  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00c63030  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00c63040  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00c63050  0x6c6f6213 0x69646964 0x7572746f 0x782e7373  .boldidiotruss.x
0x00c63060  0x0f007a79 0x617a696e 0x6f6c706f 0x79782e76  yz..nizaoplov.xy
0x00c63070  0x310f007a 0x73693335 0x2e6b6168 0x74736562
0x00c63080  0x6c691000 0x70313275 0x656e616c 0x7a79782e
0x00c63090  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00c630a0  0x00000000 0x00000000 0x00000000 0x00000000  ................
0x00c630b0  0x00000000 0x00000000 0x00000000 0x00000000  ................

Automating the process

Having gained a full comprehension on how the malware works, we can now write an automated config extraction routine. Which will basically look at the data section, parse the 8 bytes for the key and the rest of it and do the rc4 decrypt using the ARC4 library.

from arc4 import ARC4
import pefile

def dec_rc4(k, data):
    cipher = ARC4(k)
    dec = cipher.decrypt(data)
    return dec

def cfg_extract(f):
    pe = pefile.PE(f)
    for section in pe.sections:
        if ".data" in section.Name:
            return section.get_data()

f = open("extracted.bin")
data = cfg_extract(f)
k = data[:8]
data = data[8:]
print dec_rc4(k,data)

Later on we will move to more advanced samples and use R2/Frida to automate that.

Malware analysis with IDA/Radare2 2 - From unpacking to config extraction to full reversing (IceID Loader)
Older post

Malware analysis with IDA/Radare2 - Multiple unpacking (Ramnit worm)

Newer post

Malware analysis with IDA/Radare2 - DLL Injection techniques, the fundamentals

Malware analysis with IDA/Radare2 2 - From unpacking to config extraction to full reversing (IceID Loader)