GULoader Campaigns: A Deep Dive Evaluation of a extremely evasive Shellcode based mostly loader

Authored by: Anandeshwar Unnikrishnan

Stage 1: GULoader Shellcode Deployment 

In current GULoader campaigns, we’re seeing an increase in NSIS-based installers delivered through E-mail as malspam that use plugin libraries to execute the GU shellcode on the sufferer system. The NSIS scriptable installer is a extremely environment friendly software program packaging utility. The installer habits is dictated by an NSIS script and customers can prolong the performance of the packager by including customized libraries (dll) often called NSIS plugins. Since its inception, adversaries have abused the utility to ship malware. 

NSIS stands for Nullsoft Scriptable Installer. NSIS installer recordsdata are self-contained archives enabling malware authors to incorporate malicious property together with junk information. The junk information is used as Anti-AV / AV Evasion approach. The picture beneath exhibits the construction of an NSIS GULoader staging executable archive.


The NSIS script, which is a file discovered within the archive, has a file extension .nsi as proven within the picture above. The deployment technique employed by the risk actor will be studied by analyzing the NSIS script instructions offered within the script file. The picture proven beneath is an oversimplified view of the entire shellcode staging course of. 

The file that holds the encoded GULoader shellcode is dropped on to sufferer’s disc based mostly on the script configuration together with different information. Junk is appended initially of the encoded shellcode. The encoding type varies from pattern to pattern. However in all most all of the instances, it’s a easy XOR encoding. As talked about earlier than, the shellcode is appended to junk information, due to this, an offset is used to retrieve encoded GULoader shellcode. Within the picture, the FileSeek NSIS command is used to do correct offsetting. Some samples have unprotected GULoader shellcode appended to junk information. 


A plugin utilized by the NSIS installer is nothing however a DLL which will get loaded by the installer program at runtime and invokes capabilities exported by the library Two DLL recordsdata are dropped in person’s TEMP listing, in all analyzed samples one DLL has a constant identify of system.dll and identify of the opposite one varies.   

The system.dll is chargeable for allocating reminiscence for the shellcode and its execution. The next picture exhibits how the NSIS script calls capabilities in plugin libraries.


The system.dll has the following exports as proven the in the picture beneath. The operate named “Name” is getting used to deploy the shellcode on sufferer’s system. 

  • The Name operate exported by system.dll resolves following capabilities dynamically and execute them to deploy the shellcode. 
  • CreateFile – To learn the shellcode dumped on to disk by the installer. As a part of installer arrange, all of the recordsdata seen within the installer archive earlier are dumped on to disk in new listing created in C: drive. 
  • VirtualAlloc – To carry the shellcode within the RWX reminiscence. 
  • SetFilePointer – To hunt the precise place of the shellcode within the dumped file. 
  • ReadFile – To learn the shellcode.  
  • EnumResourceTypesA – Execution through callback mechanism. The second parameter is of the sort ENUMRESTYPEPROCA which is just a pointer to a callback routine. The handle the place the shellcode is allotted within the reminiscence is handed because the second argument to this API resulting in execution of the shellcode. Callback capabilities parameters are good assets for oblique execution of the code.   

Vectored Exception Dealing with in GULoader 

The implementation of the exception dealing with by the Working System supplies a chance for the adversary to take over execution circulation. The Vectored Exception Dealing with on Home windows supplies the person with capacity to register customized exception handler, which is just a code logic that will get executed on the occasion of an exception. The attention-grabbing factor about dealing with exceptions is that the best way during which the system resumes its regular execution circulation of this system after the occasion of exception. Adversaries exploit this mechanism and take possession of the execution circulation. Malware can divert the circulation to the code which is beneath its management when the exception happens. Usually it’s employed by the malware to realize following targets: 

  • Hooking 
  • Covert code execution and anti-analysis 

The GuLoader employs the VEH primarily for obfuscating the execution circulation and to decelerate the evaluation. This part will cowl the internals of Vectored exception dealing with on Home windows and investigates how GUloader is abusing the VEH mechanism to thwart any evaluation efforts.  

  • The Vectored Exception Dealing with (VEH) is an extension of Structured Exception Dealing with (SEH) with which we will add a vectored exception handler which will probably be known as regardless of of our place in a name body, merely put VEH isn’t frame-based. 
  • VEH is abused by malware, both to control the management circulation or covertly execute person capabilities. 
  • Home windows supplies AddVectoredExceptionHandler Win32 API so as to add customized exception handlers. The operate signature is proven beneath. 

The Handler routine is of the sort PVECTORED_EXCEPTION_HANDLER. Additional checking the documentation, we will see the handler operate takes a pointer to _EXCEPTION_POINTERS sort as its enter as proven within the picture beneath. 


The _EXCEPTION_POINTERS sort holds two vital buildings; PEXCEPTION_RECORD and PCONTEXT. PEXCEPTION_RECORD comprises all the data associated to exception raised by the system like exception code and many others. and PCONTEXT construction maintains CPU register (like RIP/EIP, debug registers and many others.) values or state of the thread captured when exception occurred. 


  • This implies the exception handler can entry each ExceptionRecord and ContextRecord. Right here from inside the handler one can tamper with the info saved within the ContextRecord, thus manipulating EIP/RIP to regulate the execution circulation when person utility resumes from exception dealing with.    
  • There’s one attention-grabbing factor about exception dealing with, the execution to the applying is given again through NtContinue native routine. Exception dispatch routines name the handler and when handler returns to dispatcher, it passes the ContextRecord to the NtContinue and execution is resumed from the EIP/RIP within the file. On a facet word, that is an oversimplified rationalization of the entire exception dealing with course of. 

Vectored Handler in GULoader 

  • GULoader registers a vectored exception handler through RtlAddVectoredExceptionHandler native routine.  The beneath picture exhibits the management circulation of the handler code. Curiously a lot of the code blocks current listed below are junk added to thwart the evaluation efforts.  


  • The GULoader’s handler implementation is as follows (disregarding the junk code). 
  • Reads ExceptionInfo handed to the handler by the system. 
  • Reads the ExceptionCode from ExceptionRecord construction. 
  • Checks the worth of ExceptionCode area in opposition to the computed exception codes for STATUS_ACCESS_VIOLATION, STATUS_BREAKPOINT and STATUS_SINGLESTEP. 
  • Based mostly on the exception code, malware takes a department and executes code that modifies the EIP. 



The GULoader units the lure flag to set off single stepping deliberately to detect evaluation. The handler code will get executed as mentioned earlier than, a block of code is executed based mostly on the exception code. If the exception is single stepping, standing code is 0x80000004, following actions happen:

  • The GULoader reads the ContextRecord and retrieves EIP worth of the thread. 
  •  Increments the present EIP by 2 and reads the one byte from there. 
  • Performs an XOR on the one-byte information fetched from step earlier than and a static worth. The static worth adjustments with samples. In our pattern worth is 0x1A. 
  • The XOR’ed worth is then added to the EIP fetched from the ContextRecord. 
  • Lastly, the modified EIP worth from prior step is saved within the ContextRecord and returns the management again to the system(dispatcher). 
  • The malware has the identical logic for the entry violation exception. 


  • When the shellcode is executed with out debugger, INT3 instruction invokes the vectored exception handler routine, with an exception of EXCEPTION_BREAKPOINT, handler computes EIP by incrementing the EIP by 1 and fetching the info from incremented location. Later XORing the fetched information with a relentless in our case 0x1A. The result’s added to present EIP worth. The logic applied for dealing with INT3 exceptions additionally scan this system code for 0xCC directions put by the researchers. If 0xCC are discovered which can be positioned by researchers then EIP isn’t calculated correctly. 


EIP Calculation Logic Abstract 

Set off through interrupt instruction (INT3)  eip=((ReadByte(eip+1)^0x1A)+eip) 
Set off through Single Stepping(PUSHFD/POPFD)  eip=((ReadByte(eip+2)^0x1A)+eip) 

*The worth 0x1A adjustments with samples 

Detecting Irregular Execution Stream through VEH 

  • The shellcode is structured in such a manner that the malware can detect irregular execution circulation by the order during which exception occurred at runtime. The pushfd/popfd directions are adopted by the code that when executed throws STATUS_ACCESS_VIOLATION. When program is executed usually, the execution won’t attain the code that follows the pushfd/popfd instruction block, thus elevating solely STATUS_SINGLESTEP. When accidently stepped over the pushfd/popfd block in debugger, the STATUS_SINGLESTEP isn’t thrown on the debugger because it suppreses this as a result of the debugger is already single stepping by way of the code, that is detected by the handler logic once we encounter code that follows the pushfd/popfd instruction block wich throws a STATUS_ACCESS_VIOLATION. Now it runs right into a nested exception scenario (the entry violation adopted by suppressed single stepping exception through lure). Due to this, at any time when an entry violation happens, the handler routine checks for nested exception data in _EXCEPTION_POINTERS construction as mentioned at first. 

Beneath picture exhibits this the rigorously laid out code to detect evaluation. 


The Egg looking: VEH Assisted Runtime Padding 

One attention-grabbing function seen in GULoader shellcode within the wild is runtime padding. Runtime padding is an evasive habits to beat automated scanners and different safety checks employed at runtime. It delays the malicious actions carried out by the malware on the goal system.  

  • The egg worth within the analyzed pattern is 0xAE74B61.  
  • It initiates a seek for this worth in its personal information phase of the shellcode. 
  • Don’t overlook the truth that that is applied through VEH handler. This search itself provides 0.3 million of VEH iteration on high of standard VEH management manipulation employed within the code. 
  • The loader ends this search when it retrieves the handle location of the egg worth. To ensure the worth isn’t being manipulated by any means by the researcher, it performs two extra checks to validate the egg location. 
  • If the test fails, the search continues. The method of retrieving the situation of the egg is proven within the picture beneath.  

  • As talked about above, the validity of the egg location is checked by retrieving byte values from two offsets: one is 4 bytes away from the egg location and the worth is 0xB8. The opposite is at 9 bytes from the egg location and the worth is 0xC3. This test must be handed for the loader to proceed to the subsequent stage of an infection. Core malware actions are carried out after this runtime padding loop. 

 The next pictures present the egg location validity checks carried out by GULoader. The values 0xB8 and 0xC3 are checked by utilizing correct offsets from the egg location. 


Stage 2: Setting Test and Code Injection  

Within the second stage of the an infection chain, the GULoader performs anti-analysis and code injection. Main anti-analysis vectors are listed beneath. After ensuring that shellcode isn’t operating in a sandbox, it proceeds to conduct code injection right into a newly spawned course of the place stage 3 is initiated to obtain and deploy precise payload. This payload will be both commodity stealer or RAT.  

Anti-analysis Strategies  

  • Employs runtime padding as mentioned earlier than. 
  • Scans complete course of reminiscence for evaluation software particular strings 
  • Makes use of DJB2 hashing for string checks and dynamic API handle decision. 
  • Strings are decoded at runtime 
  • Checks if qemu is put in on the system by checking the set up path: 
  • C:Program Informationqqaqqa.exe 
  • Patches the next APIs: 
  • DbgUIRemoteBreakIn 
  • The operate’s prologue is patched with ExitProcess name 
  • LdrLoadDll 
  • The preliminary bytes are patched with instruction “mov edi edi” 
  • DbgBreakPoint 
  • Patches with instruction nop 
  • Clears hooks positioned in ntdll.dll by safety merchandise or researcher for the evaluation. 
  • Window Enumeration through EnumWindows 
  • Hides the shellcode thread from the debugger through ZwSetInformationThread by passing 0x11 (ThreadHideFromDebugger) 
  • Gadget driver enumeration through EnumDeviceDrivers andGetDeviceDriverBaseNameA 
  • Put in software program enumeration through MsiEnumProductsA and MsiGetProductInfoA 
  • System service enumeration through OpenSCManagerA and EnumServiceStatusA 
  • Checks use of debugging ports by passing ProcessDebugPort (0x7) class to NtQueryInformationProcess 
  • Use of CPUID and RDTSC directions to detect digital environments and instrumentation. 

Anti-dump Safety 

Every time GULoader invokes a Win32 api, the decision is sandwiched between two XOR loops as proven within the picture beneath.  The loop previous to the decision encoded the energetic shellcode area the place the decision is happening to stop the reminiscence from getting dumped by the safety merchandise based mostly on occasion monitoring or api calls. Following the decision, the shellcode area is decoded once more again to regular and resumes execution. The XOR key used is a phrase current within the shellcode itself. 


String Decoding  

This part covers the method undertaken by the GUloader to decode the strings on the runtime. 

  • The NtAllocateVirtualMemory known as to allocate a buffer to carry the encoded bytes. 
  • The encoded bytes are computed by performing numerous arithmetic and logical operations on static values embedded as operands of meeting directions. Beneath picture exhibits the restoration of encoded bytes through numerous mathematical and logical operations. The EAX factors to reminiscence buffer, the place computed encoded values get saved. 


The primary byte/phrase is reserved to carry the dimensions of the encoded bytes. Beneath exhibits a 12 byte lengthy encoded information being written to reminiscence. 

Later, the primary phrase will get changed by the primary phrase of the particular encoded information. Beneath picture exhibits the buffer after changing the primary phrase. 

The encoded information is totally recovered now, and malware proceeds to decode it. For decoding the easy XOR is employed, and key’s current within the shellcode. The meeting routine that does the decoding is proven in the picture beneath. Every byte within the buffer is XORed with the important thing. 


The results of the XOR operation is written to identical reminiscence buffer that holds the encoded information. A last view of the reminiscence buffer with decoded information is proven beneath. 

The picture exhibits the decoding the string “psapi.dll”, later this string is utilized in fetching the handlees of numerous capabilities to make use of anti-evaluation.  


The stage 2 culminates in code injection, to be particular GULoader employs a variation of the method hollowing approach, the place a benign course of is spawned in a suspended state by the malware stager course of and proceeds to overwrite the unique content material current within the suspended course of with malicious content material, later the state of the thread within the suspended course of is modified by modifying processor register values like EIP and at last the method resumes its execution. By controlling EIP, malware can now direct the management circulation within the spawned course of to a desired code location. After a profitable hollowing, the malware code will probably be operating beneath the quilt of a legit utility.  

The variation of hollowing approach employed by the GULoader doesn’t substitute the file contents, however as an alternative injects the identical shellcode and maps the reminiscence within the suspended course of. Curiously, GULoader employs an extra approach if the hollowing try fails. Extra particulars are lined within the following part.  

Listed beneath Win32 native APIs are dynamically resolved at runtime to carry out the code injection. 

  • NtCreateSection 
  • ZwMapViewOfSection 
  • NtWriteVirtualMemory 
  • ZwGetContetThread 
  • NtSetContextThread 
  • NtResumeThread   

Overview of Code Injection 

  • Initially picture “%windirpercentMicrosoft.NETFrameworkversion on 32-bit techniques<model>CasPol.exe” is spawned in suspended mode through CreateProcessInternalW native API. 
  • The Gu loader retrieves a deal with to the file “C:WindowsSysWOW64iertutil.dll” which is utilized in part creation. The part object created through NtCreateSection will probably be backed by iertutil.dll.  
  • This habits is principally to keep away from suspicion, a piece object which isn’t backed by any file might draw undesirable consideration from safety techniques.  
  • The subsequent section within the code injection is the mapping of the view created on the part backed by the iertutil.dll into the spawned CasPol.exe course of. As soon as the view is efficiently mapped to the method, malware can inject the shellcode within the mapped reminiscence and resume the method thus initiating stage 3. The native api ZwMapViewOfSection is used to carry out this job. Following the execution of the above API, the malware checks the results of the operate name in opposition to the beneath listed error statuses. 
  • 40000003 (STATUS_IMAGE_NOT_AT_BASE). 
  • If the mapping is unsuccessful and standing code returned by ZwMapViewOfSection matches with any of the code talked about above, it has a backup plan. 
  • The GuLoader calls NtAllocateVirtualMemory by immediately calling the system name stub which is generally present in ntdll.dll library to bypass EDR/AV hooks. The reminiscence is allotted within the distant CasPol.exe course of with an RWX reminiscence safety. Following picture exhibits the direct use of NtAllocateVirtualMemory system name. 

After reminiscence allocation, it writes itself into distant course of through NtWriteVirtualMemory as mentioned above. GULoader shellcodes taken from the area are larger in dimension,  samples taken for this evaluation are all higher than 20 mb. In samples analyzed, the buffer dimension allotted to carry the shellcode is 2950000 bytes. The beneath picture exhibits the GuLoader shellcode within the reminiscence. 


Deceptive Entry level  

  • The GULoader is very evasive in nature, if irregular execution circulation is detected with assist of employed anti-analysis vectors, the EIP and EBX fields of thread context construction (of CasPol.exe course of) will probably be overwritten with a decoy handle, which is required for the stage 3 of malware execution. The situation ebp+4 is used to carry the entry level regardless of of the actual fact whether or not program is being debugged or not. 
  • The Gu loader makes use of ZwGetContextThread and NtSetContextThread routines to perform modification of the thread state. The CONTEXT construction is retrieved through ZwGetContextThread, the worth [ebp+14C] is used because the entry level handle. The present EIP worth held within the EIP area within the context construction of the thread will probably be modified to a recalculated handle based mostly on worth at ebp+4. Beneath picture exhibits the RVA calculation.  The bottom handle of the executing shellcode (stage 2) is subtracted from the digital handle [ebp+4] to acquire RVA.  


The RVA is added to the base handle of the newly allotted reminiscence within the CasPol.exe course of to acquire new VA which can be utilized within the distant course of. The brand new VA is written into EIP and EBX area within the thread context construction of the CasPol.exe course of retrieved through ZwGetContextThread. Beneath picture exhibits the modified context construction and worth of EIP.  


Lastly, by calling ZwSetContextThread, the changes made to the CONTEXT construction is dedicated within the goal thread of CasPol.exe course of. The thread is resumed by calling NtResumeThread. The CasPol.exe resumes execution and performs stage 3 of the an infection chain. 

Stage 3: Payload Deployment  

The GULoader shellcode resumes execution from inside a brand new host course of, on this report, analyzed samples inject the shellcode both into the identical course of spawned as a baby course of or caspol.exe. Stage3 performs all of the anti-analysis as soon as once more to verify this stage isn’t being analyzed. In any case checks, GUloader proceeds to carry out stage3 actions by decoding the encoded C2 string within the reminiscence as proven within the picture beneath. The decoding methodology is similar as mentioned earlier than. 

Later the addresses of following capabilities are resolved dynamically by loading wininet.dll: 

  • InternetSetOptionA 
  • InternetOpenUrlA 
  • InternetReadFile 
  • InternetCloseHandle. 

The beneath picture exhibits the response from the content material supply community (cdn) server the place the ultimate payload is saved. On this evaluation, a payload of dimension 0x2E640 bytes is shipped to the loader. Curiously, the primary 40 bytes are ignored by the loader. The precise payload begins from the offset 40 which is highlighted within the picture. 


The cdn server is nicely protected, it solely serves to shoppers with correct headers and cookies. If these are usually not current within the HTTP request, the next message is proven to the person. 

Closing Payload 

Quasi Key Era 

Step one in decoding the the downloaded last payload by the GUloader is producing a quasi key which will probably be later utilized in decoding the precise key embeded within the GULoader shellcode. The encoded embeded key dimension is 371 bytes in analysed pattern. The method of quasi key technology is as follows: 

  • The 40th and 41st bytes (phrase) are retrived from the obtain buffer within the reminiscence. 
  • The above phrase is XORed with the primary phrase of the encoded embeded key alongside and a counter worth. 
  • The method is repeated untill the the phrase taken from the downloaded information totally decodes and have a price of 0x4D5A “MZ”. 
  • The worth current within the counter when the 4D5A will get decoded is taken because the quasi key. This key’s proven as “key-1” within the picture beneath. Within the analysed pattern the worth of this key’s “0x5448” 

Decoding Precise Key 

The embedded key within the GULoader shellcode is of the dimensions 371 bytes as mentioned earlier than. The quasi key’s used to decode the embeded key as proven within the picture beneath. 

  • Every phrase within the embeded key’s XORed with quasi key key-1. 
  • When the interation counter exceeds the dimensions worth of 371 bytes, it stops and proceeds to decode the downloaded payload with this new key. 

The decoded 371 bytes of embeded key’s proven beneath within the picture beneath. 

Decoding File 

A byte stage decoding occurs after embeded key’s decoded within the reminiscence. Every byte of the downloaded information is XORed with the important thing to acquire the precise information, which is a PE file. The decoded information is overwritten to the identical buffer used to obtain the decoded information. 

The ultimate decoded PE file residing within the reminiscence is proven within the picture beneath: 

Lastly, the loader hundreds the PE file by allocating the reminiscence with RWX permission within the stage3 course of, based mostly on analyzing a number of samples its both the identical course of in stage 2 because the baby course of, or casPol.exe. The loading concerned code relocation and IAT correction as anticipated in such a state of affairs. The ultimate payload resumes execution from inside the hollowed stage3 course of. Beneath malware households are normally seen deployed by the GULoader: 

  • Vidar (Stealer) 
  • Raccoon (Stealer) 
  • Remcos RAT 

Beneath picture exhibits the injected reminiscence areas in stage3 course of caspol.exe on this report. 


The function performed by malware loaders popularly often called “crypters” is critical within the deployment of Distant Administration Instruments and stealer malwares that focus on shopper information. The exfiltrated Private Identifiable Data (PII) extracted from the compromised endpoints are largely collected and funneled to varied underground information promoting marketplaces. This additionally impacts companies as numerous important data used for authentication functions are getting leaked from the non-public techniques of the person resulting in preliminary entry on the corporate networks. The GuLoader is closely utilized in mass malware campaigns to contaminate the customers with standard stealer malware like Raccoon, Vidar, and Redline. Commodity RATs like Remcos are additionally seen delivered in such marketing campaign actions. On the intense facet, it’s not tough to fingerprint malware specimens used within the mass campaigns due to the quantity its quantity and relevance, detection guidelines and techniques will be constructed round this actual fact. 


Following desk summarizes all of the dynamically resolved Win32 APIs  

Win32 API 














Leave a Reply

Your email address will not be published. Required fields are marked *