Using GenAI to Encode Malware and Bypass EDR

In this blog, Neuvik’s Director of Advanced Assessments for the EU, Jean Maes, shares an creative method to used to “bypass” a less discussed heuristic – entropy. To do so, he’ll explore the world of Endpoint Detection and Response (EDR) solutions and their telemetry, highlighting how to leverage Generative Artificial Intelligence (GenAI) to bypass entropy metrics. Ultimately, readers will gain a new AI-based technique for their penetration testing / Red Teaming toolkit.

The approach described in this blog is novel and provides two valuable insights for offensive testers. First, entropy metrics in use by standard cybersecurity tooling, such as Endpoint Detection and Response (EDR), can be bypassed using creative manipulation. Typically, an EDR tool will use entropy to analyze how random the data is, looking for signs of “high” entropy that would suggest encryption and therefore flag an executable as a possible threat.

In this case, Jean will describe how he was able to use Generative Artificial Intelligence (GenAI) to create a list of 256 “words” which, when used to replace traditional 256-bit encryption in shellcode, have low entropy and therefore evade EDR. To provide a concrete example, this means a malicious actor could use this technique to lower the entropy of the shellcode of a malware executable and bypass a target’s EDR.

Second, Jean discovered that this modified shellcode could be located in the executable file after compilation, which is a technique Neuvik has not seen in use elsewhere. Typically, shellcode would be put in another file, an image via steganography, or downloaded directly from a site – yet, in this case, Jean simply hid it in the same executable file, but past where said file would “end”. Without disclosing how EDR tools inspect files, this approach could conceivably bypass inspection methods that include checking for a specific “end” point in shellcode.

So, what does this mean? As offensive testers, this suggests that there is significant opportunity to leverage GenAI for creative methods to bypass traditional metrics used by EDR tools. And, this also means that cybersecurity practitioners should consider methods of tuning EDR (where possible) to prevent bypasses of metrics such as entropy.

Read on for step-by-step instructions in Jean’s own words!

How to get caught – i.e., what not to do

At Neuvik, we have the luxury of trying and utilizing multiple commercial (and sometimes ad hoc, custom built) Command and Control Infrastructure (C2) as part of our offensive testing for clients.

While some of the bigger commercial C2 frameworks support evasion out of the box –making it as simple as generating an .exe or Dynamic Link Library (DLL) from the weaponization module, dropping it to disk and using msiexec (rundll32 is so 2016!) to get an implant up and running – we often find ourselves constrained in tweaking functionality in the core implant itself. The de facto C2 framework that comes to mind when we think about tweaking almost everything imaginable is Cobalt Strike (outside of creating your own custom C2, of course). As such, we’ll use Cobalt Strike as an example for this post, as it’s (arguably?) the most signatured framework on the planet, perhaps even more so than meterpreter.

So, how do Endpoint Detection and Response (EDR) tools detect malware? Most of the time, they use a combination of static signatures (like a file hash, a string, or a series of bytes that don’t change) in combination with behavioral detections (i.e., Word spawning cmd).

To make this more concrete, let’s use Elastic as an example, as they have a public GitHub repository where they publish detection mechanisms.

While these behaviors and static detections already provide a strong foundation, many blogs and talks provide guidance on how to tailor your implants to evade these signatures.

One of my personal favorites is from Fortra: Can I Have Your Signature?

However, another metric often gets overlooked and is not discussed as frequently as static and behavioral signatures: Entropy.

Now don’t get me wrong – entropy is not a new concept and has been explained in various security blogs and topics, including this one from our friends at RedSiege: Evading CrowdStrike Falcon Using Entropy

However, it turns out that entropy is still somewhat mysterious. It wasn’t until I saw a presentation at X33fcon by one of my ex-colleagues at NVISO that I had the idea of doing a fun little experiment with AI.

For several years now, whenever I needed to decrease entropy in my operations, I just appended “Rickroll” lyrics to the payload and that worked well. But what if I didn’t just append code but transform the payload itself into something that has less entropy?

Shellcode typically has pretty high entropy as it’s literally designed to be compact and to include executable instructions, making it unpredictable. Similarly, if you encrypt or pack something, entropy also increases significantly.

So, let’s figure out the entropy of a default Cobalt Strike beacon (RAW .bin format). For reference, 8 is the highest entropy possible. Typically, anything above 6 is considered high, while everything below 5 is low.

*Figure 1 – Example detection on strings*

As seen in the graphic above, this Cobalt Strike beacon has an entropy of 6.65 – which is pretty significant.

Those are rookie numbers – we’ve got to pump those numbers up down

So, how can we use AI to help us reduce entropy?

Because a byte can be represented in 256 ways (00-FF), we can have AI help us develop strings of 256 characters. First, we could use it to come up with 256 random words. GenAI is pretty good at spitting out random nonsense when asked. But, just for fun, you could also ask it to generate 256 Pokémon names.

Despite the fun of generating that list, I specifically wanted to have a python script as well as a C executable that can encode/decode, as our operators switch between Linux and Windows quite often. Further, our loader component is typically written in C, and therefore anything we create in python needs to be able to be parsed through C as well. If you like coding in another language like C++ or C#, the same logic would apply, albeit the syntax and implementation would probably be (slightly) different.

Do note, if you start encoding bytes with words, you will significantly increase the size of your final payload, as the more characters your words have, the larger the output file will be. There are, however, approximately 1300 valid 3 letter words in the English language, so if you don’t like Pokémon or want to reduce the output file size, you can consider those.

Below is the relevant prompt and output that was used to generate the python script:

Create a python script that can encode and decode the bytes of an input file, cover the entire byte range 00-FF (256 different unique values), generate random words to encode with but use Pokémon. For example 00 is Pikachu and FF is Charizard. the script cannot have external dependencies. Make it optimized for speed, no shortcuts, print all names, do not omit anything. The names need to be unique, no duplicates are allowed. The names cannot include any special characters. Only values from a-z are supported. Also include checks in the python script that validate if there are really 256 names (size check) as well as a check to prevent duplicate entries in the names.

The ChatGPT output:

The output (and file size difference) can be seen in the screenshot below:

But now, for the moment of truth: will Pokémon names reduce entropy?

We managed to reduce entropy from 6.65 to 4.65, which is considered “low” entropy! Excellent.

So, we know encoding works, but what about decoding – do we get the same file back? Or will something go wrong? Upon my first iteration testing the output from GenAI, I noticed that there were some byte differences with whitespaces. My first attempts were converting text, which obviously is slightly different than raw bytestreams. So, let’s compare the beacon RAW with the decoded “Pokéfied” version:

Seems like the decoded “Pokéfied” version is identical to the RAW shellcode generated by Cobalt Strike! Excellent.

I see what you mean, now for weaponizing the Pokémon

Now that we have an encoded Cobalt Strike beacon that we can decode back to the original value, there is only one question left to be answered:

How do we actually weaponize this?

To answer this question, we need to go back to Jimmy “the skid” Johnson, the fictional character I created to present fun with shellcode(loaders).

First, you have a decision to make:

Do you want the Pokémon names in your loader (i.e., a stage-less approach)?
Do you want to fetch the encoded file from another location and then dynamically decode (i.e., a staged approach)?

Remember, the whole point of this exercise was to reduce entropy. Fetching it remotely wouldn’t make much sense in this case, as the aim was to lower the artifact’s entropy (via the loader).

So, let’s go for option 1 and create a stageless loader.

The programming language and sometimes even the compiler you use will influence how you weaponize the output we generated.

For instance, I prefer to develop loaders in C for several reasons. However, Visual Studio doesn’t handle long byte arrays in source files well. Long arrays significantly increase compile time, as seen in this blog post: Injecting Shellcode from Executable Resources

Instead of converting the output to an array and embedding it into the source file, you could opt to embed it as a resource. However, I wanted to take another approach and came up with an idea so simple that I was surprised wasn’t frequently referenced. Here it is:

We can create a shellcode loader and write logic to make the loader aware of its own file on disk and enable it to parse bytes from this file
Compile it
Append the “Pokéfied” file to the end of the binary
During execution time, read the appended text from the binary itself

Still following me?

Let’s illustrate with another GenAI conversation:

What is the most efficient way to go to the end of the physical file on disk of a program in C. so basically, I want to have a helloworld.exe that can read the final bytes of its own binary on disk. The bytes are provided as an “Offset” argument, so we need to allocate memory dynamically and clean up afterwards.

Let’s compile the program:

Let’s now open the binary in a hex editor and go to the end of the file:

As an example, let’s also add “hello world!” to the end of the binary:

We should now be able to print “hello world!” by running our binary and specifying an offset of 12 characters.

From there, the rest of the weaponization is left as an exercise to the readers! Note: Neuvik does not condone the use of this method for any unauthorized and/or illegal activities.

Closing words

In conclusion, this blog demonstrates how to manipulate binary files by adding custom data at the end and utilizing offsets to access that data. The example showcased how a simple “hello world!” message could be appended and later retrieved. Notably, we used GenAI to generate the unique 256 words needed for this approach. This technique, while basic in nature, underscores the potential for more sophisticated applications and highlights the power of understanding binary file structures and offsets. We hope that this blog post inspires readers to explore and utilize GenAI in their own (legal) coding endeavors.

Happy coding!

Using GenAI to Encode Malware and Bypass EDR

How to get caught – i.e., what not to do

Those are rookie numbers – we’ve got to pump those numbers up down

I see what you mean, now for weaponizing the Pokémon

Closing words

More Articles ..

Gaps in AI Risk Management Frameworks

Why AI Requires Specialized Penetration Testing

Unpacking AI’s Unique Attack Surface and Attack Types

Common Cybersecurity Issues Stemming from AI

What makes AI Risk Different?