My experience rewriting the implant portion of the diet-C2 in rust, and adding an earlybird-injection command.

Introducing implant-v2#

For the last month I bounced around between rewriting the implant in pure C, C++ or in Rust (as I have been learning it for fun on the side). I eventually made the final switch from C++ to Rust because of a cryptic crash from Visual Studio whenever I tried to import the curl library that persisted after reinstalling Visual Studio, Windows and re-downloading the curl library.

I was initially nervous about trying to write it in Rust because “offensive Rust” is inherently unsafe, and I would be fighting against the compiler when trying to learn a technique that would otherwise be easy in C or C++. However on finishing the rewrite, I think it was actually easier to implement the implant in Rust than C or C++. This is thanks to 1) the amazing ecosystem and 2) a surprising amount of documentation for writing offensive Rust.

What changed?#

In addition to doing a full 1:1 rewrite of the previous implant, I was able to introduce a new command as well as work on the obfuscation of the userland APIs I am calling by dynamically resolving function addresses out of kernel32.dll and ntdll.dll.

Hiding Functions from the Import Table#

Its well known that having some pretty suspicious functions in the import table of a binary can lead to further inspection by an AV/EDR, things like CreateRemoteProcess, VirtualAllocEx etc. So naturally preventing them from showing up in the IAT, as well as just not showing up as strings in the binary can be very helpful when trying to avoid static detection.

Previously I was directly using windows.h and calling the functions in a way that would get them included in the IAT. Now I’m using rust so to represent this I used the new (effectively) windows.h replacement for the rust language - the windows crate.

Below I compiled a tiny example that just calls MessageBoxA using this library, and you can see that the function is present inside of the IAT from the dumpbin utility output.

To obfuscate the call, we can dynamically resolve the address of the MessageBoxA function by loading user32.dll and parsing the address out of it with GetProcAddress.

To do this, I used the libloading crate, a really useful library that works for both linux .so files as well as windows .dll’s. The library works by loading whatever library file name you pass in, and then resolving functions by their symbols (their function names).

use std::ffi::c_void;

fn main() {

    unsafe {
        let user32 = libloading::Library::new("user32.dll").unwrap();
        let MB_OK: u32 = 0x00000000;
        let message_box_a: libloading::Symbol<unsafe extern "C" fn(isize, *const c_void, *const c_void, u32)>
        = user32.get(b"MessageBoxA\0").unwrap();
        
        let caption = b"this is my caption\0";
        let contents = b"this is my content\0";
        message_box_a(0, contents.as_ptr().cast(), caption.as_ptr().cast(), MB_OK);
    }
}

And on inspecting the dumpbin output, the executable doesn’t even say that it imports user32.dll at all, let alone MessageBoxA.

I used this technique to extract all the sketchy functions I need out of kernel32.dll and ntdll.dll without them being in the import table. The actual implant is a little more efficient than extracting them each time, instead it will initialize a struct of functions that I need on startup (CreateRemoteProcess etc.) and store it as a global variable.

That being said, this doesn’t help much if the string of the function name is still in the binary.

This is again easily fixed with a rust crate, litcrypt, which provides an easy macro that encrypts static strings so that they are not present in the final binary. After initializing it, you can use the lc!() macro around any string literals and it will obfuscate it for you.

use std::ffi::c_void;
use litcrypt::{lc, use_litcrypt};
use_litcrypt!();

fn main() {

    unsafe {
        let user32 = libloading::Library::new(lc!("user32.dll")).unwrap();
        let MB_OK: u32 = 0x00000000;
        let message_box_a: libloading::Symbol<unsafe extern "C" fn(isize, *const c_void, *const c_void, u32)>
        = user32.get(&[lc!("MessageBoxA").as_bytes(), b"\0"].concat()).unwrap();

        let caption = b"this is my caption\0";
        let contents = b"this is my content\0";
        message_box_a(0, contents.as_ptr().cast(), caption.as_ptr().cast(), MB_OK);
    }
}

The syntax is a little ugly in order to get it to work in the right format as a C string, but it works! There is no mention of either user32.dll or MessageBoxA in the strings of the binary (And is much easier than trying to spell out each string in a u8 array each time).

All of the functions currently used in the implant are implemented in this way, but I still think I can improve on the obfuscation of the Window’s API calls.

EarlyBird Injection#

The EarlyBird injection technique is well documented - I first read about it in this blogpost from CyberBit and later learned about the technique in depth during the Intermediate Malware Development course from Sektor7, so I figured it was a natural progression for the diet-c2.

The general technique is outline in the post by CyberBit, but I’ll go over it briefly here:

  1. A new process is spawned in a suspended state, for example svchost.exe for obfuscation
  2. Memory is allocated in the new process
  3. Shellcode is copied into the allocated memory
  4. QueueUserAPC() is called on the new thread, pointing at the shellcode. This queues an Asynchronous Procedure Call (APC), a function that executes asynchronously in the context of the new process’ thread.
  5. The process is resumed with ResumeThread(), which calls an un-documented Nt function, NtTestAlert(), which clears out the APC queue for the process and executes the shellcode

The technique is special because by spawning a process in a suspended state it prevents (or rather mitigates) hooking of userland Windows APIs as the shellcode is executed before the entry-point of the process, when most AVs/EDRs place their hooks.

The technique was first discovered over 5 years ago in 2018, so while its a large step from stock standard shellcode injection with CreateRemoteThread(), I’m sure its been fingerprinted up and down by AVs and especially EDRs.

Nevertheless, Windows defender doesn’t catch it :)

A Rusty EarlyBird#

The full code can be found inside of my GitHub repo, under implants/implant-v2/src/.

The implementation uses all obfuscated functions as shown above, and never gets caught by Defender, even when launching meterpreter payloads.

Next steps#

I could think of a couple of limitations for my existing function obfuscation. Its pretty annoying to draft up the boilerplate for the C function equivalent every time I need to call a new Windows API function, but that work is front loaded so it should be smooth sailing now. The other drawback is that in executing libloading::Library::new() just calls LoadLibrary out of the windows crate, and the .get() function just calls GetProcAddress() - so it is nowhere near completely obfuscated.

In the future I want to parse the already imported-at-runtime kernel32.dll to get access to GetProcAddress(), and then use that function to load everything else. I’m thinking I can parse through the PEB headers of the process file to get to kernel32.dll in memory, and then just pull GetProcAddress out of there.

More interestingly though, I learned of a couple of techniques involving direct and indirect syscalls that would allow you to fully bypass EDR userland hooking (in theory), so I am going to work on developing a diet-c2 dropper in C that uses these techniques (and then port it to rust?). I would also like to do something involving api-unhooking, but I need to find an open source EDR that does inline api hooking first.