MalTerminal Malware Powered by LLM Technology Uses OpenAI GPT-4 to Generate Ransomware Code

Security researchers at SentinelLABS have unveiled MalTerminal, a novel proof-of-concept demonstrating the integration of Large Language Models into malware payloads.

This Windows executable, identified through a year-long retrospective hunt, embeds an OpenAI GPT-4 chat completions API endpoint, which has been deprecated since November 2023, suggesting the sample dates back to late 2023 or early 2024.

SentinelLABS analysts developed YARA rules to detect API key patterns unique to major LLM providers, notably the Base64 substring “T3BlbkFJ” used in OpenAI keys. A comprehensive retrohunt across VirusTotal uncovered over 7,000 samples containing more than 6,000 unique LLM API keys.

Manual clustering by key multiplicity isolated truly malicious candidates, among which MalTerminal emerged as the earliest known example of malware dynamically generating malicious logic at runtime.

Technical Architecture and Runtime Code Generation

MalTerminal operates by issuing a structured JSON payload to the GPT-4 endpoint, instructing the model to generate either ransomware encryption routines or a reverse shell payload.

Upon execution, the embedded prompt defines the malware’s role as a cybersecurity expert and includes explicit guardrails to mitigate hallucinations.

The prompt template specifies code segments for recursively enumerating files, applying AES encryption in CBC mode, and uploading encrypted archives via HTTP POST.

Micropatterns within the prompt, such as enforcing consistent byte endianness and restricting file-open modes to “rb+”, reflect adversaries’ efforts to preempt common LLM generation errors.

SentinelLABS recovered the accompanying Python scripts, two variants of testAPI.py , and an enhanced TestMal2.py that replicate the executable’s behavior.

These scripts prompt the operator to choose between “Ransomware” and “Reverse Shell,” dynamically executing the returned Python code in memory to evade static and behavioral detection. A defensive scanner, TestMal3.py (alias “FalconShield”), was also discovered.

This utility extracts embedded prompts from target Python files and leverages an LLM classifier to distinguish malicious intent, illustrating the dual-use nature of prompt-based analysis tools.

Implications for Detection and Threat Hunting

MalTerminal’s runtime code generation undermines traditional signature-based detection, as each execution can yield a unique payload. However, the reliance on hardcoded API keys and prompt structures introduces detectable artifacts.

SentinelLABS advocates a two-pronged hunting approach: wide API key detection using deterministic YARA patterns for known key prefixes, and prompt hunting to extract embedded JSON structures resembling chat completions.

Pairing prompt extraction with lightweight LLM classifiers enables rapid triage of high-risk samples. Network analysis can further distinguish malicious LLM traffic from legitimate usage by correlating endpoints with deprecated or revoked API versions.

Despite the novelty of MalTerminal, no evidence of in-the-wild deployment has been found; all samples remain proof-of-concept or red team utilities.

The brittle dependency on commercial LLM services, which require valid API keys and accessible endpoints, presents a narrow window for defenders to refine detection strategies before adversaries adopt more resilient architectures, such as self-hosted models or bespoke inference APIs.

By focusing threat hunting on LLM integration artifacts, security teams can stay ahead of this emerging class of malware.

As adversaries continue to experiment with generative AI, defenders must adapt, developing YARA rules, prompt-hunting pipelines, and behavior-based analysis to counter the next generation of dynamic, AI-driven threats.

Find this Story Interesting! Follow us on Google News , LinkedIn and X to Get More Instant Updates

MalTerminal Malware Powered by LLM Technology Uses OpenAI GPT-4 to Generate Ransomware Code

Technical Architecture and Runtime Code Generation

Implications for Detection and Threat Hunting

Recent Articles

CISA Warns of XWiki Injection Flaw Enabling Remote Code Execution

LANSCOPE Endpoint Manager Zero Day Vulnerability Exploited by Threat Actors to Steal Data

WhatsApp Adds Passkey Encryption to Strengthen Chat Backup Security

Open Source C2 Framework Exploited by Threat Actors to Deliver Malicious Payloads

Windows LNK File UI Misrepresentation Enables Remote Code Execution Attacks

Related Stories

LEAVE A REPLY Cancel reply

About us

Cyber Press

The latest

CISA Warns of XWiki Injection Flaw Enabling Remote Code Execution

LANSCOPE Endpoint Manager Zero Day Vulnerability Exploited by Threat Actors to Steal Data

WhatsApp Adds Passkey Encryption to Strengthen Chat Backup Security

Subscribe