MalTerminal Powered by GPT4 Generates Sophisticated Ransomware

Security researchers from SentinelLabs have uncovered what may be the earliest known example of LLM-enabled malware, dubbed MalTerminal, capable of dynamically generating ransomware and reverse shell code using OpenAI’s GPT-4 model.

Unlike classical malware, which embeds malicious logic directly in code, this new breed leverages large language models (LLMs) to construct payloads at runtime, posing unprecedented challenges for detection and threat hunting.

MalTerminal surfaced during a broad study of LLM-driven threats, in which researchers identified malicious binaries and Python scripts hardcoding OpenAI API keys and carefully engineered prompts.

These discoveries align with similar offensive tools, including APT28’s LameHug (PROMPTSTEAL) and PromptLock ransomware, which show how adversaries are experimenting with embedding LLM functionality into active payloads.

The pivotal difference lies in runtime adaptability. Traditional malware signatures become unreliable, since generated payloads may differ across executions and environments.

Dynamic detection also suffers, as LLM logic adapts its behavior depending on contextual system information returned to the model. This ability to build malicious commands or scripts on demand is what makes LLM-enabled malware a “detection engineer’s nightmare.”

MalTerminal: Proof-of-Concept Malware Using GPT-4

Researchers found several components associated with MalTerminal.

The main Windows binary (MalTerminal.exe) originated from a Python-to-EXE compilation, referencing an earlier script (C:\Users\Public\Proj\MalTerminal.py), supporting artifacts like testAPI.py, which allowed operators to select “Ransomware” or “Reverse Shell” functionality, both generated live via GPT-4’s chat-completion API.

Notably, MalTerminal contained a deprecated GPT-4 API endpoint, phased out in November 2023, indicating active development before that date. This makes MalTerminal the earliest known effort to embed GPT-powered logic directly into malware.

Other variants like TestMal2.py refined the operator menu, while an unexpected branch TestMal3.py and Defe.py implemented a defensive GPT-based malware scanner named “FalconShield.”

Despite significant engineering effort, SentinelLabs did not find evidence of wide-scale deployment or monetization of MalTerminal, suggesting it may have been a red-team proof-of-concept or research project. Still, its design demonstrates how threat actors could operationalize LLMs for scalable, polymorphic offensive campaigns.

Technically, the malware embeds crafted prompts instructing GPT-4 to generate ransomware capable of file encryption or to produce a functional reverse shell for remote control.

Dependencies on API keys and fixed libraries introduce weaknesses; revoked or blacklisted keys could disable the malware. However, embedding multiple leaked or stolen keys, as seen in APT28’s PROMPTSTEAL, remains a viable tactic to extend malware life.

For defenders, this raises new opportunities and hurdles. On one hand, LLM-enabled malware must disclose prompts and API credentials in its code.

Researchers successfully hunted fresh samples using YARA rules targeting provider-specific patterns, such as OpenAI’s “T3BlbkFJ” substring or Anthropic’s “sk-ant-api03” prefix. Prompt hunting scanning for structured LLM instructions further exposed operational intent, even before execution.

Though still in experimental stages, MalTerminal underscores a strategic shift in attacker tradecraft. By fusing malware with LLMs, adversaries gain adaptability at the cost of brittle dependencies. For defenders, understanding these patterns is vital to staying ahead of the evolving threat landscape.

Find this Story Interesting! Follow us on Google News , LinkedIn and X to Get More Instant Updates

Recent Articles

Related Stories

LEAVE A REPLY

Please enter your comment!
Please enter your name here