A recent security report has revealed the exploitation of Anthropic’s Claude AI models in a new wave of coordinated influence-as-a-service campaigns, highlighting the evolving threat posed by AI-enabled manipulation.
These campaigns illustrate a significant advancement in how adversarial actors are leveraging large language models (LLMs) to orchestrate and automate social media influence operations at scale, manipulating political discourse and conducting other forms of cyber-enabled abuse.
AI-Driven Social Manipulation Emerges as Sophisticated New Threat
At the center of the report is the exposure of a professionally operated influence-as-a-service scheme utilizing Claude for both content generation and operational planning.
This marks a departure from earlier models of AI misuse, as actors are no longer content with merely producing persuasive text; instead, they are leveraging AI as an orchestrator-deciding when and how bot accounts should engage with authentic users based on tailored, politically-motivated personas.
The orchestrated actions included liking, commenting, and sharing social media posts across multiple platforms, with bots maintaining consistent, distinct persona narratives in line with the specific political objectives of paying clients across several countries.
The botnet infrastructure controlled by the threat actors managed over 100 social media accounts, primarily on platforms such as Twitter/X and Facebook.
Each account was equipped with a nuanced political alignment and engaged tens of thousands of legitimate users, amplifying political narratives in a manner highly reminiscent of state-affiliated campaigns.
The campaign’s operational focus was on long-term, sustained engagement rather than viral spikes, suggesting a strategic approach to gradual narrative shaping over overt mass manipulation.
Beyond influence operations, the report details additional abuse vectors. These include credential stuffing attempts involving Claude-assisted automation for collecting and testing leaked passwords on internet-connected security cameras, as well as recruitment fraud campaigns.
In the latter, scammers used Claude to refine and sanitize their communication in real-time, making false job offers to Eastern European job seekers appear more professional and convincing.
In another case, a novice threat actor leveraged Claude’s capabilities to rapidly build sophisticated malware and doxing tools, effectively flattening the technical learning curve required for cybercrime.
Automation of Multi-Platform Botnets Raises Bar for Influence Operations
Anthropic’s response to these abuses involved account bans and the rapid deployment of enhanced detection methodologies.
Notably, the company utilized advanced analytical frameworks such as hierarchical summarization and conversation data clustering, coupled with robust input/output classifiers, to identify and counter emerging patterns of misuse.
Each discovered case directly informed iterative improvements to the company’s security controls and model guardrails.
The report underscores critical trends: adversaries are increasingly using frontier AI models to semi-autonomously operate complex abuse infrastructures, and generative AI is accelerating the skill acquisition of less technical actors, democratizing access to cyber offensive capabilities.
While there was no confirmation of successful real-world impact in these particular cases, the evolving threat landscape signals a pressing need for continuous innovation in AI safety, cross-sector collaboration, and the deployment of scalable, context-aware detection mechanisms.
Anthropic’s disclosure aims to provide actionable intelligence for the wider AI, security, and research communities as they work to fortify defenses against the growing misuse of generative AI.
The company reiterates its commitment to proactive monitoring and responsible AI deployment, acknowledging that neutralizing adversarial innovation is an ongoing challenge requiring collective vigilance and transparency.
Find this Story Interesting! Follow us on LinkedIn and X to Get More Instant updates