Microsoft Copilot, a language model designed to assist with tasks like email processing and document analysis, is vulnerable to prompt injection attacks, where malicious third-party content can manipulate Copilot’s behavior, leading to compromised data integrity and potential availability issues.
For instance, a Word document with carefully crafted instructions can trick Copilot into becoming a scammer. While there is no current fix for prompt injection, vendors mitigate the risk by including disclaimers that warn users about the potential inaccuracy of AI-generated content.
The exploit leverages multiple techniques for successful execution. An attacker sends a malicious email or document, tricking the user into opening it. Once opened, the email or document automatically triggers a tool to read other emails or documents on the system.
ASCII Smuggling is used to secretly stage data for exfiltration, while hyperlinks are rendered to redirect the user to attacker-controlled domains, and conditional prompt injection may be included to activate the exploit only when a specific user interacts with it using Copilot.
The prompt injection attack successfully manipulated Copilot into performing unauthorized actions. By injecting malicious code into the prompt, the attacker tricked Copilot into searching for sensitive information, such as Slack MFA codes, based on the content of the analyzed emails.
An attacker’s current control over Copilot allows them to invoke additional tools, potentially expanding the scope of accessible data, as the previous zero-click image rendering vulnerability has been patched, preventing direct data exfiltration through images.
To successfully exfiltrate sensitive information, the attacker might explore other avenues, such as manipulating Copilot to generate text-based summaries or responses that contain the desired data or potentially exploiting vulnerabilities in other integrated tools or services.
ASCII Smuggling exploits a vulnerability in Large Language Models (LLMs) by leveraging hidden Unicode characters that mimic standard ASCII yet remain invisible to users.
Attackers can embed data within seemingly normal links. When a user clicks the link, the LLM interprets the hidden data and transmits it to a malicious server, effectively exfiltrating information without the user’s knowledge.
The malicious email contains a prompt injection payload that teaches Copilot how to perform ASCII smuggling, which includes an in-context learning example demonstrating Unicode encoding to hide the text “hello, today is a good day” within a link.
The payload can also be hidden using white font, invisible tags, or other techniques. This exploit is not limited to email and can be executed through document sharing or RAG retrieval.
Researchers at Embrace The Red discovered a prompt injection vulnerability in Microsoft Copilot that could be exploited to render hyperlinks and execute arbitrary code, which was used to exfiltrate sensitive enterprise data by crafting malicious prompts that triggered the execution of unintended actions.
After reporting the vulnerability, Microsoft acknowledged and addressed the issue, but the specific mitigation techniques employed remain undisclosed, which highlights the importance of robust security measures to prevent prompt injection attacks in AI-powered systems.