Weaponizing Microsoft Copilot for Cyberattacks

Aug 27, 2024

2 min read

3

97

0

Security researcher Michael Bargury, co-founder and CTO of Zenity, recently demonstrated at Black Hat USA how threat actors can exploit Microsoft Copilot, an AI-based chatbot, for malicious purposes. Bargury released a "LOLCopilot" ethical hacking module to show how attackers can weaponize Copilot and offered advice for defensive tooling.

Copilot for security — Source - microsoft.com

Prompt Injection Attacks Bargury explained that Copilot, like other chatbots, is susceptible to prompt injections that enable hackers to evade its security controls. There are two types of prompt injections:

Direct prompt injection (jailbreak): The attacker manipulates the LLM prompt to alter its output.
Indirect prompt injection: The attacker modifies the data sources accessed by the model.

Using the LOLCopilot tool, Bargury demonstrated how attackers can add a direct prompt injection to a Copilot, jailbreaking it and modifying parameters or instructions within the model. This allows them to manipulate everything Copilot does on the victim's behalf, including the responses it provides, actions it performs, and taking full control of the conversation undetected.

Remote "Copilot" Execution (RCE) Attacks -

Bargury described Copilot prompt injections as equivalent to remote code execution (RCE) attacks. While Copilots don't run code, they follow instructions, perform operations, and create compositions from those actions. The attacker can enter the conversation from the outside and take full control of all the actions the Copilot does on the victim's behalf.

During the session, Bargury demonstrated RCE attacks where the attacker:

Manipulates a Copilot to alter a victim's vendor banking information to steal funds
Exfiltrates data in advance of an earnings report to trade on that information
Makes Copilot a malicious insider that directs users to a phishing site to harvest credentials

Microsoft's Response and Defensive Tooling -

Microsoft has addressed newly surfaced research about prompt injections and released tools like Prompt Shields, an API designed to detect direct and indirect prompt injection attacks. Other new tools include Groundedness Detection to detect hallucinations in LLM outputs, Safety Evaluation to detect application susceptibility to jailbreak attacks, and PyRIT (Python Risk Identification Toolkit) for generative AI risk discovery.

However, Bargury believes more tools are needed to scan for "promptware" (hidden instructions and untrusted data) in AI models. He notes that existing tools like Microsoft Defender and Purview lack these capabilities, and a surgical payload can evade detection.

While Microsoft is working hard to address AI risks, Bargury's research has found 10 different security mechanisms inside Microsoft Copilot to scan everything that goes into and out of the system. He regularly communicates with Microsoft's red team and believes they are aware of the risks associated with AI and are moving aggressively to address them.

References -