MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits
- URL: http://arxiv.org/abs/2504.03767v2
- Date: Fri, 11 Apr 2025 16:59:05 GMT
- Title: MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits
- Authors: Brandon Radosevich, John Halloran,
- Abstract summary: The Model Context Protocol (MCP) is an open protocol that standardizes API calls to large language models (LLMs), data sources, and agentic tools.<n>We show that the current MCP design carries a wide range of security risks for end users.<n>We introduce a safety auditing tool, MCPSafetyScanner, to assess the security of an arbitrary MCP server.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To reduce development overhead and enable seamless integration between potential components comprising any given generative AI application, the Model Context Protocol (MCP) (Anthropic, 2024) has recently been released and subsequently widely adopted. The MCP is an open protocol that standardizes API calls to large language models (LLMs), data sources, and agentic tools. By connecting multiple MCP servers, each defined with a set of tools, resources, and prompts, users are able to define automated workflows fully driven by LLMs. However, we show that the current MCP design carries a wide range of security risks for end users. In particular, we demonstrate that industry-leading LLMs may be coerced into using MCP tools to compromise an AI developer's system through various attacks, such as malicious code execution, remote access control, and credential theft. To proactively mitigate these and related attacks, we introduce a safety auditing tool, MCPSafetyScanner, the first agentic tool to assess the security of an arbitrary MCP server. MCPScanner uses several agents to (a) automatically determine adversarial samples given an MCP server's tools and resources; (b) search for related vulnerabilities and remediations based on those samples; and (c) generate a security report detailing all findings. Our work highlights serious security issues with general-purpose agentic workflows while also providing a proactive tool to audit MCP server safety and address detected vulnerabilities before deployment. The described MCP server auditing tool, MCPSafetyScanner, is freely available at: https://github.com/johnhalloran321/mcpSafetyScanner
Related papers
- Securing GenAI Multi-Agent Systems Against Tool Squatting: A Zero Trust Registry-Based Approach [0.0]
This paper analyzes tool squatting threats within the context of emerging interoperability standards.
It introduces a comprehensive Tool Registry system designed to mitigate these risks.
Based on its design principles, the proposed registry framework aims to effectively prevent common tool squatting vectors.
arXiv Detail & Related papers (2025-04-28T16:22:21Z) - DoomArena: A framework for Testing AI Agents Against Evolving Security Threats [84.94654617852322]
We present DoomArena, a security evaluation framework for AI agents.
It is a plug-in framework and integrates easily into realistic agentic frameworks.
It is modular and decouples the development of attacks from details of the environment in which the agent is deployed.
arXiv Detail & Related papers (2025-04-18T20:36:10Z) - MCP Guardian: A Security-First Layer for Safeguarding MCP-Based AI System [0.0]
We present MCP Guardian, a framework that strengthens MCP-based communication with authentication, rate-limiting, logging, tracing, and Web Application Firewall (WAF) scanning.
Our approach fosters secure, scalable data access for AI assistants, underscoring the importance of a defense-in-depth approach.
arXiv Detail & Related papers (2025-04-17T08:49:10Z) - Progent: Programmable Privilege Control for LLM Agents [46.49787947705293]
We introduce Progent, the first privilege control mechanism for LLM agents.
At its core is a domain-specific language for flexibly expressing privilege control policies applied during agent execution.
This enables agent developers and users to craft suitable policies for their specific use cases and enforce them deterministically to guarantee security.
arXiv Detail & Related papers (2025-04-16T01:58:40Z) - MCP Bridge: A Lightweight, LLM-Agnostic RESTful Proxy for Model Context Protocol Servers [0.5266869303483376]
MCP Bridge is a lightweight proxy that connects to multiple MCP servers and exposes their capabilities through a unified API.
The system implements a risk-based execution model with three security levels standard execution, confirmation, and Docker isolation while maintaining backward compatibility with standard MCP clients.
arXiv Detail & Related papers (2025-04-11T22:19:48Z) - Les Dissonances: Cross-Tool Harvesting and Polluting in Multi-Tool Empowered LLM Agents [15.15485816037418]
We present the first systematic security analysis of task control flows in multi-tool-enabled LLM agents.<n>We identify a novel threat, Cross-Tool Harvesting and Polluting (XTHP), which includes multiple attack vectors.<n>To understand the impact of this threat, we developed Chord, a dynamic scanning tool designed to automatically detect real-world agent tools susceptible to XTHP attacks.
arXiv Detail & Related papers (2025-04-04T01:41:06Z) - Defeating Prompt Injections by Design [79.00910871948787]
CaMeL is a robust defense that creates a protective system layer around the Large Language Models (LLMs)<n>To operate, CaMeL explicitly extracts the control and data flows from the (trusted) query.<n>We demonstrate effectiveness of CaMeL by solving $67%$ of tasks with provable security in AgentDojo [NeurIPS 2024], a recent agentic security benchmark.
arXiv Detail & Related papers (2025-03-24T15:54:10Z) - Multi-Agent Systems Execute Arbitrary Malicious Code [9.200635465485067]
We show that adversarial content can hijack control and communication within the system to invoke unsafe agents and functionalities.<n>We show that control-flow hijacking attacks succeed even if the individual agents are not susceptible to direct or indirect prompt injection.
arXiv Detail & Related papers (2025-03-15T16:16:08Z) - Automating Prompt Leakage Attacks on Large Language Models Using Agentic Approach [9.483655213280738]
This paper presents a novel approach to evaluating the security of large language models (LLMs)<n>We define prompt leakage as a critical threat to secure LLM deployment.<n>We implement a multi-agent system where cooperative agents are tasked with probing and exploiting the target LLM to elicit its prompt.
arXiv Detail & Related papers (2025-02-18T08:17:32Z) - Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks [88.84977282952602]
A high volume of recent ML security literature focuses on attacks against aligned large language models (LLMs)<n>In this paper, we analyze security and privacy vulnerabilities that are unique to LLM agents.<n>We conduct a series of illustrative attacks on popular open-source and commercial agents, demonstrating the immediate practical implications of their vulnerabilities.
arXiv Detail & Related papers (2025-02-12T17:19:36Z) - AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents [84.96249955105777]
LLM agents may pose a greater risk if misused, but their robustness remains underexplored.
We propose a new benchmark called AgentHarm to facilitate research on LLM agent misuse.
We find leading LLMs are surprisingly compliant with malicious agent requests without jailbreaking.
arXiv Detail & Related papers (2024-10-11T17:39:22Z) - Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents [47.219047422240145]
We take the first step to investigate one of the typical safety threats, backdoor attack, to LLM-based agents.
Specifically, compared with traditional backdoor attacks on LLMs that are only able to manipulate the user inputs and model outputs, agent backdoor attacks exhibit more diverse and covert forms.
arXiv Detail & Related papers (2024-02-17T06:48:45Z) - A Survey and Comparative Analysis of Security Properties of CAN Authentication Protocols [92.81385447582882]
The Controller Area Network (CAN) bus leaves in-vehicle communications inherently non-secure.
This paper reviews and compares the 15 most prominent authentication protocols for the CAN bus.
We evaluate protocols based on essential operational criteria that contribute to ease of implementation.
arXiv Detail & Related papers (2024-01-19T14:52:04Z) - RatGPT: Turning online LLMs into Proxies for Malware Attacks [0.0]
We present a proof-of-concept where ChatGPT is used for the dissemination of malicious software while evading detection.
We also present the general approach as well as essential elements in order to stay undetected and make the attack a success.
arXiv Detail & Related papers (2023-08-17T20:54:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.