MCPZoo: A Large-Scale Dataset of Runnable Model Context Protocol Servers for AI Agent
- URL: http://arxiv.org/abs/2512.15144v2
- Date: Thu, 18 Dec 2025 04:40:26 GMT
- Title: MCPZoo: A Large-Scale Dataset of Runnable Model Context Protocol Servers for AI Agent
- Authors: Mengying Wu, Pei Chen, Geng Hong, Baichao An, Jinsong Chen, Binwang Wan, Xudong Pan, Jiarun Dai, Min Yang,
- Abstract summary: Model Context Protocol (MCP) enables agents to interact with external tools, yet empirical research on MCP is hindered by the lack of large-scale, accessible datasets.<n>We present MCPZoo, the largest and most comprehensive dataset of MCP servers collected from multiple public sources, comprising 95,142 servers.
- Score: 21.609308232244118
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Model Context Protocol (MCP) enables agents to interact with external tools, yet empirical research on MCP is hindered by the lack of large-scale, accessible datasets. We present MCPZoo, the largest and most comprehensive dataset of MCP servers collected from multiple public sources, comprising 95,142 servers. MCPZoo includes over ten thousand server instances that have been deployed and verified as runnable and interactable, supporting realistic experimentation beyond static analysis. The dataset provides unified metadata and access interfaces, enabling systematic exploration and interaction without manual deployment effort. MCPZoo is released as an open and accessible resource to support research on MCP-based security analysis.
Related papers
- MCP-Flow: Facilitating LLM Agents to Master Real-World, Diverse and Scaling MCP Tools [58.5971352939562]
Large Language Models increasingly rely on external tools to perform complex, realistic tasks.<n>Existing MCP research covers few servers, depends on costly manual curation, and lacks training support.<n>We introduce MCP-Flow, an automated web-agent-driven pipeline for large-scale server discovery, data synthesis, and model training.
arXiv Detail & Related papers (2025-10-28T10:42:17Z) - Securing AI Agent Execution [5.599792629509229]
We introduce AgentBound, the first access control framework for MCP servers.<n>We build a dataset containing the 296 most popular MCP servers, and show that access control policies can be generated automatically from source code with 80.9% accuracy.<n>We also show that AgentBound blocks the majority of security threats in several malicious MCP servers, and that policy enforcement engine introduces negligible overhead.
arXiv Detail & Related papers (2025-10-24T08:10:36Z) - LLM-based Multi-Agent Blackboard System for Information Discovery in Data Science [69.1690891731311]
We propose a novel multi-agent communication paradigm inspired by the blackboard architecture for traditional AI models.<n>In this framework, a central agent posts requests to a shared blackboard, and autonomous subordinate agents respond based on their capabilities.<n>We evaluate our method on three benchmarks that require explicit data discovery.
arXiv Detail & Related papers (2025-09-30T22:34:23Z) - MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers [86.00932417210477]
We introduce MCP-Universe, the first comprehensive benchmark specifically designed to evaluate LLMs in realistic and hard tasks through interaction with real-world MCP servers.<n>Our benchmark encompasses 6 core domains spanning 11 different MCP servers: Location Navigation, Repository Management, Financial Analysis, 3D Design, Browser Automation, and Web Searching.<n>We find that even SOTA models such as GPT-5 (43.72%), Grok-4 (33.33%) and Claude-4.0-Sonnet (29.44%) exhibit significant performance limitations.
arXiv Detail & Related papers (2025-08-20T13:28:58Z) - LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools? [50.60770039016318]
We present LiveMCPBench, the first comprehensive benchmark for benchmarking Model Context Protocol (MCP) agents.<n>LiveMCPBench consists of 95 real-world tasks grounded in the MCP ecosystem.<n>Our evaluation covers 10 leading models, with the best-performing model reaching a 78.95% success rate.
arXiv Detail & Related papers (2025-08-03T14:36:42Z) - MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models [76.72220653705679]
We introduce MCPEval, an open-source framework that automates end-to-end task generation and deep evaluation of intelligent agents.<n> MCPEval standardizes metrics, seamlessly integrates with native agent tools, and eliminates manual effort in building evaluation pipelines.<n> Empirical results across five real-world domains show its effectiveness in revealing nuanced, domain-specific performance.
arXiv Detail & Related papers (2025-07-17T05:46:27Z) - We Urgently Need Privilege Management in MCP: A Measurement of API Usage in MCP Ecosystems [28.59170303701817]
We conduct the first large-scale empirical analysis of Model Context Protocol security risks.<n>We examine 2,562 real-world MCP applications spanning 23 functional categories.<n>We propose a detailed taxonomy of MCP resource access, quantify security-relevant API usage, and identify open challenges for building safer MCP ecosystems.
arXiv Detail & Related papers (2025-07-05T03:39:30Z) - A Large-Scale Evolvable Dataset for Model Context Protocol Ecosystem and Security Analysis [8.943261888363622]
We introduce MCPCorpus, a large-scale dataset containing around 14K MCP servers and 300 MCP clients.<n>Each artifact is annotated with 20+ normalized attributes capturing its identity, interface configuration, GitHub activity, and metadata.<n> MCPCorpus provides a reproducible snapshot of the real-world MCP ecosystem, enabling studies of adoption trends, ecosystem health, and implementation diversity.
arXiv Detail & Related papers (2025-06-30T02:37:27Z) - MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits [0.0]
The Model Context Protocol (MCP) is an open protocol that standardizes API calls to large language models (LLMs), data sources, and agentic tools.<n>We show that the current MCP design carries a wide range of security risks for end users.<n>We introduce a safety auditing tool, MCPSafetyScanner, to assess the security of an arbitrary MCP server.
arXiv Detail & Related papers (2025-04-02T21:46:02Z) - Towards Human-Guided, Data-Centric LLM Co-Pilots [53.35493881390917]
CliMB-DC is a human-guided, data-centric framework for machine learning co-pilots.<n>It combines advanced data-centric tools with LLM-driven reasoning to enable robust, context-aware data processing.<n>We show how CliMB-DC can transform uncurated datasets into ML-ready formats.
arXiv Detail & Related papers (2025-01-17T17:51:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.