Enhancing Model Context Protocol (MCP) with Context-Aware Server Collaboration
- URL: http://arxiv.org/abs/2601.11595v1
- Date: Tue, 06 Jan 2026 21:34:08 GMT
- Title: Enhancing Model Context Protocol (MCP) with Context-Aware Server Collaboration
- Authors: Meenakshi Amulya Jayanti, X. Y. Han,
- Abstract summary: The Model Context Protocol (MCP) has emerged as a widely used framework for enabling agents to communicate with external tools and services.<n>We present experiments showing that the Context-Aware MCP can outperform the traditional MCP by reducing the number of LLM calls required for complex tasks.
- Score: 0.8594140167290097
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The Model Context Protocol (MCP) has emerged as a widely used framework for enabling LLM-based agents to communicate with external tools and services. The most common implementation of MCP, proposed by Anthropic, heavily relies on a Large Language Model (LLM) to decompose tasks and issue instructions to servers, which act as stateless executors. In particular, the agents, models, and servers are stateless and do not have access to a global context. However, in tasks involving LLM-driven coordination, it is natural that a Shared Context Store (SCS) could improve the efficiency and coherence of multi-agent workflows by reducing redundancy and enabling knowledge transfer between servers. Thus, in this work, we design and assess the performance of a Context-Aware MCP (CA-MCP) that offloads execution logic to specialized MCP servers that read from and write to a shared context memory, allowing them to coordinate more autonomously in real time. In this design, context management serves as the central mechanism that maintains continuity across task executions by tracking intermediate states and shared variables, thereby enabling persistent collaboration among agents without repeated prompting. We present experiments showing that the CA-MCP can outperform the traditional MCP by reducing the number of LLM calls required for complex tasks and decreasing the frequency of response failures when task conditions are not satisfied, thereby improving overall efficiency and responsiveness. In particular, we conducted experiments on the TravelPlanner and REALM-Bench benchmark datasets and observed statistically significant results indicating the potential advantages of incorporating a shared context store via CA-MCP in LLM-driven multi-agent systems.
Related papers
- From Tool Orchestration to Code Execution: A Study of MCP Design Choices [20.817336331051752]
Model Context Protocols (MCPs) provide a unified platform for agent systems to discover, select, and orchestrate tools across heterogeneous execution environments.<n>Recent MCP designs have incorporated code execution as a first-class capability, an approach called Code Execution MCP (CEMCP)<n>This work formalizes the architectural distinction between context-coupled (traditional) and context-decoupled (CEMCP) models, analyzing their fundamental scalability trade-offs.
arXiv Detail & Related papers (2026-02-17T19:03:08Z) - MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers [17.96465932881902]
We present MCP-SafetyBench, a comprehensive benchmark built on real MCP servers.<n>It incorporates a unified taxonomy of 20 MCP attack types spanning server, host, and user sides.<n>Using MCP-SafetyBench, we systematically evaluate leading open- and closed-source LLMs.
arXiv Detail & Related papers (2025-12-17T08:00:32Z) - Multi-agent In-context Coordination via Decentralized Memory Retrieval [39.106914463842685]
Large transformer models, trained on diverse datasets, have demonstrated impressive few-shot performance on previously unseen tasks.<n>In cooperative Multi-Agent Reinforcement Learning (MARL), where agents must coordinate toward a shared goal, decentralized policy deployment can lead to mismatches in task alignment and reward assignment.<n>We introduce Multi-agent In-context Coordination via Decentralized Memory Retrieval (MAICC), a novel approach designed to enhance coordination by fast adaptation.
arXiv Detail & Related papers (2025-11-13T07:08:31Z) - MCP-Flow: Facilitating LLM Agents to Master Real-World, Diverse and Scaling MCP Tools [58.5971352939562]
Large Language Models increasingly rely on external tools to perform complex, realistic tasks.<n>Existing MCP research covers few servers, depends on costly manual curation, and lacks training support.<n>We introduce MCP-Flow, an automated web-agent-driven pipeline for large-scale server discovery, data synthesis, and model training.
arXiv Detail & Related papers (2025-10-28T10:42:17Z) - Network and Systems Performance Characterization of MCP-Enabled LLM Agents [2.952262068394116]
Model Context Protocol (MCP) has recently gained increased attention within the AI community for providing a standardized way for large language models (LLMs) to interact with external tools and services.<n>This paper presents a measurement-based analysis of MCP-enabled interactions with LLMs, revealing trade-offs between capability, performance, and cost.
arXiv Detail & Related papers (2025-10-20T05:13:47Z) - Multi-Agent Tool-Integrated Policy Optimization [67.12841355267678]
Large language models (LLMs) increasingly rely on multi-turn tool-integrated planning for knowledge-intensive and complex reasoning tasks.<n>Existing implementations typically rely on a single agent, but they suffer from limited context length and noisy tool responses.<n>No existing methods support effective reinforcement learning post-training of tool-integrated multi-agent frameworks.
arXiv Detail & Related papers (2025-10-06T10:44:04Z) - ContextNav: Towards Agentic Multimodal In-Context Learning [85.05420047017513]
ContextNav is an agentic framework that integrates the scalability of automated retrieval with the quality and adaptiveness of human-like curation.<n>It builds a resource-aware multimodal embedding pipeline, maintains a retrievable vector database, and applies agentic retrieval and structural alignment to construct noise-resilient contexts.<n> Experimental results demonstrate that ContextNav achieves state-of-the-art performance across various datasets.
arXiv Detail & Related papers (2025-10-06T07:49:52Z) - MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers [24.6512259539754]
MCP-Bench is a benchmark for evaluating large language models (LLMs) on realistic, multi-step tasks.<n>Built on the Model Context Protocol (MCP), MCP-Bench connects LLMs to 28 representative live MCP servers spanning 250 tools across domains such as finance, traveling, scientific computing, and academic search.
arXiv Detail & Related papers (2025-08-28T05:58:57Z) - MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers [86.00932417210477]
We introduce MCP-Universe, the first comprehensive benchmark specifically designed to evaluate LLMs in realistic and hard tasks through interaction with real-world MCP servers.<n>Our benchmark encompasses 6 core domains spanning 11 different MCP servers: Location Navigation, Repository Management, Financial Analysis, 3D Design, Browser Automation, and Web Searching.<n>We find that even SOTA models such as GPT-5 (43.72%), Grok-4 (33.33%) and Claude-4.0-Sonnet (29.44%) exhibit significant performance limitations.
arXiv Detail & Related papers (2025-08-20T13:28:58Z) - RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory [57.449129198822476]
RCR is a role-aware context routing framework for multi-agent large language model (LLM) systems.<n>It dynamically selects semantically relevant memory subsets for each agent based on its role and task stage.<n>A lightweight scoring policy guides memory selection, and agent outputs are integrated into a shared memory store.
arXiv Detail & Related papers (2025-08-06T21:59:34Z) - Cross-Task Experiential Learning on LLM-based Multi-Agent Collaboration [63.90193684394165]
We introduce multi-agent cross-task experiential learning (MAEL), a novel framework that endows LLM-driven agents with explicit cross-task learning and experience accumulation.<n>During the experiential learning phase, we quantify the quality for each step in the task-solving workflow and store the resulting rewards.<n>During inference, agents retrieve high-reward, task-relevant experiences as few-shot examples to enhance the effectiveness of each reasoning step.
arXiv Detail & Related papers (2025-05-29T07:24:37Z) - Modeling Response Consistency in Multi-Agent LLM Systems: A Comparative Analysis of Shared and Separate Context Approaches [0.0]
We introduce the Response Consistency Index (RCI) as a metric to evaluate the effects of context limitations, noise, and inter-agent dependencies on system performance.<n>Our approach differs from existing research by focusing on the interplay between memory constraints and noise management.
arXiv Detail & Related papers (2025-04-09T21:54:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.