M$^3$Prune: Hierarchical Communication Graph Pruning for Efficient Multi-Modal Multi-Agent Retrieval-Augmented Generation
- URL: http://arxiv.org/abs/2511.19969v1
- Date: Tue, 25 Nov 2025 06:29:13 GMT
- Title: M$^3$Prune: Hierarchical Communication Graph Pruning for Efficient Multi-Modal Multi-Agent Retrieval-Augmented Generation
- Authors: Weizi Shao, Taolin Zhang, Zijie Zhou, Chen Chen, Chengyu Wang, Xiaofeng He,
- Abstract summary: We propose a novel Multi-Modal Multi-agent hierarchical communication graph PRUNING framework, termed M$3$Prune.<n>Our framework eliminates redundant edges across different modalities, achieving an optimal balance between task performance and token overhead.<n>Our method consistently outperforms both single-agent and robust multi-agent mRAG systems.
- Score: 18.091284320771006
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in multi-modal retrieval-augmented generation (mRAG), which enhance multi-modal large language models (MLLMs) with external knowledge, have demonstrated that the collective intelligence of multiple agents can significantly outperform a single model through effective communication. Despite impressive performance, existing multi-agent systems inherently incur substantial token overhead and increased computational costs, posing challenges for large-scale deployment. To address these issues, we propose a novel Multi-Modal Multi-agent hierarchical communication graph PRUNING framework, termed M$^3$Prune. Our framework eliminates redundant edges across different modalities, achieving an optimal balance between task performance and token overhead. Specifically, M$^3$Prune first applies intra-modal graph sparsification to textual and visual modalities, identifying the edges most critical for solving the task. Subsequently, we construct a dynamic communication topology using these key edges for inter-modal graph sparsification. Finally, we progressively prune redundant edges to obtain a more efficient and hierarchical topology. Extensive experiments on both general and domain-specific mRAG benchmarks demonstrate that our method consistently outperforms both single-agent and robust multi-agent mRAG systems while significantly reducing token consumption.
Related papers
- MASPOB: Bandit-Based Prompt Optimization for Multi-Agent Systems with Graph Neural Networks [21.211097851224487]
We introduce MASPOB (Multi-Agent System Prompt Optimization via Bandits), a novel sample-efficient framework based on bandits.<n>To handle topology-induced coupling, MASPOB integrates Graph Neural Networks (GNNs) to capture structural priors, learning topology-aware representations of prompt semantics.
arXiv Detail & Related papers (2026-03-03T05:59:05Z) - Toward Effective Multimodal Graph Foundation Model: A Divide-and-Conquer Based Approach [42.970648490410504]
Multimodal Graph Foundation Models (MGFMs) allow for leveraging the rich multimodal information in Multimodal-Attributed Graphs (MAGs)<n>We propose PLANET, a novel framework employing a Divide-and-Conquer strategy to decouple modality interaction and alignment across distinct granularities.<n>We show that PLANET significantly outperforms state-of-the-art baselines across diverse graph-centric and multimodal generative tasks.
arXiv Detail & Related papers (2026-02-04T01:05:12Z) - OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models [57.94189874119267]
Multi-Agent Systems (MAS) offer a powerful paradigm for solving complex problems.<n>Current graph learning-based design methodologies often adhere to a "one-for-one" paradigm.<n>We propose OFA-TAD, a one-for-all framework that generates adaptive collaboration graphs for any task described in natural language.
arXiv Detail & Related papers (2026-01-19T12:23:44Z) - MMRAG-RFT: Two-stage Reinforcement Fine-tuning for Explainable Multi-modal Retrieval-augmented Generation [31.90681057778075]
Multi-modal Retrieval-Augmented Generation (MMRAG) enables highly credible generation by integrating external multi-modal knowledge.<n>Existing MMRAG methods fail to clarify the reasoning logic behind retrieval and response generation.
arXiv Detail & Related papers (2025-12-19T03:19:54Z) - Dynamic Generation of Multi-LLM Agents Communication Topologies with Graph Diffusion Models [99.85131798240808]
We introduce a novel generative framework called textitGuided Topology Diffusion (GTD)<n>Inspired by conditional discrete graph diffusion models, GTD formulates topology synthesis as an iterative construction process.<n>At each step, the generation is steered by a lightweight proxy model that predicts multi-objective rewards.<n>Experiments show that GTD can generate highly task-adaptive, sparse, and efficient communication topologies.
arXiv Detail & Related papers (2025-10-09T05:28:28Z) - AgentRouter: A Knowledge-Graph-Guided LLM Router for Collaborative Multi-Agent Question Answering [51.07491603393163]
tAgent is a framework that formulates multi-agent QA as a knowledge-graph-guided routing problem supervised by empirical performance signals.<n>By leveraging soft supervision and weighted aggregation of agent outputs, Agent learns principled collaboration schemes that capture the complementary strengths of diverse agents.
arXiv Detail & Related papers (2025-10-06T23:20:49Z) - Multi-Agent Tool-Integrated Policy Optimization [67.12841355267678]
Large language models (LLMs) increasingly rely on multi-turn tool-integrated planning for knowledge-intensive and complex reasoning tasks.<n>Existing implementations typically rely on a single agent, but they suffer from limited context length and noisy tool responses.<n>No existing methods support effective reinforcement learning post-training of tool-integrated multi-agent frameworks.
arXiv Detail & Related papers (2025-10-06T10:44:04Z) - MAS$^2$: Self-Generative, Self-Configuring, Self-Rectifying Multi-Agent Systems [40.44248136759827]
We introduce MAS$2$, a multi-agent system that autonomously architects bespoke multi-agent systems.<n> MAS$2$ achieves performance gains of up to $19.6%$ over state-of-the-art MAS.
arXiv Detail & Related papers (2025-09-29T06:20:10Z) - Adaptive Graph Pruning for Multi-Agent Communication [14.18447472314079]
Large Language Model (LLM) based multi-agent systems have shown remarkable performance in various tasks.<n>We propose Adaptive Graph Pruning (AGP), a novel task-adaptive multi-agent collaboration framework.
arXiv Detail & Related papers (2025-06-03T14:46:00Z) - Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems [42.137278756052595]
$texttAgentPrune$ can seamlessly integrate into mainstream multi-agent systems.
textbf(I) integrates seamlessly into existing multi-agent frameworks with $28.1%sim72.8%downarrow$ token reduction.
textbf(III) successfully defend against two types of agent-based adversarial attacks with $3.5%sim10.8%uparrow$ performance boost.
arXiv Detail & Related papers (2024-10-03T14:14:31Z) - M$^2$PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning [90.75075886543404]
Multimodal Large Language Models (MLLMs) demonstrate remarkable performance across a wide range of domains.
In this work, we introduce a novel Multimodal Prompt Tuning (M$2$PT) approach for efficient instruction tuning of MLLMs.
arXiv Detail & Related papers (2024-09-24T01:40:24Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.