Related papers: M$^3$Prune: Hierarchical Communication Graph Pruning for Efficient Multi-Modal Multi-Agent Retrieval-Augmented Generation

M$^3$Prune: Hierarchical Communication Graph Pruning for Efficient Multi-Modal Multi-Agent Retrieval-Augmented Generation

URL: http://arxiv.org/abs/2511.19969v1
Date: Tue, 25 Nov 2025 06:29:13 GMT
Title: M$^3$Prune: Hierarchical Communication Graph Pruning for Efficient Multi-Modal Multi-Agent Retrieval-Augmented Generation
Authors: Weizi Shao, Taolin Zhang, Zijie Zhou, Chen Chen, Chengyu Wang, Xiaofeng He,
Abstract summary: We propose a novel Multi-Modal Multi-agent hierarchical communication graph PRUNING framework, termed M$3$Prune.<n>Our framework eliminates redundant edges across different modalities, achieving an optimal balance between task performance and token overhead.<n>Our method consistently outperforms both single-agent and robust multi-agent mRAG systems.
Score: 18.091284320771006
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advancements in multi-modal retrieval-augmented generation (mRAG), which enhance multi-modal large language models (MLLMs) with external knowledge, have demonstrated that the collective intelligence of multiple agents can significantly outperform a single model through effective communication. Despite impressive performance, existing multi-agent systems inherently incur substantial token overhead and increased computational costs, posing challenges for large-scale deployment. To address these issues, we propose a novel Multi-Modal Multi-agent hierarchical communication graph PRUNING framework, termed M$^3$Prune. Our framework eliminates redundant edges across different modalities, achieving an optimal balance between task performance and token overhead. Specifically, M$^3$Prune first applies intra-modal graph sparsification to textual and visual modalities, identifying the edges most critical for solving the task. Subsequently, we construct a dynamic communication topology using these key edges for inter-modal graph sparsification. Finally, we progressively prune redundant edges to obtain a more efficient and hierarchical topology. Extensive experiments on both general and domain-specific mRAG benchmarks demonstrate that our method consistently outperforms both single-agent and robust multi-agent mRAG systems while significantly reducing token consumption.

Related papers

MASPOB: Bandit-Based Prompt Optimization for Multi-Agent Systems with Graph Neural Networks [21.211097851224487]
We introduce MASPOB (Multi-Agent System Prompt Optimization via Bandits), a novel sample-efficient framework based on bandits.<n>To handle topology-induced coupling, MASPOB integrates Graph Neural Networks (GNNs) to capture structural priors, learning topology-aware representations of prompt semantics.
arXiv Detail & Related papers (2026-03-03T05:59:05Z)
Toward Effective Multimodal Graph Foundation Model: A Divide-and-Conquer Based Approach [42.970648490410504]
Multimodal Graph Foundation Models (MGFMs) allow for leveraging the rich multimodal information in Multimodal-Attributed Graphs (MAGs)<n>We propose PLANET, a novel framework employing a Divide-and-Conquer strategy to decouple modality interaction and alignment across distinct granularities.<n>We show that PLANET significantly outperforms state-of-the-art baselines across diverse graph-centric and multimodal generative tasks.
arXiv Detail & Related papers (2026-02-04T01:05:12Z)
OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models [57.94189874119267]
Multi-Agent Systems (MAS) offer a powerful paradigm for solving complex problems.<n>Current graph learning-based design methodologies often adhere to a "one-for-one" paradigm.<n>We propose OFA-TAD, a one-for-all framework that generates adaptive collaboration graphs for any task described in natural language.
arXiv Detail & Related papers (2026-01-19T12:23:44Z)
MMRAG-RFT: Two-stage Reinforcement Fine-tuning for Explainable Multi-modal Retrieval-augmented Generation [31.90681057778075]
Multi-modal Retrieval-Augmented Generation (MMRAG) enables highly credible generation by integrating external multi-modal knowledge.<n>Existing MMRAG methods fail to clarify the reasoning logic behind retrieval and response generation.
arXiv Detail & Related papers (2025-12-19T03:19:54Z)
Dynamic Generation of Multi-LLM Agents Communication Topologies with Graph Diffusion Models [99.85131798240808]
We introduce a novel generative framework called textitGuided Topology Diffusion (GTD)<n>Inspired by conditional discrete graph diffusion models, GTD formulates topology synthesis as an iterative construction process.<n>At each step, the generation is steered by a lightweight proxy model that predicts multi-objective rewards.<n>Experiments show that GTD can generate highly task-adaptive, sparse, and efficient communication topologies.
arXiv Detail & Related papers (2025-10-09T05:28:28Z)
AgentRouter: A Knowledge-Graph-Guided LLM Router for Collaborative Multi-Agent Question Answering [51.07491603393163]
tAgent is a framework that formulates multi-agent QA as a knowledge-graph-guided routing problem supervised by empirical performance signals.<n>By leveraging soft supervision and weighted aggregation of agent outputs, Agent learns principled collaboration schemes that capture the complementary strengths of diverse agents.
arXiv Detail & Related papers (2025-10-06T23:20:49Z)
Multi-Agent Tool-Integrated Policy Optimization [67.12841355267678]
Large language models (LLMs) increasingly rely on multi-turn tool-integrated planning for knowledge-intensive and complex reasoning tasks.<n>Existing implementations typically rely on a single agent, but they suffer from limited context length and noisy tool responses.<n>No existing methods support effective reinforcement learning post-training of tool-integrated multi-agent frameworks.
arXiv Detail & Related papers (2025-10-06T10:44:04Z)
MAS$^2$: Self-Generative, Self-Configuring, Self-Rectifying Multi-Agent Systems [40.44248136759827]
We introduce MAS$2$, a multi-agent system that autonomously architects bespoke multi-agent systems.<n> MAS$2$ achieves performance gains of up to $19.6%$ over state-of-the-art MAS.
arXiv Detail & Related papers (2025-09-29T06:20:10Z)
Adaptive Graph Pruning for Multi-Agent Communication [14.18447472314079]
Large Language Model (LLM) based multi-agent systems have shown remarkable performance in various tasks.<n>We propose Adaptive Graph Pruning (AGP), a novel task-adaptive multi-agent collaboration framework.
arXiv Detail & Related papers (2025-06-03T14:46:00Z)
Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems [42.137278756052595]
$texttAgentPrune$ can seamlessly integrate into mainstream multi-agent systems. textbf(I) integrates seamlessly into existing multi-agent frameworks with $28.1%sim72.8%downarrow$ token reduction. textbf(III) successfully defend against two types of agent-based adversarial attacks with $3.5%sim10.8%uparrow$ performance boost.
arXiv Detail & Related papers (2024-10-03T14:14:31Z)
M$^2$PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning [90.75075886543404]
Multimodal Large Language Models (MLLMs) demonstrate remarkable performance across a wide range of domains. In this work, we introduce a novel Multimodal Prompt Tuning (M$2$PT) approach for efficient instruction tuning of MLLMs.
arXiv Detail & Related papers (2024-09-24T01:40:24Z)
Exploiting Modality-Specific Features For Multi-Modal Manipulation Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks. Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment. We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.