Related papers: MME-RAG: Multi-Manager-Expert Retrieval-Augmented Generation for Fine-Grained Entity Recognition in Task-Oriented Dialogues

MME-RAG: Multi-Manager-Expert Retrieval-Augmented Generation for Fine-Grained Entity Recognition in Task-Oriented Dialogues

URL: http://arxiv.org/abs/2511.12213v1
Date: Sat, 15 Nov 2025 13:35:55 GMT
Title: MME-RAG: Multi-Manager-Expert Retrieval-Augmented Generation for Fine-Grained Entity Recognition in Task-Oriented Dialogues
Authors: Liang Xue, Haoyu Liu, Yajun Tian, Xinyu Zhong, Yang Liu,
Abstract summary: MME-RAG is a framework that decomposes entity recognition into two coordinated stages: type-level judgment by lightweight managers and span-level extraction by specialized experts.<n>Each expert is supported by a KeyInfo retriever that injects semantically aligned, few-shot exemplars during inference, enabling precise and domain-adaptive extraction without additional training.<n> Experiments on CrossNER, MIT-Movie, MIT-Restaurant, and our newly constructed multi-domain customer-service dataset demonstrate that MME-RAG performs better than recent baselines in most domains.
Score: 15.237473092649843
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fine-grained entity recognition is crucial for reasoning and decision-making in task-oriented dialogues, yet current large language models (LLMs) continue to face challenges in domain adaptation and retrieval controllability. We introduce MME-RAG, a Multi-Manager-Expert Retrieval-Augmented Generation framework that decomposes entity recognition into two coordinated stages: type-level judgment by lightweight managers and span-level extraction by specialized experts. Each expert is supported by a KeyInfo retriever that injects semantically aligned, few-shot exemplars during inference, enabling precise and domain-adaptive extraction without additional training. Experiments on CrossNER, MIT-Movie, MIT-Restaurant, and our newly constructed multi-domain customer-service dataset demonstrate that MME-RAG performs better than recent baselines in most domains. Ablation studies further show that both the hierarchical decomposition and KeyInfo-guided retrieval are key drivers of robustness and cross-domain generalization, establishing MME-RAG as a scalable and interpretable solution for adaptive dialogue understanding.

Related papers

Reasoning-Driven Multimodal LLM for Domain Generalization [72.00754603114187]
We study the role of reasoning in domain generalization using DomainBed-Reasoning dataset.<n>We propose RD-MLDG, a framework with two components: MTCT (Multi-Task Cross-Training) and SARR (Self-Aligned Reasoning Regularization)<n>Experiments on standard DomainBed datasets demonstrate that RD-MLDG achieves complementary state-of-the-art performances.
arXiv Detail & Related papers (2026-02-27T08:10:06Z)
MiRAGE: A Multiagent Framework for Generating Multimodal Multihop Question-Answer Dataset for RAG Evaluation [0.3499870393443268]
Existing datasets often rely on general-domain corpora or purely textual retrieval.<n>We introduce MiRAGE, a Multiagent framework for RAG systems Evaluation.<n>MiRAGE orchestrates a swarm of specialized agents to generate verified, domain-specific, multimodal, and multi-hop Question-Answer datasets.
arXiv Detail & Related papers (2026-01-21T21:39:09Z)
Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding [61.36285696607487]
Document understanding is critical for applications from financial analysis to scientific discovery.<n>Current approaches, whether OCR-based pipelines feeding Large Language Models (LLMs) or native Multimodal LLMs (MLLMs) face key limitations.<n>Retrieval-Augmented Generation (RAG) helps ground models in external data, but documents' multimodal nature, combining text, tables, charts, and layout, demands a more advanced paradigm: Multimodal RAG.
arXiv Detail & Related papers (2025-10-17T02:33:16Z)
Domain-Specific Data Generation Framework for RAG Adaptation [58.20906914537952]
Retrieval-Augmented Generation (RAG) combines the language understanding and reasoning power of large language models with external retrieval to enable domain-grounded responses.<n>We propose RAGen, a framework for generating domain-grounded question-answer-context (QAC) triples tailored to diverse RAG adaptation approaches.
arXiv Detail & Related papers (2025-10-13T09:59:49Z)
RAG-IGBench: Innovative Evaluation for RAG-based Interleaved Generation in Open-domain Question Answering [50.42577862494645]
We present RAG-IGBench, a benchmark designed to evaluate the task of Interleaved Generation based on Retrieval-Augmented Generation (RAG-IG) in open-domain question answering.<n>RAG-IG integrates multimodal large language models (MLLMs) with retrieval mechanisms, enabling the models to access external image-text information for generating coherent multimodal content.
arXiv Detail & Related papers (2025-10-11T03:06:39Z)
MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning [36.3918410061572]
MA-RAG addresses the inherent ambiguities and reasoning challenges in complex information-seeking tasks.<n>Unlike conventional RAG methods that rely on end-to-end fine-tuning or isolated component enhancements, MA-RAG orchestrates a collaborative set of specialized AI agents.<n>Our results highlight the effectiveness of collaborative, modular reasoning in retrieval-augmented systems.
arXiv Detail & Related papers (2025-05-26T15:05:18Z)
Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning [88.55095746156428]
Retrieval-augmented generation (RAG) is widely utilized to incorporate external knowledge into large language models.<n>A standard RAG pipeline consists of several components, such as query rewriting, document retrieval, document filtering, and answer generation.<n>We propose treating the complex RAG pipeline with multiple components as a multi-agent cooperative task, in which each component can be regarded as an RL agent.
arXiv Detail & Related papers (2025-01-25T14:24:50Z)
R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models [51.468732121824125]
Large language models have achieved remarkable success on general NLP tasks, but they may fall short for domain-specific problems. Existing evaluation tools only provide a few baselines and evaluate them on various domains without mining the depth of domain knowledge. In this paper, we address the challenges of evaluating RALLMs by introducing the R-Eval toolkit, a Python toolkit designed to streamline the evaluation of different RAGs.
arXiv Detail & Related papers (2024-06-17T15:59:49Z)
Advancing Grounded Multimodal Named Entity Recognition via LLM-Based Reformulation and Box-Based Segmentation [48.47565361014847]
Grounded Multimodal Named Entity Recognition (GMNER) task aims to identify named entities, entity types and their corresponding visual regions.<n>We propose RiVEG, a unified framework that reformulates GMNER into a joint MNER-VE-VG task by leveraging large language models.
arXiv Detail & Related papers (2024-06-11T13:52:29Z)
META: Mimicking Embedding via oThers' Aggregation for Generalizable Person Re-identification [68.39849081353704]
Domain generalizable (DG) person re-identification (ReID) aims to test across unseen domains without access to the target domain data at training time. This paper presents a new approach called Mimicking Embedding via oThers' Aggregation (META) for DG ReID.
arXiv Detail & Related papers (2021-12-16T08:06:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.