Fugu-MT 論文翻訳(概要): M-Eval: A Heterogeneity-Based Framework for Multi-evidence Validation in Medical RAG Systems

論文の概要: M-Eval: A Heterogeneity-Based Framework for Multi-evidence Validation in Medical RAG Systems

arxiv url: http://arxiv.org/abs/2510.23995v1
Date: Tue, 28 Oct 2025 01:57:40 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-29 15:35:36.716074
Title: M-Eval: A Heterogeneity-Based Framework for Multi-evidence Validation in Medical RAG Systems
Title（参考訳）: M-Eval:医療用RAGシステムにおけるマルチエビデンス検証のための不均一性に基づくフレームワーク
Authors: Mengzhou Sun, Sendong Zhao, Jianyu Chen, Haochun Wang, Bin Qin,
Abstract要約: Retrieval-augmented Generation (RAG) は、医学的質問応答システムを強化する可能性を示している。この研究は、現在のRAGベースの医療システムにおけるエラーを検出するのに役立つ。また、LSMの応用をより信頼性が高くし、診断エラーを低減する。
参考スコア（独自算出の注目度）: 21.76710595917909
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Retrieval-augmented Generation (RAG) has demonstrated potential in enhancing medical question-answering systems through the integration of large language models (LLMs) with external medical literature. LLMs can retrieve relevant medical articles to generate more professional responses efficiently. However, current RAG applications still face problems. They generate incorrect information, such as hallucinations, and they fail to use external knowledge correctly. To solve these issues, we propose a new method named M-Eval. This method is inspired by the heterogeneity analysis approach used in Evidence-Based Medicine (EBM). Our approach can check for factual errors in RAG responses using evidence from multiple sources. First, we extract additional medical literature from external knowledge bases. Then, we retrieve the evidence documents generated by the RAG system. We use heterogeneity analysis to check whether the evidence supports different viewpoints in the response. In addition to verifying the accuracy of the response, we also assess the reliability of the evidence provided by the RAG system. Our method shows an improvement of up to 23.31% accuracy across various LLMs. This work can help detect errors in current RAG-based medical systems. It also makes the applications of LLMs more reliable and reduces diagnostic errors.
Abstract（参考訳）: Retrieval-augmented Generation (RAG)は、大規模言語モデル(LLM)と外部医療文献の統合を通じて、医療質問応答システムを強化する可能性を示している。 LLMは関連する医療品を検索して、より専門的な反応を効率的に生成することができる。しかし、現在のRAGアプリケーションは依然として問題に直面している。幻覚などの誤った情報を生成し、外部知識を正しく利用できない。これらの問題を解決するために,M-Evalという新しい手法を提案する。この方法は、EBM(Evidence-Based Medicine)における異種性分析のアプローチにインスパイアされている。提案手法は,複数の情報源から得られた証拠を用いて,RAG応答の事実的誤りを確認できる。まず、外部知識ベースから追加の医学文献を抽出する。そして,RAGシステムによって生成された証拠文書を検索する。我々は不均一性分析を用いて、その証拠が反応の異なる視点を支持するかどうかを確認する。また,回答の正確さの検証に加えて,RAGシステムが提供する証拠の信頼性も評価した。提案手法は, 各種LLMにおける最大23.31%の精度向上を示す。この研究は、現在のRAGベースの医療システムにおけるエラーを検出するのに役立つ。また、LSMの応用をより信頼性が高くし、診断エラーを低減する。

論文の概要: M-Eval: A Heterogeneity-Based Framework for Multi-evidence Validation in Medical RAG Systems

関連論文リスト