Fugu-MT 論文翻訳(概要): MetaRAG: Metamorphic Testing for Hallucination Detection in RAG Systems

論文の概要: MetaRAG: Metamorphic Testing for Hallucination Detection in RAG Systems

arxiv url: http://arxiv.org/abs/2509.09360v1
Date: Thu, 11 Sep 2025 11:18:23 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-12 16:52:24.351798
Title: MetaRAG: Metamorphic Testing for Hallucination Detection in RAG Systems
Title（参考訳）: MetaRAG:RAGシステムにおける幻覚検出のための変成検査
Authors: Channdeth Sok, David Luz, Yacine Haddam,
Abstract要約: 本稿では,Retrieval-Augmented Generation (RAG)システムにおける幻覚検出のためのテストフレームワークであるMetaRAGを提案する。 MetaRAGは、リアルタイム、教師なし、ブラックボックスの設定で動作し、グランドトラスト参照もモデル内部へのアクセスも必要としない。アイデンティティを意識したAIにとって、MetaRAGは、サポート対象の主張を、その発生箇所のファクトイドでローカライズする。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large Language Models (LLMs) are increasingly deployed in enterprise applications, yet their reliability remains limited by hallucinations, i.e., confident but factually incorrect information. Existing detection approaches, such as SelfCheckGPT and MetaQA, primarily target standalone LLMs and do not address the unique challenges of Retrieval-Augmented Generation (RAG) systems, where responses must be consistent with retrieved evidence. We therefore present MetaRAG, a metamorphic testing framework for hallucination detection in Retrieval-Augmented Generation (RAG) systems. MetaRAG operates in a real-time, unsupervised, black-box setting, requiring neither ground-truth references nor access to model internals, making it suitable for proprietary and high-stakes domains. The framework proceeds in four stages: (1) decompose answers into atomic factoids, (2) generate controlled mutations of each factoid using synonym and antonym substitutions, (3) verify each variant against the retrieved context (synonyms are expected to be entailed and antonyms contradicted), and (4) aggregate penalties for inconsistencies into a response-level hallucination score. Crucially for identity-aware AI, MetaRAG localizes unsupported claims at the factoid span where they occur (e.g., pregnancy-specific precautions, LGBTQ+ refugee rights, or labor eligibility), allowing users to see flagged spans and enabling system designers to configure thresholds and guardrails for identity-sensitive queries. Experiments on a proprietary enterprise dataset illustrate the effectiveness of MetaRAG for detecting hallucinations and enabling trustworthy deployment of RAG-based conversational agents. We also outline a topic-based deployment design that translates MetaRAG's span-level scores into identity-aware safeguards; this design is discussed but not evaluated in our experiments.
Abstract（参考訳）: 大規模言語モデル(LLM)は、ますますエンタープライズアプリケーションにデプロイされているが、その信頼性は幻覚によって制限されている。既存の検出アプローチであるSelfCheckGPTやMetaQAは、主にスタンドアロンのLCMをターゲットにしており、検索・拡張生成(RAG)システムのユニークな課題には対処していない。そこで,我々はメタRAG(MetaRAG)を,検索型拡張生成(RAG)システムにおける幻覚検出のためのメタモルフィックテストフレームワークとして提案する。 MetaRAGは、リアルタイム、教師なし、ブラックボックスの設定で動作し、地味な参照もモデル内部へのアクセスも必要とせず、プロプライエタリなドメインや高レベルのドメインに適している。この枠組みは、(1)原子ファクトイドに解答を分解し、(2)同義語と無調語置換を用いて各ファクトイドの制御突然変異を生成し、(3)検索された文脈に対して各変異を検証し(類義語は関連付けられ、無調語は矛盾すると予想される)、(4)応答レベルの幻覚スコアに不整合を集約する。アイデンティティを意識したAIにとって、MetaRAGはサポート対象の主張を事実的範囲(妊娠固有の予防措置、LGBTQ+難民権、労働適格性など)でローカライズし、ユーザがフラグ付きスパンを見ることができるようにし、システムデザイナがアイデンティティに敏感なクエリのしきい値とガードレールを設定することを可能にする。プロプライエタリなエンタープライズデータセットの実験では、幻覚を検出し、RAGベースの会話エージェントの信頼できるデプロイを可能にするMetaRAGの有効性が示されている。また、MetaRAGのスパンレベルスコアをID対応セーフガードに変換するトピックベースのデプロイメント設計についても概説する。

論文の概要: MetaRAG: Metamorphic Testing for Hallucination Detection in RAG Systems

関連論文リスト