Fugu-MT 論文翻訳(概要): Open-Source Reproduction and Explainability Analysis of Corrective Retrieval Augmented Generation

論文の概要: Open-Source Reproduction and Explainability Analysis of Corrective Retrieval Augmented Generation

arxiv url: http://arxiv.org/abs/2603.16169v1
Date: Tue, 17 Mar 2026 06:38:00 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-18 17:42:07.134814
Title: Open-Source Reproduction and Explainability Analysis of Corrective Retrieval Augmented Generation
Title（参考訳）: 補正検索拡張ジェネレーションのオープンソース再生と説明可能性解析
Authors: Surya Vardhan Yalavarthi,
Abstract要約: CRAG(Corrective Retrieval Augmented Generation)の完全オープンソース版について紹介する。 CRAGは、取得した文書の品質を評価し、修正アクションをトリガーすることにより、RAGシステムの堅牢性を向上させる。プロプライエタリなWeb検索をWikipedia APIとオリジナルのLLaMA-2ジェネレータで置き換え,Phi-3-mini-4k-インストラクトで置き換える。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Corrective Retrieval Augmented Generation (CRAG) improves the robustness of RAG systems by evaluating retrieved document quality and triggering corrective actions. However, the original implementation relies on proprietary components including the Google Search API and closed model weights, limiting reproducibility. In this work, we present a fully open-source reproduction of CRAG, replacing proprietary web search with the Wikipedia API and the original LLaMA-2 generator with Phi-3-mini-4k-instruct. We evaluate on PopQA and ARC-Challenge, demonstrating that our open-source pipeline achieves comparable performance to the original system. Furthermore, we contribute the first explainability analysis of CRAG's T5-based retrieval evaluator using SHAP, revealing that the evaluator primarily relies on named entity alignment rather than semantic similarity. Our analysis identifies key failure modes including domain transfer limitations on science questions. All code and results are available at https://github.com/suryayalavarthi/crag-reproduction.
Abstract（参考訳）: CRAG(Corrective Retrieval Augmented Generation)は、検索した文書の品質を評価し、修正動作を引き起こすことにより、RAGシステムの堅牢性を向上させる。しかし、オリジナルの実装はGoogle Search APIやクローズドモデルウェイトなどプロプライエタリなコンポーネントに依存しており、再現性を制限している。本研究では,プロプライエタリなWeb検索をウィキペディアAPIと,Phi-3-mini-4k-インストラクタでオリジナルのLLaMA-2ジェネレータに置き換えたCRAGの完全なオープンソース複製を提案する。我々はPopQAとARC-Challengeを評価し、私たちのオープンソースのパイプラインが元のシステムに匹敵する性能を達成することを実証した。さらに,CRAG の T5 に基づく検索評価器 SHAP を用いた最初の説明可能性解析を行い,この評価器は意味的類似性よりも名前付きエンティティアライメントに依存していることを明らかにした。本分析では,科学的問題に対するドメイン転送制限を含む重要な障害モードを同定する。すべてのコードと結果はhttps://github.com/suryayalavarthi/crag-reproduction.comで公開されている。

論文の概要: Open-Source Reproduction and Explainability Analysis of Corrective Retrieval Augmented Generation

関連論文リスト