Fugu-MT 論文翻訳(概要): Retrieval is Cheap, Show Me the Code: Executable Multi-Hop Reasoning for Retrieval-Augmented Generation

論文の概要: Retrieval is Cheap, Show Me the Code: Executable Multi-Hop Reasoning for Retrieval-Augmented Generation

arxiv url: http://arxiv.org/abs/2605.12975v1
Date: Wed, 13 May 2026 04:14:13 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-14 23:30:27.804139
Title: Retrieval is Cheap, Show Me the Code: Executable Multi-Hop Reasoning for Retrieval-Augmented Generation
Title（参考訳）: Retrievalはチープで、コードを見せてくれる:Retrieval拡張ジェネレーションのための実行可能なマルチホップ推論
Authors: Jiashuo Sun, Jimeng Shi, Yixuan Xie, Saizhuo Wang, Jash Rajesh Parekh, Pengcheng Jiang, Zhiyi Shi, Jiajun Fan, Qinglong Zheng, Peiran Li, Shaowen Wang, Ge Liu, Jiawei Han,
Abstract要約: pyragは、プログラムの合成と実行としてマルチホップRAGを再構成するフレームワークである。 pyragはトレーニング不要設定とRLトレーニング設定の両方で、強いベースラインを一貫して上回る。
参考スコア（独自算出の注目度）: 26.17880287280065
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Retrieval-Augmented Generation (RAG) has become a standard approach for knowledge-intensive question answering, but existing systems remain brittle on multi-hop questions, where solving the task requires chaining multiple retrieval and reasoning steps. Key challenges are that current methods represent reasoning through free-form natural language, where intermediate states are implicit, retrieval queries can drift from intended entities, and errors are detected by the same model that produces them making self-reflection an unreliable, ungrounded signal. We observe that multi-hop question answering is a typical form of step-by-step computation, and that this structured process aligns closely with how code-specialized language models are trained to operate. Motivated by this, we introduce \pyrag, a framework that reformulates multi-hop RAG as program synthesis and execution. Instead of free-form reasoning trajectories, \pyrag represents the reasoning process as an executable Python program over retrieval and QA tools, exposing intermediate states as variables, producing deterministic feedback through execution, and yielding an inspectable trace of the entire reasoning process. This formulation further enables compiler-grounded self-repair and execution-driven adaptive retrieval without any additional training. Experiments on five QA benchmarks (PopQA, HotpotQA, 2WikiMultihopQA, MuSiQue, and Bamboogle) show that \pyrag consistently outperforms strong baselines under both training-free and RL-trained settings, with especially large gains on compositional multi-hop datasets. Our code, data and models are publicly available at https://github.com/GasolSun36/PyRAG.
Abstract（参考訳）: Retrieval-Augmented Generation (RAG) は知識集約的な質問応答の標準的なアプローチとなっているが、既存のシステムはマルチホップの質問に対して脆弱であり、タスクの解決には複数の検索と推論ステップの連鎖が必要である。鍵となる課題は、現在の手法が、中間状態が暗黙的である自由形式の自然言語による推論を表現し、検索クエリは意図したエンティティからドリフトし、エラーが同じモデルによって検出され、自己回帰は信頼できない、未解決の信号となることである。マルチホップ質問応答はステップバイステップ計算の典型的な形式であり,この構造化プロセスは,コード固有化言語モデルの動作訓練と密接に一致している。そこで我々は,マルチホップRAGをプログラム合成と実行として再構成するフレームワークであるShapyragを紹介した。自由形式の推論トラジェクトリの代わりに、 \pyragは推論プロセスを、検索やQAツールよりも実行可能なPythonプログラムとして表現し、中間状態を変数として公開し、実行を通じて決定論的フィードバックを生成し、すべての推論プロセスの検査可能なトレースを生成する。この定式化により、追加のトレーニングなしでコンパイラによる自己修復と実行駆動適応検索が可能になる。 5つのQAベンチマーク(PopQA、HotpotQA、2WikiMultihopQA、MuSiQue、Bamboogle)での実験では、Piragはトレーニング不要とRLトレーニングの両方の条件下で、強いベースラインを一貫して上回り、特に合成マルチホップデータセットでは大きな伸びを示している。私たちのコード、データ、モデルはhttps://github.com/GasolSun36/PyRAG.comで公開されています。

論文の概要: Retrieval is Cheap, Show Me the Code: Executable Multi-Hop Reasoning for Retrieval-Augmented Generation

関連論文リスト