Fugu-MT 論文翻訳(概要): Reflective Paper-to-Code Reproduction Enabled by Fine-Grained Verification

論文の概要: Reflective Paper-to-Code Reproduction Enabled by Fine-Grained Verification

arxiv url: http://arxiv.org/abs/2508.16671v1
Date: Thu, 21 Aug 2025 06:57:44 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-26 18:43:45.106368
Title: Reflective Paper-to-Code Reproduction Enabled by Fine-Grained Verification
Title（参考訳）: 細粒化検証によるリフレクティブペーパー・コード再生
Authors: Mingyang Zhou, Quanming Yao, Lun Du, Lanning Wei, Da Zheng,
Abstract要約: 複雑なコードを効率的にデバッグするために、人間が体系的なチェックリストを使う方法に触発されて、textbfReflective Paper-to-Code textbfReproductionフレームワークである textbfReProを提案する。紙の指紋を自動的に抽出し、高品質な監視信号として機能する、正確で原子的な基準の包括的なセットを参照する。ベースラインよりも13.0%のパフォーマンスギャップを達成し、反射の複雑な論理的および数学的基準を正しく修正する。
参考スコア（独自算出の注目度）: 46.845133190560375
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reproducing machine learning papers is essential for scientific progress but remains challenging for both humans and automated agents. Existing agent-based methods often struggle to fully and accurately reproduce implementation details such as mathematical formulas and algorithmic logic. Previous studies show that reflection with explicit feedback improves agent performance. However, current paper reproduction methods fail to effectively adopt this strategy. This gap mainly arises from the diverse paper patterns, complex method modules, and varied configurations encountered in research papers. Motivated by how humans use systematic checklists to efficiently debug complex code, we propose \textbf{RePro}, a \textbf{Re}flective Paper-to-Code \textbf{Repro}duction framework that automatically extracts a paper's fingerprint, referring to a comprehensive set of accurate and atomic criteria serving as high-quality supervisory signals. The framework first generates code based on the extracted information, and then leverages the fingerprint within iterative verification and refinement loop. This approach systematically detects discrepancies and produces targeted revisions to align generated code with the paper's implementation details. Extensive experiments on the PaperBench Code-Dev benchmark have been conducted, RePro achieves 13.0\% performance gap over baselines, and it correctly revises complex logical and mathematical criteria in reflecting, on which the effectiveness is obvious.
Abstract（参考訳）: 機械学習の論文の再現は科学的な進歩には不可欠だが、人間と自動化されたエージェントには依然として困難である。既存のエージェントベースの手法は、数学的公式やアルゴリズム論理のような実装の詳細を完全かつ正確に再現するのに苦労することが多い。これまでの研究では、明示的なフィードバックによるリフレクションがエージェントのパフォーマンスを向上させることが示されている。しかし、現在の紙再生法はこの戦略を効果的に採用することができない。このギャップは主に、研究論文で遭遇した多彩な紙パターン、複雑なメソッドモジュール、様々な構成から生じる。複雑なコードを効率的にデバッグするための体系的なチェックリストの使用法に着想を得て,紙の指紋を自動的に抽出する <textbf{RePro}, \textbf{Re}flective Paper-to-Code \textbf{Repro}duction frameworkを提案する。フレームワークはまず抽出した情報に基づいてコードを生成し、その後、反復検証と精査ループ内で指紋を利用する。このアプローチは、系統的に不一致を検知し、生成したコードと実装の詳細を整合させるターゲットリビジョンを生成する。 PaperBench Code-Devベンチマークの大規模な実験が行われ、ReProはベースラインよりも13.0\%のパフォーマンスギャップを達成し、その効果が明らかである複雑な論理的および数学的基準を正しく修正している。

論文の概要: Reflective Paper-to-Code Reproduction Enabled by Fine-Grained Verification

関連論文リスト