Fugu-MT 論文翻訳(概要): Project Prometheus: Bridging the Intent Gap in Agentic Program Repair via Reverse-Engineered Executable Specifications

論文の概要: Project Prometheus: Bridging the Intent Gap in Agentic Program Repair via Reverse-Engineered Executable Specifications

arxiv url: http://arxiv.org/abs/2604.17464v1
Date: Sun, 19 Apr 2026 14:27:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-21 21:52:52.539514
Title: Project Prometheus: Bridging the Intent Gap in Agentic Program Repair via Reverse-Engineered Executable Specifications
Title（参考訳）: Project Prometheus: Reverse-Engineered Executable Specificationsによるエージェントプログラム修復におけるインテントギャップのブリッジ
Authors: Yongchao Wang, Zhiqiu Huang,
Abstract要約: 現在のソリューションは、自然言語の要約や敵のサンプリングに頼っているが、手術の修理に必要な決定論的制約を与えていないことが多い。コード生成よりもtextitSpecification Inference を優先することで、このギャップを埋める新しいフレームワークである textscPrometheus を紹介します。我々のフレームワークは textbf93.97% (639/680) の完全なパッチレートを達成した。
参考スコア（独自算出の注目度）: 14.657771106188115
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The transition from neural machine translation to agentic workflows has revolutionized Automated Program Repair (APR). However, existing agents, despite their advanced reasoning capabilities, frequently suffer from the ``Intent Gap'' -- the misalignment between the generated patch and the developer's original intent. Current solutions relying on natural language summaries or adversarial sampling often fail to provide the deterministic constraints required for surgical repairs. In this paper, we introduce \textsc{Prometheus}, a novel framework that bridges this gap by prioritizing \textit{Specification Inference} over code generation. We employ Behavior-Driven Development (BDD) as an executable contract, utilizing a multi-agent architecture to reverse-engineer Gherkin specifications from runtime failure reports. To resolve the ``Hallucination of Intent,'' we propose a \textbf{Requirement Quality Assurance (RQA) Loop}, a mechanism that leverages ground-truth code as a proxy oracle to validate inferred specifications. We evaluated \textsc{Prometheus} on 680 defects from the Defects4J benchmark. The results are transformative: our framework achieved a total correct patch rate of \textbf{93.97\%} (639/680). More significantly, it demonstrated a \textbf{Rescue Rate of 74.4\%}, successfully repairing 119 complex bugs that a strong blind agent failed to resolve. Qualitative analysis reveals that explicit intent guides agents away from structurally invasive over-engineering toward precise, minimal corrections. Our findings suggest that the future of APR lies not in larger models, but in the capability to align code with verified, \textbf{Executable Specifications} -- whether pre-existing or reverse-engineered.
Abstract（参考訳）: ニューラルネットワーク翻訳からエージェントワークフローへの移行は、自動プログラム修復(APR)に革命をもたらした。しかし、既存のエージェントは、高度な推論機能にもかかわらず、しばしば 'Intent Gap'' -- 生成されたパッチと開発者の意図の不一致に悩まされる。現在のソリューションは、自然言語の要約や敵のサンプリングに頼っているが、手術の修理に必要な決定論的制約を与えていないことが多い。本稿では,このギャップを埋める新しいフレームワークである \textsc{Prometheus} を紹介し,コード生成に対して \textit{Specification Inference} を優先する。我々は振る舞い駆動開発(BDD)を実行可能な契約として採用し、ランタイム障害レポートからGherkin仕様をリバースエンジニアリングするマルチエージェントアーキテクチャを活用しています。 IntentのHallucination of Intent,'' を解くために,inferred仕様を検証するために,代用託宣として接地符号を利用するメカニズムである \textbf{Requirement Quality Assurance (RQA) Loop} を提案する。 Defects4Jベンチマークの680個の欠陥について, textsc{Prometheus} の評価を行った。我々のフレームワークは、全正確なパッチレートを \textbf{93.97\%} (639/680) で達成した。さらに、強力な盲人エージェントが解決できなかった119の複雑なバグの修復に成功した、74.4\%のtextbf{Rescue Rateを実証した。定性的分析により、明示的な意図は、エージェントを構造的に侵入的なオーバーエンジニアリングから、正確で最小限の修正へと導くことが明らかになった。我々の発見は、APRの将来はより大きなモデルではなく、コードと検証済みの \textbf{Executable Specifications} をアライメントする能力にあることを示唆している。

論文の概要: Project Prometheus: Bridging the Intent Gap in Agentic Program Repair via Reverse-Engineered Executable Specifications

関連論文リスト