Fugu-MT 論文翻訳(概要): Summary-Mediated Repair: Can LLMs use code summarisation as a tool for program repair?

論文の概要: Summary-Mediated Repair: Can LLMs use code summarisation as a tool for program repair?

arxiv url: http://arxiv.org/abs/2511.18782v1
Date: Mon, 24 Nov 2025 05:33:38 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-25 18:34:25.033696
Title: Summary-Mediated Repair: Can LLMs use code summarisation as a tool for program repair?
Title（参考訳）: 概要-メディア修復: LLMはプログラム修復のツールとしてコード要約を使用できるか?
Authors: Lukas Twist,
Abstract要約: 大きな言語モデル(LLM)は、強いベンチマーク性能にもかかわらず、微妙な実装レベルのバグのあるコードを生成することが多い。本稿では,プログラム修復のためのプロンプトのみのパイプラインである要約型修復を提案する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large Language Models (LLMs) often produce code with subtle implementation-level bugs despite strong benchmark performance. These errors are hard for LLMs to spot and can have large behavioural effects; yet when asked to summarise code, LLMs can frequently surface high-level intent and sometimes overlook this low-level noise. Motivated by this, we propose summary-mediated repair, a prompt-only pipeline for program repair that leverages natural-language code summarisation as an explicit intermediate step, extending previous work that has already shown code summarisation to be a useful intermediary for downstream tasks. We evaluate our method across eight production-grade LLMs on two function level benchmarks (HumanEvalPack and MBPP), comparing several summary styles against a direct repair baseline. Error-aware diagnostic summaries consistently yield the largest gains - repairing up to 65% of unseen errors, on average of 5% more than the baseline - though overall improvements are modest and LLM-dependent. Our results position summaries as a cheap, human-interpretable diagnostic artefact that can be integrated into program-repair pipelines rather than a stand-alone fix-all.
Abstract（参考訳）: 大きな言語モデル(LLM)は、強いベンチマーク性能にもかかわらず、微妙な実装レベルのバグのあるコードを生成することが多い。これらのエラーはLLMにとって見つからないものであり、大きな振る舞い効果を持つ可能性があるが、コードを要約するように要求されると、LLMは高レベルの意図を頻繁に表面化し、時々この低レベルのノイズを見落としてしまうことがある。そこで本研究では、自然言語の要約を明示的な中間ステップとして活用する、プログラム修復のためのプロンプトオンリーのパイプラインである要約仲介修復を提案する。 2つの関数レベルベンチマーク(HumanEvalPack と MBPP)を用いて,本手法を8つの実運用レベル LLM で評価し,いくつかの要約スタイルと直修ベースラインを比較した。エラーを意識した診断サマリーは、ベースラインよりも平均5%多く、最大65%の未確認エラーを修復するが、全体的な改善は控え目で、LCMに依存している。以上の結果から,サマリーは,スタンドアローンの固定ではなく,プログラム修復パイプラインに統合可能な,安価で解釈可能な診断アーチファクトとして位置づけることができた。

論文の概要: Summary-Mediated Repair: Can LLMs use code summarisation as a tool for program repair?

関連論文リスト