Fugu-MT 論文翻訳(概要): Automated Repair of C Programs Using Large Language Models

論文の概要: Automated Repair of C Programs Using Large Language Models

arxiv url: http://arxiv.org/abs/2509.01947v1
Date: Tue, 02 Sep 2025 04:34:11 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-04 15:17:03.907531
Title: Automated Repair of C Programs Using Large Language Models
Title（参考訳）: 大規模言語モデルを用いたC言語プログラムの自動修復
Authors: Mahdi Farzandway, Fatemeh Ghassemi,
Abstract要約: 本研究では,Cプログラムの修復を自動化する上で,LLM(Large Language Models)の可能性について検討する。本稿では,SBFL(Spectrum-based Fault Localization),ランタイムフィードバック,Chain-of-Thought-structured(Chain-of-Thought-structured)を自動修復ループに統合するフレームワークを提案する。我々の手法は44.93%の修理精度を達成し、最先端のAPRベースラインに対する3.61%の絶対的な改善を示している。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This study explores the potential of Large Language Models (LLMs) in automating the repair of C programs. We present a framework that integrates spectrum-based fault localization (SBFL), runtime feedback, and Chain-of-Thought-structured prompting into an autonomous repair loop. Unlike prior approaches, our method explicitly combines statistical program analysis with LLM reasoning. The iterative repair cycle leverages a structured Chain-of-Thought (CoT) prompting approach, where the model reasons over failing tests, suspicious code regions, and prior patch outcomes, before generating new candidate patches. The model iteratively changes the code, evaluates the results, and incorporates reasoning from previous attempts into subsequent modifications, reducing repeated errors and clarifying why some bugs remain unresolved. Our evaluation spans 3,902 bugs from the Codeflaws benchmark, where our approach achieves 44.93% repair accuracy, representing a 3.61% absolute improvement over strong state-of-the-art APR baselines such as GPT-4 with CoT. This outcome highlights a practical pathway toward integrating statistical program analysis with generative AI in automated debugging.
Abstract（参考訳）: 本研究では,Cプログラムの修復を自動化する上で,Large Language Models (LLMs) の可能性について検討する。本稿では,SBFL(Spectrum-based Fault Localization),ランタイムフィードバック,Chain-of-Thought-structured(Chain-of-Thought-structured)を自動修復ループに統合するフレームワークを提案する。従来の手法とは異なり,本手法は統計的プログラム解析とLLM推論を明示的に組み合わせている。反復的修復サイクルは構造化されたChain-of-Thought(CoT)プロンプトアプローチを利用しており、新しいパッチを生成する前に、フェールテスト、疑わしいコード領域、および事前のパッチ結果に関するモデルが原因となる。モデルはコードを反復的に変更し、結果を評価し、以前の試みからの推論をその後の修正に取り入れ、繰り返しエラーを減らし、なぜいくつかのバグが未解決のままなのかを明確にする。我々の評価はCodeflawsベンチマークの3,902のバグに及び、44.93%の修正精度が達成され、GPT-4やCoTのような最先端のAPRベースラインに対して3.61%の絶対的な改善が達成された。この結果は、自動デバッグにおいて、統計的プログラム分析と生成AIを統合するための実践的な経路を強調している。

論文の概要: Automated Repair of C Programs Using Large Language Models

関連論文リスト