Fugu-MT 論文翻訳(概要): AgentSZZ: Teaching the LLM Agent to Play Detective with Bug-Inducing Commits

論文の概要: AgentSZZ: Teaching the LLM Agent to Play Detective with Bug-Inducing Commits

arxiv url: http://arxiv.org/abs/2604.02665v1
Date: Fri, 03 Apr 2026 02:54:02 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-06 17:20:24.292715
Title: AgentSZZ: Teaching the LLM Agent to Play Detective with Bug-Inducing Commits
Title（参考訳）: AgentSZZ: LLMエージェントにバグ発生コミートによる検出を指導
Authors: Yunbo Lyu, Jieke Shi, Hong Jin Kang, Ratnadira Widyasari, Junda He, Yuqing Niu, Chengran Yang, Junkai Chen, Zhou Yang, Julia Lawall, David Lo,
Abstract要約: AgentSZZは、バグを引き起こすコミットを特定するエージェントベースのフレームワークである。従来の方法とは異なり、AgentSZZはタスク固有のツール、ドメイン知識、ReActスタイルのループを統合し、バグの適応的および因果的トレースを可能にする。実験によると、AgentSZZはすべての設定で最先端のSZZアルゴリズムより一貫して優れている。
参考スコア（独自算出の注目度）: 14.213358505741105
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The SZZ algorithm is the dominant technique for identifying bug-inducing commits and underpins many software engineering tasks, such as defect prediction and vulnerability analysis. Despite numerous variants, including recent LLM-based approaches, performance remains limited on developer-annotated datasets (e.g., recall of 0.552 on the Linux kernel). A key limitation is the reliance on git blame, which traces line-level changes within the same file, failing in common scenarios such as ghost and cross-file cases-making nearly one-quarter of bug-inducing commits inherently untraceable. Moreover, current approaches follow fixed pipelines that restrict iterative reasoning and exploration, unlike developers who investigate bugs through an interactive, multi-tool process. To address these challenges, we propose AgentSZZ, an agent-based framework that leverages LLM-driven agents to explore repositories and identify bug-inducing commits. Unlike prior methods, AgentSZZ integrates task-specific tools, domain knowledge, and a ReAct-style loop to enable adaptive and causal tracing of bugs. A structured compression module further improves efficiency by reducing redundant context while preserving key evidence. Extensive experiments on three widely used datasets show that AgentSZZ consistently outperforms state-of-the-art SZZ algorithms across all settings, achieving F1-score gains of up to 27.2% over prior LLM-based approaches. The improvements are especially pronounced in challenging scenarios such as cross-file and ghost commits, with recall gains of up to 300% and 60%, respectively. Ablation studies show that task-specific tools and domain knowledge are critical, while compression tool outputs reduce token consumption by over 30% with negligible impact. The replication package is available.
Abstract（参考訳）: SZZアルゴリズムは、バグを誘発するコミットを識別し、欠陥予測や脆弱性解析など、多くのソフトウェアエンジニアリングタスクを支える主要なテクニックである。最近のLCMベースのアプローチを含む多くのバリエーションにもかかわらず、パフォーマンスは開発者によって注釈付けされたデータセット(Linuxカーネルでの0.552のリコールなど)に限られている。これは同じファイル内の行レベルの変更をトレースし、ゴーストやクロスファイルケースのような一般的なシナリオでは失敗する。さらに、現在のアプローチは、インタラクティブなマルチツールプロセスを通じてバグを調査する開発者とは異なり、反復的推論と探索を制限する固定パイプラインに従っている。これらの課題に対処するために,LLM駆動エージェントを活用したエージェントベースのフレームワークであるAgentSZZを提案し,リポジトリの探索とバグ発生コミットの特定を行う。従来の方法とは異なり、AgentSZZはタスク固有のツール、ドメイン知識、ReActスタイルのループを統合し、バグの適応的および因果的トレースを可能にする。構造化圧縮モジュールは、キーエビデンスを保持しながら冗長なコンテキストを減らし、効率をさらに向上する。広く使われている3つのデータセットの大規模な実験によると、AgentSZZはすべての設定で最先端のSZZアルゴリズムを一貫して上回り、以前のLLMベースのアプローチよりも最大27.2%のF1スコアゲインを達成した。改善は、クロスファイルやゴーストコミットといった困難なシナリオでは特に顕著で、それぞれ300%と60%のリコールゲインがある。アブレーション研究では、タスク固有のツールとドメイン知識が重要であり、圧縮ツールの出力は無視可能な影響でトークン消費を30%以上削減する。レプリケーションパッケージが利用可能だ。

論文の概要: AgentSZZ: Teaching the LLM Agent to Play Detective with Bug-Inducing Commits

関連論文リスト