Fugu-MT 論文翻訳(概要): Improving LLM-Based Go Code Review through Issue-List Generation and Context Augmentation

論文の概要: Improving LLM-Based Go Code Review through Issue-List Generation and Context Augmentation

arxiv url: http://arxiv.org/abs/2606.01859v1
Date: Mon, 01 Jun 2026 08:11:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-02 21:34:31.600384
Title: Improving LLM-Based Go Code Review through Issue-List Generation and Context Augmentation
Title（参考訳）: 課題リスト生成とコンテキスト拡張によるLLMベースのGoコードレビューの改善
Authors: Kexin Sun, Yucong Guan, Jiaqi Sun, Hongyu Kuang, Guoping Rong, Dong Shao, He Zhang, Xiaoxing Ma, Christoph Treude,
Abstract要約: 本稿では,LSMが最重要事項のみを報告するのではなく,潜在的な問題をすべて列挙する課題リストレビューパラダイムを提案する。次に、隣人、LSPベースのセマンティクス、IRベースの同様のコチェンジコンテキストの3つのタイプのコードコンテキスト拡張を比較します。提案手法は,非コンテキストおよび文脈拡張世代から候補を統合してレビューカバレッジを向上させるとともに,改良誘導プルーニングを導入し,候補リストを実用的規模に維持する。
参考スコア（独自算出の注目度）: 20.19657859180513
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: LLMs have shown strong potential for automating code review, yet their practical utility depends heavily on the design of generation and context strategies. In this paper, we investigate how to improve LLM-based code review through generation strategy and contextual augmentation. We first propose an issue-list review paradigm, in which LLMs enumerate all potential issues rather than reporting only the single most important one (i.e., primary-issue review). We then systematically compare three types of code context augmentation -- neighboring, LSP-based semantics, and IR-based similar co-change context -- and study how they influence issue discovery. Finally, we integrate candidates from no-context and context-enhanced generation to improve review coverage, and introduce refinement-guided pruning to keep the candidate list at a practical size. We evaluate our approach on 1,438 Go review instances using downstream code refinement as the main metric, i.e., how often the candidate list contains at least one comment inducing the same code change as the final human revision. For comparison, we evaluate comments by CodeReviewer, a model trained specifically for review comment generation, as well as ground-truth human review comments (as a practical upper bound), under the same refinement-based evaluation. The results show that our best configuration, combining issue-list review, neighboring and similar co-change context, and candidate integration, reaches 28.00% refinement exact match, a statistically significant gain of +10.85 percentage points over primary-issue review without any additional context (17.15%), substantially outperforming CodeReviewer (15.02%) and approaching the human-oracle ceiling of 36.09%. Our refinement-guided pruning reduces the average candidate count from 7.2 to 3.1 at top-5 while retaining nearly the full benefit, making the candidate list easier to inspect.
Abstract（参考訳）: LLMはコードレビューを自動化する強力な可能性を示しているが、その実用性は生成戦略とコンテキスト戦略に大きく依存している。本稿では,ジェネレーション戦略と文脈拡張によりLCMに基づくコードレビューを改善する方法について検討する。まず, LLM は最も重要な問題のみを報告するのではなく,全ての潜在的な問題を列挙する問題リストレビューパラダイムを提案する。次に、隣人、LSPベースのセマンティクス、IRベースの同様のコチェンジコンテキストの3つのタイプのコードコンテキスト拡張を体系的に比較し、それらが問題発見にどのように影響するかを研究します。最後に、非コンテキストおよびコンテキスト拡張世代からの候補を統合し、レビューカバレッジを改善し、改良誘導プルーニングを導入し、候補リストを実用規模に維持する。我々は、ダウンストリームコードリファインメントを主要な指標として用いた1,438のGoレビューインスタンスに対するアプローチを評価する。比較のために、我々は、レビューコメント生成に特化したモデルであるCodeReviewerのコメントを評価し、また、同じ洗練に基づく評価の下で、(実用的な上限として)基礎的な人間のレビューコメントを評価した。その結果, 課題リストのレビュー, 近隣および類似の共変化コンテキスト, および候補統合を組み合わせた最適構成が, 精度28.00%, 統計的に有意な増加率+10.85ポイント, 追加コンテキストのない初号レビュー(17.15%), コードレビューア(15.02%) を大幅に上回り, 人体天井に近づいた36.09%, という結果が得られた。改良誘導プルーニングにより,トップ5における平均候補数を7.2から3.1に減らし,全体の利益をほぼ維持し,候補リストの検査が容易になる。

論文の概要: Improving LLM-Based Go Code Review through Issue-List Generation and Context Augmentation

関連論文リスト