Fugu-MT 論文翻訳(概要): Human-AI Synergy in Agentic Code Review

論文の概要: Human-AI Synergy in Agentic Code Review

arxiv url: http://arxiv.org/abs/2603.15911v1
Date: Mon, 16 Mar 2026 20:56:18 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-18 17:42:06.986493
Title: Human-AI Synergy in Agentic Code Review
Title（参考訳）: エージェントコードレビューにおけるヒューマンAIシナジー
Authors: Suzhen Zhong, Shayan Noei, Ying Zou, Bram Adams,
Abstract要約: 我々は、人間レビュアーとAIエージェントによるフィードバックの違いを比較した。人間のレビュアーは、人間が書いたコードよりもAI生成コードをレビューする場合、11.8%のラウンドを交換する。 AIエージェントからの未確認の提案の半数以上が間違っているか、開発者による代替修正によって対処されている。
参考スコア（独自算出の注目度）: 3.7086626614863984
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Code review is a critical software engineering practice where developers review code changes before integration to ensure code quality, detect defects, and improve maintainability. In recent years, AI agents that can understand code context, plan review actions, and interact with development environments have been increasingly integrated into the code review process. However, there is limited empirical evidence to compare the effectiveness of AI agents and human reviewers in collaborative workflows. To address this gap, we conduct a large-scale empirical analysis of 278,790 code review conversations across 300 open-source GitHub projects. In our study, we aim to compare the feedback differences provided by human reviewers and AI agents. We investigate human-AI collaboration patterns in review conversations to understand how interaction shapes review outcomes. Moreover, we analyze the adoption of code suggestions provided by human reviewers and AI agents into the codebase and how adopted suggestions change code quality. We find that human reviewers provide additional feedback than AI agents, including understanding, testing, and knowledge transfer. Human reviewers exchange 11.8% more rounds when reviewing AI-generated code than human-written code. Moreover, code suggestions made by AI agents are adopted into the codebase at a significantly lower rate than suggestions proposed by human reviewers. Over half of unadopted suggestions from AI agents are either incorrect or addressed through alternative fixes by developers. When adopted, suggestions provided by AI agents produce significantly larger increases in code complexity and code size than suggestions provided by human reviewers. Our findings suggest that while AI agents can scale defect screening, human oversight remains critical for ensuring suggestion quality and providing contextual feedback that AI agents lack.
Abstract（参考訳）: コードレビューは、開発者がコード品質を保証し、欠陥を検出し、保守性を改善するために、統合前のコード変更をレビューする、重要なソフトウェアエンジニアリングプラクティスである。近年、コードコンテキスト、計画レビューアクション、開発環境とのインタラクションを理解できるAIエージェントが、コードレビュープロセスにますます統合されている。しかし、協調ワークフローにおけるAIエージェントと人間レビュアーの有効性を比較するための実証的な証拠は限られている。このギャップに対処するため、300のオープンソースプロジェクト間で278,790のコードレビューの会話を大規模に分析した。本研究では,人間レビュアーとAIエージェントのフィードバックの違いを比較することを目的とした。我々は,人間とAIのコラボレーションパターンをレビュー会話で調べ,インタラクションがどのように成果をレビューするかを理解する。さらに、人間のレビュアーやAIエージェントによるコード提案のコードベースへの導入と、採用提案がコード品質をどのように変化させるかを分析する。人間のレビュアーは、理解、テスト、知識伝達を含むAIエージェントよりも、さらなるフィードバックを提供する。人間のレビュアーは、人間が書いたコードよりもAI生成コードをレビューする場合、11.8%のラウンドを交換する。さらに、AIエージェントによるコード提案は、人間のレビュアーによる提案よりも大幅に低いレートでコードベースに採用されている。 AIエージェントからの未確認の提案の半数以上が間違っているか、開発者による代替修正によって対処されている。採用されると、AIエージェントが提供する提案は、人間のレビュアーが提供する提案よりもコードの複雑さとコードサイズが大幅に増加する。我々の研究によると、AIエージェントは欠陥スクリーニングをスケールできるが、人間の監視は提案品質を確保し、AIエージェントが欠如している文脈フィードバックを提供するために重要である。

論文の概要: Human-AI Synergy in Agentic Code Review

関連論文リスト