Fugu-MT 論文翻訳(概要): Can We Trust the AI Pair Programmer? Copilot for API Misuse Detection and Correction

論文の概要: Can We Trust the AI Pair Programmer? Copilot for API Misuse Detection and Correction

arxiv url: http://arxiv.org/abs/2509.16795v1
Date: Sat, 20 Sep 2025 19:58:01 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-30 15:02:20.649698
Title: Can We Trust the AI Pair Programmer? Copilot for API Misuse Detection and Correction
Title（参考訳）: AIペアプログラマを信頼できますか? APIミス検出と修正のコパイロット
Authors: Saikat Mondal, Chanchal K. Roy, Hong Wang, Juan Arguello, Samantha Mathan,
Abstract要約: APIの誤用はセキュリティ上の脆弱性やシステム障害を導入し、メンテナンスコストを増大させる。既存の検出アプローチは、開発後の運用を行う静的分析やマシンラーニングベースのツールに依存している。この研究は、MUBenchを使用してAPI誤用を特定し修正するGitHub Copilotの有効性を評価する。
参考スコア（独自算出の注目度）: 5.653894423049302
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: API misuse introduces security vulnerabilities, system failures, and increases maintenance costs, all of which remain critical challenges in software development. Existing detection approaches rely on static analysis or machine learning-based tools that operate post-development, which delays defect resolution. Delayed defect resolution can significantly increase the cost and complexity of maintenance and negatively impact software reliability and user trust. AI-powered code assistants, such as GitHub Copilot, offer the potential for real-time API misuse detection within development environments. This study evaluates GitHub Copilot's effectiveness in identifying and correcting API misuse using MUBench, which provides a curated benchmark of misuse cases. We construct 740 misuse examples, manually and via AI-assisted variants, using correct usage patterns and misuse specifications. These examples and 147 correct usage cases are analyzed using Copilot integrated in Visual Studio Code. Copilot achieved a detection accuracy of 86.2%, precision of 91.2%, and recall of 92.4%. It performed strongly on common misuse types (e.g., missing-call, null-check) but struggled with compound or context-sensitive cases. Notably, Copilot successfully fixed over 95% of the misuses it identified. These findings highlight both the strengths and limitations of AI-driven coding assistants, positioning Copilot as a promising tool for real-time pair programming and detecting and fixing API misuses during software development.
Abstract（参考訳）: APIの誤用はセキュリティ上の脆弱性やシステム障害を導入し、メンテナンスコストを増大させます。既存の検出アプローチは、静的解析や、開発後の運用を行うマシンラーニングベースのツールに依存しており、欠陥解決が遅れている。遅延した欠陥解決は、メンテナンスのコストと複雑さを大幅に増加させ、ソフトウェアの信頼性とユーザ信頼に悪影響を及ぼす可能性がある。 GitHub CopilotのようなAIによるコードアシスタントは、開発環境内でのリアルタイムAPI誤用検出の可能性を秘めている。この研究は、MUBenchを使用してAPI誤用を特定し修正するGitHub Copilotの有効性を評価し、誤用事例のキュレートされたベンチマークを提供する。正確な使用パターンと誤用仕様を使用して、手動およびAI支援型を介して740の誤用例を構築します。これらの例と147の正しいユースケースは、Visual Studio Codeに統合されたCopilotを使って分析される。コパイロットは86.2%の精度、91.2%の精度、92.4%のリコールを達成した。一般的な誤用タイプ(例えば、欠落呼び出し、nullチェック)で強く動作するが、複雑なケースやコンテキストに敏感なケースで苦労した。特に、Copilotは特定した誤用の95%以上をうまく修正した。これらの発見は、AI駆動のコーディングアシスタントの長所と短所の両方を強調し、Copilotをリアルタイムペアプログラミングの有望なツールとして位置づけ、ソフトウェア開発におけるAPIの誤用を検出し、修正する。

論文の概要: Can We Trust the AI Pair Programmer? Copilot for API Misuse Detection and Correction

関連論文リスト