Fugu-MT 論文翻訳(概要): A Comprehensive Survey on Benchmarks and Solutions in Software Engineering of LLM-Empowered Agentic System

論文の概要: A Comprehensive Survey on Benchmarks and Solutions in Software Engineering of LLM-Empowered Agentic System

arxiv url: http://arxiv.org/abs/2510.09721v1
Date: Fri, 10 Oct 2025 06:56:50 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-14 18:06:29.588226
Title: A Comprehensive Survey on Benchmarks and Solutions in Software Engineering of LLM-Empowered Agentic System
Title（参考訳）: LLM-Empowered Agentic System のソフトウェア工学におけるベンチマークとソリューションに関する総合調査
Authors: Jiale Guo, Suizhi Huang, Mei Li, Dong Huang, Xingsheng Chen, Regina Zhang, Zhijiang Guo, Han Yu, Siu-Ming Yiu, Christian Jensen, Pietro Lio, Kwok-Yan Lam,
Abstract要約: 本調査は, LLMを利用したソフトウェア工学の総合的解析を初めて行ったものである。我々は150以上の最近の論文を分析し、2つの主要な次元にまたがる包括的分類に分類する。我々の分析は、この分野が単純なプロンプトエンジニアリングから複雑なエージェントシステムへとどのように進化してきたかを明らかにする。
参考スコア（独自算出の注目度）: 54.933911409697714
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The integration of LLMs into software engineering has catalyzed a paradigm shift from traditional rule-based systems to sophisticated agentic systems capable of autonomous problem-solving. Despite this transformation, the field lacks a comprehensive understanding of how benchmarks and solutions interconnect, hindering systematic progress and evaluation. This survey presents the first holistic analysis of LLM-empowered software engineering, bridging the critical gap between evaluation and solution approaches. We analyze 150+ recent papers and organize them into a comprehensive taxonomy spanning two major dimensions: (1) Solutions, categorized into prompt-based, fine-tuning-based, and agent-based paradigms, and (2) Benchmarks, covering code generation, translation, repair, and other tasks. Our analysis reveals how the field has evolved from simple prompt engineering to complex agentic systems incorporating planning and decomposition, reasoning and self-refinement, memory mechanisms, and tool augmentation. We present a unified pipeline that illustrates the complete workflow from task specification to final deliverables, demonstrating how different solution paradigms address varying complexity levels across software engineering tasks. Unlike existing surveys that focus on isolated aspects, we provide full-spectrum coverage connecting 50+ benchmarks with their corresponding solution strategies, enabling researchers to identify optimal approaches for specific evaluation criteria. Furthermore, we identify critical research gaps and propose actionable future directions, including multi-agent collaboration frameworks, self-evolving code generation systems, and integration of formal verification with LLM-based methods. This survey serves as a foundational resource for researchers and practitioners seeking to understand, evaluate, and advance LLM-empowered software engineering systems.
Abstract（参考訳）: ソフトウェア工学へのLLMの統合は、従来のルールベースのシステムから、自律的な問題解決が可能な高度なエージェントシステムへのパラダイムシフトを引き起こした。この変換にもかかわらず、この分野はベンチマークとソリューションの相互接続に関する包括的な理解を欠き、体系的な進歩と評価を妨げる。本調査では,LCMを利用したソフトウェア工学の総合的な解析を行い,評価とソリューションアプローチの間に重要なギャップを埋める。我々は最近150以上の論文を分析し、(1)素早い、微調整に基づく、そしてエージェントベースのパラダイムに分類されるソリューションと(2)コード生成、翻訳、修復、その他のタスクをカバーするベンチマークの2つの主要な側面にまたがる包括的分類に分類する。我々の分析は、シンプルなプロンプトエンジニアリングから、計画と分解、推論と自己補充、メモリ機構、ツール拡張を含む複雑なエージェントシステムへと、どのように発展してきたかを明らかにする。タスク仕様から最終納品までの完全なワークフローを示し、異なるソリューションパラダイムがソフトウェアエンジニアリングタスクの複雑さレベルにどのように対処するかを示す、統一されたパイプラインを提示します。孤立した側面に焦点を当てた既存の調査とは異なり、50以上のベンチマークと対応するソリューション戦略を結びつけるフルスペクトルカバレッジを提供し、研究者は特定の評価基準に対する最適なアプローチを特定できる。さらに,重要な研究ギャップを特定し,多エージェント協調フレームワーク,自己進化型コード生成システム,LCM方式による形式検証の統合など,実用的な今後の方向性を提案する。この調査は、LLMを利用したソフトウェアエンジニアリングシステムを理解し、評価し、前進させようとする研究者や実践者の基盤となるリソースとなっている。

論文の概要: A Comprehensive Survey on Benchmarks and Solutions in Software Engineering of LLM-Empowered Agentic System

関連論文リスト