Fugu-MT 論文翻訳(概要): Benchmarking and Enhancing LLM Agents in Localizing Linux Kernel Bugs

論文の概要: Benchmarking and Enhancing LLM Agents in Localizing Linux Kernel Bugs

arxiv url: http://arxiv.org/abs/2505.19489v1
Date: Mon, 26 May 2025 04:15:48 GMT
ステータス: 翻訳完了
システム内更新日: 2025-05-27 16:58:43.159931
Title: Benchmarking and Enhancing LLM Agents in Localizing Linux Kernel Bugs
Title（参考訳）: LinuxカーネルバグのローカライズにおけるLDMエージェントのベンチマークと強化
Authors: Zhenhao Zhou, Zhuochen Huang, Yike He, Chong Wang, Jiajun Wang, Yijian Wu, Xin Peng, Yiling Lou,
Abstract要約: フォールトローカライゼーション(FL)は、ソフトウェアのバグのあるコード要素を特定することを目的としている。最近のLLMエージェントは、SWE-benchのような最近のベンチマークでFLで有望な精度を達成した。実世界のLinuxカーネルのバグから構築されたFLベンチマークであるLinuxFLBenchを紹介する。
参考スコア（独自算出の注目度）: 9.986455089493779
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Linux kernel is a critical system, serving as the foundation for numerous systems. Bugs in the Linux kernel can cause serious consequences, affecting billions of users. Fault localization (FL), which aims at identifying the buggy code elements in software, plays an essential role in software quality assurance. While recent LLM agents have achieved promising accuracy in FL on recent benchmarks like SWE-bench, it remains unclear how well these methods perform in the Linux kernel, where FL is much more challenging due to the large-scale code base, limited observability, and diverse impact factors. In this paper, we introduce LinuxFLBench, a FL benchmark constructed from real-world Linux kernel bugs. We conduct an empirical study to assess the performance of state-of-the-art LLM agents on the Linux kernel. Our initial results reveal that existing agents struggle with this task, achieving a best top-1 accuracy of only 41.6% at file level. To address this challenge, we propose LinuxFL$^+$, an enhancement framework designed to improve FL effectiveness of LLM agents for the Linux kernel. LinuxFL$^+$ substantially improves the FL accuracy of all studied agents (e.g., 7.2% - 11.2% accuracy increase) with minimal costs. Data and code are available at https://github.com/FudanSELab/LinuxFLBench.
Abstract（参考訳）: Linuxカーネルは重要なシステムであり、多くのシステムの基盤となっている。 Linuxカーネルのバグは深刻な結果をもたらし、数十億のユーザに影響を与える可能性がある。ソフトウェアにおけるバグのあるコード要素を特定することを目的としたフォールトローカライゼーション(FL)は、ソフトウェア品質保証において重要な役割を果たす。最近のLLMエージェントは、SWE-benchのような最近のベンチマークでFLで有望な精度を達成したが、これらのメソッドがLinuxカーネルでどれだけうまく機能するかは定かではない。本稿では,実世界のLinuxカーネルのバグから構築したFLベンチマークであるLinuxFLBenchを紹介する。我々は,Linuxカーネル上での最先端LLMエージェントの性能を評価するための実証的研究を行った。最初の結果は、既存のエージェントがこのタスクに苦労していることを示し、ファイルレベルでは41.6%の最高のトップ1の精度を達成した。この課題に対処するため,Linuxカーネル用のLLMエージェントのFL効率向上を目的とした拡張フレームワークであるLinuxFL$^+$を提案する。 LinuxFL$^+$は、最小コストですべての研究エージェント(例えば7.2%から11.2%の精度向上)のFL精度を大幅に改善する。データとコードはhttps://github.com/FudanSELab/LinuxFLBench.comで入手できる。

論文の概要: Benchmarking and Enhancing LLM Agents in Localizing Linux Kernel Bugs

関連論文リスト