Fugu-MT 論文翻訳(概要): OmniLLP: Enhancing LLM-based Log Level Prediction with Context-Aware Retrieval

論文の概要: OmniLLP: Enhancing LLM-based Log Level Prediction with Context-Aware Retrieval

arxiv url: http://arxiv.org/abs/2508.08545v1
Date: Tue, 12 Aug 2025 01:18:56 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-13 21:07:34.26479
Title: OmniLLP: Enhancing LLM-based Log Level Prediction with Context-Aware Retrieval
Title（参考訳）: OmniLLP: コンテキスト認識検索によるLLMベースのログレベル予測の強化
Authors: Youssef Esseddiq Ouatiti, Mohammed Sayagh, Bram Adams, Ahmed E. Hassan,
Abstract要約: 我々はOmniLLPを提案する。OmniLLPは、コードの機能目的を反映したセマンティックな類似性に基づくソースファイルをクラスタリングするフレームワークであり、開発者の所有権の凝集である。以上の結果から, セマンティック・アウェア・クラスタリングとオーナシップ・アウェア・クラスタリングは, 評価LPPの精度(最大8%のAUC)を統計的に向上させることがわかった。文脈内予測のためのセマンティック信号とオーナシップ信号を組み合わせるアプローチは、評価プロジェクト全体で印象的な0.88から0.96AUCを達成する。
参考スコア（独自算出の注目度）: 8.328441582683034
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Developers insert logging statements in source code to capture relevant runtime information essential for maintenance and debugging activities. Log level choice is an integral, yet tricky part of the logging activity as it controls log verbosity and therefore influences systems' observability and performance. Recent advances in ML-based log level prediction have leveraged large language models (LLMs) to propose log level predictors (LLPs) that demonstrated promising performance improvements (AUC between 0.64 and 0.8). Nevertheless, current LLM-based LLPs rely on randomly selected in-context examples, overlooking the structure and the diverse logging practices within modern software projects. In this paper, we propose OmniLLP, a novel LLP enhancement framework that clusters source files based on (1) semantic similarity reflecting the code's functional purpose, and (2) developer ownership cohesion. By retrieving in-context learning examples exclusively from these semantic and ownership aware clusters, we aim to provide more coherent prompts to LLPs leveraging LLMs, thereby improving their predictive accuracy. Our results show that both semantic and ownership-aware clusterings statistically significantly improve the accuracy (by up to 8\% AUC) of the evaluated LLM-based LLPs compared to random predictors (i.e., leveraging randomly selected in-context examples from the whole project). Additionally, our approach that combines the semantic and ownership signal for in-context prediction achieves an impressive 0.88 to 0.96 AUC across our evaluated projects. Our findings highlight the value of integrating software engineering-specific context, such as code semantic and developer ownership signals into LLM-LLPs, offering developers a more accurate, contextually-aware approach to logging and therefore, enhancing system maintainability and observability.
Abstract（参考訳）: 開発者はソースコードにロギングステートメントを挿入して、メンテナンスやデバッグに必要なランタイム情報をキャプチャする。ログレベルの選択は、ログの冗長性を制御しているため、システムの可観測性とパフォーマンスに影響を与えるため、ロギングアクティビティの不可欠な部分ですが、トリッキーな部分です。 MLベースのログレベルの予測の最近の進歩は、大きな言語モデル(LLM)を活用して、有望なパフォーマンス改善(0.64から0.8までのAUC)を示すログレベルの予測器(LLP)を提案している。しかし、現在のLLMベースのLPPはランダムに選択されたインコンテキストの例に依存しており、現代のソフトウェアプロジェクトの構造と多様なロギングプラクティスを見下ろしている。本稿では,(1)コードの機能的目的を反映した意味的類似性に基づくソースファイルをクラスタリングする,新しいLPP拡張フレームワークであるOmniLLPを提案する。これらの意味的およびオーナシップを意識したクラスタから、文脈内学習例を検索することで、LLMを利用したLPPに対してよりコヒーレントなプロンプトを提供することにより、予測精度を向上させることを目指している。本結果から,LLMを用いたLPPの精度を,ランダムな予測器(プロジェクト全体からランダムに選択したインコンテキストの例を活用)と比較して,意味的クラスタリングとオーナシップを考慮したクラスタリングの両方で統計的に改善した(最大8倍のAUC)。さらに、文脈内予測のためのセマンティック信号とオーナシップ信号を組み合わせたアプローチは、評価プロジェクト全体で印象的な0.88から0.96AUCを達成する。我々の発見は、コードセマンティクスや開発者の所有信号などのソフトウェアエンジニアリング固有のコンテキストをLLM-LLPに統合することの価値を強調し、開発者がロギングに対してより正確でコンテキスト対応のアプローチを提供し、システム保守性と可観測性を高めます。

論文の概要: OmniLLP: Enhancing LLM-based Log Level Prediction with Context-Aware Retrieval

関連論文リスト