Fugu-MT 論文翻訳(概要): DQE: A Semantic-Aware Evaluation Metric for Time Series Anomaly Detection

論文の概要: DQE: A Semantic-Aware Evaluation Metric for Time Series Anomaly Detection

arxiv url: http://arxiv.org/abs/2603.06131v1
Date: Fri, 06 Mar 2026 10:38:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 13:17:45.514215
Title: DQE: A Semantic-Aware Evaluation Metric for Time Series Anomaly Detection
Title（参考訳）: DQE:時系列異常検出のための意味認識評価指標
Authors: Yuewei Li, Dalin Zhang, Huan Li, Xinyi Gong, Hongjun Chu, Zhaohui Song,
Abstract要約: 本稿では,検出セマンティクスの観点から時系列異常検出の評価を再考する。検出セマンティクスに基づく分割戦略を導入し、各異常の局所時間領域を3つの機能的に異なる部分領域に分解する。このパーティショニングを用いて、イベント全体の検出挙動を評価し、各サブリージョンに対してよりきめ細かいスコアリング機構を設計し、より信頼性と解釈可能な評価を可能にする。
参考スコア（独自算出の注目度）: 10.700735533120257
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Time series anomaly detection has achieved remarkable progress in recent years. However, evaluation practices have received comparatively less attention, despite their critical importance. Existing metrics exhibit several limitations: (1) bias toward point-level coverage, (2) insensitivity or inconsistency in near-miss detections, (3) inadequate penalization of false alarms, and (4) inconsistency caused by threshold or threshold-interval selection. These limitations can produce unreliable or counterintuitive results, hindering objective progress. In this work, we revisit the evaluation of time series anomaly detection from the perspective of detection semantics and propose a novel metric for more comprehensive assessment. We first introduce a partitioning strategy grounded in detection semantics, which decomposes the local temporal region of each anomaly into three functionally distinct subregions. Using this partitioning, we evaluate overall detection behavior across events and design finer-grained scoring mechanisms for each subregion, enabling more reliable and interpretable assessment. Through a systematic study of existing metrics, we identify an evaluation bias associated with threshold-interval selection and adopt an approach that aggregates detection qualities across the full threshold spectrum, thereby eliminating evaluation inconsistency. Extensive experiments on synthetic and real-world data demonstrate that our metric provides stable, discriminative, and interpretable evaluation, while achieving robust assessment compared with ten widely used metrics.
Abstract（参考訳）: 近年,時系列異常検出は顕著な進歩を遂げている。しかし、評価の実践は、その重要な重要性にもかかわらず、比較的注目を集めていない。既存の指標には、(1)点レベルのカバレッジに対するバイアス、(2)近距離検出における不感度または不整合、(3)偽アラームの不十分なペナル化、(4)閾値またはしきい値間隔選択による不整合など、いくつかの制限がある。これらの制限は信頼できない、あるいは直感に反する結果をもたらし、客観的な進歩を妨げる。本研究では,検出セマンティクスの観点から時系列異常検出の評価を再考し,より包括的な評価のための新しい指標を提案する。まず、各異常の局所時間領域を3つの機能的に異なる部分領域に分解する、検出セマンティクスに基づく分割戦略を導入する。このパーティショニングを用いて、イベント全体の検出挙動を評価し、各サブリージョンに対してよりきめ細かいスコアリング機構を設計し、より信頼性と解釈可能な評価を可能にする。既存の指標の体系的な研究を通じて、しきい値-区間選択に関連する評価バイアスを特定し、全しきい値スペクトルにわたって検出品質を集約し、不整合を除去するアプローチを採用する。人工的および実世界のデータに対する大規模な実験により、我々の測定値が安定的で差別的で解釈可能な評価を提供する一方で、広く使用されている10の指標と比較して頑健な評価を達成できることが示されている。

論文の概要: DQE: A Semantic-Aware Evaluation Metric for Time Series Anomaly Detection

関連論文リスト