Fugu-MT 論文翻訳(概要): DRTA: Dynamic Reward Scaling for Reinforcement Learning in Time Series Anomaly Detection

論文の概要: DRTA: Dynamic Reward Scaling for Reinforcement Learning in Time Series Anomaly Detection

arxiv url: http://arxiv.org/abs/2508.18474v1
Date: Mon, 25 Aug 2025 20:39:49 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-27 17:42:38.588727
Title: DRTA: Dynamic Reward Scaling for Reinforcement Learning in Time Series Anomaly Detection
Title（参考訳）: DRTA:時系列異常検出における強化学習のための動的リワードスケーリング
Authors: Bahareh Golchin, Banafsheh Rekabdar, Kunpeng Liu,
Abstract要約: 時系列データの異常検出は、ファイナンス、ヘルスケア、センサーネットワーク、産業監視におけるアプリケーションにとって重要である。本稿では,動的報酬形成,変分オートエンコーダ(VAE),DRTAと呼ばれるアクティブラーニングを統合した強化学習ベースのフレームワークを提案する。提案手法は,VAEに基づく再構成誤りと分類報酬の効果を動的にスケーリングすることにより,探索と利用のバランスをとる適応報酬機構を用いる。
参考スコア（独自算出の注目度）: 7.185726339205792
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Anomaly detection in time series data is important for applications in finance, healthcare, sensor networks, and industrial monitoring. Traditional methods usually struggle with limited labeled data, high false-positive rates, and difficulty generalizing to novel anomaly types. To overcome these challenges, we propose a reinforcement learning-based framework that integrates dynamic reward shaping, Variational Autoencoder (VAE), and active learning, called DRTA. Our method uses an adaptive reward mechanism that balances exploration and exploitation by dynamically scaling the effect of VAE-based reconstruction error and classification rewards. This approach enables the agent to detect anomalies effectively in low-label systems while maintaining high precision and recall. Our experimental results on the Yahoo A1 and Yahoo A2 benchmark datasets demonstrate that the proposed method consistently outperforms state-of-the-art unsupervised and semi-supervised approaches. These findings show that our framework is a scalable and efficient solution for real-world anomaly detection tasks.
Abstract（参考訳）: 時系列データの異常検出は、ファイナンス、ヘルスケア、センサーネットワーク、産業監視におけるアプリケーションにとって重要である。従来の手法は、制限付きラベル付きデータ、高い偽陽性率、新しい異常タイプへの一般化の難しさに苦しむ。これらの課題を克服するために、動的報酬形成、変分自動符号化(VAE)、DRTAと呼ばれるアクティブラーニングを統合した強化学習ベースのフレームワークを提案する。提案手法は,VAEに基づく再構成誤りと分類報酬の効果を動的にスケーリングすることにより,探索と利用のバランスをとる適応報酬機構を用いる。このアプローチにより、エージェントは高い精度とリコールを維持しながら、低ラベルシステムにおける異常を効果的に検出できる。 Yahoo A1 と Yahoo A2 ベンチマークのベンチマーク実験の結果,提案手法は最先端の教師なしおよび半教師なしの手法より一貫して優れていることが示された。これらの結果から,我々のフレームワークは実世界の異常検出タスクに対して,スケーラブルで効率的なソリューションであることがわかった。

論文の概要: DRTA: Dynamic Reward Scaling for Reinforcement Learning in Time Series Anomaly Detection

関連論文リスト