Related papers: Semi-supervised learning via DQN for log anomaly detection

Semi-supervised learning via DQN for log anomaly detection

URL: http://arxiv.org/abs/2401.03151v1
Date: Sat, 6 Jan 2024 08:04:13 GMT
Title: Semi-supervised learning via DQN for log anomaly detection
Authors: Yingying He and Xiaobing Pei and Lihong Shen
Abstract summary: We propose a semi-supervised log anomaly detection method that combines the DQN algorithm from deep reinforcement learning, which is called DQNLog. Our evaluation on three widely-used datasets demonstrates that DQNLog significantly improves recall rate and F1-score while maintaining precision, validating its practicality.
Score: 1.5339370927841764
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Log anomaly detection plays a critical role in ensuring the security and maintenance of modern software systems. At present, the primary approach for detecting anomalies in log data is through supervised anomaly detection. Nonetheless, existing supervised methods heavily rely on labeled data, which can be frequently limited in real-world scenarios. In this paper, we propose a semi-supervised log anomaly detection method that combines the DQN algorithm from deep reinforcement learning, which is called DQNLog. DQNLog leverages a small amount of labeled data and a large-scale unlabeled dataset, effectively addressing the challenges of imbalanced data and limited labeling. This approach not only learns known anomalies by interacting with an environment biased towards anomalies but also discovers unknown anomalies by actively exploring the unlabeled dataset. Additionally, DQNLog incorporates a cross-entropy loss term to prevent model overestimation during Deep Reinforcement Learning (DRL). Our evaluation on three widely-used datasets demonstrates that DQNLog significantly improves recall rate and F1-score while maintaining precision, validating its practicality.

Related papers

DeepHYDRA: Resource-Efficient Time-Series Anomaly Detection in Dynamically-Configured Systems [3.44012349879073]
We present DeepHYDRA (Deep Hybrid DBSCAN/Reduction-Based Anomaly Detection) It combines DBSCAN and learning-based anomaly detection. It is shown to reliably detect different types of anomalies in both large and complex datasets.
arXiv Detail & Related papers (2024-05-13T13:47:15Z)
Unraveling the "Anomaly" in Time Series Anomaly Detection: A Self-supervised Tri-domain Solution [89.16750999704969]
Anomaly labels hinder traditional supervised models in time series anomaly detection. Various SOTA deep learning techniques, such as self-supervised learning, have been introduced to tackle this issue. We propose a novel self-supervised learning based Tri-domain Anomaly Detector (TriAD)
arXiv Detail & Related papers (2023-11-19T05:37:18Z)
A Critical Review of Common Log Data Sets Used for Evaluation of Sequence-based Anomaly Detection Techniques [2.5339493426758906]
We analyze six publicly available log data sets with focus on the manifestations of anomalies and simple techniques for their detection. Our findings suggest that most anomalies are not directly related to sequential manifestations and that advanced detection techniques are not required to achieve high detection rates on these data sets.
arXiv Detail & Related papers (2023-09-06T09:31:17Z)
Multivariate Time-Series Anomaly Detection with Contaminated Data [9.46389554092506]
This paper presents a novel and practical end-to-end unsupervised TSAD when the training data are contaminated with anomalies. The introduced approach, called TSAD-C, is devoid of access to abnormality labels during the training phase. Our experiments conducted on three reliable datasets conclusively demonstrate that our approach surpasses existing methodologies.
arXiv Detail & Related papers (2023-08-24T05:10:18Z)
Graph Neural Networks based Log Anomaly Detection and Explanation [19.66344385835598]
Event logs are widely used to record the status of high-tech systems. Most existing log anomaly detection methods take a log event count matrix or log event sequences as input. We propose a graph-based method for unsupervised log anomaly detection, dubbed Logs2Graphs.
arXiv Detail & Related papers (2023-07-02T09:38:43Z)
PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows. Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z)
LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak Supervision [63.08516384181491]
We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts. Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect. Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.
arXiv Detail & Related papers (2021-11-02T15:16:08Z)
TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs) To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics. To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z)
Toward Deep Supervised Anomaly Detection: Reinforcement Learning from Partially Labeled Anomaly Data [150.9270911031327]
We consider the problem of anomaly detection with a small set of partially labeled anomaly examples and a large-scale unlabeled dataset. Existing related methods either exclusively fit the limited anomaly examples that typically do not span the entire set of anomalies, or proceed with unsupervised learning from the unlabeled data. We propose here instead a deep reinforcement learning-based approach that enables an end-to-end optimization of the detection of both labeled and unlabeled anomalies.
arXiv Detail & Related papers (2020-09-15T03:05:39Z)
Self-Attentive Classification-Based Anomaly Detection in Unstructured Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations. We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.