Semi-supervised learning via DQN for log anomaly detection
- URL: http://arxiv.org/abs/2401.03151v1
- Date: Sat, 6 Jan 2024 08:04:13 GMT
- Title: Semi-supervised learning via DQN for log anomaly detection
- Authors: Yingying He and Xiaobing Pei and Lihong Shen
- Abstract summary: We propose a semi-supervised log anomaly detection method that combines the DQN algorithm from deep reinforcement learning, which is called DQNLog.
Our evaluation on three widely-used datasets demonstrates that DQNLog significantly improves recall rate and F1-score while maintaining precision, validating its practicality.
- Score: 1.5339370927841764
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Log anomaly detection plays a critical role in ensuring the security and
maintenance of modern software systems. At present, the primary approach for
detecting anomalies in log data is through supervised anomaly detection.
Nonetheless, existing supervised methods heavily rely on labeled data, which
can be frequently limited in real-world scenarios. In this paper, we propose a
semi-supervised log anomaly detection method that combines the DQN algorithm
from deep reinforcement learning, which is called DQNLog. DQNLog leverages a
small amount of labeled data and a large-scale unlabeled dataset, effectively
addressing the challenges of imbalanced data and limited labeling. This
approach not only learns known anomalies by interacting with an environment
biased towards anomalies but also discovers unknown anomalies by actively
exploring the unlabeled dataset. Additionally, DQNLog incorporates a
cross-entropy loss term to prevent model overestimation during Deep
Reinforcement Learning (DRL). Our evaluation on three widely-used datasets
demonstrates that DQNLog significantly improves recall rate and F1-score while
maintaining precision, validating its practicality.
Related papers
- DeepHYDRA: Resource-Efficient Time-Series Anomaly Detection in Dynamically-Configured Systems [3.44012349879073]
We present DeepHYDRA (Deep Hybrid DBSCAN/Reduction-Based Anomaly Detection)
It combines DBSCAN and learning-based anomaly detection.
It is shown to reliably detect different types of anomalies in both large and complex datasets.
arXiv Detail & Related papers (2024-05-13T13:47:15Z) - Unraveling the "Anomaly" in Time Series Anomaly Detection: A
Self-supervised Tri-domain Solution [89.16750999704969]
Anomaly labels hinder traditional supervised models in time series anomaly detection.
Various SOTA deep learning techniques, such as self-supervised learning, have been introduced to tackle this issue.
We propose a novel self-supervised learning based Tri-domain Anomaly Detector (TriAD)
arXiv Detail & Related papers (2023-11-19T05:37:18Z) - A Critical Review of Common Log Data Sets Used for Evaluation of
Sequence-based Anomaly Detection Techniques [2.5339493426758906]
We analyze six publicly available log data sets with focus on the manifestations of anomalies and simple techniques for their detection.
Our findings suggest that most anomalies are not directly related to sequential manifestations and that advanced detection techniques are not required to achieve high detection rates on these data sets.
arXiv Detail & Related papers (2023-09-06T09:31:17Z) - Multivariate Time-Series Anomaly Detection with Contaminated Data [9.46389554092506]
This paper presents a novel and practical end-to-end unsupervised TSAD when the training data are contaminated with anomalies.
The introduced approach, called TSAD-C, is devoid of access to abnormality labels during the training phase.
Our experiments conducted on three reliable datasets conclusively demonstrate that our approach surpasses existing methodologies.
arXiv Detail & Related papers (2023-08-24T05:10:18Z) - Graph Neural Networks based Log Anomaly Detection and Explanation [19.66344385835598]
Event logs are widely used to record the status of high-tech systems.
Most existing log anomaly detection methods take a log event count matrix or log event sequences as input.
We propose a graph-based method for unsupervised log anomaly detection, dubbed Logs2Graphs.
arXiv Detail & Related papers (2023-07-02T09:38:43Z) - PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows.
Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z) - LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak
Supervision [63.08516384181491]
We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts.
Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect.
Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.
arXiv Detail & Related papers (2021-11-02T15:16:08Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z) - Toward Deep Supervised Anomaly Detection: Reinforcement Learning from
Partially Labeled Anomaly Data [150.9270911031327]
We consider the problem of anomaly detection with a small set of partially labeled anomaly examples and a large-scale unlabeled dataset.
Existing related methods either exclusively fit the limited anomaly examples that typically do not span the entire set of anomalies, or proceed with unsupervised learning from the unlabeled data.
We propose here instead a deep reinforcement learning-based approach that enables an end-to-end optimization of the detection of both labeled and unlabeled anomalies.
arXiv Detail & Related papers (2020-09-15T03:05:39Z) - Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations.
We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.