Log-based Anomaly Detection with Deep Learning: How Far Are We?
- URL: http://arxiv.org/abs/2202.04301v1
- Date: Wed, 9 Feb 2022 06:27:11 GMT
- Title: Log-based Anomaly Detection with Deep Learning: How Far Are We?
- Authors: Van Hoang Le and Hongyu Zhang
- Abstract summary: We conduct an in-depth analysis of five state-of-the-art deep learning-based models for detecting system anomalies on four public log datasets.
Our results point out that all the studied models do not always work well.
- Score: 7.967230034960396
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Software-intensive systems produce logs for troubleshooting purposes.
Recently, many deep learning models have been proposed to automatically detect
system anomalies based on log data. These models typically claim very high
detection accuracy. For example, most models report an F-measure greater than
0.9 on the commonly-used HDFS dataset. To achieve a profound understanding of
how far we are from solving the problem of log-based anomaly detection, in this
paper, we conduct an in-depth analysis of five state-of-the-art deep
learning-based models for detecting system anomalies on four public log
datasets. Our experiments focus on several aspects of model evaluation,
including training data selection, data grouping, class distribution, data
noise, and early detection ability. Our results point out that all these
aspects have significant impact on the evaluation, and that all the studied
models do not always work well. The problem of log-based anomaly detection has
not been solved yet. Based on our findings, we also suggest possible future
work.
Related papers
- Log-based Anomaly Detection of Enterprise Software: An Empirical Study [0.0]
We evaluate several state-of-the-art anomaly detection models on an industrial dataset from our research partner.
Results show that while all models are capable of detecting anomalies, certain models are better suited for less-structured datasets.
arXiv Detail & Related papers (2023-10-31T14:32:08Z) - Exploring the Effectiveness of Dataset Synthesis: An application of
Apple Detection in Orchards [68.95806641664713]
We explore the usability of Stable Diffusion 2.1-base for generating synthetic datasets of apple trees for object detection.
We train a YOLOv5m object detection model to predict apples in a real-world apple detection dataset.
Results demonstrate that the model trained on generated data is slightly underperforming compared to a baseline model trained on real-world images.
arXiv Detail & Related papers (2023-06-20T09:46:01Z) - PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows.
Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z) - Deep Learning for Anomaly Detection in Log Data: A Survey [3.508620069426877]
Self-learning anomaly detection techniques capture patterns in log data and report unexpected log event occurrences.
Deep learning neural networks for this purpose have been presented.
There exist many different architectures for deep learning and it is non-trivial to encode raw and unstructured log data.
arXiv Detail & Related papers (2022-07-08T10:58:28Z) - Deep vs. Shallow Learning: A Benchmark Study in Low Magnitude Earthquake
Detection [0.0]
We build on an existing logistic regression model by adding four further features using elastic net driven data mining.
We evaluate the performance of the augmented logistic regression model relative to a deep (CNN) model, pre-trained on the Groningen data, on progressively increasing noise-to-signal ratios.
We discover that, for each ratio, our logistic regression model correctly detects every earthquake, while the deep model fails to detect nearly 20 % of seismic events.
arXiv Detail & Related papers (2022-05-01T17:59:18Z) - Explainable Deep Few-shot Anomaly Detection with Deviation Networks [123.46611927225963]
We introduce a novel weakly-supervised anomaly detection framework to train detection models.
The proposed approach learns discriminative normality by leveraging the labeled anomalies and a prior probability.
Our model is substantially more sample-efficient and robust, and performs significantly better than state-of-the-art competing methods in both closed-set and open-set settings.
arXiv Detail & Related papers (2021-08-01T14:33:17Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations.
We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z) - Contextual-Bandit Anomaly Detection for IoT Data in Distributed
Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay.
We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems.
We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.