Related papers: Log-based Anomaly Detection of Enterprise Software: An Empirical Study

Log-based Anomaly Detection of Enterprise Software: An Empirical Study

URL: http://arxiv.org/abs/2310.20492v1
Date: Tue, 31 Oct 2023 14:32:08 GMT
Title: Log-based Anomaly Detection of Enterprise Software: An Empirical Study
Authors: Nadun Wijesinghe (Calgary, Canada), Hadi Hemmati (Toronto, Canada)
Abstract summary: We evaluate several state-of-the-art anomaly detection models on an industrial dataset from our research partner. Results show that while all models are capable of detecting anomalies, certain models are better suited for less-structured datasets.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Most enterprise applications use logging as a mechanism to diagnose anomalies, which could help with reducing system downtime. Anomaly detection using software execution logs has been explored in several prior studies, using both classical and deep neural network-based machine learning models. In recent years, the research has largely focused in using variations of sequence-based deep neural networks (e.g., Long-Short Term Memory and Transformer-based models) for log-based anomaly detection on open-source data. However, they have not been applied in industrial datasets, as often. In addition, the studied open-source datasets are typically very large in size with logging statements that do not change much over time, which may not be the case with a dataset from an industrial service that is relatively new. In this paper, we evaluate several state-of-the-art anomaly detection models on an industrial dataset from our research partner, which is much smaller and loosely structured than most large scale open-source benchmark datasets. Results show that while all models are capable of detecting anomalies, certain models are better suited for less-structured datasets. We also see that model effectiveness changes when a common data leak associated with a random train-test split in some prior work is removed. A qualitative study of the defects' characteristics identified by the developers on the industrial dataset further shows strengths and weaknesses of the models in detecting different types of anomalies. Finally, we explore the effect of limited training data by gradually increasing the training set size, to evaluate if the model effectiveness does depend on the training set size.

Related papers

What Information Contributes to Log-based Anomaly Detection? Insights from a Configurable Transformer-Based Approach [12.980238412281471]
We propose a Transformer-based anomaly detection model to capture semantic, sequential, and temporal information in log data. We conduct experiments with different combinations of input features to evaluate the roles of different types of information in anomaly detection. The results indicate that the event occurrence information plays a key role in identifying anomalies, while the impact of the sequential and temporal information is not significant for anomaly detection on the studied public datasets.
arXiv Detail & Related papers (2024-09-30T17:03:13Z)
TAB: Text-Align Anomaly Backbone Model for Industrial Inspection Tasks [12.660226544498023]
We propose a novel framework to adeptly train a backbone model tailored to the manufacturing domain. Our approach concurrently considers visual and text-aligned embedding spaces for normal and abnormal conditions. The resulting pre-trained backbone markedly enhances performance in industrial downstream tasks.
arXiv Detail & Related papers (2023-12-15T01:37:29Z)
An Outlier Exposure Approach to Improve Visual Anomaly Detection Performance for Mobile Robots [76.36017224414523]
We consider the problem of building visual anomaly detection systems for mobile robots. Standard anomaly detection models are trained using large datasets composed only of non-anomalous data. We tackle the problem of exploiting these data to improve the performance of a Real-NVP anomaly detection model.
arXiv Detail & Related papers (2022-09-20T15:18:13Z)
Deep Learning for Anomaly Detection in Log Data: A Survey [3.508620069426877]
Self-learning anomaly detection techniques capture patterns in log data and report unexpected log event occurrences. Deep learning neural networks for this purpose have been presented. There exist many different architectures for deep learning and it is non-trivial to encode raw and unstructured log data.
arXiv Detail & Related papers (2022-07-08T10:58:28Z)
Log-based Anomaly Detection with Deep Learning: How Far Are We? [7.967230034960396]
We conduct an in-depth analysis of five state-of-the-art deep learning-based models for detecting system anomalies on four public log datasets. Our results point out that all the studied models do not always work well.
arXiv Detail & Related papers (2022-02-09T06:27:11Z)
Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction. We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z)
Explainable Deep Few-shot Anomaly Detection with Deviation Networks [123.46611927225963]
We introduce a novel weakly-supervised anomaly detection framework to train detection models. The proposed approach learns discriminative normality by leveraging the labeled anomalies and a prior probability. Our model is substantially more sample-efficient and robust, and performs significantly better than state-of-the-art competing methods in both closed-set and open-set settings.
arXiv Detail & Related papers (2021-08-01T14:33:17Z)
Deep Visual Anomaly detection with Negative Learning [18.79849041106952]
In this paper, we propose anomaly detection with negative learning (ADNL), which employs the negative learning concept for the enhancement of anomaly detection. The idea is to limit the reconstruction capability of a generative model using the given a small amount of anomaly examples. This way, the network not only learns to reconstruct normal data but also encloses the normal distribution far from the possible distribution of anomalies.
arXiv Detail & Related papers (2021-05-24T01:48:44Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
TELESTO: A Graph Neural Network Model for Anomaly Classification in Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance. One direction aims at the recognition of re-occurring anomaly types to enable remediation automation. We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z)
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users. We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z)
The Effectiveness of Discretization in Forecasting: An Empirical Study on Neural Time Series Models [15.281725756608981]
We investigate the effect of data input and output transformations on the predictive performance of neural forecasting architectures. We find that binning almost always improves performance compared to using normalized real-valued inputs.
arXiv Detail & Related papers (2020-05-20T15:09:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.