Related papers: Self-Attentive Classification-Based Anomaly Detection in Unstructured Logs

Self-Attentive Classification-Based Anomaly Detection in Unstructured Logs

URL: http://arxiv.org/abs/2008.09340v1
Date: Fri, 21 Aug 2020 07:26:55 GMT
Title: Self-Attentive Classification-Based Anomaly Detection in Unstructured Logs
Authors: Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, Odej Kao
Abstract summary: We propose Logsy, a classification-based method to learn log representations. We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
Score: 59.04636530383049
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The detection of anomalies is essential mining task for the security and reliability in computer systems. Logs are a common and major data source for anomaly detection methods in almost every computer system. They collect a range of significant events describing the runtime system status. Recent studies have focused predominantly on one-class deep learning methods on predefined non-learnable numerical log representations. The main limitation is that these models are not able to learn log representations describing the semantic differences between normal and anomaly logs, leading to a poor generalization of unseen logs. We propose Logsy, a classification-based method to learn log representations in a way to distinguish between normal data from the system of interest and anomaly samples from auxiliary log datasets, easily accessible via the internet. The idea behind such an approach to anomaly detection is that the auxiliary dataset is sufficiently informative to enhance the representation of the normal data, yet diverse to regularize against overfitting and improve generalization. We propose an attention-based encoder model with a new hyperspherical loss function. This enables learning compact log representations capturing the intrinsic differences between normal and anomaly logs. Empirically, we show an average improvement of 0.25 in the F1 score, compared to the previous methods. To investigate the properties of Logsy, we perform additional experiments including evaluation of the effect of the auxiliary data size, the influence of expert knowledge, and the quality of the learned log representations. The results show that the learned representation boost the performance of the previous methods such as PCA with a relative improvement of 28.2%.

Related papers

Universal Transformation of One-Class Classifiers for Unsupervised Anomaly Detection [51.73001988341294]
Anomaly detection is typically formulated as a one-class classification problem.<n>We present a dataset folding method that transforms an arbitrary one-class classifier-based anomaly detector into a fully unsupervised method.
arXiv Detail & Related papers (2026-02-13T16:54:12Z)
What Information Contributes to Log-based Anomaly Detection? Insights from a Configurable Transformer-Based Approach [12.980238412281471]
We propose a Transformer-based anomaly detection model to capture semantic, sequential, and temporal information in log data. We conduct experiments with different combinations of input features to evaluate the roles of different types of information in anomaly detection. The results indicate that the event occurrence information plays a key role in identifying anomalies, while the impact of the sequential and temporal information is not significant for anomaly detection on the studied public datasets.
arXiv Detail & Related papers (2024-09-30T17:03:13Z)
LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection [73.69399219776315]
We propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains. Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data. Then, we transfer such knowledge to the target domain via shared parameters.
arXiv Detail & Related papers (2024-01-09T12:55:21Z)
RAPID: Training-free Retrieval-based Log Anomaly Detection with PLM considering Token-level information [7.861095039299132]
The need for log anomaly detection is growing, especially in real-world applications. Traditional deep learning-based anomaly detection models require dataset-specific training, leading to corresponding delays. We introduce RAPID, a model that capitalizes on the inherent features of log data to enable anomaly detection without training delays.
arXiv Detail & Related papers (2023-11-09T06:11:44Z)
Impact of Log Parsing on Deep Learning-Based Anomaly Detection [4.0719622481627376]
We show that there is no strong correlation between log parsing accuracy and anomaly detection accuracy. We experimentally confirm existing theoretical results showing that it is a property that we refer to as distinguishability in log parsing results.
arXiv Detail & Related papers (2023-05-25T09:53:02Z)
PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows. Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z)
LogGD:Detecting Anomalies from System Logs by Graph Neural Networks [14.813971618949068]
We propose a novel graph-based log anomaly detection method, LogGD, to effectively address the issue. We exploit the powerful capability of Graph Transformer Neural Network, which combines graph structure and node semantics for log-based anomaly detection.
arXiv Detail & Related papers (2022-09-16T11:51:58Z)
Log-based Anomaly Detection Without Log Parsing [7.66638994053231]
We propose NeuralLog, a novel log-based anomaly detection approach that does not require log parsing. Our experimental results show that the proposed approach can effectively understand the semantic meaning of log messages. Overall, NeuralLog achieves F1-scores greater than 0.95 on four public datasets, outperforming the existing approaches.
arXiv Detail & Related papers (2021-08-04T10:42:13Z)
Anomalous Sound Detection Using a Binary Classification Model and Class Centroids [47.856367556856554]
We propose a binary classification model that is developed by using not only normal data but also outlier data in the other domains as pseudo-anomalous sound data. We also investigate the effectiveness of additionally using anomalous sound data for further improving the binary classification model.
arXiv Detail & Related papers (2021-06-11T03:35:06Z)
TELESTO: A Graph Neural Network Model for Anomaly Classification in Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance. One direction aims at the recognition of re-occurring anomaly types to enable remediation automation. We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z)
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users. We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.