Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs
- URL: http://arxiv.org/abs/2008.09340v1
- Date: Fri, 21 Aug 2020 07:26:55 GMT
- Title: Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs
- Authors: Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso,
Odej Kao
- Abstract summary: We propose Logsy, a classification-based method to learn log representations.
We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
- Score: 59.04636530383049
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The detection of anomalies is essential mining task for the security and
reliability in computer systems. Logs are a common and major data source for
anomaly detection methods in almost every computer system. They collect a range
of significant events describing the runtime system status. Recent studies have
focused predominantly on one-class deep learning methods on predefined
non-learnable numerical log representations. The main limitation is that these
models are not able to learn log representations describing the semantic
differences between normal and anomaly logs, leading to a poor generalization
of unseen logs. We propose Logsy, a classification-based method to learn log
representations in a way to distinguish between normal data from the system of
interest and anomaly samples from auxiliary log datasets, easily accessible via
the internet. The idea behind such an approach to anomaly detection is that the
auxiliary dataset is sufficiently informative to enhance the representation of
the normal data, yet diverse to regularize against overfitting and improve
generalization. We propose an attention-based encoder model with a new
hyperspherical loss function. This enables learning compact log representations
capturing the intrinsic differences between normal and anomaly logs.
Empirically, we show an average improvement of 0.25 in the F1 score, compared
to the previous methods. To investigate the properties of Logsy, we perform
additional experiments including evaluation of the effect of the auxiliary data
size, the influence of expert knowledge, and the quality of the learned log
representations. The results show that the learned representation boost the
performance of the previous methods such as PCA with a relative improvement of
28.2%.
Related papers
- LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection [73.69399219776315]
We propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains.
Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data.
Then, we transfer such knowledge to the target domain via shared parameters.
arXiv Detail & Related papers (2024-01-09T12:55:21Z) - RAPID: Training-free Retrieval-based Log Anomaly Detection with PLM
considering Token-level information [7.861095039299132]
The need for log anomaly detection is growing, especially in real-world applications.
Traditional deep learning-based anomaly detection models require dataset-specific training, leading to corresponding delays.
We introduce RAPID, a model that capitalizes on the inherent features of log data to enable anomaly detection without training delays.
arXiv Detail & Related papers (2023-11-09T06:11:44Z) - Impact of Log Parsing on Deep Learning-Based Anomaly Detection [4.0719622481627376]
We show that there is no strong correlation between log parsing accuracy and anomaly detection accuracy.
We experimentally confirm existing theoretical results showing that it is a property that we refer to as distinguishability in log parsing results.
arXiv Detail & Related papers (2023-05-25T09:53:02Z) - PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows.
Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z) - LogGD:Detecting Anomalies from System Logs by Graph Neural Networks [14.813971618949068]
We propose a novel graph-based log anomaly detection method, LogGD, to effectively address the issue.
We exploit the powerful capability of Graph Transformer Neural Network, which combines graph structure and node semantics for log-based anomaly detection.
arXiv Detail & Related papers (2022-09-16T11:51:58Z) - Log-based Anomaly Detection Without Log Parsing [7.66638994053231]
We propose NeuralLog, a novel log-based anomaly detection approach that does not require log parsing.
Our experimental results show that the proposed approach can effectively understand the semantic meaning of log messages.
Overall, NeuralLog achieves F1-scores greater than 0.95 on four public datasets, outperforming the existing approaches.
arXiv Detail & Related papers (2021-08-04T10:42:13Z) - Anomalous Sound Detection Using a Binary Classification Model and Class
Centroids [47.856367556856554]
We propose a binary classification model that is developed by using not only normal data but also outlier data in the other domains as pseudo-anomalous sound data.
We also investigate the effectiveness of additionally using anomalous sound data for further improving the binary classification model.
arXiv Detail & Related papers (2021-06-11T03:35:06Z) - TELESTO: A Graph Neural Network Model for Anomaly Classification in
Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance.
One direction aims at the recognition of re-occurring anomaly types to enable remediation automation.
We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.