Unsupervised Log Anomaly Detection with Few Unique Tokens
- URL: http://arxiv.org/abs/2310.08951v2
- Date: Tue, 23 Jul 2024 10:20:32 GMT
- Title: Unsupervised Log Anomaly Detection with Few Unique Tokens
- Authors: Antonin Sulc, Annika Eichler, Tim Wilksen,
- Abstract summary: This article introduces a method to detect anomalies in the log data generated by control system nodes at the European XFEL accelerator.
The primary aim of this proposed method is to provide operators a comprehensive understanding of the availability, status, and problems specific to each node.
- Score: 1.9389881806157316
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This article introduces a method to detect anomalies in the log data generated by control system nodes at the European XFEL accelerator. The primary aim of this proposed method is to provide operators a comprehensive understanding of the availability, status, and problems specific to each node. This information is vital for ensuring the smooth operation. The sequential nature of logs and the absence of a rich text corpus that is specific to our nodes poses significant limitations for traditional and learning-based approaches for anomaly detection. To overcome this limitation, we propose a method that uses word embedding and models individual nodes as a sequence of these vectors that commonly co-occur, using a Hidden Markov Model (HMM). We score individual log entries by computing a probability ratio between the probability of the full log sequence including the new entry and the probability of just the previous log entries, without the new entry. This ratio indicates how probable the sequence becomes when the new entry is added. The proposed approach can detect anomalies by scoring and ranking log entries from European XFEL nodes where entries that receive high scores are potential anomalies that do not fit the routine of the node. This method provides a warning system to alert operators about these irregular log events that may indicate issues.
Related papers
- Rethinking Contrastive Learning in Graph Anomaly Detection: A Clean-View Perspective [54.605073936695575]
Graph anomaly detection aims to identify unusual patterns in graph-based data, with wide applications in fields such as web security and financial fraud detection.<n>Existing methods rely on contrastive learning, assuming that a lower similarity between a node and its local subgraph indicates abnormality.<n>The presence of interfering edges invalidates this assumption, since it introduces disruptive noise that compromises the contrastive learning process.<n>We propose a Clean-View Enhanced Graph Anomaly Detection framework (CVGAD), which includes a multi-scale anomaly awareness module to identify key sources of interference in the contrastive learning process.
arXiv Detail & Related papers (2025-05-23T15:05:56Z) - ADALog: Adaptive Unsupervised Anomaly detection in Logs with Self-attention Masked Language Model [2.55347686868565]
ADALog is an adaptive, unsupervised anomaly detection framework.<n>It operates on individual unstructured logs, extracts intra-log contextual relationships, and performs adaptive thresholding on normal data.<n>We evaluate ADALog on benchmark datasets BGL, Thunderbird, and Spirit.
arXiv Detail & Related papers (2025-05-15T17:31:40Z) - Multitask Active Learning for Graph Anomaly Detection [48.690169078479116]
We propose a novel MultItask acTIve Graph Anomaly deTEction framework, namely MITIGATE.
By coupling node classification tasks, MITIGATE obtains the capability to detect out-of-distribution nodes without known anomalies.
Empirical studies on four datasets demonstrate that MITIGATE significantly outperforms the state-of-the-art methods for anomaly detection.
arXiv Detail & Related papers (2024-01-24T03:43:45Z) - Semi-supervised learning via DQN for log anomaly detection [1.5339370927841764]
Current methods in log anomaly detection face challenges such as underutilization of unlabeled data, imbalance between normal and anomaly class data, and high rates of false positives and false negatives.
We propose a semi-supervised log anomaly detection method named DQNLog, which integrates deep reinforcement learning to enhance anomaly detection performance.
We evaluate DQNLog on three widely used datasets, demonstrating its ability to effectively utilize large-scale unlabeled data.
arXiv Detail & Related papers (2024-01-06T08:04:13Z) - A Supervised Embedding and Clustering Anomaly Detection method for
classification of Mobile Network Faults [0.0]
The paper introduces Supervised Embedding and Clustering Anomaly Detection (SEMC-AD)
It is a method designed to efficiently identify faulty alarm logs in a mobile network and alleviate the challenges of manual monitoring.
SEMC-AD achieves 99% anomaly detection, whereas random forest and XGBoost only detect 86% and 81% of anomalies, respectively.
arXiv Detail & Related papers (2023-10-10T16:54:25Z) - GLAD: Content-aware Dynamic Graphs For Log Anomaly Detection [49.9884374409624]
GLAD is a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
We introduce GLAD, a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
arXiv Detail & Related papers (2023-09-12T04:21:30Z) - BOURNE: Bootstrapped Self-supervised Learning Framework for Unified
Graph Anomaly Detection [50.26074811655596]
We propose a novel unified graph anomaly detection framework based on bootstrapped self-supervised learning (named BOURNE)
By swapping the context embeddings between nodes and edges, we enable the mutual detection of node and edge anomalies.
BOURNE can eliminate the need for negative sampling, thereby enhancing its efficiency in handling large graphs.
arXiv Detail & Related papers (2023-07-28T00:44:57Z) - Graph Neural Networks based Log Anomaly Detection and Explanation [19.66344385835598]
Event logs are widely used to record the status of high-tech systems.
Most existing log anomaly detection methods take a log event count matrix or log event sequences as input.
We propose a graph-based method for unsupervised log anomaly detection, dubbed Logs2Graphs.
arXiv Detail & Related papers (2023-07-02T09:38:43Z) - Neighbour Consistency Guided Pseudo-Label Refinement for Unsupervised
Person Re-Identification [80.98291772215154]
Unsupervised person re-identification (ReID) aims at learning discriminative identity features for person retrieval without any annotations.
Recent advances accomplish this task by leveraging clustering-based pseudo labels.
We propose a Neighbour Consistency guided Pseudo Label Refinement framework.
arXiv Detail & Related papers (2022-11-30T09:39:57Z) - LogGD:Detecting Anomalies from System Logs by Graph Neural Networks [14.813971618949068]
We propose a novel graph-based log anomaly detection method, LogGD, to effectively address the issue.
We exploit the powerful capability of Graph Transformer Neural Network, which combines graph structure and node semantics for log-based anomaly detection.
arXiv Detail & Related papers (2022-09-16T11:51:58Z) - LogLG: Weakly Supervised Log Anomaly Detection via Log-Event Graph
Construction [31.31712326361932]
We propose a novel weakly supervised log anomaly detection framework, named LogLG, to explore the semantic connections among keywords from sequences.
Specifically, we design an end-to-end iterative process, where the keywords of unlabeled logs are first extracted to construct a log-event graph.
Then, we build a subgraph annotator to generate pseudo labels for unlabeled log sequences.
arXiv Detail & Related papers (2022-08-23T09:32:19Z) - A2Log: Attentive Augmented Log Anomaly Detection [53.06341151551106]
Anomaly detection becomes increasingly important for the dependability and serviceability of IT services.
Existing unsupervised methods need anomaly examples to obtain a suitable decision boundary.
We develop A2Log, which is an unsupervised anomaly detection method consisting of two steps: Anomaly scoring and anomaly decision.
arXiv Detail & Related papers (2021-09-20T13:40:21Z) - Log-based Anomaly Detection Without Log Parsing [7.66638994053231]
We propose NeuralLog, a novel log-based anomaly detection approach that does not require log parsing.
Our experimental results show that the proposed approach can effectively understand the semantic meaning of log messages.
Overall, NeuralLog achieves F1-scores greater than 0.95 on four public datasets, outperforming the existing approaches.
arXiv Detail & Related papers (2021-08-04T10:42:13Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - Recomposition vs. Prediction: A Novel Anomaly Detection for Discrete
Events Based On Autoencoder [5.781280693720236]
One of the most challenging problems in the field of intrusion detection is anomaly detection for discrete event logs.
We propose DabLog, a Deep Autoencoder-Based anomaly detection method for discrete event Logs.
Our approach determines whether a sequence is normal or abnormal by analyzing (encoding) and reconstructing (decoding) the given sequence.
arXiv Detail & Related papers (2020-12-27T16:31:05Z) - ESAD: End-to-end Deep Semi-supervised Anomaly Detection [85.81138474858197]
We propose a new objective function that measures the KL-divergence between normal and anomalous data.
The proposed method significantly outperforms several state-of-the-arts on multiple benchmark datasets.
arXiv Detail & Related papers (2020-12-09T08:16:35Z) - Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations.
We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.