Unsupervised Log Anomaly Detection with Few Unique Tokens
- URL: http://arxiv.org/abs/2310.08951v3
- Date: Wed, 21 May 2025 21:21:20 GMT
- Title: Unsupervised Log Anomaly Detection with Few Unique Tokens
- Authors: Antonin Sulc, Annika Eichler, Tim Wilksen,
- Abstract summary: This article introduces a novel method for detecting anomalies within log data from control system nodes at the European XFEL accelerator.<n>Anomalies are identified by scoring individual log entries based on a probability ratio.<n>High scores indicate potential anomalies that deviate from the node's routine behavior.
- Score: 1.9389881806157316
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This article introduces a novel method for detecting anomalies within log data from control system nodes at the European XFEL accelerator. Effective anomaly detection is crucial for providing operators with a clear understanding of each node's availability, status, and potential problems, thereby ensuring smooth accelerator operation. Traditional and learning-based anomaly detection methods face significant limitations due to the sequential nature of these logs and the lack of a rich, node-specific text corpus. To address this, we propose an approach utilizing word embeddings to represent log entries and a Hidden Markov Model (HMM) to model the typical sequential patterns of these embeddings for individual nodes. Anomalies are identified by scoring individual log entries based on a probability ratio: this ratio compares the likelihood of the log sequence including the new entry against its likelihood without it, effectively measuring how well the new entry fits the established pattern. High scores indicate potential anomalies that deviate from the node's routine behavior. This method functions as a warning system, alerting operators to irregular log events that may signify underlying issues, thereby facilitating proactive intervention.
Related papers
- Rethinking Contrastive Learning in Graph Anomaly Detection: A Clean-View Perspective [54.605073936695575]
Graph anomaly detection aims to identify unusual patterns in graph-based data, with wide applications in fields such as web security and financial fraud detection.<n>Existing methods rely on contrastive learning, assuming that a lower similarity between a node and its local subgraph indicates abnormality.<n>The presence of interfering edges invalidates this assumption, since it introduces disruptive noise that compromises the contrastive learning process.<n>We propose a Clean-View Enhanced Graph Anomaly Detection framework (CVGAD), which includes a multi-scale anomaly awareness module to identify key sources of interference in the contrastive learning process.
arXiv Detail & Related papers (2025-05-23T15:05:56Z) - ADALog: Adaptive Unsupervised Anomaly detection in Logs with Self-attention Masked Language Model [2.55347686868565]
ADALog is an adaptive, unsupervised anomaly detection framework.<n>It operates on individual unstructured logs, extracts intra-log contextual relationships, and performs adaptive thresholding on normal data.<n>We evaluate ADALog on benchmark datasets BGL, Thunderbird, and Spirit.
arXiv Detail & Related papers (2025-05-15T17:31:40Z) - Multitask Active Learning for Graph Anomaly Detection [48.690169078479116]
We propose a novel MultItask acTIve Graph Anomaly deTEction framework, namely MITIGATE.
By coupling node classification tasks, MITIGATE obtains the capability to detect out-of-distribution nodes without known anomalies.
Empirical studies on four datasets demonstrate that MITIGATE significantly outperforms the state-of-the-art methods for anomaly detection.
arXiv Detail & Related papers (2024-01-24T03:43:45Z) - Semi-supervised learning via DQN for log anomaly detection [1.5339370927841764]
Current methods in log anomaly detection face challenges such as underutilization of unlabeled data, imbalance between normal and anomaly class data, and high rates of false positives and false negatives.
We propose a semi-supervised log anomaly detection method named DQNLog, which integrates deep reinforcement learning to enhance anomaly detection performance.
We evaluate DQNLog on three widely used datasets, demonstrating its ability to effectively utilize large-scale unlabeled data.
arXiv Detail & Related papers (2024-01-06T08:04:13Z) - A Supervised Embedding and Clustering Anomaly Detection method for
classification of Mobile Network Faults [0.0]
The paper introduces Supervised Embedding and Clustering Anomaly Detection (SEMC-AD)
It is a method designed to efficiently identify faulty alarm logs in a mobile network and alleviate the challenges of manual monitoring.
SEMC-AD achieves 99% anomaly detection, whereas random forest and XGBoost only detect 86% and 81% of anomalies, respectively.
arXiv Detail & Related papers (2023-10-10T16:54:25Z) - GLAD: Content-aware Dynamic Graphs For Log Anomaly Detection [49.9884374409624]
GLAD is a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
We introduce GLAD, a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
arXiv Detail & Related papers (2023-09-12T04:21:30Z) - BOURNE: Bootstrapped Self-supervised Learning Framework for Unified
Graph Anomaly Detection [50.26074811655596]
We propose a novel unified graph anomaly detection framework based on bootstrapped self-supervised learning (named BOURNE)
By swapping the context embeddings between nodes and edges, we enable the mutual detection of node and edge anomalies.
BOURNE can eliminate the need for negative sampling, thereby enhancing its efficiency in handling large graphs.
arXiv Detail & Related papers (2023-07-28T00:44:57Z) - Graph Neural Networks based Log Anomaly Detection and Explanation [19.66344385835598]
Event logs are widely used to record the status of high-tech systems.
Most existing log anomaly detection methods take a log event count matrix or log event sequences as input.
We propose a graph-based method for unsupervised log anomaly detection, dubbed Logs2Graphs.
arXiv Detail & Related papers (2023-07-02T09:38:43Z) - Neighbour Consistency Guided Pseudo-Label Refinement for Unsupervised
Person Re-Identification [80.98291772215154]
Unsupervised person re-identification (ReID) aims at learning discriminative identity features for person retrieval without any annotations.
Recent advances accomplish this task by leveraging clustering-based pseudo labels.
We propose a Neighbour Consistency guided Pseudo Label Refinement framework.
arXiv Detail & Related papers (2022-11-30T09:39:57Z) - LogGD:Detecting Anomalies from System Logs by Graph Neural Networks [14.813971618949068]
We propose a novel graph-based log anomaly detection method, LogGD, to effectively address the issue.
We exploit the powerful capability of Graph Transformer Neural Network, which combines graph structure and node semantics for log-based anomaly detection.
arXiv Detail & Related papers (2022-09-16T11:51:58Z) - LogLG: Weakly Supervised Log Anomaly Detection via Log-Event Graph
Construction [31.31712326361932]
We propose a novel weakly supervised log anomaly detection framework, named LogLG, to explore the semantic connections among keywords from sequences.
Specifically, we design an end-to-end iterative process, where the keywords of unlabeled logs are first extracted to construct a log-event graph.
Then, we build a subgraph annotator to generate pseudo labels for unlabeled log sequences.
arXiv Detail & Related papers (2022-08-23T09:32:19Z) - A2Log: Attentive Augmented Log Anomaly Detection [53.06341151551106]
Anomaly detection becomes increasingly important for the dependability and serviceability of IT services.
Existing unsupervised methods need anomaly examples to obtain a suitable decision boundary.
We develop A2Log, which is an unsupervised anomaly detection method consisting of two steps: Anomaly scoring and anomaly decision.
arXiv Detail & Related papers (2021-09-20T13:40:21Z) - Log-based Anomaly Detection Without Log Parsing [7.66638994053231]
We propose NeuralLog, a novel log-based anomaly detection approach that does not require log parsing.
Our experimental results show that the proposed approach can effectively understand the semantic meaning of log messages.
Overall, NeuralLog achieves F1-scores greater than 0.95 on four public datasets, outperforming the existing approaches.
arXiv Detail & Related papers (2021-08-04T10:42:13Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - Recomposition vs. Prediction: A Novel Anomaly Detection for Discrete
Events Based On Autoencoder [5.781280693720236]
One of the most challenging problems in the field of intrusion detection is anomaly detection for discrete event logs.
We propose DabLog, a Deep Autoencoder-Based anomaly detection method for discrete event Logs.
Our approach determines whether a sequence is normal or abnormal by analyzing (encoding) and reconstructing (decoding) the given sequence.
arXiv Detail & Related papers (2020-12-27T16:31:05Z) - ESAD: End-to-end Deep Semi-supervised Anomaly Detection [85.81138474858197]
We propose a new objective function that measures the KL-divergence between normal and anomalous data.
The proposed method significantly outperforms several state-of-the-arts on multiple benchmark datasets.
arXiv Detail & Related papers (2020-12-09T08:16:35Z) - Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations.
We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.