Log Anomaly Detection with Large Language Models via Knowledge-Enriched Fusion
- URL: http://arxiv.org/abs/2512.11997v1
- Date: Fri, 12 Dec 2025 19:24:54 GMT
- Title: Log Anomaly Detection with Large Language Models via Knowledge-Enriched Fusion
- Authors: Anfeng Peng, Ajesh Koyatan Chathoth, Stephen Lee,
- Abstract summary: EnrichLog is a training-free, entry-based anomaly detection framework.<n>It enriches raw log entries with both corpus-specific and sample-specific knowledge.<n>We evaluate EnrichLog on four large-scale system log benchmark datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: System logs are a critical resource for monitoring and managing distributed systems, providing insights into failures and anomalous behavior. Traditional log analysis techniques, including template-based and sequence-driven approaches, often lose important semantic information or struggle with ambiguous log patterns. To address this, we present EnrichLog, a training-free, entry-based anomaly detection framework that enriches raw log entries with both corpus-specific and sample-specific knowledge. EnrichLog incorporates contextual information, including historical examples and reasoning derived from the corpus, to enable more accurate and interpretable anomaly detection. The framework leverages retrieval-augmented generation to integrate relevant contextual knowledge without requiring retraining. We evaluate EnrichLog on four large-scale system log benchmark datasets and compare it against five baseline methods. Our results show that EnrichLog consistently improves anomaly detection performance, effectively handles ambiguous log entries, and maintains efficient inference. Furthermore, incorporating both corpus- and sample-specific knowledge enhances model confidence and detection accuracy, making EnrichLog well-suited for practical deployments.
Related papers
- A unified framework for detecting point and collective anomalies in operating system logs via collaborative transformers [0.9558392439655014]
CoLog is a framework that collaboratively encodes logs utilizing various modalities.<n>In detecting both point and collective anomalies, CoLog achieves a mean precision of 99.63%, a mean recall of 99.59%, and a mean F1 score of 99.61%.
arXiv Detail & Related papers (2025-12-29T11:18:34Z) - FusionLog: Cross-System Log-based Anomaly Detection via Fusion of General and Proprietary Knowledge [10.135000927533385]
FusionLog is a novel zero-label cross-system log-based anomaly detection method.<n>It achieves the fusion of general and proprietary knowledge, enabling cross-system generalization without labeled target logs.<n>Experiments show that FusionLog achieves over 90% F1-score under a fully zero-label setting.
arXiv Detail & Related papers (2025-11-08T06:30:50Z) - Revisiting Logit Distributions for Reliable Out-of-Distribution Detection [73.9121001113687]
Out-of-distribution (OOD) detection is critical for ensuring the reliability of deep learning models in open-world applications.<n>LogitGap is a novel post-hoc OOD detection method that exploits the relationship between the maximum logit and the remaining logits.<n>We show that LogitGap consistently achieves state-of-the-art performance across diverse OOD detection scenarios and benchmarks.
arXiv Detail & Related papers (2025-10-23T02:16:45Z) - RationAnomaly: Log Anomaly Detection with Rationality via Chain-of-Thought and Reinforcement Learning [27.235259453535537]
RationAnomaly is a novel framework that enhances log anomaly detection by synergizing Chain-of-Thought fine-tuning with reinforcement learning.<n>We have released the corresponding resources, including code and datasets.
arXiv Detail & Related papers (2025-09-18T07:35:58Z) - Log2graphs: An Unsupervised Framework for Log Anomaly Detection with Efficient Feature Extraction [1.474723404975345]
High cost of manual annotation and dynamic nature of usage scenarios present major challenges to effective log analysis.
This study proposes a novel log feature extraction model called DualGCN-LogAE, designed to adapt to various scenarios.
We also introduce Log2graphs, an unsupervised log anomaly detection method based on the feature extractor.
arXiv Detail & Related papers (2024-09-18T11:35:58Z) - Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - GLAD: Content-aware Dynamic Graphs For Log Anomaly Detection [49.9884374409624]
GLAD is a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
We introduce GLAD, a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
arXiv Detail & Related papers (2023-09-12T04:21:30Z) - Log-based Anomaly Detection Without Log Parsing [7.66638994053231]
We propose NeuralLog, a novel log-based anomaly detection approach that does not require log parsing.
Our experimental results show that the proposed approach can effectively understand the semantic meaning of log messages.
Overall, NeuralLog achieves F1-scores greater than 0.95 on four public datasets, outperforming the existing approaches.
arXiv Detail & Related papers (2021-08-04T10:42:13Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations.
We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z) - Self-Supervised Log Parsing [59.04636530383049]
Large-scale software systems generate massive volumes of semi-structured log records.
Existing approaches rely on log-specifics or manual rule extraction.
We propose NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling.
arXiv Detail & Related papers (2020-03-17T19:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.