FusionLog: Cross-System Log-based Anomaly Detection via Fusion of General and Proprietary Knowledge
- URL: http://arxiv.org/abs/2511.05878v1
- Date: Sat, 08 Nov 2025 06:30:50 GMT
- Title: FusionLog: Cross-System Log-based Anomaly Detection via Fusion of General and Proprietary Knowledge
- Authors: Xinlong Zhao, Tong Jia, Minghua He, Xixuan Yang, Ying Li,
- Abstract summary: FusionLog is a novel zero-label cross-system log-based anomaly detection method.<n>It achieves the fusion of general and proprietary knowledge, enabling cross-system generalization without labeled target logs.<n>Experiments show that FusionLog achieves over 90% F1-score under a fully zero-label setting.
- Score: 10.135000927533385
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Log-based anomaly detection is critical for ensuring the stability and reliability of web systems. One of the key problems in this task is the lack of sufficient labeled logs, which limits the rapid deployment in new systems. Existing works usually leverage large-scale labeled logs from a mature web system and a small amount of labeled logs from a new system, using transfer learning to extract and generalize general knowledge across both domains. However, these methods focus solely on the transfer of general knowledge and neglect the disparity and potential mismatch between such knowledge and the proprietary knowledge of target system, thus constraining performance. To address this limitation, we propose FusionLog, a novel zero-label cross-system log-based anomaly detection method that effectively achieves the fusion of general and proprietary knowledge, enabling cross-system generalization without any labeled target logs. Specifically, we first design a training-free router based on semantic similarity that dynamically partitions unlabeled target logs into 'general logs' and 'proprietary logs.' For general logs, FusionLog employs a small model based on system-agnostic representation meta-learning for direct training and inference, inheriting the general anomaly patterns shared between the source and target systems. For proprietary logs, we iteratively generate pseudo-labels and fine-tune the small model using multi-round collaborative knowledge distillation and fusion based on large language model (LLM) and small model (SM) to enhance its capability to recognize anomaly patterns specific to the target system. Experimental results on three public log datasets from different systems show that FusionLog achieves over 90% F1-score under a fully zero-label setting, significantly outperforming state-of-the-art cross-system log-based anomaly detection methods.
Related papers
- Log Anomaly Detection with Large Language Models via Knowledge-Enriched Fusion [0.0]
EnrichLog is a training-free, entry-based anomaly detection framework.<n>It enriches raw log entries with both corpus-specific and sample-specific knowledge.<n>We evaluate EnrichLog on four large-scale system log benchmark datasets.
arXiv Detail & Related papers (2025-12-12T19:24:54Z) - Generality Is Not Enough: Zero-Label Cross-System Log-Based Anomaly Detection via Knowledge-Level Collaboration [10.873294740040912]
GeneralLog is a novel collaborative method for zero-label cross-system log anomaly detection.<n>GeneralLog achieves over 90% F1-score under a fully zero-label setting, significantly outperforming existing methods.
arXiv Detail & Related papers (2025-11-08T06:47:28Z) - ZeroLog: Zero-Label Generalizable Cross-System Log-based Anomaly Detection [13.441063641941037]
ZeroLog is a system-agnostic representation meta-learning method that enables cross-system log-based anomaly detection under zero-label conditions.<n>We show that ZeroLog reaches over 80% F1-score without labels, comparable to state-of-the-art cross-system methods trained with labeled logs, and outperforms existing methods under zero-label conditions.
arXiv Detail & Related papers (2025-11-08T05:30:02Z) - From Few-Label to Zero-Label: An Approach for Cross-System Log-Based Anomaly Detection with Meta-Learning [14.506853344375342]
Cross-system transfer has been identified as a key research direction.<n>We propose FreeLog, a system-agnostic representation meta-learning method that eliminates the need for labeled target system logs.<n>FreeLog achieves performance comparable to state-of-the-art methods that rely on a small amount of labeled data from the target system.
arXiv Detail & Related papers (2025-07-26T05:38:51Z) - LogLLM: Log-based Anomaly Detection Using Large Language Models [7.7704116297749675]
We propose LogLLM, a log-based anomaly detection framework that leverages large language models (LLMs)<n>LogLLM employs BERT for extracting semantic vectors from log messages, while utilizing Llama, a transformer decoder-based model, for classifying log sequences.<n>Our framework is trained through a novel three-stage procedure designed to enhance performance and adaptability.
arXiv Detail & Related papers (2024-11-13T12:18:00Z) - LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection [73.69399219776315]
We propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains.
Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data.
Then, we transfer such knowledge to the target domain via shared parameters.
arXiv Detail & Related papers (2024-01-09T12:55:21Z) - GLAD: Content-aware Dynamic Graphs For Log Anomaly Detection [49.9884374409624]
GLAD is a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
We introduce GLAD, a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
arXiv Detail & Related papers (2023-09-12T04:21:30Z) - PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows.
Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z) - Leveraging Log Instructions in Log-based Anomaly Detection [0.5949779668853554]
We propose a method for reliable and practical anomaly detection from system logs.
It overcomes the common disadvantage of related works by building an anomaly detection model with log instructions from the source code of 1000+ GitHub projects.
The proposed method, named ADLILog, combines the log instructions and the data from the system of interest (target system) to learn a deep neural network model.
arXiv Detail & Related papers (2022-07-07T10:22:10Z) - LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak
Supervision [63.08516384181491]
We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts.
Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect.
Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.
arXiv Detail & Related papers (2021-11-02T15:16:08Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations.
We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.