LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection
- URL: http://arxiv.org/abs/2401.04749v1
- Date: Tue, 9 Jan 2024 12:55:21 GMT
- Title: LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection
- Authors: Hongcheng Guo, Jian Yang, Jiaheng Liu, Jiaqi Bai, Boyang Wang, Zhoujun
Li, Tieqiao Zheng, Bo Zhang, Junran peng, Qi Tian
- Abstract summary: We propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains.
Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data.
Then, we transfer such knowledge to the target domain via shared parameters.
- Score: 73.69399219776315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Log anomaly detection is a key component in the field of artificial
intelligence for IT operations (AIOps). Considering log data of variant
domains, retraining the whole network for unknown domains is inefficient in
real industrial scenarios. However, previous deep models merely focused on
extracting the semantics of log sequences in the same domain, leading to poor
generalization on multi-domain logs. To alleviate this issue, we propose a
unified Transformer-based framework for Log anomaly detection (LogFormer) to
improve the generalization ability across different domains, where we establish
a two-stage process including the pre-training and adapter-based tuning stage.
Specifically, our model is first pre-trained on the source domain to obtain
shared semantic knowledge of log data. Then, we transfer such knowledge to the
target domain via shared parameters. Besides, the Log-Attention module is
proposed to supplement the information ignored by the log-paring. The proposed
method is evaluated on three public and one real-world datasets. Experimental
results on multiple benchmarks demonstrate the effectiveness of our LogFormer
with fewer trainable parameters and lower training costs.
Related papers
- LogLLM: Log-based Anomaly Detection Using Large Language Models [8.03646578793411]
We propose LogLLM, a log-based anomaly detection framework that leverages large language models (LLMs)
LogLLM employs BERT for extracting semantic vectors from log messages, while utilizing Llama, a transformer decoder-based model, for classifying log sequences.
Our framework is trained through a novel three-stage procedure designed to enhance performance and adaptability.
arXiv Detail & Related papers (2024-11-13T12:18:00Z) - HELP: Hierarchical Embeddings-based Log Parsing [0.25112747242081457]
Logs are a first-hand source of information for software maintenance and failure diagnosis.
Log parsing is a prerequisite for automated log analysis tasks such as anomaly detection, troubleshooting, and root cause analysis.
Existing online parsing algorithms are susceptible to log drift, where slight log changes create false positives that drown out real anomalies.
arXiv Detail & Related papers (2024-08-15T17:54:31Z) - Leveraging Log Instructions in Log-based Anomaly Detection [0.5949779668853554]
We propose a method for reliable and practical anomaly detection from system logs.
It overcomes the common disadvantage of related works by building an anomaly detection model with log instructions from the source code of 1000+ GitHub projects.
The proposed method, named ADLILog, combines the log instructions and the data from the system of interest (target system) to learn a deep neural network model.
arXiv Detail & Related papers (2022-07-07T10:22:10Z) - TransLog: A Unified Transformer-based Framework for Log Anomaly
Detection [29.29752871868652]
Ourmethod is comprised of the pretraining and adapter-based tuning stage.
Our simple yet efficient approach, with fewer trainable parameters and lower training costs in the target domain, achieves state-of-the-art performance on three benchmarks.
arXiv Detail & Related papers (2021-12-31T10:46:14Z) - Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training
for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials.
We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field.
Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z) - Unsupervised Out-of-Domain Detection via Pre-trained Transformers [56.689635664358256]
Out-of-domain inputs can lead to unpredictable outputs and sometimes catastrophic safety issues.
Our work tackles the problem of detecting out-of-domain samples with only unsupervised in-domain data.
Two domain-specific fine-tuning approaches are further proposed to boost detection accuracy.
arXiv Detail & Related papers (2021-06-02T05:21:25Z) - Log2NS: Enhancing Deep Learning Based Analysis of Logs With Formal to
Prevent Survivorship Bias [0.37943450391498496]
We introduce log to Neuro-symbolic (Log2NS), a framework that combines probabilistic analysis from machine learning (ML) techniques on observational data with certainties derived from symbolic reasoning on an underlying formal model.
Log2NS provides an ability to query from static logs and correlation engines for positive instances, as well as formal reasoning for negative and unseen instances.
arXiv Detail & Related papers (2021-05-29T00:01:08Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations.
We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z) - Self-Supervised Log Parsing [59.04636530383049]
Large-scale software systems generate massive volumes of semi-structured log records.
Existing approaches rely on log-specifics or manual rule extraction.
We propose NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling.
arXiv Detail & Related papers (2020-03-17T19:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.