Related papers: EvLog: Identifying Anomalous Logs over Software Evolution

EvLog: Identifying Anomalous Logs over Software Evolution

URL: http://arxiv.org/abs/2306.01509v2
Date: Tue, 15 Aug 2023 05:43:30 GMT
Title: EvLog: Identifying Anomalous Logs over Software Evolution
Authors: Yintong Huo, Cheryl Lee, Yuxin Su, Shiwen Shan, Jinyang Liu and Michael R. Lyu
Abstract summary: We propose a novel unsupervised approach named Evolving Log extractor (EvLog) to process logs without parsing. EvLog implements an anomaly discriminator with an attention mechanism to identify the anomalous logs and avoid the issue brought by the unstable sequence. EvLog has shown effectiveness in two real-world system evolution log datasets with an average F1 score of 0.955 and 0.847 in the intra-version setting and inter-version setting, respectively.
Score: 31.46106509190191
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Software logs record system activities, aiding maintainers in identifying the underlying causes for failures and enabling prompt mitigation actions. However, maintainers need to inspect a large volume of daily logs to identify the anomalous logs that reveal failure details for further diagnosis. Thus, how to automatically distinguish these anomalous logs from normal logs becomes a critical problem. Existing approaches alleviate the burden on software maintainers, but they are built upon an improper yet critical assumption: logging statements in the software remain unchanged. While software keeps evolving, our empirical study finds that evolving software brings three challenges: log parsing errors, evolving log events, and unstable log sequences. In this paper, we propose a novel unsupervised approach named Evolving Log analyzer (EvLog) to mitigate these challenges. We first build a multi-level representation extractor to process logs without parsing to prevent errors from the parser. The multi-level representations preserve the essential semantics of logs while leaving out insignificant changes in evolving events. EvLog then implements an anomaly discriminator with an attention mechanism to identify the anomalous logs and avoid the issue brought by the unstable sequence. EvLog has shown effectiveness in two real-world system evolution log datasets with an average F1 score of 0.955 and 0.847 in the intra-version setting and inter-version setting, respectively, which outperforms other state-of-the-art approaches by a wide margin. To our best knowledge, this is the first study on localizing anomalous logs over software evolution. We believe our work sheds new light on the impact of software evolution with the corresponding solutions for the log analysis community.

Related papers

Cross-System Software Log-based Anomaly Detection Using Meta-Learning [17.39262430769509]
AIOps tools have been developed to automate the process of log-based anomaly detection for software systems. Three practical challenges are widely recognized in this field: high data labeling costs, evolving logs in dynamic systems, and adaptability across different systems. We propose CroSysLog, an AIOps tool for log-event level anomaly detection, specifically designed in response to these challenges.
arXiv Detail & Related papers (2024-12-19T22:55:45Z)
Anomaly Detection on Unstable Logs with GPT Models [1.9713190626298576]
This paper reports on an experimental comparison of a fine-tuned LLM and alternative models for anomaly detection on unstable logs. The pre-training of LLMs on vast datasets may enable a robust understanding of diverse patterns and contextual information. The difference between GPT-3 and other supervised approaches tends to become more significant as the degree of changes in log sequences increases.
arXiv Detail & Related papers (2024-06-11T17:13:18Z)
LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection [73.69399219776315]
We propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains. Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data. Then, we transfer such knowledge to the target domain via shared parameters.
arXiv Detail & Related papers (2024-01-09T12:55:21Z)
GLAD: Content-aware Dynamic Graphs For Log Anomaly Detection [49.9884374409624]
GLAD is a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs. We introduce GLAD, a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
arXiv Detail & Related papers (2023-09-12T04:21:30Z)
MoniLog: An Automated Log-Based Anomaly Detection System for Cloud Computing Infrastructures [3.04585143845864]
MoniLog is a distributed approach to detect real-time anomalies within large-scale environments. It aims to detect sequential and quantitative anomalies within a multi-source log stream.
arXiv Detail & Related papers (2023-04-24T09:21:52Z)
LogGD:Detecting Anomalies from System Logs by Graph Neural Networks [14.813971618949068]
We propose a novel graph-based log anomaly detection method, LogGD, to effectively address the issue. We exploit the powerful capability of Graph Transformer Neural Network, which combines graph structure and node semantics for log-based anomaly detection.
arXiv Detail & Related papers (2022-09-16T11:51:58Z)
Failure Identification from Unstable Log Data using Deep Learning [0.27998963147546146]
We present CLog as a method for failure identification. By representing the log data as sequences of subprocesses instead of sequences of log events, the effect of the unstable log data is reduced. Our experimental results demonstrate that the learned subprocesses representations reduce the instability in the input.
arXiv Detail & Related papers (2022-04-06T07:41:48Z)
LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak Supervision [63.08516384181491]
We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts. Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect. Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.
arXiv Detail & Related papers (2021-11-02T15:16:08Z)
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users. We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z)
Self-Attentive Classification-Based Anomaly Detection in Unstructured Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations. We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z)
Self-Supervised Log Parsing [59.04636530383049]
Large-scale software systems generate massive volumes of semi-structured log records. Existing approaches rely on log-specifics or manual rule extraction. We propose NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling.
arXiv Detail & Related papers (2020-03-17T19:25:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.