Related papers: A Large-Scale Evaluation for Log Parsing Techniques: How Far Are We?

A Large-Scale Evaluation for Log Parsing Techniques: How Far Are We?

URL: http://arxiv.org/abs/2308.10828v2
Date: Sat, 23 Mar 2024 03:46:58 GMT
Title: A Large-Scale Evaluation for Log Parsing Techniques: How Far Are We?
Authors: Zhihan Jiang, Jinyang Liu, Junjie Huang, Yichen Li, Yintong Huo, Jiazhen Gu, Zhuangbin Chen, Jieming Zhu, Michael R. Lyu,
Abstract summary: We provide a new collection of annotated log datasets, denoted Loghub-2.0, which can better reflect the characteristics of log data in real-world software systems. We conduct a thorough re-evaluation of 15 state-of-the-art logs in a more rigorous and practical setting. Particularly, we introduce a new evaluation metric to mitigate the sensitivity of existing metrics to imbalanced data distributions.
Score: 42.56249610409624
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Log data have facilitated various tasks of software development and maintenance, such as testing, debugging and diagnosing. Due to the unstructured nature of logs, log parsing is typically required to transform log messages into structured data for automated log analysis. Given the abundance of log parsers that employ various techniques, evaluating these tools to comprehend their characteristics and performance becomes imperative. Loghub serves as a commonly used dataset for benchmarking log parsers, but it suffers from limited scale and representativeness, posing significant challenges for studies to comprehensively evaluate existing log parsers or develop new methods. This limitation is particularly pronounced when assessing these log parsers for production use. To address these limitations, we provide a new collection of annotated log datasets, denoted Loghub-2.0, which can better reflect the characteristics of log data in real-world software systems. Loghub-2.0 comprises 14 datasets with an average of 3.6 million log lines in each dataset. Based on Loghub-2.0, we conduct a thorough re-evaluation of 15 state-of-the-art log parsers in a more rigorous and practical setting. Particularly, we introduce a new evaluation metric to mitigate the sensitivity of existing metrics to imbalanced data distributions. We are also the first to investigate the granular performance of log parsers on logs that represent rare system events, offering in-depth details for software diagnosis. Accurately parsing such logs is essential, yet it remains a challenge. We believe this work could shed light on the evaluation and design of log parsers in practical settings, thereby facilitating their deployment in production systems.

Related papers

LogParser-LLM: Advancing Efficient Log Parsing with Large Language Models [19.657278472819588]
We introduce Log-LLM, a novel log integrated with LLM capabilities. We address the intricate challenge of parsing granularity, proposing a new metric to allow users to calibrate granularity to their specific needs. Our method's efficacy is empirically demonstrated through evaluations on the Loghub-2k and the large-scale LogPub benchmark.
arXiv Detail & Related papers (2024-08-25T05:34:24Z)
HELP: Hierarchical Embeddings-based Log Parsing [0.25112747242081457]
Logs are a first-hand source of information for software maintenance and failure diagnosis. Log parsing is a prerequisite for automated log analysis tasks such as anomaly detection, troubleshooting, and root cause analysis. Existing online parsing algorithms are susceptible to log drift, where slight log changes create false positives that drown out real anomalies.
arXiv Detail & Related papers (2024-08-15T17:54:31Z)
Stronger, Cheaper and Demonstration-Free Log Parsing with LLMs [18.240096266464544]
We propose LogBatcher, a cost-effective LLM-based log that requires no training process or labeled data. We have conducted experiments on 16 public log datasets and the results show that LogBatcher is effective for log parsing.
arXiv Detail & Related papers (2024-06-10T10:39:28Z)
LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection [73.69399219776315]
We propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains. Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data. Then, we transfer such knowledge to the target domain via shared parameters.
arXiv Detail & Related papers (2024-01-09T12:55:21Z)
Log Parsing Evaluation in the Era of Modern Software Systems [47.370291246632114]
We focus on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We propose a tool, Logchimera, that enables estimating log parsing performance in industry contexts.
arXiv Detail & Related papers (2023-08-17T14:19:22Z)
On the Effectiveness of Log Representation for Log-based Anomaly Detection [12.980238412281471]
This work investigates and compares the commonly adopted log representation techniques from previous log analysis research. We select six log representation techniques and evaluate them with seven ML models and four public log datasets. We also examine the impacts of the log parsing process and the different feature aggregation approaches when they are employed with log representation techniques.
arXiv Detail & Related papers (2023-08-17T02:18:59Z)
Data-Driven Approach for Log Instruction Quality Assessment [59.04636530383049]
There are no widely adopted guidelines on how to write log instructions with good quality properties. We identify two quality properties: 1) correct log level assignment assessing the correctness of the log level, and 2) sufficient linguistic structure assessing the minimal richness of the static text necessary for verbose event description. Our approach correctly assesses log level assignments with an accuracy of 0.88, and the sufficient linguistic structure with an F1 score of 0.99, outperforming the baselines.
arXiv Detail & Related papers (2022-04-06T07:02:23Z)
LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak Supervision [63.08516384181491]
We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts. Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect. Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.
arXiv Detail & Related papers (2021-11-02T15:16:08Z)
Self-Supervised Log Parsing [59.04636530383049]
Large-scale software systems generate massive volumes of semi-structured log records. Existing approaches rely on log-specifics or manual rule extraction. We propose NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling.
arXiv Detail & Related papers (2020-03-17T19:25:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.