Related papers: A Comprehensive Survey of Logging in Software: From Logging Statements Automation to Log Mining and Analysis

A Comprehensive Survey of Logging in Software: From Logging Statements Automation to Log Mining and Analysis

URL: http://arxiv.org/abs/2110.12489v1
Date: Sun, 24 Oct 2021 17:15:06 GMT
Title: A Comprehensive Survey of Logging in Software: From Logging Statements Automation to Log Mining and Analysis
Authors: Sina Gholamian and Paul A. S. Ward
Abstract summary: We study a large number of conference and journal papers that appeared on top-level peer-reviewed venues. We provide a set of challenges and opportunities that will lead the researchers in academia and industry in moving the field forward.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Logs are widely used to record runtime information of software systems, such as the timestamp and the importance of an event, the unique ID of the source of the log, and a part of the state of a task's execution. The rich information of logs enables system developers (and operators) to monitor the runtime behaviors of their systems and further track down system problems and perform analysis on log data in production settings. However, the prior research on utilizing logs is scattered and that limits the ability of new researchers in this field to quickly get to the speed and hampers currently active researchers to advance this field further. Therefore, this paper surveys and provides a systematic literature review of the contemporary logging practices and log statements' mining and monitoring techniques and their applications such as in system failure detection and diagnosis. We study a large number of conference and journal papers that appeared on top-level peer-reviewed venues. Additionally, we draw high-level trends of ongoing research and categorize publications into subdivisions. In the end, and based on our holistic observations during this survey, we provide a set of challenges and opportunities that will lead the researchers in academia and industry in moving the field forward.

Related papers

Query Logs Analytics: A Aystematic Literature Review [0.0]
This paper presents a systematic survey of log usage, focusing on Database (DB), Data Warehouse (DW), Web, and KG logs.<n>More than 300 publications were analyzed to address three central questions: do different types of logs share common structural and functional characteristics?<n>The survey reveals a limited number of end-to-end approaches, the absence of standardization across log usage pipelines, and the existence of shared structural elements among different types of logs.
arXiv Detail & Related papers (2025-08-19T15:38:13Z)
Deep Research Agents: A Systematic Examination And Roadmap [79.04813794804377]
Deep Research (DR) agents are designed to tackle complex, multi-turn informational research tasks.<n>In this paper, we conduct a detailed analysis of the foundational technologies and architectural components that constitute DR agents.
arXiv Detail & Related papers (2025-06-22T16:52:48Z)
Out-of-Distribution Detection on Graphs: A Survey [58.47395497985277]
Graph out-of-distribution (GOOD) detection focuses on identifying graph data that deviates from the distribution seen during training. We categorize existing methods into four types: enhancement-based, reconstruction-based, information propagation-based, and classification-based approaches. We discuss practical applications and theoretical foundations, highlighting the unique challenges posed by graph data.
arXiv Detail & Related papers (2025-02-12T04:07:12Z)
LLM-based event log analysis techniques: A survey [1.6180992915701702]
Event logs record key information on activities that occur on computing devices. Researchers have developed automated techniques to improve the event log analysis process. This paper aims to survey LLM-based event log analysis techniques.
arXiv Detail & Related papers (2025-02-02T05:28:17Z)
Log Summarisation for Defect Evolution Analysis [14.055261850785456]
We suggest an online semantic-based clustering approach to error logs. We also introduce a novel metric to evaluate the performance of temporal log clusters.
arXiv Detail & Related papers (2024-03-13T09:18:46Z)
RAPID: Training-free Retrieval-based Log Anomaly Detection with PLM considering Token-level information [7.861095039299132]
The need for log anomaly detection is growing, especially in real-world applications. Traditional deep learning-based anomaly detection models require dataset-specific training, leading to corresponding delays. We introduce RAPID, a model that capitalizes on the inherent features of log data to enable anomaly detection without training delays.
arXiv Detail & Related papers (2023-11-09T06:11:44Z)
Log Parsing Evaluation in the Era of Modern Software Systems [47.370291246632114]
We focus on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We propose a tool, Logchimera, that enables estimating log parsing performance in industry contexts.
arXiv Detail & Related papers (2023-08-17T14:19:22Z)
On the Effectiveness of Log Representation for Log-based Anomaly Detection [12.980238412281471]
This work investigates and compares the commonly adopted log representation techniques from previous log analysis research. We select six log representation techniques and evaluate them with seven ML models and four public log datasets. We also examine the impacts of the log parsing process and the different feature aggregation approaches when they are employed with log representation techniques.
arXiv Detail & Related papers (2023-08-17T02:18:59Z)
PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows. Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z)
LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak Supervision [63.08516384181491]
We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts. Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect. Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.
arXiv Detail & Related papers (2021-11-02T15:16:08Z)
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users. We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z)
Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics [40.96246300489472]
We have collected and released loghub, a large collection of system log datasets. In particular, loghub provides 19 real-world log datasets collected from a wide range of software systems. Up to the time of this paper writing, the loghub datasets have been downloaded for roughly 90,000 times in total by hundreds of organizations from both industry and academia.
arXiv Detail & Related papers (2020-08-14T16:17:54Z)
Improving time use measurement with personal big data collection -- the experience of the European Big Data Hackathon 2019 [62.997667081978825]
This article assesses the experience with i-Log at the European Big Data Hackathon 2019, a satellite event of the New Techniques and Technologies for Statistics (NTTS) conference, organised by Eurostat. i-Log is a system that allows to capture personal big data from smartphones' internal sensors to be used for time use measurement.
arXiv Detail & Related papers (2020-04-24T18:40:08Z)
Self-Supervised Log Parsing [59.04636530383049]
Large-scale software systems generate massive volumes of semi-structured log records. Existing approaches rely on log-specifics or manual rule extraction. We propose NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling.
arXiv Detail & Related papers (2020-03-17T19:25:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.