PDLogger: Automated Logging Framework for Practical Software Development
- URL: http://arxiv.org/abs/2507.19951v1
- Date: Sat, 26 Jul 2025 13:35:57 GMT
- Title: PDLogger: Automated Logging Framework for Practical Software Development
- Authors: Shengcheng Duan, Yihua Xu, Sheng Zhang, Shen Wang, Yue Duan,
- Abstract summary: Existing automated logging techniques focus on isolated sub-tasks.<n>PDLogger is the first end-to-end log generation technique expressly designed for practical, multi-log scenarios.<n>It improves log-position precision by 139.0 percent, F1 by 69.2 percent, level accuracy by 82.3 percent, variable precision by 131.8 percent, and message quality (BERTScore) by 65.7 percent.
- Score: 7.860311994179783
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Logging is indispensable for maintaining the reliability and diagnosability of modern software, yet developers still struggle to decide where and how to log effectively. Existing automated logging techniques focus on isolated sub-tasks - predicting a single log position, level, or message - and therefore cannot produce complete, high-quality log statements that reflect real-world practice in which multiple logs often appear inside one method. They also neglect deeper semantic dependencies among methods and consider only a narrow set of candidate variables, leading to superficial or incomplete logs. In this paper, we present PDLogger, the first end-to-end log generation technique expressly designed for practical, multi-log scenarios. PDLogger operates in three phases. (1) Log position prediction: block-type-aware structured prompts guide a large language model (LLM) to suggest candidate positions across all control-flow blocks of a method. (2) Log generation: backward program slicing supplies precise inter-procedural control and data-dependency context, while an expanded variable extractor captures both member and external function expressions; the enriched prompt enables the LLM to emit a full log statement (position, level, message, variables). (3) Log refinement: level correction and context-sensitive deduplication prune false positives and redundant logs. We evaluate PDLogger on 3,113 log statements drawn from two widely used Java projects. Compared with the strongest prior systems, PDLogger improves log-position precision by 139.0 percent, F1 by 69.2 percent, level accuracy by 82.3 percent, variable precision by 131.8 percent, and message quality (BERTScore) by 65.7 percent. The framework consistently performs well with different mainstream LLMs, demonstrating robustness and generality. PDLogger's implementation is available as open source to foster future research and adoption.
Related papers
- EquiBench: Benchmarking Large Language Models' Understanding of Program Semantics via Equivalence Checking [55.81461218284736]
EquiBench is a new benchmark for evaluating large language models (LLMs)<n>It determines whether two programs produce identical outputs for all possible inputs.<n>We evaluate 19 state-of-the-art LLMs and find that the best accuracies are 63.8% and 76.2%, only modestly above the 50% random baseline.
arXiv Detail & Related papers (2025-02-18T02:54:25Z) - AL-Bench: A Benchmark for Automatic Logging [3.8293110324859505]
We introduce AL-Bench, a benchmark designed specifically for automatic logging tools.<n> AL-Bench includes a large-scale, high-quality, diverse dataset collected from 10 widely recognized projects.<n>It provides a run-time perspective of logging quality in addition to the traditional static evaluation at source code level.
arXiv Detail & Related papers (2025-02-05T13:32:39Z) - LogLLM: Log-based Anomaly Detection Using Large Language Models [7.7704116297749675]
We propose LogLLM, a log-based anomaly detection framework that leverages large language models (LLMs)<n>LogLLM employs BERT for extracting semantic vectors from log messages, while utilizing Llama, a transformer decoder-based model, for classifying log sequences.<n>Our framework is trained through a novel three-stage procedure designed to enhance performance and adaptability.
arXiv Detail & Related papers (2024-11-13T12:18:00Z) - Studying and Benchmarking Large Language Models For Log Level Suggestion [49.176736212364496]
Large Language Models (LLMs) have become a focal point of research across various domains.
This paper investigates the impact of characteristics and learning paradigms on the performance of 12 open-source LLMs in log level suggestion.
arXiv Detail & Related papers (2024-10-11T03:52:17Z) - LogParser-LLM: Advancing Efficient Log Parsing with Large Language Models [19.657278472819588]
We introduce Log-LLM, a novel log integrated with LLM capabilities.
We address the intricate challenge of parsing granularity, proposing a new metric to allow users to calibrate granularity to their specific needs.
Our method's efficacy is empirically demonstrated through evaluations on the Loghub-2k and the large-scale LogPub benchmark.
arXiv Detail & Related papers (2024-08-25T05:34:24Z) - Log Probabilities Are a Reliable Estimate of Semantic Plausibility in Base and Instruction-Tuned Language Models [50.15455336684986]
We evaluate the effectiveness of LogProbs and basic prompting to measure semantic plausibility.
We find that LogProbs offers a more reliable measure of semantic plausibility than direct zero-shot prompting.
We conclude that, even in the era of prompt-based evaluations, LogProbs constitute a useful metric of semantic plausibility.
arXiv Detail & Related papers (2024-03-21T22:08:44Z) - Go Static: Contextualized Logging Statement Generation [38.15795803230719]
SCLogger is a contextualized logging statement generation approach with inter-method static contexts.
SCLogger surpasses the state-of-the-art approach by 8.7% in logging position accuracy, 32.1% in level accuracy, 19.6% in variable precision, and 138.4% in text BLEU-4 score.
arXiv Detail & Related papers (2024-02-20T12:22:59Z) - LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection [73.69399219776315]
We propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains.
Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data.
Then, we transfer such knowledge to the target domain via shared parameters.
arXiv Detail & Related papers (2024-01-09T12:55:21Z) - Data-Driven Approach for Log Instruction Quality Assessment [59.04636530383049]
There are no widely adopted guidelines on how to write log instructions with good quality properties.
We identify two quality properties: 1) correct log level assignment assessing the correctness of the log level, and 2) sufficient linguistic structure assessing the minimal richness of the static text necessary for verbose event description.
Our approach correctly assesses log level assignments with an accuracy of 0.88, and the sufficient linguistic structure with an F1 score of 0.99, outperforming the baselines.
arXiv Detail & Related papers (2022-04-06T07:02:23Z) - Borrowing from Similar Code: A Deep Learning NLP-Based Approach for Log
Statement Automation [0.0]
We introduce an updated and improved log-aware code-clone detection method to predict the location of logging statements.
We incorporate natural language processing (NLP) and deep learning methods to automate the log statements' description prediction.
Our analysis shows that our hybrid NLP and code-clone detection approach (NLP CC'd) outperforms conventional clone detectors in finding log statement locations.
arXiv Detail & Related papers (2021-12-02T14:03:49Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - Self-Supervised Log Parsing [59.04636530383049]
Large-scale software systems generate massive volumes of semi-structured log records.
Existing approaches rely on log-specifics or manual rule extraction.
We propose NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling.
arXiv Detail & Related papers (2020-03-17T19:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.