Studying and Benchmarking Large Language Models For Log Level Suggestion
- URL: http://arxiv.org/abs/2410.08499v1
- Date: Fri, 11 Oct 2024 03:52:17 GMT
- Title: Studying and Benchmarking Large Language Models For Log Level Suggestion
- Authors: Yi Wen Heng, Zeyang Ma, Zhenhao Li, Dong Jae Kim, Tse-Hsun, Chen,
- Abstract summary: Large Language Models (LLMs) have become a focal point of research across various domains.
This paper investigates the impact of characteristics and learning paradigms on the performance of 12 open-source LLMs in log level suggestion.
- Score: 49.176736212364496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have become a focal point of research across various domains, including software engineering, where their capabilities are increasingly leveraged. Recent studies have explored the integration of LLMs into software development tools and frameworks, revealing their potential to enhance performance in text and code-related tasks. Log level is a key part of a logging statement that allows software developers control the information recorded during system runtime. Given that log messages often mix natural language with code-like variables, LLMs' language translation abilities could be applied to determine the suitable verbosity level for logging statements. In this paper, we undertake a detailed empirical analysis to investigate the impact of characteristics and learning paradigms on the performance of 12 open-source LLMs in log level suggestion. We opted for open-source models because they enable us to utilize in-house code while effectively protecting sensitive information and maintaining data security. We examine several prompting strategies, including Zero-shot, Few-shot, and fine-tuning techniques, across different LLMs to identify the most effective combinations for accurate log level suggestions. Our research is supported by experiments conducted on 9 large-scale Java systems. The results indicate that although smaller LLMs can perform effectively with appropriate instruction and suitable techniques, there is still considerable potential for improvement in their ability to suggest log levels.
Related papers
- Crystal: Illuminating LLM Abilities on Language and Code [58.5467653736537]
We propose a pretraining strategy to enhance the integration of natural language and coding capabilities.
The resulting model, Crystal, demonstrates remarkable capabilities in both domains.
arXiv Detail & Related papers (2024-11-06T10:28:46Z) - LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models [32.938862271579424]
This paper introduces a novel knowledge enhancement framework, called LUK, which acquires expert knowledge from LLMs to empower log understanding on a smaller PLM.
LUK achieves state-of-the-art results on different log analysis tasks and extensive experiments demonstrate expert knowledge from LLMs can be utilized more effectively to understand logs.
arXiv Detail & Related papers (2024-09-03T13:58:34Z) - CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language Models [68.64605538559312]
In this paper, we analyze the MLLM instruction tuning from both theoretical and empirical perspectives.
Inspired by our findings, we propose a measurement to quantitatively evaluate the learning balance.
In addition, we introduce an auxiliary loss regularization method to promote updating of the generation distribution of MLLMs.
arXiv Detail & Related papers (2024-07-29T23:18:55Z) - An Empirical Study of Automated Vulnerability Localization with Large Language Models [21.84971967029474]
Large Language Models (LLMs) have shown potential in various domains, yet their effectiveness in vulnerability localization remains underexplored.
Our investigation encompasses 10+ leading LLMs suitable for code analysis, including ChatGPT and various open-source models.
We explore the efficacy of these LLMs using 4 distinct paradigms: zero-shot learning, one-shot learning, discriminative fine-tuning, and generative fine-tuning.
arXiv Detail & Related papers (2024-03-30T08:42:10Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - The potential of LLMs for coding with low-resource and domain-specific
programming languages [0.0]
This study focuses on the econometric scripting language named hansl of the open-source software gretl.
Our findings suggest that LLMs can be a useful tool for writing, understanding, improving, and documenting gretl code.
arXiv Detail & Related papers (2023-07-24T17:17:13Z) - Exploring the Effectiveness of LLMs in Automated Logging Generation: An Empirical Study [32.53659676826846]
This paper performs the first study on exploring large language models (LLMs) for logging statement generation.
We first build a logging statement generation dataset, LogBench, with two parts: (1) LogBench-O: logging statements collected from GitHub repositories, and (2) LogBench-T: the transformed unseen code from LogBench-O.
arXiv Detail & Related papers (2023-07-12T06:32:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.