Large Language Models for Test-Free Fault Localization
- URL: http://arxiv.org/abs/2310.01726v1
- Date: Tue, 3 Oct 2023 01:26:39 GMT
- Title: Large Language Models for Test-Free Fault Localization
- Authors: Aidan Z.H. Yang, Ruben Martins, Claire Le Goues, Vincent J.
Hellendoorn
- Abstract summary: We propose a language model based fault localization approach that locates buggy lines of code without any test coverage information.
We fine-tune language models with 350 million, 6 billion, and 16 billion parameters on small, manually curated corpora of buggy programs.
Our empirical evaluation shows that LLMAO improves the Top-1 results over the state-of-the-art machine learning fault localization (MLFL) baselines by 2.3%-54.4%, and Top-5 results by 14.4%-35.6%.
- Score: 11.080712737595174
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fault Localization (FL) aims to automatically localize buggy lines of code, a
key first step in many manual and automatic debugging tasks. Previous FL
techniques assume the provision of input tests, and often require extensive
program analysis, program instrumentation, or data preprocessing. Prior work on
deep learning for APR struggles to learn from small datasets and produces
limited results on real-world programs. Inspired by the ability of large
language models (LLMs) of code to adapt to new tasks based on very few
examples, we investigate the applicability of LLMs to line level fault
localization. Specifically, we propose to overcome the left-to-right nature of
LLMs by fine-tuning a small set of bidirectional adapter layers on top of the
representations learned by LLMs to produce LLMAO, the first language model
based fault localization approach that locates buggy lines of code without any
test coverage information. We fine-tune LLMs with 350 million, 6 billion, and
16 billion parameters on small, manually curated corpora of buggy programs such
as the Defects4J corpus. We observe that our technique achieves substantially
more confidence in fault localization when built on the larger models, with bug
localization performance scaling consistently with the LLM size. Our empirical
evaluation shows that LLMAO improves the Top-1 results over the
state-of-the-art machine learning fault localization (MLFL) baselines by
2.3%-54.4%, and Top-5 results by 14.4%-35.6%. LLMAO is also the first FL
technique trained using a language model architecture that can detect security
vulnerabilities down to the code line level.
Related papers
- Where's the Bug? Attention Probing for Scalable Fault Localization [18.699014321422023]
We present Bug Attention Probe (BAP), a method which learns state-of-the-art fault localization without any direct localization labels.
BAP is significantly more efficient than prompting, outperforming large open-weight models at a small fraction of the computational cost.
arXiv Detail & Related papers (2025-02-19T18:59:32Z) - LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.
LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.
Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z) - Enhancing Code Generation for Low-Resource Languages: No Silver Bullet [55.39571645315926]
Large Language Models (LLMs) rely on large and diverse datasets to learn syntax, semantics, and usage patterns of programming languages.
For low-resource languages, the limited availability of such data hampers the models' ability to generalize effectively.
We present an empirical study investigating the effectiveness of several approaches for boosting LLMs' performance on low-resource languages.
arXiv Detail & Related papers (2025-01-31T12:23:28Z) - Adaptive Pruning for Large Language Models with Structural Importance Awareness [66.2690963378878]
Large language models (LLMs) have significantly improved language understanding and generation capabilities.
LLMs are difficult to deploy on resource-constrained edge devices due to their high computational and storage resource demands.
We propose structurally-aware adaptive pruning (SAAP) to significantly reduce the computational and memory costs while maintaining model performance.
arXiv Detail & Related papers (2024-12-19T18:08:04Z) - Enhancing Fault Localization Through Ordered Code Analysis with LLM Agents and Self-Reflection [8.22737389683156]
Large Language Models (LLMs) offer promising improvements in fault localization by enhancing code comprehension and reasoning.
We introduce LLM4FL, a novel LLM-agent-based fault localization approach that integrates SBFL rankings with a divide-and-conquer strategy.
Our results demonstrate that LLM4FL outperforms AutoFL by 19.27% in Top-1 accuracy and surpasses state-of-the-art supervised techniques such as DeepFL and Grace.
arXiv Detail & Related papers (2024-09-20T16:47:34Z) - Impact of Large Language Models of Code on Fault Localization [2.936007114555107]
We propose a simple but effective sequence generation approach for fine-tuning large language models of code for FL tasks.
Specifically, we fine-tune representative encoder, encoder-decoder, and decoder-based 13 LLMCs for FL tasks.
Experimental results show that LLMCs fine-tuned with our approach successfully pinpoint error positions in 50.6%, 64.2%, and 72.3% of 1,291 methods in Defects4J for Top-2/3/5 prediction.
arXiv Detail & Related papers (2024-08-19T02:36:07Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - An empirical study of LLaMA3 quantization: from LLMs to MLLMs [54.91212829143966]
The LLaMA family is one of the most powerful open-source large language models (LLMs)
LLaMA3 models have achieved impressive performance in various domains with super-large scale pre-training on over 15T tokens of data.
We evaluate the 10 existing post-training quantization and LoRA fine-tuning (LoRA-FT) methods of LLaMA3 on 1-8 bits and various datasets to reveal the low-bit quantization performance of LLaMA3.
arXiv Detail & Related papers (2024-04-22T10:03:03Z) - Aligning the Objective of LLM-based Program Repair [14.935596175148586]
This paper investigates a new approach to adapt large language models (LLMs) to program repair.
Our core insight is that LLM's APR capability can be greatly improved by simply aligning the output to their training objective.
Based on this insight, we designed D4C, a straightforward prompting framework for APR.
arXiv Detail & Related papers (2024-04-13T02:36:40Z) - Large Language Models in Fault Localisation [32.87044163543427]
This paper investigates the capability of ChatGPT-3.5 and ChatGPT-4, the two state-of-the-art LLMs, on fault localisation.
Within function-level context, ChatGPT-4 outperforms all the existing fault localisation methods.
However, when the code context of the Defects4J dataset expands to the class-level, ChatGPT-4's performance suffers a significant drop.
arXiv Detail & Related papers (2023-08-29T13:07:27Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.