Revisiting the Plastic Surgery Hypothesis via Large Language Models
- URL: http://arxiv.org/abs/2303.10494v1
- Date: Sat, 18 Mar 2023 20:33:46 GMT
- Title: Revisiting the Plastic Surgery Hypothesis via Large Language Models
- Authors: Chunqiu Steven Xia, Yifeng Ding, Lingming Zhang
- Abstract summary: We propose FitRepair, which combines the direct usage of Large Language Models with two domain-specific fine-tuning strategies and one prompting strategy for more powerful APR.
Our experiments on the widely studied Defects4j 1.2 and 2.0 datasets show that FitRepair fixes 89 and 44 bugs.
- Score: 9.904030364454563
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated Program Repair (APR) aspires to automatically generate patches for
an input buggy program. Traditional APR tools typically focus on specific bug
types and fixes through the use of templates, heuristics, and formal
specifications. However, these techniques are limited in terms of the bug types
and patch variety they can produce. As such, researchers have designed various
learning-based APR tools with recent work focused on directly using Large
Language Models (LLMs) for APR. While LLM-based APR tools are able to achieve
state-of-the-art performance on many repair datasets, the LLMs used for direct
repair are not fully aware of the project-specific information such as unique
variable or method names.
The plastic surgery hypothesis is a well-known insight for APR, which states
that the code ingredients to fix the bug usually already exist within the same
project. Traditional APR tools have largely leveraged the plastic surgery
hypothesis by designing manual or heuristic-based approaches to exploit such
existing code ingredients. However, as recent APR research starts focusing on
LLM-based approaches, the plastic surgery hypothesis has been largely ignored.
In this paper, we ask the following question: How useful is the plastic surgery
hypothesis in the era of LLMs? Interestingly, LLM-based APR presents a unique
opportunity to fully automate the plastic surgery hypothesis via fine-tuning
and prompting. To this end, we propose FitRepair, which combines the direct
usage of LLMs with two domain-specific fine-tuning strategies and one prompting
strategy for more powerful APR. Our experiments on the widely studied Defects4j
1.2 and 2.0 datasets show that FitRepair fixes 89 and 44 bugs (substantially
outperforming the best-performing baseline by 15 and 8), respectively,
demonstrating a promising future of the plastic surgery hypothesis in the era
of LLMs.
Related papers
- Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing [63.20133320524577]
Large Language Models (LLMs) have demonstrated great potential as generalist assistants.
It is crucial that these models exhibit desirable behavioral traits, such as non-toxicity and resilience against jailbreak attempts.
In this paper, we observe that directly editing a small subset of parameters can effectively modulate specific behaviors of LLMs.
arXiv Detail & Related papers (2024-07-11T17:52:03Z) - Hybrid Automated Program Repair by Combining Large Language Models and Program Analysis [12.7034916462208]
Automated Program Repair (APR) has garnered significant attention due to its potential to streamline the bug repair process for human developers.
This paper introduces an innovative APR approach called GIANTREPAIR.
Based on this insight, GIANTREPAIR first constructs patch skeletons from LLM-generated patches to confine the patch space, and then generates high-quality patches tailored to specific programs.
arXiv Detail & Related papers (2024-06-03T05:05:12Z) - Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models [9.454475517867817]
We propose a patch-naturalness measurement, entropy-delta, to improve the efficiency of template-based repair techniques.
Our proposed method can rank correct patches more effectively than state-of-the-art machine learning tools.
arXiv Detail & Related papers (2024-04-23T17:12:45Z) - Aligning LLMs for FL-free Program Repair [14.935596175148586]
This paper investigates a new approach to adapt large language models (LLMs) to program repair.
Our core insight is that LLM's APR capability can be greatly improved by simply aligning the output to their training objective.
Based on this insight, we designed D4C, a straightforward prompting framework for APR.
arXiv Detail & Related papers (2024-04-13T02:36:40Z) - LLM-Assisted Multi-Teacher Continual Learning for Visual Question Answering in Robotic Surgery [57.358568111574314]
Patient data privacy often restricts the availability of old data when updating the model.
Prior CL studies overlooked two vital problems in the surgical domain.
This paper proposes addressing these problems with a multimodal large language model (LLM) and an adaptive weight assignment methodology.
arXiv Detail & Related papers (2024-02-26T15:35:24Z) - Large Language Model Distilling Medication Recommendation Model [61.89754499292561]
We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs)
Our research aims to transform existing medication recommendation methodologies using LLMs.
To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
arXiv Detail & Related papers (2024-02-05T08:25:22Z) - A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - Evaluating Pre-trained Language Models for Repairing API Misuses [15.17607624946389]
API misuses often lead to software bugs, crashes, and vulnerabilities.
In a recent study, test-suite-based automatic program repair (APR) tools were found to be ineffective in repairing API misuses.
We conduct a comprehensive empirical study on 11 learning-aided APR tools, which include 9 of the state-of-the-art general-purpose PLMs and two APR tools.
Our results show that PLMs perform better than the studied APR tools in repairing API misuses.
arXiv Detail & Related papers (2023-10-25T06:10:22Z) - Automatically Correcting Large Language Models: Surveying the landscape
of diverse self-correction strategies [104.32199881187607]
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tasks.
A promising approach to rectify these flaws is self-correction, where the LLM itself is prompted or guided to fix problems in its own output.
This paper presents a comprehensive review of this emerging class of techniques.
arXiv Detail & Related papers (2023-08-06T18:38:52Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z) - Conversational Automated Program Repair [10.071615423169902]
We propose a new paradigm for program repair that alternates between patch generation and validation in a conversational manner.
We leverage the long-term context window of Large Pre-Trained Language Models to not only avoid generating previously incorrect patches but also incorporate validation feedback to help the model understand the semantic meaning of the program under test.
arXiv Detail & Related papers (2023-01-30T19:22:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.