A Survey of Learning-based Automated Program Repair
- URL: http://arxiv.org/abs/2301.03270v3
- Date: Wed, 1 Nov 2023 06:03:48 GMT
- Title: A Survey of Learning-based Automated Program Repair
- Authors: Quanjun Zhang, Chunrong Fang, Yuxiang Ma, Weisong Sun, Zhenyu Chen
- Abstract summary: Automated program repair (APR) aims to fix software bugs automatically and plays a crucial role in software development and maintenance.
With the recent advances in deep learning (DL), an increasing number of APR techniques have been proposed to leverage neural networks to learn bug-fixing patterns from massive open-source code repositories.
This paper provides a systematic survey to summarize the current state-of-the-art research in the learning-based APR community.
- Score: 12.09968472868107
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated program repair (APR) aims to fix software bugs automatically and
plays a crucial role in software development and maintenance. With the recent
advances in deep learning (DL), an increasing number of APR techniques have
been proposed to leverage neural networks to learn bug-fixing patterns from
massive open-source code repositories. Such learning-based techniques usually
treat APR as a neural machine translation (NMT) task, where buggy code snippets
(i.e., source language) are translated into fixed code snippets (i.e., target
language) automatically. Benefiting from the powerful capability of DL to learn
hidden relationships from previous bug-fixing datasets, learning-based APR
techniques have achieved remarkable performance. In this paper, we provide a
systematic survey to summarize the current state-of-the-art research in the
learning-based APR community. We illustrate the general workflow of
learning-based APR techniques and detail the crucial components, including
fault localization, patch generation, patch ranking, patch validation, and
patch correctness phases. We then discuss the widely-adopted datasets and
evaluation metrics and outline existing empirical studies. We discuss several
critical aspects of learning-based APR techniques, such as repair domains,
industrial deployment, and the open science issue. We highlight several
practical guidelines on applying DL techniques for future APR studies, such as
exploring explainable patch generation and utilizing code features. Overall,
our paper can help researchers gain a comprehensive understanding about the
achievements of the existing learning-based APR techniques and promote the
practical application of these techniques. Our artifacts are publicly available
at \url{https://github.com/QuanjunZhang/AwesomeLearningAPR}.
Related papers
- Machine Learning Innovations in CPR: A Comprehensive Survey on Enhanced Resuscitation Techniques [52.71395121577439]
This survey paper explores the transformative role of Machine Learning (ML) and Artificial Intelligence (AI) in Cardiopulmonary Resuscitation (CPR)
It highlights the impact of predictive modeling, AI-enhanced devices, and real-time data analysis in improving resuscitation outcomes.
The paper provides a comprehensive overview, classification, and critical analysis of current applications, challenges, and future directions in this emerging field.
arXiv Detail & Related papers (2024-11-03T18:01:50Z) - A Case Study of LLM for Automated Vulnerability Repair: Assessing Impact of Reasoning and Patch Validation Feedback [7.742213291781287]
We present VRpilot, a vulnerability repair technique based on reasoning and patch validation feedback.
Our results show that VRpilot generates, on average, 14% and 7.6% more correct patches than the baseline techniques on C and Java.
arXiv Detail & Related papers (2024-05-24T16:29:48Z) - A Closer Look at the Limitations of Instruction Tuning [52.587607091917214]
We show that Instruction Tuning (IT) fails to enhance knowledge or skills in large language models (LLMs)
We also show that popular methods to improve IT do not lead to performance improvements over a simple LoRA fine-tuned model.
Our findings reveal that responses generated solely from pre-trained knowledge consistently outperform responses by models that learn any form of new knowledge from IT on open-source datasets.
arXiv Detail & Related papers (2024-02-03T04:45:25Z) - KGA: A General Machine Unlearning Framework Based on Knowledge Gap
Alignment [51.15802100354848]
We propose a general unlearning framework called KGA to induce forgetfulness.
Experiments on large-scale datasets show that KGA yields comprehensive improvements over baselines.
arXiv Detail & Related papers (2023-05-11T02:44:29Z) - A Survey on Automated Program Repair Techniques [19.8878105453369]
We introduce four different APR patch generation schemes: search-based, constraint-based, template-based, and learning-based.
We propose a uniform set of criteria to review and compare each APR tool, summarize the advantages and disadvantages, and discuss the current state of APR development.
arXiv Detail & Related papers (2023-03-31T16:28:37Z) - Meta Learning for Natural Language Processing: A Survey [88.58260839196019]
Deep learning has been the mainstream technique in natural language processing (NLP) area.
Deep learning requires many labeled data and is less generalizable across domains.
Meta-learning is an arising field in machine learning studying approaches to learn better algorithms.
arXiv Detail & Related papers (2022-05-03T13:58:38Z) - Neural Program Repair: Systems, Challenges and Solutions [20.776565589340265]
Automated Program Repair (APR) aims to automatically fix bugs in the source code.
Recently, there is a rise of Neural Program Repair (NPR) studies.
NPR approaches have a great advantage in applicability because they do not need any specification.
arXiv Detail & Related papers (2022-02-22T12:56:31Z) - CURE: Code-Aware Neural Machine Translation for Automatic Program Repair [11.556110575946631]
We propose CURE, a new NMT-based APR technique with three major novelties.
CURE pre-trains a programming language (PL) model on a large software to learn developer-like source code before the APR task.
Second, CURE designs a new code-aware search strategy that finds more correct fixes by focusing on compilable patches and patches that are close in length to the buggy code.
arXiv Detail & Related papers (2021-02-26T22:30:28Z) - A Systematic Literature Review on the Use of Deep Learning in Software
Engineering Research [22.21817722054742]
An increasingly popular set of techniques adopted by software engineering (SE) researchers to automate development tasks are those rooted in the concept of Deep Learning (DL)
This paper presents a systematic literature review of research at the intersection of SE & DL.
We center our analysis around the components of learning, a set of principles that govern the application of machine learning techniques to a given problem domain.
arXiv Detail & Related papers (2020-09-14T15:28:28Z) - Discovering Reinforcement Learning Algorithms [53.72358280495428]
Reinforcement learning algorithms update an agent's parameters according to one of several possible rules.
This paper introduces a new meta-learning approach that discovers an entire update rule.
It includes both 'what to predict' (e.g. value functions) and 'how to learn from it' by interacting with a set of environments.
arXiv Detail & Related papers (2020-07-17T07:38:39Z) - Bayesian active learning for production, a systematic study and a
reusable library [85.32971950095742]
In this paper, we analyse the main drawbacks of current active learning techniques.
We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process.
We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size.
arXiv Detail & Related papers (2020-06-17T14:51:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.