A Syntax-Guided Edit Decoder for Neural Program Repair
- URL: http://arxiv.org/abs/2106.08253v1
- Date: Tue, 15 Jun 2021 16:01:51 GMT
- Title: A Syntax-Guided Edit Decoder for Neural Program Repair
- Authors: Qihao Zhu, Zeyu Sun, Yuan-an Xiao, Wenjie Zhang, Kang Yuan, Yingfei
Xiong, Lu Zhang
- Abstract summary: We propose Recoder, a syntax-guided edit decoder with placeholder generation.
We conduct experiments to evaluate Recoder on 395 bugs from Defects4J v1.2, 420 additional bugs from Defects4J v2.0, 297 bugs from IntroClassJava and 40 bugs from QuixBugs.
Our results show that Recoder repairs 53 bugs on Defects4J v1.2, which achieves 26.2% (11 bugs) improvement over the previous state-of-the-art approach for single-hunk bugs (TBar)
- Score: 14.978841897815434
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated Program Repair (APR) helps improve the efficiency of software
development and maintenance. Recent APR techniques use deep learning,
particularly the encoder-decoder architecture, to generate patches.
Though existing DL-based APR approaches have proposed different encoder
architectures, the decoder remains to be the standard one, which generates a
sequence of tokens one by one to replace the faulty statement.
This decoder has multiple limitations: 1) allowing to generate syntactically
incorrect programs, 2) inefficiently representing small edits, and 3) not being
able to generate project-specific identifiers.
In this paper, we propose Recoder, a syntax-guided edit decoder with
placeholder generation. Recoder is novel in multiple aspects: 1) Recoder
generates edits rather than modified code, allowing efficient representation of
small edits; 2) Recoder is syntax-guided, with the novel provider/decider
architecture to ensure the syntactic correctness of the patched program and
accurate generation; 3) Recoder generates placeholders that could be
instantiated as project-specific identifiers later.
We conduct experiments to evaluate Recoder on 395 bugs from Defects4J v1.2,
420 additional bugs from Defects4J v2.0, 297 bugs from IntroClassJava and 40
bugs from QuixBugs. Our results show that Recoder repairs 53 bugs on Defects4J
v1.2, which achieves 26.2% (11 bugs) improvement over the previous
state-of-the-art approach for single-hunk bugs (TBar). Importantly, to our
knowledge, Recoder is the first DL-based APR approach that has outperformed the
traditional APR approaches on this benchmark.
Related papers
- A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic
Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen)
RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs.
We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z) - Decoder-Only or Encoder-Decoder? Interpreting Language Model as a
Regularized Encoder-Decoder [75.03283861464365]
The seq2seq task aims at generating the target sequence based on the given input source sequence.
Traditionally, most of the seq2seq task is resolved by an encoder to encode the source sequence and a decoder to generate the target text.
Recently, a bunch of new approaches have emerged that apply decoder-only language models directly to the seq2seq task.
arXiv Detail & Related papers (2023-04-08T15:44:29Z) - KNOD: Domain Knowledge Distilled Tree Decoder for Automated Program
Repair [33.04645845117822]
Automated Program Repair (APR) improves software reliability by generating patches for a buggy program automatically.
Recent APR techniques leverage deep learning (DL) to build models to learn to generate patches from existing patches and code corpora.
We propose a DL-based APR approach, which incorporates domain knowledge to guide patch generation in a direct and comprehensive way.
arXiv Detail & Related papers (2023-02-03T17:02:56Z) - ASR Error Correction with Constrained Decoding on Operation Prediction [8.701142327932484]
We propose an ASR error correction method utilizing the predictions of correction operations.
Experiments on three public datasets demonstrate the effectiveness of the proposed approach in reducing the latency of the decoding process.
arXiv Detail & Related papers (2022-08-09T09:59:30Z) - Diffsound: Discrete Diffusion Model for Text-to-sound Generation [78.4128796899781]
We propose a novel text-to-sound generation framework that consists of a text encoder, a Vector Quantized Variational Autoencoder (VQ-VAE), a decoder, and a vocoder.
The framework first uses the decoder to transfer the text features extracted from the text encoder to a mel-spectrogram with the help of VQ-VAE, and then the vocoder is used to transform the generated mel-spectrogram into a waveform.
arXiv Detail & Related papers (2022-07-20T15:41:47Z) - Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired
Speech Data [145.95460945321253]
We introduce two pre-training tasks for the encoder-decoder network using acoustic units, i.e., pseudo codes.
The proposed Speech2C can relatively reduce the word error rate (WER) by 19.2% over the method without decoder pre-training.
arXiv Detail & Related papers (2022-03-31T15:33:56Z) - FastCorrect 2: Fast Error Correction on Multiple Candidates for
Automatic Speech Recognition [92.12910821300034]
We propose FastCorrect 2, an error correction model that takes multiple ASR candidates as input for better correction accuracy.
FastCorrect 2 achieves better performance than the cascaded re-scoring and correction pipeline.
arXiv Detail & Related papers (2021-09-29T13:48:03Z) - Improving the List Decoding Version of the Cyclically Equivariant Neural
Decoder [33.63188063525036]
We propose an improved version of the list decoding algorithm for BCH codes and punctured RM codes.
Our new decoder provides up to $2$dB gain over the previous list decoder when measured by BER.
arXiv Detail & Related papers (2021-06-15T08:37:36Z) - CURE: Code-Aware Neural Machine Translation for Automatic Program Repair [11.556110575946631]
We propose CURE, a new NMT-based APR technique with three major novelties.
CURE pre-trains a programming language (PL) model on a large software to learn developer-like source code before the APR task.
Second, CURE designs a new code-aware search strategy that finds more correct fixes by focusing on compilable patches and patches that are close in length to the buggy code.
arXiv Detail & Related papers (2021-02-26T22:30:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.