Learning to Update Natural Language Comments Based on Code Changes
- URL: http://arxiv.org/abs/2004.12169v2
- Date: Tue, 28 Apr 2020 02:53:17 GMT
- Title: Learning to Update Natural Language Comments Based on Code Changes
- Authors: Sheena Panthaplackel, Pengyu Nie, Milos Gligoric, Junyi Jessy Li,
Raymond J. Mooney
- Abstract summary: We formulate the novel task of automatically updating an existing natural language comment based on changes in the body of code it accompanies.
We propose an approach that learns to correlate changes across two distinct language representations, to generate a sequence of edits that are applied to the existing comment to reflect the source code modifications.
- Score: 48.829941738578086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We formulate the novel task of automatically updating an existing natural
language comment based on changes in the body of code it accompanies. We
propose an approach that learns to correlate changes across two distinct
language representations, to generate a sequence of edits that are applied to
the existing comment to reflect the source code modifications. We train and
evaluate our model using a dataset that we collected from commit histories of
open-source software projects, with each example consisting of a concurrent
update to a method and its corresponding comment. We compare our approach
against multiple baselines using both automatic metrics and human evaluation.
Results reflect the challenge of this task and that our model outperforms
baselines with respect to making edits.
Related papers
- A Computational Analysis of Vagueness in Revisions of Instructional
Texts [2.2577978123177536]
We extract pairwise versions of an instruction before and after a revision was made.
We investigate the ability of a neural model to distinguish between two versions of an instruction in our data.
arXiv Detail & Related papers (2023-09-21T14:26:04Z) - EditEval: An Instruction-Based Benchmark for Text Improvements [73.5918084416016]
This work presents EditEval: An instruction-based, benchmark and evaluation suite for automatic evaluation of editing capabilities.
We evaluate several pre-trained models, which shows that InstructGPT and PEER perform the best, but that most baselines fall below the supervised SOTA.
Our analysis shows that commonly used metrics for editing tasks do not always correlate well, and that optimization for prompts with the highest performance does not necessarily entail the strongest robustness to different models.
arXiv Detail & Related papers (2022-09-27T12:26:05Z) - CoditT5: Pretraining for Source Code and Natural Language Editing [34.77621217370665]
CoditT5 is a large language model for software-related editing tasks that is pretrained on large amounts of source code and natural language comments.
We fine-tune it on various downstream editing tasks, including comment updating, bug fixing, and automated code review.
arXiv Detail & Related papers (2022-08-10T16:59:40Z) - Code Comment Inconsistency Detection with BERT and Longformer [9.378041196272878]
Comments, or natural language descriptions of source code, are standard practice among software developers.
When the code is modified without an accompanying correction to the comment, an inconsistency between the comment and code can arise.
We propose two models to detect such inconsistencies in a natural language inference (NLI) context.
arXiv Detail & Related papers (2022-07-29T02:43:51Z) - Language Anisotropic Cross-Lingual Model Editing [61.51863835749279]
Existing work only studies the monolingual scenario, which lacks the cross-lingual transferability to perform editing simultaneously across languages.
We propose a framework to naturally adapt monolingual model editing approaches to the cross-lingual scenario using parallel corpus.
We empirically demonstrate the failure of monolingual baselines in propagating the edit to multiple languages and the effectiveness of the proposed language anisotropic model editing.
arXiv Detail & Related papers (2022-05-25T11:38:12Z) - Language Model Evaluation in Open-ended Text Generation [0.76146285961466]
We study different evaluation metrics that have been proposed to evaluate quality, diversity and consistency of machine-generated text.
From there, we propose a practical pipeline to evaluate language models in open-ended generation task.
arXiv Detail & Related papers (2021-08-08T06:16:02Z) - TextFlint: Unified Multilingual Robustness Evaluation Toolkit for
Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint)
It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis.
TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z) - Text Editing by Command [82.50904226312451]
A prevailing paradigm in neural text generation is one-shot generation, where text is produced in a single step.
We address this limitation with an interactive text generation setting in which the user interacts with the system by issuing commands to edit existing text.
We show that our Interactive Editor, a transformer-based model trained on this dataset, outperforms baselines and obtains positive results in both automatic and human evaluations.
arXiv Detail & Related papers (2020-10-24T08:00:30Z) - Deep Just-In-Time Inconsistency Detection Between Comments and Source
Code [51.00904399653609]
In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code.
We develop a deep-learning approach that learns to correlate a comment with code changes.
We show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system.
arXiv Detail & Related papers (2020-10-04T16:49:28Z) - On-the-Fly Adaptation of Source Code Models using Meta-Learning [28.98699307030983]
We frame the problem of context adaptation as a meta-learning problem.
We train a base source code model that is best able to learn from information in a file to deliver improved predictions of missing tokens.
We demonstrate improved performance in experiments on a large scale Java GitHub corpus.
arXiv Detail & Related papers (2020-03-26T07:11:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.