Related papers: Learning to Update Natural Language Comments Based on Code Changes

Learning to Update Natural Language Comments Based on Code Changes

URL: http://arxiv.org/abs/2004.12169v2
Date: Tue, 28 Apr 2020 02:53:17 GMT
Title: Learning to Update Natural Language Comments Based on Code Changes
Authors: Sheena Panthaplackel, Pengyu Nie, Milos Gligoric, Junyi Jessy Li, Raymond J. Mooney
Abstract summary: We formulate the novel task of automatically updating an existing natural language comment based on changes in the body of code it accompanies. We propose an approach that learns to correlate changes across two distinct language representations, to generate a sequence of edits that are applied to the existing comment to reflect the source code modifications.
Score: 48.829941738578086
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We formulate the novel task of automatically updating an existing natural language comment based on changes in the body of code it accompanies. We propose an approach that learns to correlate changes across two distinct language representations, to generate a sequence of edits that are applied to the existing comment to reflect the source code modifications. We train and evaluate our model using a dataset that we collected from commit histories of open-source software projects, with each example consisting of a concurrent update to a method and its corresponding comment. We compare our approach against multiple baselines using both automatic metrics and human evaluation. Results reflect the challenge of this task and that our model outperforms baselines with respect to making edits.

Related papers

Turning the Tide: Repository-based Code Reflection [52.13709676656648]
We introduce LiveRepoReflection, a benchmark for evaluating code understanding and generation in multi-file repository contexts.<n>1,888 rigorously filtered test cases across $6$ programming languages to ensure diversity, correctness, and high difficulty.<n>We also create RepoReflection-Instruct, a large-scale, quality-filtered instruction-tuning dataset derived from diverse sources.
arXiv Detail & Related papers (2025-07-14T02:36:27Z)
Contextually Guided Transformers via Low-Rank Adaptation [14.702057924366345]
Large Language Models (LLMs) based on Transformers excel at text processing, but their reliance on prompts for specialized behavior introduces computational overhead.<n>We propose a modification to a Transformer architecture that eliminates the need for explicit prompts by learning to encode context into the model's weights.
arXiv Detail & Related papers (2025-06-06T01:34:39Z)
A Computational Analysis of Vagueness in Revisions of Instructional Texts [2.2577978123177536]
We extract pairwise versions of an instruction before and after a revision was made. We investigate the ability of a neural model to distinguish between two versions of an instruction in our data.
arXiv Detail & Related papers (2023-09-21T14:26:04Z)
EditEval: An Instruction-Based Benchmark for Text Improvements [73.5918084416016]
This work presents EditEval: An instruction-based, benchmark and evaluation suite for automatic evaluation of editing capabilities. We evaluate several pre-trained models, which shows that InstructGPT and PEER perform the best, but that most baselines fall below the supervised SOTA. Our analysis shows that commonly used metrics for editing tasks do not always correlate well, and that optimization for prompts with the highest performance does not necessarily entail the strongest robustness to different models.
arXiv Detail & Related papers (2022-09-27T12:26:05Z)
CoditT5: Pretraining for Source Code and Natural Language Editing [34.77621217370665]
CoditT5 is a large language model for software-related editing tasks that is pretrained on large amounts of source code and natural language comments. We fine-tune it on various downstream editing tasks, including comment updating, bug fixing, and automated code review.
arXiv Detail & Related papers (2022-08-10T16:59:40Z)
Code Comment Inconsistency Detection with BERT and Longformer [9.378041196272878]
Comments, or natural language descriptions of source code, are standard practice among software developers. When the code is modified without an accompanying correction to the comment, an inconsistency between the comment and code can arise. We propose two models to detect such inconsistencies in a natural language inference (NLI) context.
arXiv Detail & Related papers (2022-07-29T02:43:51Z)
Language Anisotropic Cross-Lingual Model Editing [61.51863835749279]
Existing work only studies the monolingual scenario, which lacks the cross-lingual transferability to perform editing simultaneously across languages. We propose a framework to naturally adapt monolingual model editing approaches to the cross-lingual scenario using parallel corpus. We empirically demonstrate the failure of monolingual baselines in propagating the edit to multiple languages and the effectiveness of the proposed language anisotropic model editing.
arXiv Detail & Related papers (2022-05-25T11:38:12Z)
Language Model Evaluation in Open-ended Text Generation [0.76146285961466]
We study different evaluation metrics that have been proposed to evaluate quality, diversity and consistency of machine-generated text. From there, we propose a practical pipeline to evaluate language models in open-ended generation task.
arXiv Detail & Related papers (2021-08-08T06:16:02Z)
TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint) It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis. TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z)
Text Editing by Command [82.50904226312451]
A prevailing paradigm in neural text generation is one-shot generation, where text is produced in a single step. We address this limitation with an interactive text generation setting in which the user interacts with the system by issuing commands to edit existing text. We show that our Interactive Editor, a transformer-based model trained on this dataset, outperforms baselines and obtains positive results in both automatic and human evaluations.
arXiv Detail & Related papers (2020-10-24T08:00:30Z)
Deep Just-In-Time Inconsistency Detection Between Comments and Source Code [51.00904399653609]
In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code. We develop a deep-learning approach that learns to correlate a comment with code changes. We show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system.
arXiv Detail & Related papers (2020-10-04T16:49:28Z)
On-the-Fly Adaptation of Source Code Models using Meta-Learning [28.98699307030983]
We frame the problem of context adaptation as a meta-learning problem. We train a base source code model that is best able to learn from information in a file to deliver improved predictions of missing tokens. We demonstrate improved performance in experiments on a large scale Java GitHub corpus.
arXiv Detail & Related papers (2020-03-26T07:11:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.