Deep Just-In-Time Inconsistency Detection Between Comments and Source
Code
- URL: http://arxiv.org/abs/2010.01625v2
- Date: Sat, 26 Dec 2020 22:55:17 GMT
- Title: Deep Just-In-Time Inconsistency Detection Between Comments and Source
Code
- Authors: Sheena Panthaplackel, Junyi Jessy Li, Milos Gligoric, Raymond J.
Mooney
- Abstract summary: In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code.
We develop a deep-learning approach that learns to correlate a comment with code changes.
We show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system.
- Score: 51.00904399653609
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural language comments convey key aspects of source code such as
implementation, usage, and pre- and post-conditions. Failure to update comments
accordingly when the corresponding code is modified introduces inconsistencies,
which is known to lead to confusion and software bugs. In this paper, we aim to
detect whether a comment becomes inconsistent as a result of changes to the
corresponding body of code, in order to catch potential inconsistencies
just-in-time, i.e., before they are committed to a code base. To achieve this,
we develop a deep-learning approach that learns to correlate a comment with
code changes. By evaluating on a large corpus of comment/code pairs spanning
various comment types, we show that our model outperforms multiple baselines by
significant margins. For extrinsic evaluation, we show the usefulness of our
approach by combining it with a comment update model to build a more
comprehensive automatic comment maintenance system which can both detect and
resolve inconsistent comments based on code changes.
Related papers
- Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub.
83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z) - Investigating the Impact of Code Comment Inconsistency on Bug Introducing [4.027975836739619]
This study investigates the impact of code-comment inconsistency on bug introduction using large language models.
We first compare the performance of the GPT-3.5 model with other state-of-the-art methods in detecting these inconsistencies.
We also analyze the temporal evolution of code-comment inconsistencies and their effect on bug proneness over various timeframes.
arXiv Detail & Related papers (2024-09-16T23:24:29Z) - Code Documentation and Analysis to Secure Software Development [0.0]
CoDAT is a tool designed to maintain consistency between the various levels of code documentation.
It is implemented in the Intellij IDEA.
We use a large language model to check the semantic consistency between a fragment of code and the comments that describe it.
arXiv Detail & Related papers (2024-07-16T17:25:44Z) - When simplicity meets effectiveness: Detecting code comments coherence with word embeddings and LSTM [6.417777780911223]
Code comments play a crucial role in software development, as they provide programmers with practical information.
Developers tend to leave comments unchanged after updating the code, resulting in a discrepancy between the two artifacts.
It is crucial to identify if, given a code snippet, its corresponding comment is coherent and reflects well the intent behind the code.
arXiv Detail & Related papers (2024-05-25T15:21:27Z) - Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective [85.48043537327258]
We propose MANGO (comMents As Natural loGic pivOts), including a comment contrastive training strategy and a corresponding logical comment decoding strategy.
Results indicate that MANGO significantly improves the code pass rate based on the strong baselines.
The robustness of the logical comment decoding strategy is notably higher than the Chain-of-thoughts prompting.
arXiv Detail & Related papers (2024-04-11T08:30:46Z) - Are your comments outdated? Towards automatically detecting code-comment
consistency [3.204922482708544]
Outdated comments are dangerous and harmful and may mislead subsequent developers.
We propose a learning-based method, called CoCC, to detect the consistency between code and comment.
Experiment results show that CoCC can effectively detect outdated comments with precision over 90%.
arXiv Detail & Related papers (2024-03-01T03:30:13Z) - Code Comment Inconsistency Detection with BERT and Longformer [9.378041196272878]
Comments, or natural language descriptions of source code, are standard practice among software developers.
When the code is modified without an accompanying correction to the comment, an inconsistency between the comment and code can arise.
We propose two models to detect such inconsistencies in a natural language inference (NLI) context.
arXiv Detail & Related papers (2022-07-29T02:43:51Z) - ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval.
We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z) - CodeRetriever: Unimodal and Bimodal Contrastive Learning [128.06072658302165]
We propose the CodeRetriever model, which combines the unimodal and bimodal contrastive learning to train function-level code semantic representations.
For unimodal contrastive learning, we design a semantic-guided method to build positive code pairs based on the documentation and function name.
For bimodal contrastive learning, we leverage the documentation and in-line comments of code to build text-code pairs.
arXiv Detail & Related papers (2022-01-26T10:54:30Z) - CodeBLEU: a Method for Automatic Evaluation of Code Synthesis [57.87741831987889]
In the area of code synthesis, the commonly used evaluation metric is BLEU or perfect accuracy.
We introduce a new automatic evaluation metric, dubbed CodeBLEU.
It absorbs the strength of BLEU in the n-gram match and further injects code syntax via abstract syntax trees (AST) and code semantics via data-flow.
arXiv Detail & Related papers (2020-09-22T03:10:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.