Related papers: DocChecker: Bootstrapping Code Large Language Model for Detecting and Resolving Code-Comment Inconsistencies

DocChecker: Bootstrapping Code Large Language Model for Detecting and Resolving Code-Comment Inconsistencies

URL: http://arxiv.org/abs/2306.06347v3
Date: Sat, 3 Feb 2024 03:36:33 GMT
Title: DocChecker: Bootstrapping Code Large Language Model for Detecting and Resolving Code-Comment Inconsistencies
Authors: Anh T. V. Dau, Jin L. C. Guo, Nghi D. Q. Bui
Abstract summary: DocChecker is a tool for detecting and correcting differences between code and its accompanying comments. It is adept at identifying inconsistencies between code and comments, and it can also generate synthetic comments. It achieves a new State-of-the-art result of 72.3% accuracy on the Inconsistency Code-Comment Detection task.
Score: 13.804337643709717
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Comments within source code are essential for developers to comprehend the code's purpose and ensure its correct usage. However, as codebases evolve, maintaining an accurate alignment between the comments and the code becomes increasingly challenging. Recognizing the growing interest in automated solutions for detecting and correcting differences between code and its accompanying comments, current methods rely primarily on heuristic rules. In contrast, this paper presents DocChecker, a tool powered by deep learning. DocChecker is adept at identifying inconsistencies between code and comments, and it can also generate synthetic comments. This capability enables the tool to detect and correct instances where comments do not accurately reflect their corresponding code segments. We demonstrate the effectiveness of DocChecker using the Just-In-Time and CodeXGlue datasets in different settings. Particularly, DocChecker achieves a new State-of-the-art result of 72.3% accuracy on the Inconsistency Code-Comment Detection (ICCD) task and 33.64 BLEU-4 on the code summarization task against other Large Language Models (LLMs), even surpassing GPT 3.5 and CodeLlama. DocChecker is accessible for use and evaluation. It can be found on our GitHub https://github.com/FSoft-AI4Code/DocChecker and as an Online Tool http://4.193.50.237:5000/. For a more comprehensive understanding of its functionality, a demonstration video is available on YouTube https://youtu.be/FqnPmd531xw.

Related papers

METAMON: Finding Inconsistencies between Program Documentation and Behavior using Metamorphic LLM Queries [10.9334354663311]
This paper proposes METAMON, which uses an existing search-based test generation technique to capture the current program behavior in the form of test cases. METAMON is supported in this task by metamorphic testing and self-consistency. An empirical evaluation against 9,482 pairs of code documentation and code snippets, generated using five open-source projects from Defects4J v2.0.1, shows that METAMON can classify the code-and-documentation inconsistencies with a precision of 0.72 and a recall of 0.48.
arXiv Detail & Related papers (2025-02-05T00:42:50Z)
Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub. 83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z)
CodeJudge: Evaluating Code Generation with Large Language Models [6.867043179943195]
Large Language Models (LLMs) have shown promising performance in code generation. How to reliably evaluate code generated by LLMs remains an unresolved problem. This paper presents CodeJudge, a code evaluation framework that leverages LLMs to evaluate the semantic correctness of generated code without the need for test cases.
arXiv Detail & Related papers (2024-10-03T03:58:03Z)
Code Documentation and Analysis to Secure Software Development [0.0]
CoDAT is a tool designed to maintain consistency between the various levels of code documentation. It is implemented in the Intellij IDEA. We use a large language model to check the semantic consistency between a fragment of code and the comments that describe it.
arXiv Detail & Related papers (2024-07-16T17:25:44Z)
Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions [6.367360745627828]
We introduce a benchmark of code editing tasks and use it to evaluate several cutting edge LLMs. Our evaluation exposes a significant gap between the capabilities of state-of-the-art open and closed models. We introduce a new, carefully curated, permissively licensed training dataset of code editing tasks coupled with natural language instructions.
arXiv Detail & Related papers (2023-12-11T02:27:45Z)
Coeditor: Leveraging Contextual Changes for Multi-round Code Auto-editing [57.776971051512234]
In this work, we explore a multi-round code auto-editing setting, aiming to predict edits to a code region based on recent changes within the same. Our model, Coeditor, is a fine-tuned language model specifically designed for code editing tasks. In a simplified single-round, single-edit task, Coeditor significantly outperforms GPT-3.5 and SOTA open-source code completion models.
arXiv Detail & Related papers (2023-05-29T19:57:36Z)
Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study [4.438873396405334]
We aim to answer whether making code easier to understand through using contextual data improves the performance of pre-trained code language models for the task of code completion. For comments, we find that the models perform better in the presence of multi-line comments.
arXiv Detail & Related papers (2023-04-24T17:09:14Z)
Augmenting Diffs With Runtime Information [53.22981451758425]
Collector-Sahab is a tool that augments code diffs with runtime difference information. We run Collector-Sahab on 584 code diffs for Defects4J bugs and find it successfully augments the code diff for 95% (555/584) of them.
arXiv Detail & Related papers (2022-12-20T16:33:51Z)
DocCoder: Generating Code by Retrieving and Reading Docs [87.88474546826913]
We introduce DocCoder, an approach that explicitly leverages code manuals and documentation. Our approach is general, can be applied to any programming language, and is agnostic to the underlying neural model.
arXiv Detail & Related papers (2022-07-13T06:47:51Z)
CodeRetriever: Unimodal and Bimodal Contrastive Learning [128.06072658302165]
We propose the CodeRetriever model, which combines the unimodal and bimodal contrastive learning to train function-level code semantic representations. For unimodal contrastive learning, we design a semantic-guided method to build positive code pairs based on the documentation and function name. For bimodal contrastive learning, we leverage the documentation and in-line comments of code to build text-code pairs.
arXiv Detail & Related papers (2022-01-26T10:54:30Z)
Deep Just-In-Time Inconsistency Detection Between Comments and Source Code [51.00904399653609]
In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code. We develop a deep-learning approach that learns to correlate a comment with code changes. We show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system.
arXiv Detail & Related papers (2020-10-04T16:49:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.