Augmenting Diffs With Runtime Information
- URL: http://arxiv.org/abs/2212.11077v2
- Date: Fri, 30 Jun 2023 12:27:41 GMT
- Title: Augmenting Diffs With Runtime Information
- Authors: Khashayar Etemadi, Aman Sharma, Fernanda Madeiral and Martin Monperrus
- Abstract summary: Collector-Sahab is a tool that augments code diffs with runtime difference information.
We run Collector-Sahab on 584 code diffs for Defects4J bugs and find it successfully augments the code diff for 95% (555/584) of them.
- Score: 53.22981451758425
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Source code diffs are used on a daily basis as part of code review,
inspection, and auditing. To facilitate understanding, they are typically
accompanied by explanations that describe the essence of what is changed in the
program. As manually crafting high-quality explanations is a cumbersome task,
researchers have proposed automatic techniques to generate code diff
explanations. Existing explanation generation methods solely focus on static
analysis, i.e., they do not take advantage of runtime information to explain
code changes. In this paper, we propose Collector-Sahab, a novel tool that
augments code diffs with runtime difference information. Collector-Sahab
compares the program states of the original (old) and patched (new) versions of
a program to find unique variable values. Then, Collector-Sahab adds this novel
runtime information to the source code diff as shown, for instance, in code
reviewing systems. As an evaluation, we run Collector-Sahab on 584 code diffs
for Defects4J bugs and find it successfully augments the code diff for 95%
(555/584) of them. We also perform a user study and ask eight participants to
score the augmented code diffs generated by Collector-Sahab. Per this user
study, we conclude that developers find the idea of adding runtime data to code
diffs promising and useful. Overall, our experiments show the effectiveness and
usefulness of Collector-Sahab in augmenting code diffs with runtime difference
information. Publicly-available repository:
https://github.com/ASSERT-KTH/collector-sahab.
Related papers
- Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub.
83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z) - Toward Interactive Optimization of Source Code Differences: An Empirical Study of Its Performance [1.313675711285772]
We propose an interactive approach to optimize source code differences (diffs)
Users can provide feedback for the points of a diff that should not be matched but are or parts that should be matched but are not.
The results of 23 GitHub projects confirm that 92% of nonoptimal diffs can be addressed with less than four feedback actions in the ideal case.
arXiv Detail & Related papers (2024-09-20T15:43:55Z) - CodeUpdateArena: Benchmarking Knowledge Editing on API Updates [77.81663273436375]
We present CodeUpdateArena, a benchmark for knowledge editing in the code domain.
An instance in our benchmark consists of a synthetic API function update paired with a program synthesis example.
Our benchmark covers updates of various types to 54 functions from seven diverse Python packages.
arXiv Detail & Related papers (2024-07-08T17:55:04Z) - SparseCoder: Identifier-Aware Sparse Transformer for File-Level Code
Summarization [51.67317895094664]
This paper studies file-level code summarization, which can assist programmers in understanding and maintaining large source code projects.
We propose SparseCoder, an identifier-aware sparse transformer for effectively handling long code sequences.
arXiv Detail & Related papers (2024-01-26T09:23:27Z) - Gitor: Scalable Code Clone Detection by Building Global Sample Graph [11.041017540277558]
We propose Gitor to capture the underlying connections among different code samples.
Gitor has higher accuracy in terms of code clone detection and excellent execution time for inputs of various sizes.
arXiv Detail & Related papers (2023-11-15T08:48:50Z) - DocChecker: Bootstrapping Code Large Language Model for Detecting and
Resolving Code-Comment Inconsistencies [13.804337643709717]
DocChecker is a tool for detecting and correcting differences between code and its accompanying comments.
It is adept at identifying inconsistencies between code and comments, and it can also generate synthetic comments.
It achieves a new State-of-the-art result of 72.3% accuracy on the Inconsistency Code-Comment Detection task.
arXiv Detail & Related papers (2023-06-10T05:29:09Z) - CONCORD: Clone-aware Contrastive Learning for Source Code [64.51161487524436]
Self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE tasks.
We argue that it is also essential to factor in how developers code day-to-day for general-purpose representation learning.
In particular, we propose CONCORD, a self-supervised, contrastive learning strategy to place benign clones closer in the representation space while moving deviants further apart.
arXiv Detail & Related papers (2023-06-05T20:39:08Z) - How is the speed of code review affected by activity, usage and code
quality? [0.0]
This paper investigates how the speed of code review is affected by the code quality activity and usage in the context of extensions.
The median time to merge is compared against several other variables which are collected using a variety of manual methods and APIs.
arXiv Detail & Related papers (2023-05-09T21:11:17Z) - RepoCoder: Repository-Level Code Completion Through Iterative Retrieval
and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process.
It incorporates a similarity-based retriever and a pre-trained code language model.
It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z) - Empirical Analysis on Effectiveness of NLP Methods for Predicting Code
Smell [3.2973778921083357]
A code smell is a surface indicator of an inherent problem in the system.
We use three Extreme learning machine kernels over 629 packages to identify eight code smells.
Our findings indicate that the radial basis functional kernel performs best out of the three kernel methods with a mean accuracy of 98.52.
arXiv Detail & Related papers (2021-08-08T12:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.