Related papers: Toward Interactive Optimization of Source Code Differences: An Empirical Study of Its Performance

Toward Interactive Optimization of Source Code Differences: An Empirical Study of Its Performance

URL: http://arxiv.org/abs/2409.13590v2
Date: Thu, 26 Sep 2024 14:13:53 GMT
Title: Toward Interactive Optimization of Source Code Differences: An Empirical Study of Its Performance
Authors: Tsukasa Yagi, Shinpei Hayashi,
Abstract summary: We propose an interactive approach to optimize source code differences (diffs) Users can provide feedback for the points of a diff that should not be matched but are or parts that should be matched but are not. The results of 23 GitHub projects confirm that 92% of nonoptimal diffs can be addressed with less than four feedback actions in the ideal case.
Score: 1.313675711285772
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A source code difference (diff) indicates changes made by comparing new and old source codes, and it can be utilized in code reviews to help developers understand the changes made to the code. Although many diff generation methods have been proposed, existing automatic methods may generate nonoptimal diffs, hindering reviewers from understanding the changes. In this paper, we propose an interactive approach to optimize diffs. Users can provide feedback for the points of a diff that should not be matched but are or parts that should be matched but are not. The edit graph is updated based on this feedback, enabling users to obtain a more optimal diff. We simulated our proposed method by applying a search algorithm to empirically assess the number of feedback instances required and the amount of diff optimization resulting from the feedback to investigate the potential of this approach. The results of 23 GitHub projects confirm that 92% of nonoptimal diffs can be addressed with less than four feedback actions in the ideal case.

Related papers

What Happened in This Pipeline? Diffing Build Logs with CiDiff [3.093293209977702]
We introduce a new diff algorithm specifically tailored to build logs called CiDiff. We evaluate CiDiff against several baselines on a novel dataset of 17 906 CI regressions.
arXiv Detail & Related papers (2025-04-25T08:56:21Z)
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub. 83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z)
Scattered Forest Search: Smarter Code Space Exploration with LLMs [55.71665969800222]
We propose SCATTERED FOREST SEARCH (SFS), a novel approach that improves solution diversity and better exploits feedback during evolutionary search. Our approach scales more efficiently than existing search techniques, including tree search, line search, and repeated sampling.
arXiv Detail & Related papers (2024-10-22T01:58:29Z)
Advanced Detection of Source Code Clones via an Ensemble of Unsupervised Similarity Measures [0.0]
This research introduces a novel ensemble learning approach for code similarity assessment. The key idea is that the strengths of a diverse set of similarity measures can complement each other and mitigate individual weaknesses.
arXiv Detail & Related papers (2024-05-03T13:42:49Z)
Rethinking Negative Pairs in Code Search [56.23857828689406]
We propose a simple yet effective Soft-InfoNCE loss that inserts weight terms into InfoNCE. We analyze the effects of Soft-InfoNCE on controlling the distribution of learnt code representations and on deducing a more precise mutual information estimation.
arXiv Detail & Related papers (2023-10-12T06:32:42Z)
Performance Evaluation and Comparison of a New Regression Algorithm [4.125187280299247]
We compare the performance of a newly proposed regression algorithm against four conventional machine learning algorithms. The reader is free to replicate our results since we have provided the source code in a GitHub repository.
arXiv Detail & Related papers (2023-06-15T13:01:16Z)
Augmenting Diffs With Runtime Information [53.22981451758425]
Collector-Sahab is a tool that augments code diffs with runtime difference information. We run Collector-Sahab on 584 code diffs for Defects4J bugs and find it successfully augments the code diff for 95% (555/584) of them.
arXiv Detail & Related papers (2022-12-20T16:33:51Z)
Efficient computation of the Knowledge Gradient for Bayesian Optimization [1.0497128347190048]
One-shot Hybrid KG is a new approach that combines several of the previously proposed ideas and is cheap to compute as well as powerful and efficient. All experiments are implemented in BOTorch and show empirically drastically reduced computational overhead with equal or improved performance.
arXiv Detail & Related papers (2022-09-30T10:39:38Z)
MAGPIE: Machine Automated General Performance Improvement via Evolution of Software [19.188864062289433]
MAGPIE is a unified software improvement framework. It provides a common edit sequence based representation that isolates the search process from the specific improvement technique.
arXiv Detail & Related papers (2022-08-04T17:58:43Z)
Search Methods for Sufficient, Socially-Aligned Feature Importance Explanations with In-Distribution Counterfactuals [72.00815192668193]
Feature importance (FI) estimates are a popular form of explanation, and they are commonly created and evaluated by computing the change in model confidence caused by removing certain input features at test time. We study several under-explored dimensions of FI-based explanations, providing conceptual and empirical improvements for this form of explanation.
arXiv Detail & Related papers (2021-06-01T20:36:48Z)
Neural Non-Rigid Tracking [26.41847163649205]
We introduce a novel, end-to-end learnable, differentiable non-rigid tracker. We employ a convolutional neural network to predict dense correspondences and their confidences. Compared to state-of-the-art approaches, our algorithm shows improved reconstruction performance.
arXiv Detail & Related papers (2020-06-23T18:00:39Z)
Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval. We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing. This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z)
Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad Samples [67.11669996924671]
We introduce a simple (one line of code) modification to the Generative Adversarial Network (GAN) training algorithm. When updating the generator parameters, we zero out the gradient contributions from the elements of the batch that the critic scores as least realistic' We show that this top-k update' procedure is a generally applicable improvement.
arXiv Detail & Related papers (2020-02-14T19:27:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.