Do code refactorings influence the merge effort?
- URL: http://arxiv.org/abs/2305.06129v1
- Date: Wed, 10 May 2023 13:24:59 GMT
- Title: Do code refactorings influence the merge effort?
- Authors: Andre Oliveira, Vania Neves, Alexandre Plastino, Ana Carla Bibiano,
Alessandro Garcia, Leonardo Murta
- Abstract summary: Multiple contributors frequently change the source code in parallel to implement new features, fix bugs, existing code, and make other changes.
These simultaneous changes need to be merged into the same version of the source code.
Studies show that 10 to 20 percent of all merge attempts result in conflicts, which require the manual developer's intervention to complete the process.
- Score: 80.1936417993664
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In collaborative software development, multiple contributors frequently
change the source code in parallel to implement new features, fix bugs,
refactor existing code, and make other changes. These simultaneous changes need
to be merged into the same version of the source code. However, the merge
operation can fail, and developer intervention is required to resolve the
conflicts. Studies in the literature show that 10 to 20 percent of all merge
attempts result in conflicts, which require the manual developer's intervention
to complete the process. In this paper, we concern about a specific type of
change that affects the structure of the source code and has the potential to
increase the merge effort: code refactorings. We analyze the relationship
between the occurrence of refactorings and the merge effort. To do so, we
applied a data mining technique called association rule extraction to find
patterns of behavior that allow us to analyze the influence of refactorings on
the merge effort. Our experiments extracted association rules from 40,248 merge
commits that occurred in 28 popular open-source projects. The results indicate
that: (i) the occurrence of refactorings increases the chances of having merge
effort; (ii) the more refactorings, the greater the chances of effort; (iii)
the more refactorings, the greater the effort; and (iv) parallel refactorings
increase even more the chances of having effort, as well as the intensity of
it. The results obtained may suggest behavioral changes in the way refactorings
are implemented by developer teams. In addition, they can indicate possible
ways to improve tools that support code merging and those that recommend
refactorings, considering the number of refactorings and merge effort
attributes.
Related papers
- An Empirical Study on the Potential of LLMs in Automated Software Refactoring [9.157968996300417]
We investigate the potential of large language models (LLMs) in automated software.
We find that 13 out of the 176 solutions suggested by ChatGPT and 9 out of the 137 solutions suggested by Gemini were unsafe in that they either changed the functionality of the source code or introduced syntax errors.
arXiv Detail & Related papers (2024-11-07T05:35:55Z) - An Empirical Study on the Code Refactoring Capability of Large Language Models [0.5852077003870416]
This study empirically evaluates StarCoder2, an LLM optimized for code generation, in code across 30 open-source Java projects.
We compare StarCoder2's performance against human developers, focusing on (1) code quality improvements, (2) types and effectiveness of smells, and (3) enhancements through one-shot and chain-of-thought prompting.
arXiv Detail & Related papers (2024-11-04T17:46:20Z) - In Search of Metrics to Guide Developer-Based Refactoring Recommendations [13.063733696956678]
Motivation is a well-established approach to improving source code quality without compromising its external behavior.
We propose an empirical study into the metrics that study the developer's willingness to apply operations.
We will quantify the value of product and process metrics in grasping developers' motivations to perform.
arXiv Detail & Related papers (2024-07-25T16:32:35Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - ReGAL: Refactoring Programs to Discover Generalizable Abstractions [59.05769810380928]
Generalizable Abstraction Learning (ReGAL) is a method for learning a library of reusable functions via codeization.
We find that the shared function libraries discovered by ReGAL make programs easier to predict across diverse domains.
For CodeLlama-13B, ReGAL results in absolute accuracy increases of 11.5% on LOGO, 26.1% on date understanding, and 8.1% on TextCraft, outperforming GPT-3.5 in two of three domains.
arXiv Detail & Related papers (2024-01-29T18:45:30Z) - State of Refactoring Adoption: Better Understanding Developer Perception
of Refactoring [5.516979718589074]
We aim to explore how developers document their activities during the software life cycle.
We call such activity Self-Affirmed Refactoring (SAR), which indicates developers' documentation of their activities.
We propose an approach to identify whether a commit describes developer-related events to classify them according to the common quality improvement categories.
arXiv Detail & Related papers (2023-06-09T16:38:20Z) - CONCORD: Clone-aware Contrastive Learning for Source Code [64.51161487524436]
Self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE tasks.
We argue that it is also essential to factor in how developers code day-to-day for general-purpose representation learning.
In particular, we propose CONCORD, a self-supervised, contrastive learning strategy to place benign clones closer in the representation space while moving deviants further apart.
arXiv Detail & Related papers (2023-06-05T20:39:08Z) - RefBERT: A Two-Stage Pre-trained Framework for Automatic Rename
Refactoring [57.8069006460087]
We study automatic rename on variable names, which is considered more challenging than other rename activities.
We propose RefBERT, a two-stage pre-trained framework for rename on variable names.
We show that the generated variable names of RefBERT are more accurate and meaningful than those produced by the existing method.
arXiv Detail & Related papers (2023-05-28T12:29:39Z) - ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval.
We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z) - How We Refactor and How We Document it? On the Use of Supervised Machine
Learning Algorithms to Classify Refactoring Documentation [25.626914797750487]
Refactoring is the art of improving the design of a system without altering its external behavior.
This study categorizes commits into 3 categories, namely, Internal QA, External QA, and Code Smell Resolution, along with the traditional BugFix and Functional categories.
To better understand our classification results, we analyzed commit messages to extract patterns that developers regularly use to describe their smells.
arXiv Detail & Related papers (2020-10-26T20:33:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.