ActRef: Enhancing the Understanding of Python Code Refactoring with Action-Based Analysis
- URL: http://arxiv.org/abs/2505.06553v1
- Date: Sat, 10 May 2025 07:48:50 GMT
- Title: ActRef: Enhancing the Understanding of Python Code Refactoring with Action-Based Analysis
- Authors: Siqi Wang, Xing Hu, Xin Xia, Xinyu Wang,
- Abstract summary: This study presents an action-based Refactoring Analysis Framework named ActRef.<n>ActRef mining multiple types (e.g., move, rename, extract, and inline operations) based on diff actions.<n>By focusing on the code change actions, ActRef provides a Python-adaptive solution to detect intricate patterns.
- Score: 10.724563250102696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Refactoring, the process of improving the code structure of a software system without altering its behavior, is crucial for managing code evolution in software development. Identifying refactoring actions in source code is essential for understanding software evolution and guiding developers in maintaining and improving the code quality. This study presents an action-based Refactoring Analysis Framework named ActRef, a novel algorithm designed to advance the detection and understanding of Python refactorings through a unique code change action-based analysis of code changes. ActRef mining multiple refactoring types (e.g., move, rename, extract, and inline operations) based on diff actions, covering multiple granularity levels including variable, method, class, and module levels. By focusing on the code change actions, ActRef provides a Python-adaptive solution to detect intricate refactoring patterns. Our evaluation, conducted on 1,914 manually validated refactoring instances from 136 open-source Python projects. The evaluation results show that ActRef achieves high precision(0.80) and recall(0.92), effectively identifying multiple refactoring types. Compared with leading baselines, including PyRef, PyRef with MLRefScanner, DeepSeek-R1 and ChatGPT-4, ActRef consistently demonstrates superior performance in detecting Python refactorings across various types. While matching PyRef in runtime efficiency, ActRef supports a broader spectrum of refactoring types and more refactoring mining levels. ActRef shows an effective and scalable approach for mining refactorings in dynamic Python codebases and introduces a new perspective on understanding code.
Related papers
- RefModel: Detecting Refactorings using Foundation Models [2.2670483018110366]
We investigate the viability of using foundation models for detection, implemented in a tool named RefModel.<n>We evaluate Phi4-14B, and Claude 3.5 Sonnet on a dataset of 858 single-operation transformations applied to artificially generated Java programs.<n>In real-world settings, Claude 3.5 Sonnet and Gemini 2.5 Pro jointly identified 97% of all transformations, surpassing the best-performing static-analysis-based tools.
arXiv Detail & Related papers (2025-07-15T14:20:56Z) - Turning the Tide: Repository-based Code Reflection [52.13709676656648]
We introduce LiveRepoReflection, a benchmark for evaluating code understanding and generation in multi-file repository contexts.<n>1,888 rigorously filtered test cases across $6$ programming languages to ensure diversity, correctness, and high difficulty.<n>We also create RepoReflection-Instruct, a large-scale, quality-filtered instruction-tuning dataset derived from diverse sources.
arXiv Detail & Related papers (2025-07-14T02:36:27Z) - Bugs in the Shadows: Static Detection of Faulty Python Refactorings [44.115219601924856]
Python's dynamic type system poses significant challenges for automated code transformations.<n>Our analysis uncovered 29 bugs across four types from a total of 1,152 attempts.<n>These results highlight the need to improve the robustness of current Python tools to ensure the correctness of automated code transformations.
arXiv Detail & Related papers (2025-07-01T18:03:56Z) - Assessing the Bug-Proneness of Refactored Code: A Longitudinal Multi-Project Study [43.65862440745159]
Refactoring is a common practice in software development, aimed at improving the internal code structure in order to make it easier to understand and modify.<n>It is often assumed that makes the code less prone to bugs.<n>However, in practice, is a complex task and applied in different ways. Therefore, certains can inadvertently make the code more prone to bugs.
arXiv Detail & Related papers (2025-05-12T19:12:30Z) - Automated Refactoring of Non-Idiomatic Python Code: A Differentiated Replication with LLMs [54.309127753635366]
We present the results of a replication study in which we investigate GPT-4 effectiveness in recommending and suggesting idiomatic actions.<n>Our findings underscore the potential of LLMs to achieve tasks where, in the past, implementing recommenders based on complex code analyses was required.
arXiv Detail & Related papers (2025-01-28T15:41:54Z) - A Survey of Deep Learning Based Software Refactoring [5.716522445049744]
Dozens of deep learning-based approaches have been proposed forfactoring software.
There is a lack of comprehensive reviews on such works as well as a taxonomy for deep learning-based approaches.
Most of the deep learning techniques have been used for the detection of code smells and the recommendation of solutions.
arXiv Detail & Related papers (2024-04-30T03:07:11Z) - Detecting Refactoring Commits in Machine Learning Python Projects: A Machine Learning-Based Approach [3.000496428347787]
MLRefScanner identifies commits with both ML-specific and general operations.
Our study highlights the potential of ML-driven approaches in detecting programming across diverse languages and technical domains.
arXiv Detail & Related papers (2024-04-09T18:46:56Z) - ReGAL: Refactoring Programs to Discover Generalizable Abstractions [59.05769810380928]
Generalizable Abstraction Learning (ReGAL) is a method for learning a library of reusable functions via codeization.
We find that the shared function libraries discovered by ReGAL make programs easier to predict across diverse domains.
For CodeLlama-13B, ReGAL results in absolute accuracy increases of 11.5% on LOGO, 26.1% on date understanding, and 8.1% on TextCraft, outperforming GPT-3.5 in two of three domains.
arXiv Detail & Related papers (2024-01-29T18:45:30Z) - State of Refactoring Adoption: Better Understanding Developer Perception
of Refactoring [5.516979718589074]
We aim to explore how developers document their activities during the software life cycle.
We call such activity Self-Affirmed Refactoring (SAR), which indicates developers' documentation of their activities.
We propose an approach to identify whether a commit describes developer-related events to classify them according to the common quality improvement categories.
arXiv Detail & Related papers (2023-06-09T16:38:20Z) - RefBERT: A Two-Stage Pre-trained Framework for Automatic Rename
Refactoring [57.8069006460087]
We study automatic rename on variable names, which is considered more challenging than other rename activities.
We propose RefBERT, a two-stage pre-trained framework for rename on variable names.
We show that the generated variable names of RefBERT are more accurate and meaningful than those produced by the existing method.
arXiv Detail & Related papers (2023-05-28T12:29:39Z) - Do code refactorings influence the merge effort? [80.1936417993664]
Multiple contributors frequently change the source code in parallel to implement new features, fix bugs, existing code, and make other changes.
These simultaneous changes need to be merged into the same version of the source code.
Studies show that 10 to 20 percent of all merge attempts result in conflicts, which require the manual developer's intervention to complete the process.
arXiv Detail & Related papers (2023-05-10T13:24:59Z) - ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval.
We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z) - How We Refactor and How We Document it? On the Use of Supervised Machine
Learning Algorithms to Classify Refactoring Documentation [25.626914797750487]
Refactoring is the art of improving the design of a system without altering its external behavior.
This study categorizes commits into 3 categories, namely, Internal QA, External QA, and Code Smell Resolution, along with the traditional BugFix and Functional categories.
To better understand our classification results, we analyzed commit messages to extract patterns that developers regularly use to describe their smells.
arXiv Detail & Related papers (2020-10-26T20:33:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.