Detecting Refactoring Commits in Machine Learning Python Projects: A Machine Learning-Based Approach
- URL: http://arxiv.org/abs/2404.06572v1
- Date: Tue, 9 Apr 2024 18:46:56 GMT
- Title: Detecting Refactoring Commits in Machine Learning Python Projects: A Machine Learning-Based Approach
- Authors: Shayan Noei, Heng Li, Ying Zou,
- Abstract summary: MLRefScanner identifies commits with both ML-specific and general operations.
Our study highlights the potential of ML-driven approaches in detecting programming across diverse languages and technical domains.
- Score: 3.000496428347787
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Refactoring enhances software quality without altering its functional behaviors. Understanding the refactoring activities of developers is crucial to improving software maintainability. With the increasing use of machine learning (ML) libraries and frameworks, maximizing their maintainability is crucial. Due to the data-driven nature of ML projects, they often undergo different refactoring operations (e.g., data manipulation), for which existing refactoring tools lack ML-specific detection capabilities. Furthermore, a large number of ML libraries are written in Python, which has limited tools for refactoring detection. PyRef, a rule-based and state-of-the-art tool for Python refactoring detection, can identify 11 types of refactoring operations. In comparison, Rminer can detect 99 types of refactoring for Java projects. We introduce MLRefScanner, a prototype tool that applies machine-learning techniques to detect refactoring commits in ML Python projects. MLRefScanner identifies commits with both ML-specific and general refactoring operations. Evaluating MLRefScanner on 199 ML projects demonstrates its superior performance compared to state-of-the-art approaches, achieving an overall 94% precision and 82% recall. Combining it with PyRef further boosts performance to 95% precision and 99% recall. Our study highlights the potential of ML-driven approaches in detecting refactoring across diverse programming languages and technical domains, addressing the limitations of rule-based detection methods.
Related papers
- Distributed Approach to Haskell Based Applications Refactoring with LLMs Based Multi-Agent Systems [3.972203967261693]
Large language models (LLMs) based multi-agent system to automate Haskells.
System consists of specialized agents performing tasks such as context analysis, validation, and testing.
Refactoring improvements are using metrics such as cyclomatic complexity, run-time, and memory allocation.
arXiv Detail & Related papers (2025-02-11T20:04:15Z) - An Empirical Study on the Impact of Code Duplication-aware Refactoring Practices on Quality Metrics [5.516979718589074]
We extract a corpus of 332 commits applied and documented by developers during their daily changes from 128 open-source Java projects.
We empirically analyze the impact of these operations on a set of common state-of-the-art design quality metrics.
arXiv Detail & Related papers (2025-02-06T13:34:25Z) - Testing Refactoring Engine via Historical Bug Report driven LLM [6.852749659993347]
Refactoring is the process of restructuring existing code without changing its external behavior.
We propose RETESTER, a framework for automated engine testing.
arXiv Detail & Related papers (2025-01-16T23:31:49Z) - Automated Unit Test Refactoring [10.847400457238423]
Test smells arise from poor design practices and insufficient domain knowledge.
We propose UTRefactor, a context-enhanced, LLM-based framework for automatic test in Java projects.
We evaluate UTRefactor on 879 tests from six open-source Java projects, reducing the number of test smells from 2,375 to 265, achieving an 89% reduction.
arXiv Detail & Related papers (2024-09-25T08:42:29Z) - Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML)
VML constrains the parameter space to be human-interpretable natural language.
We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.
arXiv Detail & Related papers (2024-06-06T17:59:56Z) - ReGAL: Refactoring Programs to Discover Generalizable Abstractions [59.05769810380928]
Generalizable Abstraction Learning (ReGAL) is a method for learning a library of reusable functions via codeization.
We find that the shared function libraries discovered by ReGAL make programs easier to predict across diverse domains.
For CodeLlama-13B, ReGAL results in absolute accuracy increases of 11.5% on LOGO, 26.1% on date understanding, and 8.1% on TextCraft, outperforming GPT-3.5 in two of three domains.
arXiv Detail & Related papers (2024-01-29T18:45:30Z) - ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code [76.84199699772903]
ML-Bench is a benchmark rooted in real-world programming applications that leverage existing code repositories to perform tasks.
To evaluate both Large Language Models (LLMs) and AI agents, two setups are employed: ML-LLM-Bench for assessing LLMs' text-to-code conversion within a predefined deployment environment, and ML-Agent-Bench for testing autonomous agents in an end-to-end task execution within a Linux sandbox environment.
arXiv Detail & Related papers (2023-11-16T12:03:21Z) - Large Language Model-Aware In-Context Learning for Code Generation [75.68709482932903]
Large language models (LLMs) have shown impressive in-context learning (ICL) ability in code generation.
We propose a novel learning-based selection approach named LAIL (LLM-Aware In-context Learning) for code generation.
arXiv Detail & Related papers (2023-10-15T06:12:58Z) - CRAFT: Customizing LLMs by Creating and Retrieving from Specialized
Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs)
It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.
Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z) - Practical Machine Learning Safety: A Survey and Primer [81.73857913779534]
Open-world deployment of Machine Learning algorithms in safety-critical applications such as autonomous vehicles needs to address a variety of ML vulnerabilities.
New models and training techniques to reduce generalization error, achieve domain adaptation, and detect outlier examples and adversarial attacks.
Our organization maps state-of-the-art ML techniques to safety strategies in order to enhance the dependability of the ML algorithm from different aspects.
arXiv Detail & Related papers (2021-06-09T05:56:42Z) - The Prevalence of Code Smells in Machine Learning projects [9.722159563454436]
static code analysis can be used to find potential defects in the source code, opportunities, and violations of common coding standards.
We gathered a dataset of 74 open-source Machine Learning projects, installed their dependencies and ran Pylint on them.
This resulted in a top 20 of all detected code smells, per category.
arXiv Detail & Related papers (2021-03-06T16:01:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.