Related papers: Detecting Refactoring Commits in Machine Learning Python Projects: A Machine Learning-Based Approach

Detecting Refactoring Commits in Machine Learning Python Projects: A Machine Learning-Based Approach

URL: http://arxiv.org/abs/2404.06572v1
Date: Tue, 9 Apr 2024 18:46:56 GMT
Title: Detecting Refactoring Commits in Machine Learning Python Projects: A Machine Learning-Based Approach
Authors: Shayan Noei, Heng Li, Ying Zou,
Abstract summary: MLRefScanner identifies commits with both ML-specific and general operations. Our study highlights the potential of ML-driven approaches in detecting programming across diverse languages and technical domains.
Score: 3.000496428347787
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Refactoring enhances software quality without altering its functional behaviors. Understanding the refactoring activities of developers is crucial to improving software maintainability. With the increasing use of machine learning (ML) libraries and frameworks, maximizing their maintainability is crucial. Due to the data-driven nature of ML projects, they often undergo different refactoring operations (e.g., data manipulation), for which existing refactoring tools lack ML-specific detection capabilities. Furthermore, a large number of ML libraries are written in Python, which has limited tools for refactoring detection. PyRef, a rule-based and state-of-the-art tool for Python refactoring detection, can identify 11 types of refactoring operations. In comparison, Rminer can detect 99 types of refactoring for Java projects. We introduce MLRefScanner, a prototype tool that applies machine-learning techniques to detect refactoring commits in ML Python projects. MLRefScanner identifies commits with both ML-specific and general refactoring operations. Evaluating MLRefScanner on 199 ML projects demonstrates its superior performance compared to state-of-the-art approaches, achieving an overall 94% precision and 82% recall. Combining it with PyRef further boosts performance to 95% precision and 99% recall. Our study highlights the potential of ML-driven approaches in detecting refactoring across diverse programming languages and technical domains, addressing the limitations of rule-based detection methods.

Related papers

MANTRA: Enhancing Automated Method-Level Refactoring with Contextual RAG and Multi-Agent LLM Collaboration [44.75848695076576]
We introduce MANTRA, a comprehensive Large Language Models agent-based framework. ManTRA integrates Context-Aware Retrieval-Augmented Generation, coordinated Multi-Agent Collaboration, and Verbal Reinforcement Learning. Experimental results demonstrate that MANTRA substantially surpasses a baseline LLM model.
arXiv Detail & Related papers (2025-03-18T15:16:51Z)
Evaluating the Effectiveness of Small Language Models in Detecting Refactoring Bugs [0.6133301815445301]
This study evaluates the effectiveness of Small Language Models (SLMs) in detecting two types of bugs in Java and Python. The study covers 16 types and employs zero-shot prompting on consumer-grade hardware to evaluate the models' ability to reason about correctness without explicit prior training. The proprietary o3-mini-high model achieved the highest detection rate, identifying 84.3% of Type I bugs.
arXiv Detail & Related papers (2025-02-25T18:52:28Z)
Refactoring Detection in C++ Programs with RefactoringMiner++ [45.045206894182776]
We present RefactoringMiner++, a detection tool based on the current state of the art: RefactoringMiner 3. While the latter focuses exclusively on Java, our tool is seeded -- to the best of our knowledge -- the first publicly available detection tool for C++ projects.
arXiv Detail & Related papers (2025-02-24T23:17:35Z)
Context-Enhanced LLM-Based Framework for Automatic Test Refactoring [10.847400457238423]
Test smells arise from poor design practices and insufficient domain knowledge. We propose UTRefactor, a context-enhanced, LLM-based framework for automatic test in Java projects. We evaluate UTRefactor on 879 tests from six open-source Java projects, reducing the number of test smells from 2,375 to 265, achieving an 89% reduction.
arXiv Detail & Related papers (2024-09-25T08:42:29Z)
Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML) VML constrains the parameter space to be human-interpretable natural language. We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.
arXiv Detail & Related papers (2024-06-06T17:59:56Z)
ReGAL: Refactoring Programs to Discover Generalizable Abstractions [59.05769810380928]
Generalizable Abstraction Learning (ReGAL) is a method for learning a library of reusable functions via codeization. We find that the shared function libraries discovered by ReGAL make programs easier to predict across diverse domains. For CodeLlama-13B, ReGAL results in absolute accuracy increases of 11.5% on LOGO, 26.1% on date understanding, and 8.1% on TextCraft, outperforming GPT-3.5 in two of three domains.
arXiv Detail & Related papers (2024-01-29T18:45:30Z)
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code [76.84199699772903]
ML-Bench is a benchmark rooted in real-world programming applications that leverage existing code repositories to perform tasks. To evaluate both Large Language Models (LLMs) and AI agents, two setups are employed: ML-LLM-Bench for assessing LLMs' text-to-code conversion within a predefined deployment environment, and ML-Agent-Bench for testing autonomous agents in an end-to-end task execution within a Linux sandbox environment.
arXiv Detail & Related papers (2023-11-16T12:03:21Z)
Julearn: an easy-to-use library for leakage-free evaluation and inspection of ML models [0.23301643766310373]
We present the rationale behind julearn's design, its core features, and showcase three examples of previously-published research projects. Julearn aims to simplify the entry into the machine learning world by providing an easy-to-use environment with built in guards against some of the most common ML pitfalls.
arXiv Detail & Related papers (2023-10-19T08:21:12Z)
GEVO-ML: Optimizing Machine Learning Code with Evolutionary Computation [6.525197444717069]
GEVO-ML is a tool for discovering optimization opportunities and tuning the performance of Machine Learning kernels. We demonstrate GEVO-ML on two different ML workloads for both model training and prediction. GEVO-ML finds significant improvements for these models, achieving 90.43% performance improvement when model accuracy is relaxed by 2%.
arXiv Detail & Related papers (2023-10-16T09:24:20Z)
Large Language Model-Aware In-Context Learning for Code Generation [75.68709482932903]
Large language models (LLMs) have shown impressive in-context learning (ICL) ability in code generation. We propose a novel learning-based selection approach named LAIL (LLM-Aware In-context Learning) for code generation.
arXiv Detail & Related papers (2023-10-15T06:12:58Z)
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs) It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks. Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z)
Practical Machine Learning Safety: A Survey and Primer [81.73857913779534]
Open-world deployment of Machine Learning algorithms in safety-critical applications such as autonomous vehicles needs to address a variety of ML vulnerabilities. New models and training techniques to reduce generalization error, achieve domain adaptation, and detect outlier examples and adversarial attacks. Our organization maps state-of-the-art ML techniques to safety strategies in order to enhance the dependability of the ML algorithm from different aspects.
arXiv Detail & Related papers (2021-06-09T05:56:42Z)
The Prevalence of Code Smells in Machine Learning projects [9.722159563454436]
static code analysis can be used to find potential defects in the source code, opportunities, and violations of common coding standards. We gathered a dataset of 74 open-source Machine Learning projects, installed their dependencies and ran Pylint on them. This resulted in a top 20 of all detected code smells, per category.
arXiv Detail & Related papers (2021-03-06T16:01:54Z)
MLGO: a Machine Learning Guided Compiler Optimizations Framework [0.0]
This work is the first full integration of machine learning in a complex compiler pass in a real-world setting. We use two different ML algorithms to train the inlining-for-size model, and achieve up to 7% size reduction. The same model generalizes well to a diversity of real-world targets, as well as to the same set of targets after months of active development.
arXiv Detail & Related papers (2021-01-13T00:02:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.