How do Machine Learning Models Change?
- URL: http://arxiv.org/abs/2411.09645v1
- Date: Thu, 14 Nov 2024 18:14:32 GMT
- Title: How do Machine Learning Models Change?
- Authors: Joel Castaño, Rafael Cabañas, Antonio Salmerón, David Lo, Silverio Martínez-Fernández,
- Abstract summary: This study utilizes both repository mining and longitudinal analysis methods to examine over 200,000 commits and 1,200 releases from over 50,000 models on Hugging Face (HF)
We replicate and extend an ML change taxonomy for classifying commits and utilize Bayesian networks to uncover patterns in commit and release activities over time.
Our findings indicate that commit activities align with established data science methodologies, such as CRISP-DM, emphasizing iterative refinement and continuous improvement.
Additionally, release patterns tend to consolidate significant updates, particularly in documentation, distinguishing between granular changes and milestone-based releases.
- Score: 7.543685248926161
- License:
- Abstract: The proliferation of Machine Learning (ML) models and their open-source implementations has transformed Artificial Intelligence research and applications. Platforms like Hugging Face (HF) enable the development, sharing, and deployment of these models, fostering an evolving ecosystem. While previous studies have examined aspects of models hosted on platforms like HF, a comprehensive longitudinal study of how these models change remains underexplored. This study addresses this gap by utilizing both repository mining and longitudinal analysis methods to examine over 200,000 commits and 1,200 releases from over 50,000 models on HF. We replicate and extend an ML change taxonomy for classifying commits and utilize Bayesian networks to uncover patterns in commit and release activities over time. Our findings indicate that commit activities align with established data science methodologies, such as CRISP-DM, emphasizing iterative refinement and continuous improvement. Additionally, release patterns tend to consolidate significant updates, particularly in documentation, distinguishing between granular changes and milestone-based releases. Furthermore, projects with higher popularity prioritize infrastructure enhancements early in their lifecycle, and those with intensive collaboration practices exhibit improved documentation standards. These and other insights enhance the understanding of model changes on community platforms and provide valuable guidance for best practices in model maintenance.
Related papers
- Transfer Learning with Foundational Models for Time Series Forecasting using Low-Rank Adaptations [0.0]
This study proposes LLIAM, the Llama Lora-Integrated Autorregresive Model.
Low-Rank Adaptations are used to enhance the knowledge of the model with diverse time series datasets, known as the fine-tuning phase.
arXiv Detail & Related papers (2024-10-15T12:14:01Z) - Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community.
There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z) - Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations [52.11801730860999]
In recent years, the robot learning community has shown increasing interest in using deep generative models to capture the complexity of large datasets.
We present the different types of models that the community has explored, such as energy-based models, diffusion models, action value maps, or generative adversarial networks.
We also present the different types of applications in which deep generative models have been used, from grasp generation to trajectory generation or cost learning.
arXiv Detail & Related papers (2024-08-08T11:34:31Z) - Data Augmentation for Multivariate Time Series Classification: An Experimental Study [1.5390962520179197]
Despite the limited size of these datasets, we achieved classification accuracy improvements in 10 out of 13 datasets using the Rocket and InceptionTime models.
This highlights the essential role of sufficient data in training effective models, paralleling the advancements seen in computer vision.
arXiv Detail & Related papers (2024-06-10T17:58:02Z) - TSPP: A Unified Benchmarking Tool for Time-series Forecasting [3.5415344166235534]
We propose a unified benchmarking framework that exposes the crucial modelling and machine learning decisions involved in developing time series forecasting models.
This framework fosters seamless integration of models and datasets, aiding both practitioners and researchers in their development efforts.
We benchmark recently proposed models within this framework, demonstrating that carefully implemented deep learning models with minimal effort can rival gradient-boosting decision trees.
arXiv Detail & Related papers (2023-12-28T16:23:58Z) - Analyzing the Evolution and Maintenance of ML Models on Hugging Face [8.409033836300761]
Hugging Face (HF) has established itself as a crucial platform for the development and sharing of machine learning (ML) models.
This repository mining study, which delves into more than 380,000 models using data gathered via the HF Hub API, aims to explore the community engagement, evolution, and maintenance around models hosted on HF.
arXiv Detail & Related papers (2023-11-22T13:20:25Z) - ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model
Reuse [59.500060790983994]
This paper introduces ZhiJian, a comprehensive and user-friendly toolbox for model reuse, utilizing the PyTorch backend.
ZhiJian presents a novel paradigm that unifies diverse perspectives on model reuse, encompassing target architecture construction with PTM, tuning target model with PTM, and PTM-based inference.
arXiv Detail & Related papers (2023-08-17T19:12:13Z) - Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models.
We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models.
Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z) - Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost.
Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z) - Learning to Augment via Implicit Differentiation for Domain
Generalization [107.9666735637355]
Domain generalization (DG) aims to overcome the problem by leveraging multiple source domains to learn a domain-generalizable model.
In this paper, we propose a novel augmentation-based DG approach, dubbed AugLearn.
AugLearn shows effectiveness on three standard DG benchmarks, PACS, Office-Home and Digits-DG.
arXiv Detail & Related papers (2022-10-25T18:51:51Z) - Bayesian Active Learning for Discrete Latent Variable Models [19.852463786440122]
Active learning seeks to reduce the amount of data required to fit the parameters of a model.
latent variable models play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines.
arXiv Detail & Related papers (2022-02-27T19:07:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.