An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep
Learning Model Registry
- URL: http://arxiv.org/abs/2303.02552v1
- Date: Sun, 5 Mar 2023 02:28:15 GMT
- Title: An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep
Learning Model Registry
- Authors: Wenxin Jiang, Nicholas Synovic, Matt Hyatt, Taylor R. Schorlemmer,
Rohan Sethi, Yung-Hsiang Lu, George K. Thiruvathukal, James C. Davis
- Abstract summary: Machine learning engineers have begun to reuse large-scale pre-trained models (PTMs)
We interviewed 12 practitioners from the most popular PTM ecosystem, Hugging Face, to learn the practices and challenges of PTM reuse.
Three challenges for PTM reuse are missing attributes, discrepancies between claimed and actual performance, and model risks.
- Score: 2.1346819928536687
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNNs) are being adopted as components in software
systems. Creating and specializing DNNs from scratch has grown increasingly
difficult as state-of-the-art architectures grow more complex. Following the
path of traditional software engineering, machine learning engineers have begun
to reuse large-scale pre-trained models (PTMs) and fine-tune these models for
downstream tasks. Prior works have studied reuse practices for traditional
software packages to guide software engineers towards better package
maintenance and dependency management. We lack a similar foundation of
knowledge to guide behaviors in pre-trained model ecosystems.
In this work, we present the first empirical investigation of PTM reuse. We
interviewed 12 practitioners from the most popular PTM ecosystem, Hugging Face,
to learn the practices and challenges of PTM reuse. From this data, we model
the decision-making process for PTM reuse. Based on the identified practices,
we describe useful attributes for model reuse, including provenance,
reproducibility, and portability. Three challenges for PTM reuse are missing
attributes, discrepancies between claimed and actual performance, and model
risks. We substantiate these identified challenges with systematic measurements
in the Hugging Face ecosystem. Our work informs future directions on optimizing
deep learning ecosystems by automated measuring useful attributes and potential
attacks, and envision future research on infrastructure and standardization for
model registries.
Related papers
- PeaTMOSS: Mining Pre-Trained Models in Open-Source Software [6.243303627949341]
We present the PeaTMOSS dataset: Pre-Trained Models in Open-Source Software.
PeaTMOSS has three parts: a snapshot of 281,638 PTMs, (2) 27,270 open-source software repositories that use PTMs, and (3) a mapping between PTMs and the projects that use them.
arXiv Detail & Related papers (2023-10-05T15:58:45Z) - Naming Practices of Pre-Trained Models in Hugging Face [4.956536094440504]
Pre-Trained Models (PTMs) are used in computer systems to adapt for quality or performance prior to deployment.
Researchers publish PTMs, which engineers adapt for quality or performance prior to deployment.
Prior research has reported that model names are not always well chosen - and are sometimes erroneous.
In this paper, we frame and conduct the first empirical investigation of PTM naming practices in the Hugging Face PTM registry.
arXiv Detail & Related papers (2023-10-02T21:13:32Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model
Reuse [59.500060790983994]
This paper introduces ZhiJian, a comprehensive and user-friendly toolbox for model reuse, utilizing the PyTorch backend.
ZhiJian presents a novel paradigm that unifies diverse perspectives on model reuse, encompassing target architecture construction with PTM, tuning target model with PTM, and PTM-based inference.
arXiv Detail & Related papers (2023-08-17T19:12:13Z) - Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey [66.18478838828231]
Multi-modal pre-trained big models have drawn more and more attention in recent years.
This paper introduces the background of multi-modal pre-training by reviewing the conventional deep, pre-training works in natural language process, computer vision, and speech.
Then, we introduce the task definition, key challenges, and advantages of multi-modal pre-training models (MM-PTMs), and discuss the MM-PTMs with a focus on data, objectives, network, and knowledge enhanced pre-training.
arXiv Detail & Related papers (2023-02-20T15:34:03Z) - Great Truths are Always Simple: A Rather Simple Knowledge Encoder for
Enhancing the Commonsense Reasoning Capacity of Pre-Trained Models [89.98762327725112]
Commonsense reasoning in natural language is a desired ability of artificial intelligent systems.
For solving complex commonsense reasoning tasks, a typical solution is to enhance pre-trained language models(PTMs) with a knowledge-aware graph neural network(GNN) encoder.
Despite the effectiveness, these approaches are built on heavy architectures, and can't clearly explain how external knowledge resources improve the reasoning capacity of PTMs.
arXiv Detail & Related papers (2022-05-04T01:27:36Z) - A Model-Driven Engineering Approach to Machine Learning and Software
Modeling [0.5156484100374059]
Models are used in both the Software Engineering (SE) and the Artificial Intelligence (AI) communities.
The main focus is on the Internet of Things (IoT) and smart Cyber-Physical Systems (CPS) use cases, where both ML and model-driven SE play a key role.
arXiv Detail & Related papers (2021-07-06T15:50:50Z) - Pre-Trained Models: Past, Present and Future [126.21572378910746]
Large-scale pre-trained models (PTMs) have recently achieved great success and become a milestone in the field of artificial intelligence (AI)
By storing knowledge into huge parameters and fine-tuning on specific tasks, the rich knowledge implicitly encoded in huge parameters can benefit a variety of downstream tasks.
It is now the consensus of the AI community to adopt PTMs as backbone for downstream tasks rather than learning models from scratch.
arXiv Detail & Related papers (2021-06-14T02:40:32Z) - Do we need to go Deep? Knowledge Tracing with Big Data [5.218882272051637]
We use EdNet, the largest student interaction dataset publicly available in the education domain, to understand how accurately both deep and traditional models predict future student performances.
Our work observes that logistic regression models with carefully engineered features outperformed deep models from extensive experimentation.
arXiv Detail & Related papers (2021-01-20T22:40:38Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.