Naming Practices of Pre-Trained Models in Hugging Face
- URL: http://arxiv.org/abs/2310.01642v2
- Date: Thu, 28 Mar 2024 20:12:16 GMT
- Title: Naming Practices of Pre-Trained Models in Hugging Face
- Authors: Wenxin Jiang, Chingwo Cheung, Mingyu Kim, Heesoo Kim, George K. Thiruvathukal, James C. Davis,
- Abstract summary: Pre-Trained Models (PTMs) are used in computer systems to adapt for quality or performance prior to deployment.
Researchers publish PTMs, which engineers adapt for quality or performance prior to deployment.
Prior research has reported that model names are not always well chosen - and are sometimes erroneous.
In this paper, we frame and conduct the first empirical investigation of PTM naming practices in the Hugging Face PTM registry.
- Score: 4.956536094440504
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As innovation in deep learning continues, many engineers seek to adopt Pre-Trained Models (PTMs) as components in computer systems. Researchers publish PTMs, which engineers adapt for quality or performance prior to deployment. PTM authors should choose appropriate names for their PTMs, which would facilitate model discovery and reuse. However, prior research has reported that model names are not always well chosen - and are sometimes erroneous. The naming for PTM packages has not been systematically studied. In this paper, we frame and conduct the first empirical investigation of PTM naming practices in the Hugging Face PTM registry. We initiated our study with a survey of 108 Hugging Face users to understand the practices in PTM naming. From our survey analysis, we highlight discrepancies from traditional software package naming, and present findings on naming practices. Our findings indicate there is a great mismatch between engineers' preferences and practical practices of PTM naming. We also present practices on detecting naming anomalies and introduce a novel automated DNN ARchitecture Assessment technique (DARA), capable of detecting PTM naming anomalies. We envision future works on leveraging meta-features of PTMs to improve model reuse and trustworthiness.
Related papers
- Automated categorization of pre-trained models for software engineering: A case study with a Hugging Face dataset [9.218130273952383]
Software engineering activities have been revolutionized by the advent of pre-trained models (PTMs)
The Hugging Face (HF) platform simplifies the use of PTMs by collecting, storing, and curating several models.
This paper introduces an approach to enable the automatic classification of PTMs for SE tasks.
arXiv Detail & Related papers (2024-05-21T20:26:17Z) - PeaTMOSS: Mining Pre-Trained Models in Open-Source Software [6.243303627949341]
We present the PeaTMOSS dataset: Pre-Trained Models in Open-Source Software.
PeaTMOSS has three parts: a snapshot of 281,638 PTMs, (2) 27,270 open-source software repositories that use PTMs, and (3) a mapping between PTMs and the projects that use them.
arXiv Detail & Related papers (2023-10-05T15:58:45Z) - ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model
Reuse [59.500060790983994]
This paper introduces ZhiJian, a comprehensive and user-friendly toolbox for model reuse, utilizing the PyTorch backend.
ZhiJian presents a novel paradigm that unifies diverse perspectives on model reuse, encompassing target architecture construction with PTM, tuning target model with PTM, and PTM-based inference.
arXiv Detail & Related papers (2023-08-17T19:12:13Z) - Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need [84.3507610522086]
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones.
Recent pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL.
We argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring.
arXiv Detail & Related papers (2023-03-13T17:59:02Z) - An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep
Learning Model Registry [2.1346819928536687]
Machine learning engineers have begun to reuse large-scale pre-trained models (PTMs)
We interviewed 12 practitioners from the most popular PTM ecosystem, Hugging Face, to learn the practices and challenges of PTM reuse.
Three challenges for PTM reuse are missing attributes, discrepancies between claimed and actual performance, and model risks.
arXiv Detail & Related papers (2023-03-05T02:28:15Z) - Black-Box Tuning for Language-Model-as-a-Service [85.2210372920386]
This work proposes the Black-Box Tuning to optimize PTMs through derivative-free algorithms.
In particular, we invoke the CMA-ES to optimize the continuous prompt prepended to the input text by iteratively calling PTM inference APIs.
Our experimental results demonstrate that, black-box tuning with RoBERTa on a few labeled samples not only significantly outperforms manual prompt and GPT-3's in-context learning, but also surpasses the gradient-based counterparts.
arXiv Detail & Related papers (2022-01-10T18:17:05Z) - Ranking and Tuning Pre-trained Models: A New Paradigm of Exploiting
Model Hubs [136.4492678691406]
We propose a new paradigm of exploiting model hubs by ranking and tuning pre-trained models.
The best ranked PTM can be fine-tuned and deployed if we have no preference for the model's architecture.
The tuning part introduces a novel method for multiple PTMs tuning, which surpasses dedicated methods.
arXiv Detail & Related papers (2021-10-20T12:59:23Z) - Pre-Trained Models: Past, Present and Future [126.21572378910746]
Large-scale pre-trained models (PTMs) have recently achieved great success and become a milestone in the field of artificial intelligence (AI)
By storing knowledge into huge parameters and fine-tuning on specific tasks, the rich knowledge implicitly encoded in huge parameters can benefit a variety of downstream tasks.
It is now the consensus of the AI community to adopt PTMs as backbone for downstream tasks rather than learning models from scratch.
arXiv Detail & Related papers (2021-06-14T02:40:32Z) - Evaluating Pre-Trained Models for User Feedback Analysis in Software
Engineering: A Study on Classification of App-Reviews [2.66512000865131]
We study the accuracy and time efficiency of pre-trained neural language models (PTMs) for app review classification.
We set up different studies to evaluate PTMs in multiple settings.
In all cases, Micro and Macro Precision, Recall, and F1-scores will be used.
arXiv Detail & Related papers (2021-04-12T23:23:45Z) - Pre-trained Models for Natural Language Processing: A Survey [75.95500552357429]
The emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era.
This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.
arXiv Detail & Related papers (2020-03-18T15:22:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.