Related papers: "I see models being a whole other thing": An Empirical Study of Pre-Trained Model Naming Conventions and A Tool for Enhancing Naming Consistency

"I see models being a whole other thing": An Empirical Study of Pre-Trained Model Naming Conventions and A Tool for Enhancing Naming Consistency

URL: http://arxiv.org/abs/2310.01642v3
Date: Mon, 18 Aug 2025 22:56:45 GMT
Title: "I see models being a whole other thing": An Empirical Study of Pre-Trained Model Naming Conventions and A Tool for Enhancing Naming Consistency
Authors: Wenxin Jiang, Mingyu Kim, Chingwo Cheung, Heesoo Kim, George K. Thiruvathukal, James C. Davis,
Abstract summary: We conduct the first empirical investigation of PTM naming practices in the Hugging Face PTM registry.<n>We introduce DARA, the first automated DNN ARchitecture Assessment technique designed to detect PTM naming inconsistencies.
Score: 4.956536094440504
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As innovation in deep learning continues, many engineers are incorporating Pre-Trained Models (PTMs) as components in computer systems. Some PTMs are foundation models, and others are fine-tuned variations adapted to different needs. When these PTMs are named well, it facilitates model discovery and reuse. However, prior research has shown that model names are not always well chosen and can sometimes be inaccurate and misleading. The naming practices for PTM packages have not been systematically studied, which hampers engineers' ability to efficiently search for and reliably reuse these models. In this paper, we conduct the first empirical investigation of PTM naming practices in the Hugging Face PTM registry. We begin by reporting on a survey of 108 Hugging Face users, highlighting differences from traditional software package naming and presenting findings on PTM naming practices. The survey results indicate a mismatch between engineers' preferences and current practices in PTM naming. We then introduce DARA, the first automated DNN ARchitecture Assessment technique designed to detect PTM naming inconsistencies. Our results demonstrate that architectural information alone is sufficient to detect these inconsistencies, achieving an accuracy of 94% in identifying model types and promising performance (over 70%) in other architectural metadata as well. We also highlight potential use cases for automated naming tools, such as model validation, PTM metadata generation and verification, and plagiarism detection. Our study provides a foundation for automating naming inconsistency detection. Finally, we envision future work focusing on automated tools for standardizing package naming, improving model selection and reuse, and strengthening the security of the PTM supply chain.

Related papers

Software Dependencies 2.0: An Empirical Study of Reuse and Integration of Pre-Trained Models in Open-Source Projects [9.22889135297242]
Pre-trained models (PTMs) are machine learning models that have been trained in advance, often on large-scale data, and can be reused for new tasks.<n>Software Dependencies 2.0 introduces a new class of software dependency, which we term Software Dependencies 2.0.
arXiv Detail & Related papers (2025-09-07T15:00:22Z)
How do Pre-Trained Models Support Software Engineering? An Empirical Study in Hugging Face [52.257764273141184]
Open-Source Pre-Trained Models (PTMs) provide extensive resources for various Machine Learning (ML) tasks.<n>These resources lack a classification tailored to Software Engineering (SE) needs.<n>We derive a taxonomy encompassing 147 SE tasks and apply an SE-oriented classification to PTMs in a popular open-source ML repository, Hugging Face (HF)<n>We find that code generation is the most common SE task among PTMs, while requirements engineering and software design activities receive limited attention.
arXiv Detail & Related papers (2025-06-03T15:51:17Z)
Towards a Classification of Open-Source ML Models and Datasets for Software Engineering [52.257764273141184]
Open-source Pre-Trained Models (PTMs) and datasets provide extensive resources for various Machine Learning (ML) tasks. These resources lack a classification tailored to Software Engineering (SE) needs. We apply an SE-oriented classification to PTMs and datasets on a popular open-source ML repository, Hugging Face (HF), and analyze the evolution of PTMs over time.
arXiv Detail & Related papers (2024-11-14T18:52:05Z)
Automated categorization of pre-trained models for software engineering: A case study with a Hugging Face dataset [9.218130273952383]
Software engineering activities have been revolutionized by the advent of pre-trained models (PTMs) The Hugging Face (HF) platform simplifies the use of PTMs by collecting, storing, and curating several models. This paper introduces an approach to enable the automatic classification of PTMs for SE tasks.
arXiv Detail & Related papers (2024-05-21T20:26:17Z)
PeaTMOSS: Mining Pre-Trained Models in Open-Source Software [6.243303627949341]
We present the PeaTMOSS dataset: Pre-Trained Models in Open-Source Software. PeaTMOSS has three parts: a snapshot of 281,638 PTMs, (2) 27,270 open-source software repositories that use PTMs, and (3) a mapping between PTMs and the projects that use them.
arXiv Detail & Related papers (2023-10-05T15:58:45Z)
ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model Reuse [59.500060790983994]
This paper introduces ZhiJian, a comprehensive and user-friendly toolbox for model reuse, utilizing the PyTorch backend. ZhiJian presents a novel paradigm that unifies diverse perspectives on model reuse, encompassing target architecture construction with PTM, tuning target model with PTM, and PTM-based inference.
arXiv Detail & Related papers (2023-08-17T19:12:13Z)
Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need [84.3507610522086]
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Recent pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. We argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring.
arXiv Detail & Related papers (2023-03-13T17:59:02Z)
An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry [2.1346819928536687]
Machine learning engineers have begun to reuse large-scale pre-trained models (PTMs) We interviewed 12 practitioners from the most popular PTM ecosystem, Hugging Face, to learn the practices and challenges of PTM reuse. Three challenges for PTM reuse are missing attributes, discrepancies between claimed and actual performance, and model risks.
arXiv Detail & Related papers (2023-03-05T02:28:15Z)
Ranking and Tuning Pre-trained Models: A New Paradigm of Exploiting Model Hubs [136.4492678691406]
We propose a new paradigm of exploiting model hubs by ranking and tuning pre-trained models. The best ranked PTM can be fine-tuned and deployed if we have no preference for the model's architecture. The tuning part introduces a novel method for multiple PTMs tuning, which surpasses dedicated methods.
arXiv Detail & Related papers (2021-10-20T12:59:23Z)
Pre-Trained Models: Past, Present and Future [126.21572378910746]
Large-scale pre-trained models (PTMs) have recently achieved great success and become a milestone in the field of artificial intelligence (AI) By storing knowledge into huge parameters and fine-tuning on specific tasks, the rich knowledge implicitly encoded in huge parameters can benefit a variety of downstream tasks. It is now the consensus of the AI community to adopt PTMs as backbone for downstream tasks rather than learning models from scratch.
arXiv Detail & Related papers (2021-06-14T02:40:32Z)
Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviews [2.66512000865131]
We study the accuracy and time efficiency of pre-trained neural language models (PTMs) for app review classification. We set up different studies to evaluate PTMs in multiple settings. In all cases, Micro and Macro Precision, Recall, and F1-scores will be used.
arXiv Detail & Related papers (2021-04-12T23:23:45Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
Pre-trained Models for Natural Language Processing: A Survey [75.95500552357429]
The emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.
arXiv Detail & Related papers (2020-03-18T15:22:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.