Model Spider: Learning to Rank Pre-Trained Models Efficiently
- URL: http://arxiv.org/abs/2306.03900v1
- Date: Tue, 6 Jun 2023 17:58:12 GMT
- Title: Model Spider: Learning to Rank Pre-Trained Models Efficiently
- Authors: Yi-Kai Zhang, Ting-Ji Huang, Yao-Xiang Ding, De-Chuan Zhan, Han-Jia Ye
- Abstract summary: Model Spider learns to construct tokens and measure the fitness score between a model-task pair via their tokens.
Model Spider balances efficiency and selection ability, making PTM selection like a spider preying on a web.
- Score: 42.56392378060269
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Figuring out which Pre-Trained Model (PTM) from a model zoo fits the target
task is essential to take advantage of plentiful model resources. With the
availability of numerous heterogeneous PTMs from diverse fields, efficiently
selecting the most suitable PTM is challenging due to the time-consuming costs
of carrying out forward or backward passes over all PTMs. In this paper, we
propose Model Spider, which tokenizes both PTMs and tasks by summarizing their
characteristics into vectors to enable efficient PTM selection. By leveraging
the approximated performance of PTMs on a separate set of training tasks, Model
Spider learns to construct tokens and measure the fitness score between a
model-task pair via their tokens. The ability to rank relevant PTMs higher than
others generalizes to new tasks. With the top-ranked PTM candidates, we further
learn to enrich task tokens with their PTM-specific semantics to re-rank the
PTMs for better selection. Model Spider balances efficiency and selection
ability, making PTM selection like a spider preying on a web. Model Spider
demonstrates promising performance in various configurations of model zoos.
Related papers
- MeToken: Uniform Micro-environment Token Boosts Post-Translational Modification Prediction [65.33218256339151]
Post-translational modifications (PTMs) profoundly expand the complexity and functionality of the proteome.
Existing computational approaches predominantly focus on protein sequences to predict PTM sites, driven by the recognition of sequence-dependent motifs.
We introduce the MeToken model, which tokenizes the micro-environment of each acid, integrating both sequence and structural information into unified discrete tokens.
arXiv Detail & Related papers (2024-11-04T07:14:28Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Automated categorization of pre-trained models for software engineering: A case study with a Hugging Face dataset [9.218130273952383]
Software engineering activities have been revolutionized by the advent of pre-trained models (PTMs)
The Hugging Face (HF) platform simplifies the use of PTMs by collecting, storing, and curating several models.
This paper introduces an approach to enable the automatic classification of PTMs for SE tasks.
arXiv Detail & Related papers (2024-05-21T20:26:17Z) - Rethinking Class-incremental Learning in the Era of Large Pre-trained Models via Test-Time Adaptation [20.62749699589017]
Class-incremental learning (CIL) is a challenging task that involves sequentially learning to categorize classes from new tasks.
We propose Test-Time Adaptation for Class-Incremental Learning (TTACIL) that first fine-tunes PTMs using Adapters on the first task.
Our TTACIL does not undergo any forgetting, while benefiting each task with the rich PTM features.
arXiv Detail & Related papers (2023-10-17T13:06:39Z) - Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need [84.3507610522086]
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones.
Recent pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL.
We argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring.
arXiv Detail & Related papers (2023-03-13T17:59:02Z) - ZooD: Exploiting Model Zoo for Out-of-Distribution Generalization [65.58562481279023]
We propose ZooD, a paradigm for PTMs ranking and ensemble with feature selection.
We evaluate our paradigm on a diverse model zoo consisting of 35 models for various Out-of-Distribution (OoD) tasks.
arXiv Detail & Related papers (2022-10-17T16:31:57Z) - Ranking and Tuning Pre-trained Models: A New Paradigm of Exploiting
Model Hubs [136.4492678691406]
We propose a new paradigm of exploiting model hubs by ranking and tuning pre-trained models.
The best ranked PTM can be fine-tuned and deployed if we have no preference for the model's architecture.
The tuning part introduces a novel method for multiple PTMs tuning, which surpasses dedicated methods.
arXiv Detail & Related papers (2021-10-20T12:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.