Model Reuse with Reduced Kernel Mean Embedding Specification
- URL: http://arxiv.org/abs/2001.07135v1
- Date: Mon, 20 Jan 2020 15:15:07 GMT
- Title: Model Reuse with Reduced Kernel Mean Embedding Specification
- Authors: Xi-Zhu Wu, Wenkai Xu, Song Liu, and Zhi-Hua Zhou
- Abstract summary: We present a two-phase framework for finding helpful models for a current application.
In the upload phase, when a model is uploading into the pool, we construct a reduced kernel mean embedding (RKME) as a specification for the model.
Then in the deployment phase, the relatedness of the current task and pre-trained models will be measured based on the value of the RKME specification.
- Score: 70.044322798187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given a publicly available pool of machine learning models constructed for
various tasks, when a user plans to build a model for her own machine learning
application, is it possible to build upon models in the pool such that the
previous efforts on these existing models can be reused rather than starting
from scratch? Here, a grand challenge is how to find models that are helpful
for the current application, without accessing the raw training data for the
models in the pool. In this paper, we present a two-phase framework. In the
upload phase, when a model is uploading into the pool, we construct a reduced
kernel mean embedding (RKME) as a specification for the model. Then in the
deployment phase, the relatedness of the current task and pre-trained models
will be measured based on the value of the RKME specification. Theoretical
results and extensive experiments validate the effectiveness of our approach.
Related papers
- WAVE: Weight Template for Adaptive Initialization of Variable-sized Models [37.97945436202779]
WAVE achieves state-of-the-art performance when initializing models with various depth and width.
WAVE simultaneously achieves the most efficient knowledge transfer across a series of datasets.
arXiv Detail & Related papers (2024-06-25T12:43:33Z) - Data Shapley in One Training Run [88.59484417202454]
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts.
Existing approaches require re-training models on different data subsets, which is computationally intensive.
This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest.
arXiv Detail & Related papers (2024-06-16T17:09:24Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - MGE: A Training-Free and Efficient Model Generation and Enhancement
Scheme [10.48591131837771]
This paper proposes a Training-Free and Efficient Model Generation and Enhancement Scheme (MGE)
It considers two aspects during the model generation process: the distribution of model parameters and model performance.
Experiments result shows that generated models are comparable to models obtained through normal training, and even superior in some cases.
arXiv Detail & Related papers (2024-02-27T13:12:00Z) - ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model
Reuse [59.500060790983994]
This paper introduces ZhiJian, a comprehensive and user-friendly toolbox for model reuse, utilizing the PyTorch backend.
ZhiJian presents a novel paradigm that unifies diverse perspectives on model reuse, encompassing target architecture construction with PTM, tuning target model with PTM, and PTM-based inference.
arXiv Detail & Related papers (2023-08-17T19:12:13Z) - Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large
Language Models [11.57282859281814]
We consider different knowledge levels and attribution strategies, and find that we can correctly trace back 8 out of the 10 fine tuned models with our best method.
arXiv Detail & Related papers (2023-06-15T17:42:48Z) - Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data.
However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations.
This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Assemble Foundation Models for Automatic Code Summarization [9.53949558569201]
We propose a flexible and robust approach for automatic code summarization based on neural networks.
We assemble available foundation models, such as CodeBERT and GPT-2, into a single model named AdaMo.
We introduce two adaptive schemes from the perspective of knowledge transfer, namely continuous pretraining and intermediate finetuning.
arXiv Detail & Related papers (2022-01-13T21:38:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.