Building a Winning Team: Selecting Source Model Ensembles using a
Submodular Transferability Estimation Approach
- URL: http://arxiv.org/abs/2309.02429v1
- Date: Tue, 5 Sep 2023 17:57:31 GMT
- Title: Building a Winning Team: Selecting Source Model Ensembles using a
Submodular Transferability Estimation Approach
- Authors: Vimal K B, Saketh Bachu, Tanmay Garg, Niveditha Lakshmi Narasimhan,
Raghavan Konuru and Vineeth N Balasubramanian
- Abstract summary: Estimating the transferability of publicly available pretrained models to a target task has assumed an important place for transfer learning tasks.
We propose a novel Optimal tranSport-based suBmOdular tRaNsferability metric (OSBORN) to estimate the transferability of an ensemble of models to a downstream task.
- Score: 20.86345962679122
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating the transferability of publicly available pretrained models to a
target task has assumed an important place for transfer learning tasks in
recent years. Existing efforts propose metrics that allow a user to choose one
model from a pool of pre-trained models without having to fine-tune each model
individually and identify one explicitly. With the growth in the number of
available pre-trained models and the popularity of model ensembles, it also
becomes essential to study the transferability of multiple-source models for a
given target task. The few existing efforts study transferability in such
multi-source ensemble settings using just the outputs of the classification
layer and neglect possible domain or task mismatch. Moreover, they overlook the
most important factor while selecting the source models, viz., the cohesiveness
factor between them, which can impact the performance and confidence in the
prediction of the ensemble. To address these gaps, we propose a novel Optimal
tranSport-based suBmOdular tRaNsferability metric (OSBORN) to estimate the
transferability of an ensemble of models to a downstream task. OSBORN
collectively accounts for image domain difference, task difference, and
cohesiveness of models in the ensemble to provide reliable estimates of
transferability. We gauge the performance of OSBORN on both image
classification and semantic segmentation tasks. Our setup includes 28 source
datasets, 11 target datasets, 5 model architectures, and 2 pre-training
methods. We benchmark our method against current state-of-the-art metrics
MS-LEEP and E-LEEP, and outperform them consistently using the proposed
approach.
Related papers
- A Two-Phase Recall-and-Select Framework for Fast Model Selection [13.385915962994806]
We propose a two-phase (coarse-recall and fine-selection) model selection framework.
It aims to enhance the efficiency of selecting a robust model by leveraging the models' training performances on benchmark datasets.
It has been demonstrated that the proposed methodology facilitates the selection of a high-performing model at a rate about 3x times faster than conventional baseline methods.
arXiv Detail & Related papers (2024-03-28T14:44:44Z) - MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining [73.81862342673894]
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks.
transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks.
We conduct multi-task supervised pretraining on the SAMRS dataset, encompassing semantic segmentation, instance segmentation, and rotated object detection.
Our models are finetuned on various RS downstream tasks, such as scene classification, horizontal and rotated object detection, semantic segmentation, and change detection.
arXiv Detail & Related papers (2024-03-20T09:17:22Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Universal Semi-supervised Model Adaptation via Collaborative Consistency
Training [92.52892510093037]
We introduce a realistic and challenging domain adaptation problem called Universal Semi-supervised Model Adaptation (USMA)
We propose a collaborative consistency training framework that regularizes the prediction consistency between two models.
Experimental results demonstrate the effectiveness of our method on several benchmark datasets.
arXiv Detail & Related papers (2023-07-07T08:19:40Z) - Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data.
However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations.
This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z) - Towards Estimating Transferability using Hard Subsets [25.86053764521497]
We propose HASTE, a new strategy to estimate the transferability of a source model to a particular target task using only a harder subset of target data.
We show that HASTE can be used with any existing transferability metric to improve their reliability.
Our experimental results across multiple source model architectures, target datasets, and transfer learning tasks show that HASTE modified metrics are consistently better or on par with the state of the art transferability metrics.
arXiv Detail & Related papers (2023-01-17T14:50:18Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Transferability Estimation using Bhattacharyya Class Separability [37.52588126267552]
Transfer learning is a popular method for leveraging pre-trained models in computer vision.
It is difficult to quantify which pre-trained source models are suitable for a specific target task.
We propose a novel method for quantifying transferability between a source model and a target dataset.
arXiv Detail & Related papers (2021-11-24T20:22:28Z) - Transferring model structure in Bayesian transfer learning for Gaussian
process regression [1.370633147306388]
This paper defines the task of conditioning a target probability distribution on a transferred source distribution.
Fully probabilistic design is adopted to solve this optimal decision-making problem in the target.
By successfully transferring higher moments of the source, the target can reject unreliable source knowledge.
arXiv Detail & Related papers (2021-01-18T05:28:02Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.