Transferability Estimation using Bhattacharyya Class Separability
- URL: http://arxiv.org/abs/2111.12780v1
- Date: Wed, 24 Nov 2021 20:22:28 GMT
- Title: Transferability Estimation using Bhattacharyya Class Separability
- Authors: Michal P\'andy and Andrea Agostinelli and Jasper Uijlings and Vittorio
Ferrari and Thomas Mensink
- Abstract summary: Transfer learning is a popular method for leveraging pre-trained models in computer vision.
It is difficult to quantify which pre-trained source models are suitable for a specific target task.
We propose a novel method for quantifying transferability between a source model and a target dataset.
- Score: 37.52588126267552
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning has become a popular method for leveraging pre-trained
models in computer vision. However, without performing computationally
expensive fine-tuning, it is difficult to quantify which pre-trained source
models are suitable for a specific target task, or, conversely, to which tasks
a pre-trained source model can be easily adapted to. In this work, we propose
Gaussian Bhattacharyya Coefficient (GBC), a novel method for quantifying
transferability between a source model and a target dataset. In a first step we
embed all target images in the feature space defined by the source model, and
represent them with per-class Gaussians. Then, we estimate their pairwise class
separability using the Bhattacharyya coefficient, yielding a simple and
effective measure of how well the source model transfers to the target task. We
evaluate GBC on image classification tasks in the context of dataset and
architecture selection. Further, we also perform experiments on the more
complex semantic segmentation transferability estimation task. We demonstrate
that GBC outperforms state-of-the-art transferability metrics on most
evaluation criteria in the semantic segmentation settings, matches the
performance of top methods for dataset transferability in image classification,
and performs best on architecture selection problems for image classification.
Related papers
- Efficient Transferability Assessment for Selection of Pre-trained Detectors [63.21514888618542]
This paper studies the efficient transferability assessment of pre-trained object detectors.
We build up a detector transferability benchmark which contains a large and diverse zoo of pre-trained detectors.
Experimental results demonstrate that our method outperforms other state-of-the-art approaches in assessing transferability.
arXiv Detail & Related papers (2024-03-14T14:23:23Z) - Building a Winning Team: Selecting Source Model Ensembles using a
Submodular Transferability Estimation Approach [20.86345962679122]
Estimating the transferability of publicly available pretrained models to a target task has assumed an important place for transfer learning tasks.
We propose a novel Optimal tranSport-based suBmOdular tRaNsferability metric (OSBORN) to estimate the transferability of an ensemble of models to a downstream task.
arXiv Detail & Related papers (2023-09-05T17:57:31Z) - Fast and Accurate Transferability Measurement by Evaluating Intra-class
Feature Variance [20.732095457775138]
Transferability measurement is to quantify how transferable is a pre-trained model learned on a source task to a target task.
We propose TMI (TRANSFERABILITY MEASUREMENT WITH INTRA-CLASS FEATURE VARIANCE), a fast and accurate algorithm to measure transferability.
arXiv Detail & Related papers (2023-08-11T07:50:40Z) - Transferability Metrics for Object Detection [0.0]
Transfer learning aims to make the most of existing pre-trained models to achieve better performance on a new task in limited data scenarios.
We extend transferability metrics to object detection using ROI-Align and TLogME.
We show that TLogME provides a robust correlation with transfer performance and outperforms other transferability metrics on local and global level features.
arXiv Detail & Related papers (2023-06-27T08:49:31Z) - Towards Estimating Transferability using Hard Subsets [25.86053764521497]
We propose HASTE, a new strategy to estimate the transferability of a source model to a particular target task using only a harder subset of target data.
We show that HASTE can be used with any existing transferability metric to improve their reliability.
Our experimental results across multiple source model architectures, target datasets, and transfer learning tasks show that HASTE modified metrics are consistently better or on par with the state of the art transferability metrics.
arXiv Detail & Related papers (2023-01-17T14:50:18Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z) - Pairwise Similarity Knowledge Transfer for Weakly Supervised Object
Localization [53.99850033746663]
We study the problem of learning localization model on target classes with weakly supervised image labels.
In this work, we argue that learning only an objectness function is a weak form of knowledge transfer.
Experiments on the COCO and ILSVRC 2013 detection datasets show that the performance of the localization model improves significantly with the inclusion of pairwise similarity function.
arXiv Detail & Related papers (2020-03-18T17:53:33Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.