ETran: Energy-Based Transferability Estimation
- URL: http://arxiv.org/abs/2308.02027v1
- Date: Thu, 3 Aug 2023 20:41:08 GMT
- Title: ETran: Energy-Based Transferability Estimation
- Authors: Mohsen Gholami, Mohammad Akbari, Xinglu Wang, Behnam Kamranian, Yong
Zhang
- Abstract summary: We argue that quantifying whether the target dataset is in-distribution (IND) or out-of-distribution (OOD) for the pre-trained model is an important factor in the transferability estimation.
We use energy-based models to determine whether the target dataset is OOD or IND for the pre-trained model.
This is the first work that proposes transferability estimation for object detection task.
- Score: 8.15331116331861
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses the problem of ranking pre-trained models for object
detection and image classification. Selecting the best pre-trained model by
fine-tuning is an expensive and time-consuming task. Previous works have
proposed transferability estimation based on features extracted by the
pre-trained models. We argue that quantifying whether the target dataset is
in-distribution (IND) or out-of-distribution (OOD) for the pre-trained model is
an important factor in the transferability estimation. To this end, we propose
ETran, an energy-based transferability assessment metric, which includes three
scores: 1) energy score, 2) classification score, and 3) regression score. We
use energy-based models to determine whether the target dataset is OOD or IND
for the pre-trained model. In contrast to the prior works, ETran is applicable
to a wide range of tasks including classification, regression, and object
detection (classification+regression). This is the first work that proposes
transferability estimation for object detection task. Our extensive experiments
on four benchmarks and two tasks show that ETran outperforms previous works on
object detection and classification benchmarks by an average of 21% and 12%,
respectively, and achieves SOTA in transferability assessment.
Related papers
- KITE: A Kernel-based Improved Transferability Estimation Method [7.859384515308456]
We introduce Kite, as a Kernel-based Improved Transferability Estimation method.
Kite is based on the key observations that the separability of the pre-trained features and the similarity of the pre-trained features to random features are two important factors for estimating transferability.
We evaluate the performance of Kite on a recently introduced large-scale model selection benchmark.
arXiv Detail & Related papers (2024-05-01T21:58:04Z) - Efficient Transferability Assessment for Selection of Pre-trained Detectors [63.21514888618542]
This paper studies the efficient transferability assessment of pre-trained object detectors.
We build up a detector transferability benchmark which contains a large and diverse zoo of pre-trained detectors.
Experimental results demonstrate that our method outperforms other state-of-the-art approaches in assessing transferability.
arXiv Detail & Related papers (2024-03-14T14:23:23Z) - Building a Winning Team: Selecting Source Model Ensembles using a
Submodular Transferability Estimation Approach [20.86345962679122]
Estimating the transferability of publicly available pretrained models to a target task has assumed an important place for transfer learning tasks.
We propose a novel Optimal tranSport-based suBmOdular tRaNsferability metric (OSBORN) to estimate the transferability of an ensemble of models to a downstream task.
arXiv Detail & Related papers (2023-09-05T17:57:31Z) - Fast and Accurate Transferability Measurement by Evaluating Intra-class
Feature Variance [20.732095457775138]
Transferability measurement is to quantify how transferable is a pre-trained model learned on a source task to a target task.
We propose TMI (TRANSFERABILITY MEASUREMENT WITH INTRA-CLASS FEATURE VARIANCE), a fast and accurate algorithm to measure transferability.
arXiv Detail & Related papers (2023-08-11T07:50:40Z) - How to Estimate Model Transferability of Pre-Trained Speech Models? [84.11085139766108]
"Score-based assessment" framework for estimating transferability of pre-trained speech models.
We leverage upon two representation theories, Bayesian likelihood estimation and optimal transport, to generate rank scores for the PSM candidates.
Our framework efficiently computes transferability scores without actual fine-tuning of candidate models or layers.
arXiv Detail & Related papers (2023-06-01T04:52:26Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Transferability Estimation Based On Principal Gradient Expectation [68.97403769157117]
Cross-task transferability is compatible with transferred results while keeping self-consistency.
Existing transferability metrics are estimated on the particular model by conversing source and target tasks.
We propose Principal Gradient Expectation (PGE), a simple yet effective method for assessing transferability across tasks.
arXiv Detail & Related papers (2022-11-29T15:33:02Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Estimating and Evaluating Regression Predictive Uncertainty in Deep
Object Detectors [9.273998041238224]
We show that training variance networks with negative log likelihood (NLL) can lead to high entropy predictive distributions.
We propose to use the energy score as a non-local proper scoring rule and find that when used for training, the energy score leads to better calibrated and lower entropy predictive distributions.
arXiv Detail & Related papers (2021-01-13T12:53:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.