PACTran: PAC-Bayesian Metrics for Estimating the Transferability of
Pretrained Models to Classification Tasks
- URL: http://arxiv.org/abs/2203.05126v1
- Date: Thu, 10 Mar 2022 02:54:56 GMT
- Title: PACTran: PAC-Bayesian Metrics for Estimating the Transferability of
Pretrained Models to Classification Tasks
- Authors: Nan Ding, Xi Chen, Tomer Levinboim, Beer Changpinyo, Radu Soricut
- Abstract summary: PACTran is a theoretically grounded family of metrics for pretrained model selection and transferability measurement.
An analysis of the results shows PACTran is a more consistent and effective transferability measure compared to existing selection methods.
- Score: 22.41824478940036
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the increasing abundance of pretrained models in recent years, the
problem of selecting the best pretrained checkpoint for a particular downstream
classification task has been gaining increased attention. Although several
methods have recently been proposed to tackle the selection problem (e.g. LEEP,
H-score), these methods resort to applying heuristics that are not well
motivated by learning theory. In this paper we present PACTran, a theoretically
grounded family of metrics for pretrained model selection and transferability
measurement. We first show how to derive PACTran metrics from the optimal
PAC-Bayesian bound under the transfer learning setting. We then empirically
evaluate three metric instantiations of PACTran on a number of vision tasks
(VTAB) as well as a language-and-vision (OKVQA) task. An analysis of the
results shows PACTran is a more consistent and effective transferability
measure compared to existing selection methods.
Related papers
- A Bayesian Approach to Data Point Selection [24.98069363998565]
Data point selection (DPS) is becoming a critical topic in deep learning.
Existing approaches to DPS are predominantly based on a bi-level optimisation (BLO) formulation.
We propose a novel Bayesian approach to DPS.
arXiv Detail & Related papers (2024-11-06T09:04:13Z) - Leveraging Estimated Transferability Over Human Intuition for Model Selection in Text Ranking [17.475727043819635]
Transferability Estimation (TE) has emerged as an effective approach to model selection.
We propose to compute the expected rank as transferability, explicitly reflecting the model's ranking capability.
Our resulting method, Adaptive Ranking Transferability (AiRTran), can effectively capture subtle differences between models.
arXiv Detail & Related papers (2024-09-24T15:48:03Z) - The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis [27.310894780313618]
This paper undertakes a comprehensive comparison of model capabilities at various pretraining intermediate checkpoints.
We confirm that specific downstream metrics exhibit similar training dynamics across models of different sizes.
In addition to our core findings, we've reproduced Amber and OpenLLaMA, releasing their intermediate checkpoints.
arXiv Detail & Related papers (2024-04-01T16:00:01Z) - Efficient Transferability Assessment for Selection of Pre-trained Detectors [63.21514888618542]
This paper studies the efficient transferability assessment of pre-trained object detectors.
We build up a detector transferability benchmark which contains a large and diverse zoo of pre-trained detectors.
Experimental results demonstrate that our method outperforms other state-of-the-art approaches in assessing transferability.
arXiv Detail & Related papers (2024-03-14T14:23:23Z) - DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection [72.25697820290502]
This work introduces a straightforward and efficient strategy to identify potential novel classes through zero-shot classification.
We refer to this approach as the self-training strategy, which enhances recall and accuracy for novel classes without requiring extra annotations, datasets, and re-training.
Empirical evaluations on three datasets, including LVIS, V3Det, and COCO, demonstrate significant improvements over the baseline performance.
arXiv Detail & Related papers (2023-10-02T17:52:24Z) - Fast and Accurate Transferability Measurement by Evaluating Intra-class
Feature Variance [20.732095457775138]
Transferability measurement is to quantify how transferable is a pre-trained model learned on a source task to a target task.
We propose TMI (TRANSFERABILITY MEASUREMENT WITH INTRA-CLASS FEATURE VARIANCE), a fast and accurate algorithm to measure transferability.
arXiv Detail & Related papers (2023-08-11T07:50:40Z) - On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts.
We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z) - A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts [143.14128737978342]
Test-time adaptation, an emerging paradigm, has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions.
Recent progress in this paradigm highlights the significant benefits of utilizing unlabeled data for training self-adapted models prior to inference.
arXiv Detail & Related papers (2023-03-27T16:32:21Z) - Adaptive Distribution Calibration for Few-Shot Learning with
Hierarchical Optimal Transport [78.9167477093745]
We propose a novel distribution calibration method by learning the adaptive weight matrix between novel samples and base classes.
Experimental results on standard benchmarks demonstrate that our proposed plug-and-play model outperforms competing approaches.
arXiv Detail & Related papers (2022-10-09T02:32:57Z) - SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
for Semantic and Generative Capabilities [76.97949110580703]
We introduce SUPERB-SG, a new benchmark to evaluate pre-trained models across various speech tasks.
We use a lightweight methodology to test the robustness of representations learned by pre-trained models under shifts in data domain.
We also show that the task diversity of SUPERB-SG coupled with limited task supervision is an effective recipe for evaluating the generalizability of model representation.
arXiv Detail & Related papers (2022-03-14T04:26:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.