Analysis of Task Transferability in Large Pre-trained Classifiers
- URL: http://arxiv.org/abs/2307.00823v1
- Date: Mon, 3 Jul 2023 08:06:22 GMT
- Title: Analysis of Task Transferability in Large Pre-trained Classifiers
- Authors: Akshay Mehra, Yunbei Zhang, and Jihun Hamm
- Abstract summary: We analyze the transfer of performance for classification tasks, when only the last linear layer of the source model is fine-tuned on the target task.
We propose a novel Task Transfer Analysis approach that transforms the source distribution (and classifier) by changing the class prior distribution, label, and feature spaces.
We perform a large-scale empirical study by using state-of-the-art pre-trained models and demonstrate the effectiveness of our bound and optimization at predicting transferability.
- Score: 11.517862889784293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transfer learning transfers the knowledge acquired by a model from a source
task to multiple downstream target tasks with minimal fine-tuning. The success
of transfer learning at improving performance, especially with the use of large
pre-trained models has made transfer learning an essential tool in the machine
learning toolbox. However, the conditions under which the performance is
transferable to downstream tasks are not understood very well. In this work, we
analyze the transfer of performance for classification tasks, when only the
last linear layer of the source model is fine-tuned on the target task. We
propose a novel Task Transfer Analysis approach that transforms the source
distribution (and classifier) by changing the class prior distribution, label,
and feature spaces to produce a new source distribution (and classifier) and
allows us to relate the loss of the downstream task (i.e., transferability) to
that of the source task. Concretely, our bound explains transferability in
terms of the Wasserstein distance between the transformed source and downstream
task's distribution, conditional entropy between the label distributions of the
two tasks, and weighted loss of the source classifier on the source task.
Moreover, we propose an optimization problem for learning the transforms of the
source task to minimize the upper bound on transferability. We perform a
large-scale empirical study by using state-of-the-art pre-trained models and
demonstrate the effectiveness of our bound and optimization at predicting
transferability. The results of our experiments demonstrate how factors such as
task relatedness, pretraining method, and model architecture affect
transferability.
Related papers
- Robust Transfer Learning with Unreliable Source Data [13.276850367115333]
We introduce a novel quantity called the ''ambiguity level'' that measures the discrepancy between the target and source regression functions.
We propose a simple transfer learning procedure, and establish a general theorem that shows how this new quantity is related to the transferability of learning.
arXiv Detail & Related papers (2023-10-06T21:50:21Z) - Optimal transfer protocol by incremental layer defrosting [66.76153955485584]
Transfer learning is a powerful tool enabling model training with limited amounts of data.
The simplest transfer learning protocol is based on freezing" the feature-extractor layers of a network pre-trained on a data-rich source task.
We show that this protocol is often sub-optimal and the largest performance gain may be achieved when smaller portions of the pre-trained network are kept frozen.
arXiv Detail & Related papers (2023-03-02T17:32:11Z) - Towards Estimating Transferability using Hard Subsets [25.86053764521497]
We propose HASTE, a new strategy to estimate the transferability of a source model to a particular target task using only a harder subset of target data.
We show that HASTE can be used with any existing transferability metric to improve their reliability.
Our experimental results across multiple source model architectures, target datasets, and transfer learning tasks show that HASTE modified metrics are consistently better or on par with the state of the art transferability metrics.
arXiv Detail & Related papers (2023-01-17T14:50:18Z) - An Information-Theoretic Approach to Transferability in Task Transfer
Learning [16.05523977032659]
Task transfer learning is a popular technique in image processing applications that uses pre-trained models to reduce the supervision cost of related tasks.
We present a novel metric, H-score, that estimates the performance of transferred representations from one task to another in classification problems.
arXiv Detail & Related papers (2022-12-20T08:47:17Z) - Transferability Estimation Based On Principal Gradient Expectation [68.97403769157117]
Cross-task transferability is compatible with transferred results while keeping self-consistency.
Existing transferability metrics are estimated on the particular model by conversing source and target tasks.
We propose Principal Gradient Expectation (PGE), a simple yet effective method for assessing transferability across tasks.
arXiv Detail & Related papers (2022-11-29T15:33:02Z) - An Exploration of Data Efficiency in Intra-Dataset Task Transfer for
Dialog Understanding [65.75873687351553]
This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain.
Unintuitively, our data shows that often target task training data size has minimal effect on how sequential transfer learning performs compared to the same model without transfer learning.
arXiv Detail & Related papers (2022-10-21T04:36:46Z) - Frustratingly Easy Transferability Estimation [64.42879325144439]
We propose a simple, efficient, and effective transferability measure named TransRate.
TransRate measures the transferability as the mutual information between the features of target examples extracted by a pre-trained model and labels of them.
Despite its extraordinary simplicity in 10 lines of codes, TransRate performs remarkably well in extensive evaluations on 22 pre-trained models and 16 downstream tasks.
arXiv Detail & Related papers (2021-06-17T10:27:52Z) - Unsupervised Transfer Learning for Spatiotemporal Predictive Networks [90.67309545798224]
We study how to transfer knowledge from a zoo of unsupervisedly learned models towards another network.
Our motivation is that models are expected to understand complex dynamics from different sources.
Our approach yields significant improvements on three benchmarks fortemporal prediction, and benefits the target even from less relevant ones.
arXiv Detail & Related papers (2020-09-24T15:40:55Z) - Exploring and Predicting Transferability across NLP Tasks [115.6278033699853]
We study the transferability between 33 NLP tasks across three broad classes of problems.
Our results show that transfer learning is more beneficial than previously thought.
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
arXiv Detail & Related papers (2020-05-02T09:39:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.