Transferability-Guided Cross-Domain Cross-Task Transfer Learning
- URL: http://arxiv.org/abs/2207.05510v2
- Date: Thu, 29 Feb 2024 06:53:00 GMT
- Title: Transferability-Guided Cross-Domain Cross-Task Transfer Learning
- Authors: Yang Tan, Enming Zhang, Yang Li, Shao-Lun Huang, Xiao-Ping Zhang
- Abstract summary: We propose two novel transferability metrics F-OTCE and JC-OTCE.
F-OTCE estimates transferability by first solving an Optimal Transport problem between source and target distributions.
JC-OTCE improves the transferability of F-OTCE by including label distances in the OT problem.
- Score: 21.812715282796255
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We propose two novel transferability metrics F-OTCE (Fast Optimal Transport
based Conditional Entropy) and JC-OTCE (Joint Correspondence OTCE) to evaluate
how much the source model (task) can benefit the learning of the target task
and to learn more transferable representations for cross-domain cross-task
transfer learning. Unlike the existing metric that requires evaluating the
empirical transferability on auxiliary tasks, our metrics are auxiliary-free
such that they can be computed much more efficiently. Specifically, F-OTCE
estimates transferability by first solving an Optimal Transport (OT) problem
between source and target distributions, and then uses the optimal coupling to
compute the Negative Conditional Entropy between source and target labels. It
can also serve as a loss function to maximize the transferability of the source
model before finetuning on the target task. Meanwhile, JC-OTCE improves the
transferability robustness of F-OTCE by including label distances in the OT
problem, though it may incur additional computation cost. Extensive experiments
demonstrate that F-OTCE and JC-OTCE outperform state-of-the-art auxiliary-free
metrics by 18.85% and 28.88%, respectively in correlation coefficient with the
ground-truth transfer accuracy. By eliminating the training cost of auxiliary
tasks, the two metrics reduces the total computation time of the previous
method from 43 minutes to 9.32s and 10.78s, respectively, for a pair of tasks.
When used as a loss function, F-OTCE shows consistent improvements on the
transfer accuracy of the source model in few-shot classification experiments,
with up to 4.41% accuracy gain.
Related papers
- Transferability Estimation Based On Principal Gradient Expectation [68.97403769157117]
Cross-task transferability is compatible with transferred results while keeping self-consistency.
Existing transferability metrics are estimated on the particular model by conversing source and target tasks.
We propose Principal Gradient Expectation (PGE), a simple yet effective method for assessing transferability across tasks.
arXiv Detail & Related papers (2022-11-29T15:33:02Z) - Identifying Suitable Tasks for Inductive Transfer Through the Analysis
of Feature Attributions [78.55044112903148]
We use explainability techniques to predict whether task pairs will be complementary, through comparison of neural network activation between single-task models.
Our results show that, through this approach, it is possible to reduce training time by up to 83.5% at a cost of only 0.034 reduction in positive-class F1 on the TREC-IS 2020-A dataset.
arXiv Detail & Related papers (2022-02-02T15:51:07Z) - On Transferability of Prompt Tuning for Natural Language Understanding [63.29235426932978]
We investigate the transferability of soft prompts across different tasks and models.
We find that trained soft prompts can well transfer to similar tasks and initialize PT for them to accelerate training and improve performance.
Our findings show that improving PT with knowledge transfer is possible and promising, while prompts' cross-task transferability is generally better than the cross-model transferability.
arXiv Detail & Related papers (2021-11-12T13:39:28Z) - Transferability Estimation for Semantic Segmentation Task [20.07223947190349]
We extend the recent transferability metric OTCE score to the semantic segmentation task.
The challenge in applying the OTCE score is the high dimensional segmentation output, which is difficult to find the optimal coupling between so many pixels under an acceptable cost.
Experimental evaluation on Cityscapes, BDD100K and GTA5 datasets demonstrates that the OTCE score highly correlates with the transfer performance.
arXiv Detail & Related papers (2021-09-30T16:21:17Z) - Practical Transferability Estimation for Image Classification Tasks [20.07223947190349]
A major challenge is how to make transfereability estimation robust under the cross-domain cross-task settings.
The recently proposed OTCE score solves this problem by considering both domain and task differences.
We propose a practical transferability metric called JC-NCE score that dramatically improves the robustness of the task difference estimation.
arXiv Detail & Related papers (2021-06-19T11:59:11Z) - Frustratingly Easy Transferability Estimation [64.42879325144439]
We propose a simple, efficient, and effective transferability measure named TransRate.
TransRate measures the transferability as the mutual information between the features of target examples extracted by a pre-trained model and labels of them.
Despite its extraordinary simplicity in 10 lines of codes, TransRate performs remarkably well in extensive evaluations on 22 pre-trained models and 16 downstream tasks.
arXiv Detail & Related papers (2021-06-17T10:27:52Z) - OTCE: A Transferability Metric for Cross-Domain Cross-Task
Representations [6.730043708859326]
We propose a transferability metric called Optimal Transport based Conditional Entropy (OTCE)
OTCE characterizes transferability as a combination of domain difference and task difference, and explicitly evaluates them from data in a unified framework.
Experiments on the largest cross-domain dataset DomainNet and Office31 demonstrate that OTCE shows an average of 21% gain in the correlation with the ground truth transfer accuracy.
arXiv Detail & Related papers (2021-03-25T13:51:33Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Exploring and Predicting Transferability across NLP Tasks [115.6278033699853]
We study the transferability between 33 NLP tasks across three broad classes of problems.
Our results show that transfer learning is more beneficial than previously thought.
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
arXiv Detail & Related papers (2020-05-02T09:39:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.