Understanding the Transferability of Representations via Task-Relatedness
- URL: http://arxiv.org/abs/2307.00823v2
- Date: Mon, 28 Oct 2024 21:21:02 GMT
- Title: Understanding the Transferability of Representations via Task-Relatedness
- Authors: Akshay Mehra, Yunbei Zhang, Jihun Hamm,
- Abstract summary: We propose a novel analysis that analyzes the transferability of the representations of pre-trained models to downstream tasks in terms of their relatedness to a given reference task.
Our experiments using state-of-the-art pre-trained models show the effectiveness of task-relatedness in explaining transferability on various vision and language tasks.
- Score: 8.425690424016986
- License:
- Abstract: The growing popularity of transfer learning, due to the availability of models pre-trained on vast amounts of data, makes it imperative to understand when the knowledge of these pre-trained models can be transferred to obtain high-performing models on downstream target tasks. However, the exact conditions under which transfer learning succeeds in a cross-domain cross-task setting are still poorly understood. To bridge this gap, we propose a novel analysis that analyzes the transferability of the representations of pre-trained models to downstream tasks in terms of their relatedness to a given reference task. Our analysis leads to an upper bound on transferability in terms of task-relatedness, quantified using the difference between the class priors, label sets, and features of the two tasks. Our experiments using state-of-the-art pre-trained models show the effectiveness of task-relatedness in explaining transferability on various vision and language tasks. The efficient computability of task-relatedness even without labels of the target task and its high correlation with the model's accuracy after end-to-end fine-tuning on the target task makes it a useful metric for transferability estimation. Our empirical results of using task-relatedness to select the best pre-trained model from a model zoo for a target task highlight its utility for practical problems.
Related papers
- Exploring the Effectiveness and Consistency of Task Selection in Intermediate-Task Transfer Learning [21.652389166495407]
We show that the transfer performance exhibits severe variance across different source tasks and training seeds.
Compared to embedding-free methods and text embeddings, task embeddings constructed from fine-tuned weights can better estimate task transferability.
We introduce a novel method that measures pairwise token similarity using maximum inner product search, leading to the highest performance in task prediction.
arXiv Detail & Related papers (2024-07-23T07:31:43Z) - Towards Estimating Transferability using Hard Subsets [25.86053764521497]
We propose HASTE, a new strategy to estimate the transferability of a source model to a particular target task using only a harder subset of target data.
We show that HASTE can be used with any existing transferability metric to improve their reliability.
Our experimental results across multiple source model architectures, target datasets, and transfer learning tasks show that HASTE modified metrics are consistently better or on par with the state of the art transferability metrics.
arXiv Detail & Related papers (2023-01-17T14:50:18Z) - An Information-Theoretic Approach to Transferability in Task Transfer
Learning [16.05523977032659]
Task transfer learning is a popular technique in image processing applications that uses pre-trained models to reduce the supervision cost of related tasks.
We present a novel metric, H-score, that estimates the performance of transferred representations from one task to another in classification problems.
arXiv Detail & Related papers (2022-12-20T08:47:17Z) - An Exploration of Data Efficiency in Intra-Dataset Task Transfer for
Dialog Understanding [65.75873687351553]
This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain.
Unintuitively, our data shows that often target task training data size has minimal effect on how sequential transfer learning performs compared to the same model without transfer learning.
arXiv Detail & Related papers (2022-10-21T04:36:46Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - SynBench: Task-Agnostic Benchmarking of Pretrained Representations using
Synthetic Data [78.21197488065177]
Recent success in fine-tuning large models, that are pretrained on broad data at scale, on downstream tasks has led to a significant paradigm shift in deep learning.
This paper proposes a new task-agnostic framework, textitSynBench, to measure the quality of pretrained representations using synthetic data.
arXiv Detail & Related papers (2022-10-06T15:25:00Z) - Exploring and Predicting Transferability across NLP Tasks [115.6278033699853]
We study the transferability between 33 NLP tasks across three broad classes of problems.
Our results show that transfer learning is more beneficial than previously thought.
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
arXiv Detail & Related papers (2020-05-02T09:39:36Z) - Intermediate-Task Transfer Learning with Pretrained Models for Natural
Language Understanding: When and Why Does It Work? [44.88358841370665]
It is poorly understood when and why intermediate-task training is beneficial for a given target task.
We perform a large-scale study on the pretrained RoBERTa model with 110 intermediate-target task combinations.
We observe that intermediate tasks requiring high-level inference and reasoning abilities tend to work best.
arXiv Detail & Related papers (2020-05-01T21:49:34Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.