Analysis of Task Transferability in Large Pre-trained Classifiers
- URL: http://arxiv.org/abs/2307.00823v1
- Date: Mon, 3 Jul 2023 08:06:22 GMT
- Title: Analysis of Task Transferability in Large Pre-trained Classifiers
- Authors: Akshay Mehra, Yunbei Zhang, and Jihun Hamm
- Abstract summary: We analyze the transfer of performance for classification tasks, when only the last linear layer of the source model is fine-tuned on the target task.
We propose a novel Task Transfer Analysis approach that transforms the source distribution (and classifier) by changing the class prior distribution, label, and feature spaces.
We perform a large-scale empirical study by using state-of-the-art pre-trained models and demonstrate the effectiveness of our bound and optimization at predicting transferability.
- Score: 11.517862889784293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transfer learning transfers the knowledge acquired by a model from a source
task to multiple downstream target tasks with minimal fine-tuning. The success
of transfer learning at improving performance, especially with the use of large
pre-trained models has made transfer learning an essential tool in the machine
learning toolbox. However, the conditions under which the performance is
transferable to downstream tasks are not understood very well. In this work, we
analyze the transfer of performance for classification tasks, when only the
last linear layer of the source model is fine-tuned on the target task. We
propose a novel Task Transfer Analysis approach that transforms the source
distribution (and classifier) by changing the class prior distribution, label,
and feature spaces to produce a new source distribution (and classifier) and
allows us to relate the loss of the downstream task (i.e., transferability) to
that of the source task. Concretely, our bound explains transferability in
terms of the Wasserstein distance between the transformed source and downstream
task's distribution, conditional entropy between the label distributions of the
two tasks, and weighted loss of the source classifier on the source task.
Moreover, we propose an optimization problem for learning the transforms of the
source task to minimize the upper bound on transferability. We perform a
large-scale empirical study by using state-of-the-art pre-trained models and
demonstrate the effectiveness of our bound and optimization at predicting
transferability. The results of our experiments demonstrate how factors such as
task relatedness, pretraining method, and model architecture affect
transferability.
Related papers
- Enhancing Cross-task Transfer of Large Language Models via Activation Steering [75.41750053623298]
Cross-task in-context learning offers a direct solution for transferring knowledge across tasks.<n>We investigate whether cross-task transfer can be achieved via latent space steering without parameter updates or input expansion.<n>We propose a novel Cross-task Activation Steering Transfer framework that enables effective transfer by manipulating the model's internal activation states.
arXiv Detail & Related papers (2025-07-17T15:47:22Z) - Exploring the Effectiveness and Consistency of Task Selection in Intermediate-Task Transfer Learning [21.652389166495407]
We show that the transfer performance exhibits severe variance across different source tasks and training seeds.
Compared to embedding-free methods and text embeddings, task embeddings constructed from fine-tuned weights can better estimate task transferability.
We introduce a novel method that measures pairwise token similarity using maximum inner product search, leading to the highest performance in task prediction.
arXiv Detail & Related papers (2024-07-23T07:31:43Z) - Towards Estimating Transferability using Hard Subsets [25.86053764521497]
We propose HASTE, a new strategy to estimate the transferability of a source model to a particular target task using only a harder subset of target data.
We show that HASTE can be used with any existing transferability metric to improve their reliability.
Our experimental results across multiple source model architectures, target datasets, and transfer learning tasks show that HASTE modified metrics are consistently better or on par with the state of the art transferability metrics.
arXiv Detail & Related papers (2023-01-17T14:50:18Z) - An Information-Theoretic Approach to Transferability in Task Transfer
Learning [16.05523977032659]
Task transfer learning is a popular technique in image processing applications that uses pre-trained models to reduce the supervision cost of related tasks.
We present a novel metric, H-score, that estimates the performance of transferred representations from one task to another in classification problems.
arXiv Detail & Related papers (2022-12-20T08:47:17Z) - An Exploration of Data Efficiency in Intra-Dataset Task Transfer for
Dialog Understanding [65.75873687351553]
This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain.
Unintuitively, our data shows that often target task training data size has minimal effect on how sequential transfer learning performs compared to the same model without transfer learning.
arXiv Detail & Related papers (2022-10-21T04:36:46Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - SynBench: Task-Agnostic Benchmarking of Pretrained Representations using
Synthetic Data [78.21197488065177]
Recent success in fine-tuning large models, that are pretrained on broad data at scale, on downstream tasks has led to a significant paradigm shift in deep learning.
This paper proposes a new task-agnostic framework, textitSynBench, to measure the quality of pretrained representations using synthetic data.
arXiv Detail & Related papers (2022-10-06T15:25:00Z) - Exploring and Predicting Transferability across NLP Tasks [115.6278033699853]
We study the transferability between 33 NLP tasks across three broad classes of problems.
Our results show that transfer learning is more beneficial than previously thought.
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
arXiv Detail & Related papers (2020-05-02T09:39:36Z) - Intermediate-Task Transfer Learning with Pretrained Models for Natural
Language Understanding: When and Why Does It Work? [44.88358841370665]
It is poorly understood when and why intermediate-task training is beneficial for a given target task.
We perform a large-scale study on the pretrained RoBERTa model with 110 intermediate-target task combinations.
We observe that intermediate tasks requiring high-level inference and reasoning abilities tend to work best.
arXiv Detail & Related papers (2020-05-01T21:49:34Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.