Generalization Performance of Transfer Learning: Overparameterized and
Underparameterized Regimes
- URL: http://arxiv.org/abs/2306.04901v2
- Date: Fri, 9 Jun 2023 00:45:06 GMT
- Title: Generalization Performance of Transfer Learning: Overparameterized and
Underparameterized Regimes
- Authors: Peizhong Ju, Sen Lin, Mark S. Squillante, Yingbin Liang, Ness B.
Shroff
- Abstract summary: In real-world applications, tasks often exhibit partial similarity, where certain aspects are similar while others are different or irrelevant.
Our study explores various types of transfer learning, encompassing two options for parameter transfer.
We provide practical guidelines for determining the number of features in the common and task-specific parts for improved generalization performance.
- Score: 61.22448274621503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning is a useful technique for achieving improved performance
and reducing training costs by leveraging the knowledge gained from source
tasks and applying it to target tasks. Assessing the effectiveness of transfer
learning relies on understanding the similarity between the ground truth of the
source and target tasks. In real-world applications, tasks often exhibit
partial similarity, where certain aspects are similar while others are
different or irrelevant. To investigate the impact of partial similarity on
transfer learning performance, we focus on a linear regression model with two
distinct sets of features: a common part shared across tasks and a
task-specific part. Our study explores various types of transfer learning,
encompassing two options for parameter transfer. By establishing a theoretical
characterization on the error of the learned model, we compare these transfer
learning options, particularly examining how generalization performance changes
with the number of features/parameters in both underparameterized and
overparameterized regimes. Furthermore, we provide practical guidelines for
determining the number of features in the common and task-specific parts for
improved generalization performance. For example, when the total number of
features in the source task's learning model is fixed, we show that it is more
advantageous to allocate a greater number of redundant features to the
task-specific part rather than the common part. Moreover, in specific
scenarios, particularly those characterized by high noise levels and small true
parameters, sacrificing certain true features in the common part in favor of
employing more redundant features in the task-specific part can yield notable
benefits.
Related papers
- Bridging Domains with Approximately Shared Features [26.096779584142986]
Multi-source domain adaptation aims to reduce performance degradation when applying machine learning models to unseen domains.
Some advocate for learning invariant features from source domains, while others favor more diverse features.
We propose a statistical framework that distinguishes the utilities of features based on the variance of their correlation to label $y$ across domains.
arXiv Detail & Related papers (2024-03-11T04:25:41Z) - Multi-task Bias-Variance Trade-off Through Functional Constraints [102.64082402388192]
Multi-task learning aims to acquire a set of functions that perform well for diverse tasks.
In this paper we draw intuition from the two extreme learning scenarios -- a single function for all tasks, and a task-specific function that ignores the other tasks.
We introduce a constrained learning formulation that enforces domain specific solutions to a central function.
arXiv Detail & Related papers (2022-10-27T16:06:47Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z) - Uniform Priors for Data-Efficient Transfer [65.086680950871]
We show that features that are most transferable have high uniformity in the embedding space.
We evaluate the regularization on its ability to facilitate adaptation to unseen tasks and data.
arXiv Detail & Related papers (2020-06-30T04:39:36Z) - Double Double Descent: On Generalization Errors in Transfer Learning
between Linear Regression Tasks [30.075430694663293]
We study the transfer learning process between two linear regression problems.
We examine a parameter transfer mechanism whereby a subset of the parameters of the target task solution are constrained to the values learned for a related source task.
arXiv Detail & Related papers (2020-06-12T08:42:14Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.