Double Double Descent: On Generalization Errors in Transfer Learning
between Linear Regression Tasks
- URL: http://arxiv.org/abs/2006.07002v8
- Date: Wed, 28 Sep 2022 15:24:12 GMT
- Title: Double Double Descent: On Generalization Errors in Transfer Learning
between Linear Regression Tasks
- Authors: Yehuda Dar and Richard G. Baraniuk
- Abstract summary: We study the transfer learning process between two linear regression problems.
We examine a parameter transfer mechanism whereby a subset of the parameters of the target task solution are constrained to the values learned for a related source task.
- Score: 30.075430694663293
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the transfer learning process between two linear regression
problems. An important and timely special case is when the regressors are
overparameterized and perfectly interpolate their training data. We examine a
parameter transfer mechanism whereby a subset of the parameters of the target
task solution are constrained to the values learned for a related source task.
We analytically characterize the generalization error of the target task in
terms of the salient factors in the transfer learning architecture, i.e., the
number of examples available, the number of (free) parameters in each of the
tasks, the number of parameters transferred from the source to target task, and
the relation between the two tasks. Our non-asymptotic analysis shows that the
generalization error of the target task follows a two-dimensional double
descent trend (with respect to the number of free parameters in each of the
tasks) that is controlled by the transfer learning factors. Our analysis points
to specific cases where the transfer of parameters is beneficial as a
substitute for extra overparameterization (i.e., additional free parameters in
the target task). Specifically, we show that the usefulness of a transfer
learning setting is fragile and depends on a delicate interplay among the set
of transferred parameters, the relation between the tasks, and the true
solution. We also demonstrate that overparameterized transfer learning is not
necessarily more beneficial when the source task is closer or identical to the
target task.
Related papers
- Generalization Performance of Transfer Learning: Overparameterized and
Underparameterized Regimes [61.22448274621503]
In real-world applications, tasks often exhibit partial similarity, where certain aspects are similar while others are different or irrelevant.
Our study explores various types of transfer learning, encompassing two options for parameter transfer.
We provide practical guidelines for determining the number of features in the common and task-specific parts for improved generalization performance.
arXiv Detail & Related papers (2023-06-08T03:08:40Z) - Task Difficulty Aware Parameter Allocation & Regularization for Lifelong
Learning [20.177260510548535]
We propose the Allocation & Regularization (PAR), which adaptively select an appropriate strategy for each task from parameter allocation and regularization based on its learning difficulty.
Our method is scalable and significantly reduces the model's redundancy while improving the model's performance.
arXiv Detail & Related papers (2023-04-11T15:38:21Z) - ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z) - Transferability Estimation Based On Principal Gradient Expectation [68.97403769157117]
Cross-task transferability is compatible with transferred results while keeping self-consistency.
Existing transferability metrics are estimated on the particular model by conversing source and target tasks.
We propose Principal Gradient Expectation (PGE), a simple yet effective method for assessing transferability across tasks.
arXiv Detail & Related papers (2022-11-29T15:33:02Z) - Multi-task Bias-Variance Trade-off Through Functional Constraints [102.64082402388192]
Multi-task learning aims to acquire a set of functions that perform well for diverse tasks.
In this paper we draw intuition from the two extreme learning scenarios -- a single function for all tasks, and a task-specific function that ignores the other tasks.
We introduce a constrained learning formulation that enforces domain specific solutions to a central function.
arXiv Detail & Related papers (2022-10-27T16:06:47Z) - The Common Intuition to Transfer Learning Can Win or Lose: Case Studies for Linear Regression [26.5147705530439]
We define a transfer learning approach to the target task as a linear regression optimization with a regularization on the distance between the to-be-learned target parameters and the already-learned source parameters.
We show that for sufficiently related tasks, the optimally tuned transfer learning approach can outperform the optimally tuned ridge regression method.
arXiv Detail & Related papers (2021-03-09T18:46:01Z) - Efficient Continual Adaptation for Generative Adversarial Networks [97.20244383723853]
We present a continual learning approach for generative adversarial networks (GANs)
Our approach is based on learning a set of global and task-specific parameters.
We show that the feature-map transformation based approach outperforms state-of-the-art continual GANs methods.
arXiv Detail & Related papers (2021-03-06T05:09:37Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.