Phase Transitions in Transfer Learning for High-Dimensional Perceptrons
- URL: http://arxiv.org/abs/2101.01918v1
- Date: Wed, 6 Jan 2021 08:29:22 GMT
- Title: Phase Transitions in Transfer Learning for High-Dimensional Perceptrons
- Authors: Oussama Dhifallah and Yue M. Lu
- Abstract summary: Transfer learning seeks to improve the generalization performance of a target task by exploiting knowledge learned from a related source task.
The latter question is related to the so-called negative transfer phenomenon, where the transferred source information actually reduces the generalization performance of the target task.
We present a theoretical analysis of transfer learning by studying a pair of related perceptron learning tasks.
- Score: 12.614901374282868
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transfer learning seeks to improve the generalization performance of a target
task by exploiting the knowledge learned from a related source task. Central
questions include deciding what information one should transfer and when
transfer can be beneficial. The latter question is related to the so-called
negative transfer phenomenon, where the transferred source information actually
reduces the generalization performance of the target task. This happens when
the two tasks are sufficiently dissimilar. In this paper, we present a
theoretical analysis of transfer learning by studying a pair of related
perceptron learning tasks. Despite the simplicity of our model, it reproduces
several key phenomena observed in practice. Specifically, our asymptotic
analysis reveals a phase transition from negative transfer to positive transfer
as the similarity of the two tasks moves past a well-defined threshold.
Related papers
- Generalization Performance of Transfer Learning: Overparameterized and
Underparameterized Regimes [61.22448274621503]
In real-world applications, tasks often exhibit partial similarity, where certain aspects are similar while others are different or irrelevant.
Our study explores various types of transfer learning, encompassing two options for parameter transfer.
We provide practical guidelines for determining the number of features in the common and task-specific parts for improved generalization performance.
arXiv Detail & Related papers (2023-06-08T03:08:40Z) - ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z) - An Information-Theoretic Approach to Transferability in Task Transfer
Learning [16.05523977032659]
Task transfer learning is a popular technique in image processing applications that uses pre-trained models to reduce the supervision cost of related tasks.
We present a novel metric, H-score, that estimates the performance of transferred representations from one task to another in classification problems.
arXiv Detail & Related papers (2022-12-20T08:47:17Z) - Frozen Overparameterization: A Double Descent Perspective on Transfer
Learning of Deep Neural Networks [27.17697714584768]
We study the generalization behavior of transfer learning of deep neural networks (DNNs)
We show that the test error evolution during the target training has a more significant double descent effect when the target training dataset is sufficiently large.
Also, we show that the double descent phenomenon may make a transfer from a less related source task better than a transfer from a more related source task.
arXiv Detail & Related papers (2022-11-20T20:26:23Z) - Identifying Suitable Tasks for Inductive Transfer Through the Analysis
of Feature Attributions [78.55044112903148]
We use explainability techniques to predict whether task pairs will be complementary, through comparison of neural network activation between single-task models.
Our results show that, through this approach, it is possible to reduce training time by up to 83.5% at a cost of only 0.034 reduction in positive-class F1 on the TREC-IS 2020-A dataset.
arXiv Detail & Related papers (2022-02-02T15:51:07Z) - Frustratingly Easy Transferability Estimation [64.42879325144439]
We propose a simple, efficient, and effective transferability measure named TransRate.
TransRate measures the transferability as the mutual information between the features of target examples extracted by a pre-trained model and labels of them.
Despite its extraordinary simplicity in 10 lines of codes, TransRate performs remarkably well in extensive evaluations on 22 pre-trained models and 16 downstream tasks.
arXiv Detail & Related papers (2021-06-17T10:27:52Z) - Unsupervised Transfer Learning for Spatiotemporal Predictive Networks [90.67309545798224]
We study how to transfer knowledge from a zoo of unsupervisedly learned models towards another network.
Our motivation is that models are expected to understand complex dynamics from different sources.
Our approach yields significant improvements on three benchmarks fortemporal prediction, and benefits the target even from less relevant ones.
arXiv Detail & Related papers (2020-09-24T15:40:55Z) - What is being transferred in transfer learning? [51.6991244438545]
We show that when training from pre-trained weights, the model stays in the same basin in the loss landscape.
We present that when training from pre-trained weights, the model stays in the same basin in the loss landscape and different instances of such model are similar in feature space and close in parameter space.
arXiv Detail & Related papers (2020-08-26T17:23:40Z) - Learning Boost by Exploiting the Auxiliary Task in Multi-task Domain [1.2183405753834562]
Learning two tasks in a single shared function has some benefits.
It helps to generalize the function that can be learned using generally applicable information for both tasks.
However, in a real environment, tasks inevitably have a conflict between them during the learning phase, called negative transfer.
We introduce a novel approach that can drive positive transfer and suppress negative transfer by leveraging class-wise weights in the learning process.
arXiv Detail & Related papers (2020-08-05T10:56:56Z) - Uncovering the Connections Between Adversarial Transferability and
Knowledge Transferability [27.65302656389911]
We analyze and demonstrate the connections between knowledge transferability and adversarial transferability.
Our theoretical studies show that adversarial transferability indicates knowledge transferability and vice versa.
We conduct extensive experiments for different scenarios on diverse datasets, showing a positive correlation between adversarial transferability and knowledge transferability.
arXiv Detail & Related papers (2020-06-25T16:04:47Z) - Inter- and Intra-domain Knowledge Transfer for Related Tasks in Deep
Character Recognition [2.320417845168326]
Pre-training a deep neural network on the ImageNet dataset is a common practice for training deep learning models.
The technique of pre-training on one task and then retraining on a new one is called transfer learning.
In this paper we analyse the effectiveness of using deep transfer learning for character recognition tasks.
arXiv Detail & Related papers (2020-01-02T14:18:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.