Online Transfer Learning: Negative Transfer and Effect of Prior
Knowledge
- URL: http://arxiv.org/abs/2105.01445v1
- Date: Tue, 4 May 2021 12:12:14 GMT
- Title: Online Transfer Learning: Negative Transfer and Effect of Prior
Knowledge
- Authors: Xuetong Wu, Jonathan H. Manton, Uwe Aickelin, Jingge Zhu
- Abstract summary: We study the online transfer learning problems where the source samples are given in an offline way while the target samples arrive sequentially.
We define the expected regret of the online transfer learning problem and provide upper bounds on the regret using information-theoretic quantities.
Examples show that the derived bounds are accurate even for small sample sizes.
- Score: 6.193838300896449
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning is a machine learning paradigm where the knowledge from one
task is utilized to resolve the problem in a related task. On the one hand, it
is conceivable that knowledge from one task could be useful for solving a
related problem. On the other hand, it is also recognized that if not executed
properly, transfer learning algorithms could in fact impair the learning
performance instead of improving it - commonly known as "negative transfer". In
this paper, we study the online transfer learning problems where the source
samples are given in an offline way while the target samples arrive
sequentially. We define the expected regret of the online transfer learning
problem and provide upper bounds on the regret using information-theoretic
quantities. We also obtain exact expressions for the bounds when the sample
size becomes large. Examples show that the derived bounds are accurate even for
small sample sizes. Furthermore, the obtained bounds give valuable insight on
the effect of prior knowledge for transfer learning in our formulation. In
particular, we formally characterize the conditions under which negative
transfer occurs.
Related papers
- Feasibility of Transfer Learning: A Mathematical Framework [4.530876736231948]
It begins by establishing the necessary mathematical concepts and constructing a mathematical framework for transfer learning.
It then identifies and formulates the three-step transfer learning procedure as an optimization problem, allowing for the resolution of the feasibility issue.
arXiv Detail & Related papers (2023-05-22T12:44:38Z) - Transferability Estimation Based On Principal Gradient Expectation [68.97403769157117]
Cross-task transferability is compatible with transferred results while keeping self-consistency.
Existing transferability metrics are estimated on the particular model by conversing source and target tasks.
We propose Principal Gradient Expectation (PGE), a simple yet effective method for assessing transferability across tasks.
arXiv Detail & Related papers (2022-11-29T15:33:02Z) - An Exploration of Data Efficiency in Intra-Dataset Task Transfer for
Dialog Understanding [65.75873687351553]
This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain.
Unintuitively, our data shows that often target task training data size has minimal effect on how sequential transfer learning performs compared to the same model without transfer learning.
arXiv Detail & Related papers (2022-10-21T04:36:46Z) - A Data-Based Perspective on Transfer Learning [76.30206800557411]
We take a closer look at the role of the source dataset's composition in transfer learning.
Our framework gives rise to new capabilities such as pinpointing transfer learning brittleness.
arXiv Detail & Related papers (2022-07-12T17:58:28Z) - A Bayesian Approach to (Online) Transfer Learning: Theory and Algorithms [6.193838300896449]
We study transfer learning from a Bayesian perspective, where a parametric statistical model is used.
Specifically, we study three variants of transfer learning problems, instantaneous, online, and time-variant transfer learning.
For each problem, we define an appropriate objective function, and provide either exact expressions or upper bounds on the learning performance.
Examples show that the derived bounds are accurate even for small sample sizes.
arXiv Detail & Related papers (2021-09-03T08:43:29Z) - Reducing Representation Drift in Online Continual Learning [87.71558506591937]
We study the online continual learning paradigm, where agents must learn from a changing distribution with constrained memory and compute.
In this work we instead focus on the change in representations of previously observed data due to the introduction of previously unobserved class samples in the incoming data stream.
arXiv Detail & Related papers (2021-04-11T15:19:30Z) - Phase Transitions in Transfer Learning for High-Dimensional Perceptrons [12.614901374282868]
Transfer learning seeks to improve the generalization performance of a target task by exploiting knowledge learned from a related source task.
The latter question is related to the so-called negative transfer phenomenon, where the transferred source information actually reduces the generalization performance of the target task.
We present a theoretical analysis of transfer learning by studying a pair of related perceptron learning tasks.
arXiv Detail & Related papers (2021-01-06T08:29:22Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - What is being transferred in transfer learning? [51.6991244438545]
We show that when training from pre-trained weights, the model stays in the same basin in the loss landscape.
We present that when training from pre-trained weights, the model stays in the same basin in the loss landscape and different instances of such model are similar in feature space and close in parameter space.
arXiv Detail & Related papers (2020-08-26T17:23:40Z) - Limits of Transfer Learning [0.0]
We show the need to carefully select which sets of information to transfer and the need for dependence between transferred information and target problems.
These results build on the algorithmic search framework for machine learning, allowing the results to apply to a wide range of learning problems using transfer.
arXiv Detail & Related papers (2020-06-23T01:48:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.