A Review of Deep Transfer Learning and Recent Advancements
- URL: http://arxiv.org/abs/2201.09679v1
- Date: Wed, 19 Jan 2022 04:19:36 GMT
- Title: A Review of Deep Transfer Learning and Recent Advancements
- Authors: Mohammadreza Iman, Khaled Rasheed, Hamid R. Arabnia
- Abstract summary: Deep transfer learning (DTL) methods are the answer to tackle such limitations.
DTLs handle limited target data concerns as well as drastically reduce the training costs.
- Score: 1.3535770763481905
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A successful deep learning model is dependent on extensive training data and
processing power and time (known as training costs). There exist many tasks
without enough number of labeled data to train a deep learning model. Further,
the demand is rising for running deep learning models on edge devices with
limited processing capacity and training time. Deep transfer learning (DTL)
methods are the answer to tackle such limitations, e.g., fine-tuning a
pre-trained model on a massive semi-related dataset proved to be a simple and
effective method for many problems. DTLs handle limited target data concerns as
well as drastically reduce the training costs. In this paper, the definition
and taxonomy of deep transfer learning is reviewed. Then we focus on the
sub-category of network-based DTLs since it is the most common types of DTLs
that have been applied to various applications in the last decade.
Related papers
- Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - Deep Transfer Learning for Automatic Speech Recognition: Towards Better
Generalization [3.6393183544320236]
Speech recognition has become an important challenge when using deep learning (DL)
It requires large-scale training datasets and high computational and storage resources.
Deep transfer learning (DTL) has been introduced to overcome these issues.
arXiv Detail & Related papers (2023-04-27T21:08:05Z) - Dataset Distillation: A Comprehensive Review [76.26276286545284]
dataset distillation (DD) aims to derive a much smaller dataset containing synthetic samples, based on which the trained models yield performance comparable with those trained on the original dataset.
This paper gives a comprehensive review and summary of recent advances in DD and its application.
arXiv Detail & Related papers (2023-01-17T17:03:28Z) - PIVOT: Prompting for Video Continual Learning [50.80141083993668]
We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain.
Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
arXiv Detail & Related papers (2022-12-09T13:22:27Z) - Continual Learning with Transformers for Image Classification [12.028617058465333]
In computer vision, neural network models struggle to continually learn new concepts without forgetting what has been learnt in the past.
We develop a solution called Adaptive Distillation of Adapters (ADA), which is developed to perform continual learning.
We empirically demonstrate on different classification tasks that this method maintains a good predictive performance without retraining the model.
arXiv Detail & Related papers (2022-06-28T15:30:10Z) - EXPANSE: A Deep Continual / Progressive Learning System for Deep
Transfer Learning [1.1024591739346294]
Current DTL techniques suffer from either catastrophic forgetting dilemma or overly biased pre-trained models.
We propose a new continual/progressive learning approach for deep transfer learning to tackle these limitations.
We offer a new way of training deep learning models inspired by the human education system.
arXiv Detail & Related papers (2022-05-19T03:54:58Z) - Knowledge Distillation as Efficient Pre-training: Faster Convergence,
Higher Data-efficiency, and Better Transferability [53.27240222619834]
Knowledge Distillation as Efficient Pre-training aims to efficiently transfer the learned feature representation from pre-trained models to new student models for future downstream tasks.
Our method performs comparably with supervised pre-training counterparts in 3 downstream tasks and 9 downstream datasets requiring 10x less data and 5x less pre-training time.
arXiv Detail & Related papers (2022-03-10T06:23:41Z) - Training Deep Networks from Zero to Hero: avoiding pitfalls and going
beyond [59.94347858883343]
This tutorial covers the basic steps as well as more recent options to improve models.
It can be particularly useful in datasets that are not as well-prepared as those in challenges.
arXiv Detail & Related papers (2021-09-06T21:31:42Z) - A Survey on Transfer Learning in Natural Language Processing [8.396202730857942]
The demand for transfer learning is increasing as many large models are emerging.
In this survey, we feature the recent transfer learning advances in the field of NLP.
arXiv Detail & Related papers (2020-05-31T21:52:31Z) - Exploring the Efficacy of Transfer Learning in Mining Image-Based
Software Artifacts [1.5285292154680243]
Transfer learning allows us to train deep architectures requiring a large number of learned parameters, even if the amount of available data is limited.
Here we explore the applicability of transfer learning utilizing models pre-trained on non-software engineering data applied to the problem of classifying software diagrams.
arXiv Detail & Related papers (2020-03-03T16:41:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.