A Review of Deep Transfer Learning and Recent Advancements
- URL: http://arxiv.org/abs/2201.09679v1
- Date: Wed, 19 Jan 2022 04:19:36 GMT
- Title: A Review of Deep Transfer Learning and Recent Advancements
- Authors: Mohammadreza Iman, Khaled Rasheed, Hamid R. Arabnia
- Abstract summary: Deep transfer learning (DTL) methods are the answer to tackle such limitations.
DTLs handle limited target data concerns as well as drastically reduce the training costs.
- Score: 1.3535770763481905
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A successful deep learning model is dependent on extensive training data and
processing power and time (known as training costs). There exist many tasks
without enough number of labeled data to train a deep learning model. Further,
the demand is rising for running deep learning models on edge devices with
limited processing capacity and training time. Deep transfer learning (DTL)
methods are the answer to tackle such limitations, e.g., fine-tuning a
pre-trained model on a massive semi-related dataset proved to be a simple and
effective method for many problems. DTLs handle limited target data concerns as
well as drastically reduce the training costs. In this paper, the definition
and taxonomy of deep transfer learning is reviewed. Then we focus on the
sub-category of network-based DTLs since it is the most common types of DTLs
that have been applied to various applications in the last decade.
Related papers
- A Practical Guide to Fine-tuning Language Models with Limited Data [9.413178499853156]
Employing pre-trained Large Language Models (LLMs) has become the de facto standard in Natural Language Processing (NLP) despite their extensive data requirements.
Motivated by the recent surge in research focused on training LLMs with limited data, this paper surveys recent transfer learning approaches to optimize model performance in downstream tasks where data is scarce.
arXiv Detail & Related papers (2024-11-14T15:55:37Z) - Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data [54.934578742209716]
In real-world NLP applications, Large Language Models (LLMs) offer promising solutions due to their extensive training on vast datasets.
LLKD is an adaptive sample selection method that incorporates signals from both the teacher and student.
Our comprehensive experiments show that LLKD achieves superior performance across various datasets with higher data efficiency.
arXiv Detail & Related papers (2024-11-12T18:57:59Z) - Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review [50.78587571704713]
Large Language Model (LLM) pretraining traditionally relies on autoregressive language modeling on randomly sampled data blocks from web-scale datasets.
We take inspiration from human learning techniques like spaced repetition to hypothesize that random data sampling for LLMs leads to high training cost and low quality models which tend to forget data.
In order to effectively commit web-scale information to long-term memory, we propose the LFR (Learn, Focus, and Review) pedagogy.
arXiv Detail & Related papers (2024-09-10T00:59:18Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Deep Transfer Learning for Automatic Speech Recognition: Towards Better
Generalization [3.6393183544320236]
Speech recognition has become an important challenge when using deep learning (DL)
It requires large-scale training datasets and high computational and storage resources.
Deep transfer learning (DTL) has been introduced to overcome these issues.
arXiv Detail & Related papers (2023-04-27T21:08:05Z) - PIVOT: Prompting for Video Continual Learning [50.80141083993668]
We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain.
Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
arXiv Detail & Related papers (2022-12-09T13:22:27Z) - Continual Learning with Transformers for Image Classification [12.028617058465333]
In computer vision, neural network models struggle to continually learn new concepts without forgetting what has been learnt in the past.
We develop a solution called Adaptive Distillation of Adapters (ADA), which is developed to perform continual learning.
We empirically demonstrate on different classification tasks that this method maintains a good predictive performance without retraining the model.
arXiv Detail & Related papers (2022-06-28T15:30:10Z) - EXPANSE: A Deep Continual / Progressive Learning System for Deep
Transfer Learning [1.1024591739346294]
Current DTL techniques suffer from either catastrophic forgetting dilemma or overly biased pre-trained models.
We propose a new continual/progressive learning approach for deep transfer learning to tackle these limitations.
We offer a new way of training deep learning models inspired by the human education system.
arXiv Detail & Related papers (2022-05-19T03:54:58Z) - Knowledge Distillation as Efficient Pre-training: Faster Convergence,
Higher Data-efficiency, and Better Transferability [53.27240222619834]
Knowledge Distillation as Efficient Pre-training aims to efficiently transfer the learned feature representation from pre-trained models to new student models for future downstream tasks.
Our method performs comparably with supervised pre-training counterparts in 3 downstream tasks and 9 downstream datasets requiring 10x less data and 5x less pre-training time.
arXiv Detail & Related papers (2022-03-10T06:23:41Z) - Training Deep Networks from Zero to Hero: avoiding pitfalls and going
beyond [59.94347858883343]
This tutorial covers the basic steps as well as more recent options to improve models.
It can be particularly useful in datasets that are not as well-prepared as those in challenges.
arXiv Detail & Related papers (2021-09-06T21:31:42Z) - A Survey on Transfer Learning in Natural Language Processing [8.396202730857942]
The demand for transfer learning is increasing as many large models are emerging.
In this survey, we feature the recent transfer learning advances in the field of NLP.
arXiv Detail & Related papers (2020-05-31T21:52:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.