Related papers: Guided Transfer Learning

Guided Transfer Learning

URL: http://arxiv.org/abs/2303.16154v1
Date: Sun, 26 Mar 2023 18:21:24 GMT
Title: Guided Transfer Learning
Authors: Danko Nikoli\'c, Davor Andri\'c, Vjekoslav Nikoli\'c
Abstract summary: In some applications, guided transfer learning enables the network to learn from a small amount of data. In other cases, a network with a smaller number of parameters can learn a task which otherwise only a larger network could learn. Guided transfer learning potentially has many applications when the amount of data, model size, or the availability of computational resources reach their limits.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Machine learning requires exuberant amounts of data and computation. Also, models require equally excessive growth in the number of parameters. It is, therefore, sensible to look for technologies that reduce these demands on resources. Here, we propose an approach called guided transfer learning. Each weight and bias in the network has its own guiding parameter that indicates how much this parameter is allowed to change while learning a new task. Guiding parameters are learned during an initial scouting process. Guided transfer learning can result in a reduction in resources needed to train a network. In some applications, guided transfer learning enables the network to learn from a small amount of data. In other cases, a network with a smaller number of parameters can learn a task which otherwise only a larger network could learn. Guided transfer learning potentially has many applications when the amount of data, model size, or the availability of computational resources reach their limits.

Related papers

Many or Few Samples? Comparing Transfer, Contrastive and Meta-Learning in Encrypted Traffic Classification [68.19713459228369]
We compare transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models. We show that (i) using large datasets we can obtain more general representations, (ii) contrastive learning is the best methodology. While ML tree-based cannot handle large tasks but fits well small tasks, by means of reusing learned representations, DL methods are reaching tree-based models performance also for small tasks.
arXiv Detail & Related papers (2023-05-21T11:20:49Z)
Task-Attentive Transformer Architecture for Continual Learning of Vision-and-Language Tasks Using Knowledge Distillation [18.345183818638475]
Continual learning (CL) can serve as a remedy through enabling knowledge-transfer across sequentially arriving tasks. We develop a transformer-based CL architecture for learning bimodal vision-and-language tasks. Our approach is scalable learning to a large number of tasks because it requires little memory and time overhead.
arXiv Detail & Related papers (2023-03-25T10:16:53Z)
PIVOT: Prompting for Video Continual Learning [50.80141083993668]
We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain. Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
arXiv Detail & Related papers (2022-12-09T13:22:27Z)
Continual Learning with Transformers for Image Classification [12.028617058465333]
In computer vision, neural network models struggle to continually learn new concepts without forgetting what has been learnt in the past. We develop a solution called Adaptive Distillation of Adapters (ADA), which is developed to perform continual learning. We empirically demonstrate on different classification tasks that this method maintains a good predictive performance without retraining the model.
arXiv Detail & Related papers (2022-06-28T15:30:10Z)
Training Deep Networks from Zero to Hero: avoiding pitfalls and going beyond [59.94347858883343]
This tutorial covers the basic steps as well as more recent options to improve models. It can be particularly useful in datasets that are not as well-prepared as those in challenges.
arXiv Detail & Related papers (2021-09-06T21:31:42Z)
Fractional Transfer Learning for Deep Model-Based Reinforcement Learning [0.966840768820136]
Reinforcement learning (RL) is well known for requiring large amounts of data in order for RL agents to learn to perform complex tasks. Recent progress in model-based RL allows agents to be much more data-efficient. We present a simple alternative approach: fractional transfer learning.
arXiv Detail & Related papers (2021-08-14T12:44:42Z)
What is being transferred in transfer learning? [51.6991244438545]
We show that when training from pre-trained weights, the model stays in the same basin in the loss landscape. We present that when training from pre-trained weights, the model stays in the same basin in the loss landscape and different instances of such model are similar in feature space and close in parameter space.
arXiv Detail & Related papers (2020-08-26T17:23:40Z)
Exploring the Efficacy of Transfer Learning in Mining Image-Based Software Artifacts [1.5285292154680243]
Transfer learning allows us to train deep architectures requiring a large number of learned parameters, even if the amount of available data is limited. Here we explore the applicability of transfer learning utilizing models pre-trained on non-software engineering data applied to the problem of classifying software diagrams.
arXiv Detail & Related papers (2020-03-03T16:41:45Z)
Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC. To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation [111.44445634272235]
In this paper, we develop a parameter efficient transfer learning architecture, termed as PeterRec. PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks. We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks.
arXiv Detail & Related papers (2020-01-13T14:09:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.