Self-Supervised Knowledge Transfer via Loosely Supervised Auxiliary
Tasks
- URL: http://arxiv.org/abs/2110.12696v1
- Date: Mon, 25 Oct 2021 07:18:26 GMT
- Title: Self-Supervised Knowledge Transfer via Loosely Supervised Auxiliary
Tasks
- Authors: Seungbum Hong, Jihun Yoon, Junmo Kim, Min-Kook Choi
- Abstract summary: knowledge transfer using convolutional neural networks (CNNs) can help efficiently train a CNN with fewer parameters or maximize the generalization performance under limited supervision.
We propose a simple yet powerful knowledge transfer methodology without any restrictions regarding the network structure or dataset used.
We devise a training methodology that transfers previously learned knowledge to the current training process as an auxiliary task for the target task through self-supervision using a soft label.
- Score: 24.041268664220294
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge transfer using convolutional neural networks (CNNs) can help
efficiently train a CNN with fewer parameters or maximize the generalization
performance under limited supervision. To enable a more efficient transfer of
pretrained knowledge under relaxed conditions, we propose a simple yet powerful
knowledge transfer methodology without any restrictions regarding the network
structure or dataset used, namely self-supervised knowledge transfer (SSKT),
via loosely supervised auxiliary tasks. For this, we devise a training
methodology that transfers previously learned knowledge to the current training
process as an auxiliary task for the target task through self-supervision using
a soft label. The SSKT is independent of the network structure and dataset, and
is trained differently from existing knowledge transfer methods; hence, it has
an advantage in that the prior knowledge acquired from various tasks can be
naturally transferred during the training process to the target task.
Furthermore, it can improve the generalization performance on most datasets
through the proposed knowledge transfer between different problem domains from
multiple source networks. SSKT outperforms the other transfer learning methods
(KD, DML, and MAXL) through experiments under various knowledge transfer
settings. The source code will be made available to the public.
Related papers
- Adaptive Intellect Unleashed: The Feasibility of Knowledge Transfer in
Large Language Models [25.23472658127685]
We conduct the first empirical study on using knowledge transfer to improve the generalization ability of large language models (LLMs)
Our proposed general knowledge transfer approach guides the LLM towards a similar and familiar API or code snippet it has encountered before, improving the model's generalization ability for unseen knowledge.
We apply this approach to three software engineering tasks: API inference, code example generation, and FQN inference, and find transfer span, transfer strategy, and transfer architecture as key factors affecting the method.
arXiv Detail & Related papers (2023-08-09T08:26:22Z) - Evaluating the structure of cognitive tasks with transfer learning [67.22168759751541]
This study investigates the transferability of deep learning representations between different EEG decoding tasks.
We conduct extensive experiments using state-of-the-art decoding models on two recently released EEG datasets.
arXiv Detail & Related papers (2023-07-28T14:51:09Z) - Multi-Source Transfer Learning for Deep Model-Based Reinforcement
Learning [0.6445605125467572]
A crucial challenge in reinforcement learning is to reduce the number of interactions with the environment that an agent requires to master a given task.
Transfer learning proposes to address this issue by re-using knowledge from previously learned tasks.
The goal of this paper is to address these issues with modular multi-source transfer learning techniques.
arXiv Detail & Related papers (2022-05-28T12:04:52Z) - Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation [51.21190751266442]
Domain adaptation (DA) tries to tackle the scenarios when the test data does not fully follow the same distribution of the training data.
By learning from large-scale unlabeled samples, self-supervised learning has now become a new trend in deep learning.
We propose a novel textbfSelf-textbfSupervised textbfGraph Neural Network (SSG) to enable more effective inter-task information exchange and knowledge sharing.
arXiv Detail & Related papers (2022-04-08T03:37:56Z) - Hierarchical Self-supervised Augmented Knowledge Distillation [1.9355744690301404]
We propose an alternative self-supervised augmented task to guide the network to learn the joint distribution of the original recognition task and self-supervised auxiliary task.
It is demonstrated as a richer knowledge to improve the representation power without losing the normal classification capability.
Our method significantly surpasses the previous SOTA SSKD with an average improvement of 2.56% on CIFAR-100 and an improvement of 0.77% on ImageNet.
arXiv Detail & Related papers (2021-07-29T02:57:21Z) - Network-Agnostic Knowledge Transfer for Medical Image Segmentation [2.25146058725705]
We propose a knowledge transfer approach from a teacher to a student network wherein we train the student on an independent transferal dataset.
We studied knowledge transfer from a single teacher, combination of knowledge transfer and fine-tuning, and knowledge transfer from multiple teachers.
The proposed algorithm is effective for knowledge transfer and easily tunable.
arXiv Detail & Related papers (2021-01-23T19:06:14Z) - CosSGD: Nonlinear Quantization for Communication-efficient Federated
Learning [62.65937719264881]
Federated learning facilitates learning across clients without transferring local data on these clients to a central server.
We propose a nonlinear quantization for compressed gradient descent, which can be easily utilized in federated learning.
Our system significantly reduces the communication cost by up to three orders of magnitude, while maintaining convergence and accuracy of the training process.
arXiv Detail & Related papers (2020-12-15T12:20:28Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Unsupervised Transfer Learning for Spatiotemporal Predictive Networks [90.67309545798224]
We study how to transfer knowledge from a zoo of unsupervisedly learned models towards another network.
Our motivation is that models are expected to understand complex dynamics from different sources.
Our approach yields significant improvements on three benchmarks fortemporal prediction, and benefits the target even from less relevant ones.
arXiv Detail & Related papers (2020-09-24T15:40:55Z) - Uniform Priors for Data-Efficient Transfer [65.086680950871]
We show that features that are most transferable have high uniformity in the embedding space.
We evaluate the regularization on its ability to facilitate adaptation to unseen tasks and data.
arXiv Detail & Related papers (2020-06-30T04:39:36Z) - Inter- and Intra-domain Knowledge Transfer for Related Tasks in Deep
Character Recognition [2.320417845168326]
Pre-training a deep neural network on the ImageNet dataset is a common practice for training deep learning models.
The technique of pre-training on one task and then retraining on a new one is called transfer learning.
In this paper we analyse the effectiveness of using deep transfer learning for character recognition tasks.
arXiv Detail & Related papers (2020-01-02T14:18:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.