Related papers: Lifelong Teacher-Student Network Learning

Lifelong Teacher-Student Network Learning

URL: http://arxiv.org/abs/2107.04689v1
Date: Fri, 9 Jul 2021 21:25:56 GMT
Title: Lifelong Teacher-Student Network Learning
Authors: Fei Ye and Adrian G. Bors
Abstract summary: We propose a novel lifelong learning methodology by employing a Teacher-Student network framework. The Teacher is trained to preserve and replay past knowledge corresponding to the probabilistic representations of previously learn databases. The Student module is trained to capture both continuous and discrete underlying data representations across different domains.
Score: 15.350366047108103
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: A unique cognitive capability of humans consists in their ability to acquire new knowledge and skills from a sequence of experiences. Meanwhile, artificial intelligence systems are good at learning only the last given task without being able to remember the databases learnt in the past. We propose a novel lifelong learning methodology by employing a Teacher-Student network framework. While the Student module is trained with a new given database, the Teacher module would remind the Student about the information learnt in the past. The Teacher, implemented by a Generative Adversarial Network (GAN), is trained to preserve and replay past knowledge corresponding to the probabilistic representations of previously learn databases. Meanwhile, the Student module is implemented by a Variational Autoencoder (VAE) which infers its latent variable representation from both the output of the Teacher module as well as from the newly available database. Moreover, the Student module is trained to capture both continuous and discrete underlying data representations across different domains. The proposed lifelong learning framework is applied in supervised, semi-supervised and unsupervised training. The code is available~: \url{https://github.com/dtuzi123/Lifelong-Teacher-Student-Network-Learning}

Related papers

Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning [70.64617500380287]
Continual learning allows models to learn from new data while retaining previously learned knowledge. The semantic knowledge available in the label information of the images, offers important semantic information that can be related with previously acquired knowledge of semantic classes. We propose integrating semantic guidance within and across tasks by capturing semantic similarity using text embeddings.
arXiv Detail & Related papers (2024-08-02T07:51:44Z)
Self-Regulated Data-Free Knowledge Amalgamation for Text Classification [9.169836450935724]
We develop a lightweight student network that can learn from multiple teacher models without accessing their original training data. To accomplish this, we propose STRATANET, a modeling framework that produces text data tailored to each teacher. We evaluate our method on three benchmark text classification datasets with varying labels or domains.
arXiv Detail & Related papers (2024-06-16T21:13:30Z)
Premonition: Using Generative Models to Preempt Future Data Changes in Continual Learning [63.850451635362425]
Continual learning requires a model to adapt to ongoing changes in the data distribution. We show that the combination of a large language model and an image generation model can similarly provide useful premonitions. We find that the backbone of our pre-trained networks can learn representations useful for the downstream continual learning problem.
arXiv Detail & Related papers (2024-03-12T06:29:54Z)
YODA: Teacher-Student Progressive Learning for Language Models [82.0172215948963]
This paper introduces YODA, a teacher-student progressive learning framework. It emulates the teacher-student education process to improve the efficacy of model fine-tuning. Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
arXiv Detail & Related papers (2024-01-28T14:32:15Z)
Semi-Supervised Lifelong Language Learning [81.0685290973989]
We explore a novel setting, semi-supervised lifelong language learning (SSLL), where a model learns sequentially arriving language tasks with both labeled and unlabeled data. Specially, we dedicate task-specific modules to alleviate catastrophic forgetting and design two modules to exploit unlabeled data. Experimental results on various language tasks demonstrate our model's effectiveness and superiority over competitive baselines.
arXiv Detail & Related papers (2022-11-23T15:51:33Z)
Can Bad Teaching Induce Forgetting? Unlearning in Deep Networks using an Incompetent Teacher [6.884272840652062]
We propose a novel machine unlearning method by exploring the utility of competent and incompetent teachers in a student-teacher framework to induce forgetfulness. The knowledge from the competent and incompetent teachers is selectively transferred to the student to obtain a model that doesn't contain any information about the forget data. We introduce the zero forgetting (ZRF) metric to evaluate any unlearning method.
arXiv Detail & Related papers (2022-05-17T05:13:17Z)
Self-supervised Text-to-SQL Learning with Header Alignment Training [4.518012967046983]
Self-supervised learning is a de-facto component for the recent success of deep learning in various fields. We propose a novel self-supervised learning framework to tackle discrepancy between a self-supervised learning objective and a task-specific objective. Our method is effective for training the model with scarce labeled data.
arXiv Detail & Related papers (2021-03-11T01:09:59Z)
SLADE: A Self-Training Framework For Distance Metric Learning [75.54078592084217]
We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional unlabeled data. We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data. We then train a student model on both labels and pseudo labels to generate final feature embeddings.
arXiv Detail & Related papers (2020-11-20T08:26:10Z)
Teaching to Learn: Sequential Teaching of Agents with Inner States [20.556373950863247]
We introduce a multi-agent formulation in which learners' inner state may change with the teaching interaction. In order to teach such learners, we propose an optimal control approach that takes the future performance of the learner after teaching into account.
arXiv Detail & Related papers (2020-09-14T07:03:15Z)
Point Adversarial Self Mining: A Simple Method for Facial Expression Recognition [79.75964372862279]
We propose Point Adversarial Self Mining (PASM) to improve the recognition accuracy in facial expression recognition. PASM uses a point adversarial attack method and a trained teacher network to locate the most informative position related to the target task. The adaptive learning materials generation and teacher/student update can be conducted more than one time, improving the network capability iteratively.
arXiv Detail & Related papers (2020-08-26T06:39:24Z)
A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks [1.1802674324027231]
Self-supervised pre-training for transfer learning is becoming an increasingly popular technique to improve state-of-the-art results using unlabeled data. We provide an overview of the taxonomy for self-supervised learning and transfer learning, and highlight some prominent methods for designing pre-training tasks across different domains.
arXiv Detail & Related papers (2020-07-01T22:55:48Z)
Role-Wise Data Augmentation for Knowledge Distillation [48.115719640111394]
Knowledge Distillation (KD) is a common method for transferring the knowledge'' learned by one machine learning model into another. We design data augmentation agents with distinct roles to facilitate knowledge distillation. We find empirically that specially tailored data points enable the teacher's knowledge to be demonstrated more effectively to the student.
arXiv Detail & Related papers (2020-04-19T14:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.