Continual Learning with Dirichlet Generative-based Rehearsal
- URL: http://arxiv.org/abs/2309.06917v1
- Date: Wed, 13 Sep 2023 12:30:03 GMT
- Title: Continual Learning with Dirichlet Generative-based Rehearsal
- Authors: Min Zeng, Wei Xue, Qifeng Liu, Yike Guo
- Abstract summary: We present Dirichlet Continual Learning, a novel generative-based rehearsal strategy for task-oriented dialogue systems.
We also introduce Jensen-Shannon Knowledge Distillation (JSKD), a robust logit-based knowledge distillation method.
Our experiments confirm the efficacy of our approach in both intent detection and slot-filling tasks, outperforming state-of-the-art methods.
- Score: 22.314195832409755
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in data-driven task-oriented dialogue systems (ToDs)
struggle with incremental learning due to computational constraints and
time-consuming issues. Continual Learning (CL) attempts to solve this by
avoiding intensive pre-training, but it faces the problem of catastrophic
forgetting (CF). While generative-based rehearsal CL methods have made
significant strides, generating pseudo samples that accurately reflect the
underlying task-specific distribution is still a challenge. In this paper, we
present Dirichlet Continual Learning (DCL), a novel generative-based rehearsal
strategy for CL. Unlike the traditionally used Gaussian latent variable in the
Conditional Variational Autoencoder (CVAE), DCL leverages the flexibility and
versatility of the Dirichlet distribution to model the latent prior variable.
This enables it to efficiently capture sentence-level features of previous
tasks and effectively guide the generation of pseudo samples. In addition, we
introduce Jensen-Shannon Knowledge Distillation (JSKD), a robust logit-based
knowledge distillation method that enhances knowledge transfer during pseudo
sample generation. Our experiments confirm the efficacy of our approach in both
intent detection and slot-filling tasks, outperforming state-of-the-art
methods.
Related papers
- Temporal-Difference Variational Continual Learning [89.32940051152782]
A crucial capability of Machine Learning models in real-world applications is the ability to continuously learn new tasks.
In Continual Learning settings, models often struggle to balance learning new tasks with retaining previous knowledge.
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - ICL-TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models [103.45785408116146]
Continual learning (CL) aims to train a model that can solve multiple tasks presented sequentially.
Recent CL approaches have achieved strong performance by leveraging large pre-trained models that generalize well to downstream tasks.
However, such methods lack theoretical guarantees, making them prone to unexpected failures.
We bridge this gap by integrating an empirically strong approach into a principled framework, designed to prevent forgetting.
arXiv Detail & Related papers (2024-10-01T12:58:37Z) - Overcoming Domain Drift in Online Continual Learning [24.86094018430407]
Online Continual Learning (OCL) empowers machine learning models to acquire new knowledge online across a sequence of tasks.
OCL faces a significant challenge: catastrophic forgetting, wherein the model learned in previous tasks is substantially overwritten upon encountering new tasks.
We propose a novel rehearsal strategy, Drift-Reducing Rehearsal (DRR), to anchor the domain of old tasks and reduce the negative transfer effects.
arXiv Detail & Related papers (2024-05-15T06:57:18Z) - Dynamic Sub-graph Distillation for Robust Semi-supervised Continual
Learning [52.046037471678005]
We focus on semi-supervised continual learning (SSCL), where the model progressively learns from partially labeled data with unknown categories.
We propose a novel approach called Dynamic Sub-Graph Distillation (DSGD) for semi-supervised continual learning.
arXiv Detail & Related papers (2023-12-27T04:40:12Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Detecting Morphing Attacks via Continual Incremental Training [10.796380524798744]
Recent Continual Learning (CL) paradigm may represent an effective solution to enable incremental training, even through multiple sites.
We investigate the performance of different Continual Learning methods in this scenario, simulating a learning model that is updated every time a new chunk of data, even of variable size, is available.
Experimental results reveal that a particular CL method, namely Learning without Forgetting (LwF), is one of the best-performing algorithms.
arXiv Detail & Related papers (2023-07-27T17:48:29Z) - Task-agnostic Continual Learning with Hybrid Probabilistic Models [75.01205414507243]
We propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification.
The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting.
We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST.
arXiv Detail & Related papers (2021-06-24T05:19:26Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z) - Ask-n-Learn: Active Learning via Reliable Gradient Representations for
Image Classification [29.43017692274488]
Deep predictive models rely on human supervision in the form of labeled training data.
We propose Ask-n-Learn, an active learning approach based on gradient embeddings obtained using the pesudo-labels estimated in each of the algorithm.
arXiv Detail & Related papers (2020-09-30T05:19:56Z) - Online Continual Learning under Extreme Memory Constraints [40.80045285324969]
We introduce the novel problem of Memory-Constrained Online Continual Learning (MC-OCL)
MC-OCL imposes strict constraints on the memory overhead that a possible algorithm can use to avoid catastrophic forgetting.
We propose an algorithmic solution to MC-OCL: Batch-level Distillation (BLD), a regularization-based CL approach.
arXiv Detail & Related papers (2020-08-04T13:25:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.