Continual Learning for Natural Language Generation in Task-oriented
Dialog Systems
- URL: http://arxiv.org/abs/2010.00910v1
- Date: Fri, 2 Oct 2020 10:32:29 GMT
- Title: Continual Learning for Natural Language Generation in Task-oriented
Dialog Systems
- Authors: Fei Mi, Liangwei Chen, Mengjie Zhao, Minlie Huang and Boi Faltings
- Abstract summary: Natural language generation (NLG) is an essential component of task-oriented dialog systems.
We study NLG in a "continual learning" setting to expand its knowledge to new domains or functionalities incrementally.
The major challenge towards this goal is catastrophic forgetting, meaning that a continually trained model tends to forget the knowledge it has learned before.
- Score: 72.92029584113676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural language generation (NLG) is an essential component of task-oriented
dialog systems. Despite the recent success of neural approaches for NLG, they
are typically developed in an offline manner for particular domains. To better
fit real-life applications where new data come in a stream, we study NLG in a
"continual learning" setting to expand its knowledge to new domains or
functionalities incrementally. The major challenge towards this goal is
catastrophic forgetting, meaning that a continually trained model tends to
forget the knowledge it has learned before. To this end, we propose a method
called ARPER (Adaptively Regularized Prioritized Exemplar Replay) by replaying
prioritized historical exemplars, together with an adaptive regularization
technique based on ElasticWeight Consolidation. Extensive experiments to
continually learn new domains and intents are conducted on MultiWoZ-2.0 to
benchmark ARPER with a wide range of techniques. Empirical results demonstrate
that ARPER significantly outperforms other methods by effectively mitigating
the detrimental catastrophic forgetting issue.
Related papers
- Preserving Generalization of Language models in Few-shot Continual Relation Extraction [34.68364639170838]
Few-shot Continual Relations Extraction (FCRE) is an emerging and dynamic area of study.
We introduce a novel method that leverages often-discarded language model heads.
Our experimental results underscore the efficacy of the proposed method and offer valuable insights for future work.
arXiv Detail & Related papers (2024-10-01T02:22:34Z) - P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task [94.08478298711789]
Embodied Everyday Task is a popular task in the embodied AI community.
Natural language instructions often lack explicit task planning.
Extensive training is required to equip models with knowledge of the task environment.
arXiv Detail & Related papers (2024-09-17T15:29:34Z) - Sequential Editing for Lifelong Training of Speech Recognition Models [10.770491329674401]
Fine-tuning solely on new domain risks Catastrophic Forgetting (CF)
We propose Sequential Model Editing as a novel method to continually learn new domains in ASR systems.
Our study demonstrates up to 15% Word Error Rate Reduction (WERR) over fine-tuning baseline, and superior efficiency over other LLL techniques on CommonVoice English multi-accent dataset.
arXiv Detail & Related papers (2024-06-25T20:52:09Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - A Unified and General Framework for Continual Learning [58.72671755989431]
Continual Learning (CL) focuses on learning from dynamic and changing data distributions while retaining previously acquired knowledge.
Various methods have been developed to address the challenge of catastrophic forgetting, including regularization-based, Bayesian-based, and memory-replay-based techniques.
This research aims to bridge this gap by introducing a comprehensive and overarching framework that encompasses and reconciles these existing methodologies.
arXiv Detail & Related papers (2024-03-20T02:21:44Z) - Adaptive Explainable Continual Learning Framework for Regression
Problems with Focus on Power Forecasts [0.0]
Two continual learning scenarios will be proposed to describe the potential challenges in this context.
Deep neural networks have to learn new tasks and overcome forgetting the knowledge obtained from the old tasks as the amount of data keeps increasing in applications.
Research topics are related but not limited to developing continual deep learning algorithms, strategies for non-stationarity detection in data streams, explainable and visualizable artificial intelligence, etc.
arXiv Detail & Related papers (2021-08-24T14:59:10Z) - DRILL: Dynamic Representations for Imbalanced Lifelong Learning [15.606651610221416]
Continual or lifelong learning has been a long-standing challenge in machine learning to date.
We introduce DRILL, a novel continual learning architecture for open-domain text classification.
arXiv Detail & Related papers (2021-05-18T11:36:37Z) - Learning to Continuously Optimize Wireless Resource in a Dynamic
Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment.
We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes.
Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z) - Continual Deep Learning by Functional Regularisation of Memorable Past [95.97578574330934]
Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past.
We propose a new functional-regularisation approach that utilises a few memorable past examples crucial to avoid forgetting.
Our method achieves state-of-the-art performance on standard benchmarks and opens a new direction for life-long learning where regularisation and memory-based methods are naturally combined.
arXiv Detail & Related papers (2020-04-29T10:47:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.