Total Recall: a Customized Continual Learning Method for Neural Semantic
Parsers
- URL: http://arxiv.org/abs/2109.05186v2
- Date: Wed, 15 Sep 2021 04:16:04 GMT
- Title: Total Recall: a Customized Continual Learning Method for Neural Semantic
Parsers
- Authors: Zhuang Li, Lizhen Qu, Gholamreza Haffari
- Abstract summary: A neural semantic learns tasks sequentially without accessing full training data from previous tasks.
We propose TotalRecall, a continual learning method designed for neural semantics from two aspects.
We demonstrate that a neural semantic trained with TotalRecall achieves superior performance than the one trained directly with the SOTA continual learning algorithms and achieve a 3-6 times speedup compared to re-training from scratch.
- Score: 38.035925090154024
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates continual learning for semantic parsing. In this
setting, a neural semantic parser learns tasks sequentially without accessing
full training data from previous tasks. Direct application of the SOTA
continual learning algorithms to this problem fails to achieve comparable
performance with re-training models with all seen tasks because they have not
considered the special properties of structured outputs yielded by semantic
parsers. Therefore, we propose TotalRecall, a continual learning method
designed for neural semantic parsers from two aspects: i) a sampling method for
memory replay that diversifies logical form templates and balances
distributions of parse actions in a memory; ii) a two-stage training method
that significantly improves generalization capability of the parsers across
tasks. We conduct extensive experiments to study the research problems involved
in continual semantic parsing and demonstrate that a neural semantic parser
trained with TotalRecall achieves superior performance than the one trained
directly with the SOTA continual learning algorithms and achieve a 3-6 times
speedup compared to re-training from scratch. Code and datasets are available
at: https://github.com/zhuang-li/cl_nsp.
Related papers
- Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Active Learning Guided by Efficient Surrogate Learners [25.52920030051264]
Re-training a deep learning model each time a single data point receives a new label is impractical.
We introduce a new active learning algorithm that harnesses the power of a Gaussian process surrogate in conjunction with the neural network principal learner.
Our proposed model adeptly updates the surrogate learner for every new data instance, enabling it to emulate and capitalize on the continuous learning dynamics of the neural network.
arXiv Detail & Related papers (2023-01-07T01:35:25Z) - The impact of memory on learning sequence-to-sequence tasks [6.603326895384289]
Recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks.
We propose a model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences.
arXiv Detail & Related papers (2022-05-29T14:57:33Z) - Lifelong Learning Without a Task Oracle [13.331659934508764]
Supervised deep neural networks are known to undergo a sharp decline in the accuracy of older tasks when new tasks are learned.
We propose and compare several candidate task-assigning mappers which require very little memory overhead.
Best-performing variants only impose an average cost of 1.7% parameter memory increase.
arXiv Detail & Related papers (2020-11-09T21:30:31Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z) - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation.
We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z) - Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks.
We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.