Related papers: Total Recall: a Customized Continual Learning Method for Neural Semantic Parsers

Total Recall: a Customized Continual Learning Method for Neural Semantic Parsers

URL: http://arxiv.org/abs/2109.05186v2
Date: Wed, 15 Sep 2021 04:16:04 GMT
Title: Total Recall: a Customized Continual Learning Method for Neural Semantic Parsers
Authors: Zhuang Li, Lizhen Qu, Gholamreza Haffari
Abstract summary: A neural semantic learns tasks sequentially without accessing full training data from previous tasks. We propose TotalRecall, a continual learning method designed for neural semantics from two aspects. We demonstrate that a neural semantic trained with TotalRecall achieves superior performance than the one trained directly with the SOTA continual learning algorithms and achieve a 3-6 times speedup compared to re-training from scratch.
Score: 38.035925090154024
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper investigates continual learning for semantic parsing. In this setting, a neural semantic parser learns tasks sequentially without accessing full training data from previous tasks. Direct application of the SOTA continual learning algorithms to this problem fails to achieve comparable performance with re-training models with all seen tasks because they have not considered the special properties of structured outputs yielded by semantic parsers. Therefore, we propose TotalRecall, a continual learning method designed for neural semantic parsers from two aspects: i) a sampling method for memory replay that diversifies logical form templates and balances distributions of parse actions in a memory; ii) a two-stage training method that significantly improves generalization capability of the parsers across tasks. We conduct extensive experiments to study the research problems involved in continual semantic parsing and demonstrate that a neural semantic parser trained with TotalRecall achieves superior performance than the one trained directly with the SOTA continual learning algorithms and achieve a 3-6 times speedup compared to re-training from scratch. Code and datasets are available at: https://github.com/zhuang-li/cl_nsp.

Related papers

Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning. We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads. We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z)
Optimal Task Order for Continual Learning of Multiple Tasks [3.591122855617648]
Continual learning of multiple tasks remains a major challenge for neural networks. Here, we investigate how task order influences continual learning and propose a strategy for optimizing it. Our work thus presents a generalizable framework for task-order optimization in task-incremental continual learning.
arXiv Detail & Related papers (2025-02-05T16:43:58Z)
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks. Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z)
Complementary Learning Subnetworks for Parameter-Efficient Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks. Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z)
Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z)
Active Learning Guided by Efficient Surrogate Learners [25.52920030051264]
Re-training a deep learning model each time a single data point receives a new label is impractical. We introduce a new active learning algorithm that harnesses the power of a Gaussian process surrogate in conjunction with the neural network principal learner. Our proposed model adeptly updates the surrogate learner for every new data instance, enabling it to emulate and capitalize on the continuous learning dynamics of the neural network.
arXiv Detail & Related papers (2023-01-07T01:35:25Z)
The impact of memory on learning sequence-to-sequence tasks [6.603326895384289]
Recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. We propose a model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences.
arXiv Detail & Related papers (2022-05-29T14:57:33Z)
Lifelong Learning Without a Task Oracle [13.331659934508764]
Supervised deep neural networks are known to undergo a sharp decline in the accuracy of older tasks when new tasks are learned. We propose and compare several candidate task-assigning mappers which require very little memory overhead. Best-performing variants only impose an average cost of 1.7% parameter memory increase.
arXiv Detail & Related papers (2020-11-09T21:30:31Z)
Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks. In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other. This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z)
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation. We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z)
Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks. We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.