Continuous QA Learning with Structured Prompts
- URL: http://arxiv.org/abs/2208.14602v3
- Date: Fri, 15 Mar 2024 01:53:58 GMT
- Title: Continuous QA Learning with Structured Prompts
- Authors: Yinhe Zheng,
- Abstract summary: Diana is a dynamic architecture-based lifelong QA model that tries to learn a sequence of QA tasks.
Four types of hierarchically organized prompts are used in Diana to capture QA knowledge from different granularities.
In experiments, Diana outperforms state-of-the-art lifelong QA models, especially in handling unseen tasks.
- Score: 20.246786740364133
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: QA models with lifelong learning (LL) abilities are important for practical QA applications, and architecture-based LL methods are reported to be an effective implementation for these models. However, it is non-trivial to extend previous approaches to QA tasks since they either require access to task identities in the testing phase or do not explicitly model samples from unseen tasks. In this paper, we propose Diana: a dynamic architecture-based lifelong QA model that tries to learn a sequence of QA tasks with a prompt enhanced language model. Four types of hierarchically organized prompts are used in Diana to capture QA knowledge from different granularities. Specifically, we dedicate task-level prompts to capture task-specific knowledge to retain high LL performances and maintain instance-level prompts to learn knowledge shared across different input samples to improve the model's generalization performance. Moreover, we dedicate separate prompts to explicitly model unseen tasks and introduce a set of prompt key vectors to facilitate knowledge sharing between tasks. Extensive experiments demonstrate that Diana outperforms state-of-the-art lifelong QA models, especially in handling unseen tasks.
Related papers
- Gotta: Generative Few-shot Question Answering by Prompt-based Cloze Data
Augmentation [18.531941086922256]
Few-shot question answering (QA) aims at precisely discovering answers to a set of questions from context passages.
We develop Gotta, a Generative prOmpT-based daTa Augmentation framework.
Inspired by the human reasoning process, we propose to integrate the cloze task to enhance few-shot QA learning.
arXiv Detail & Related papers (2023-06-07T01:44:43Z) - Few-shot Unified Question Answering: Tuning Models or Prompts? [23.885286975673644]
The paper explores the potential of two paradigms of tuning, model, and prompts, for unified QA under a low-resource setting.
The research offers insights into the advantages and limitations of prompt tuning for unified QA in a few-shot setting.
arXiv Detail & Related papers (2023-05-23T23:14:38Z) - Long-Tailed Question Answering in an Open World [46.67715607552547]
We define Open Long-Tailed QA (OLTQA) as learning from long-tailed distributed data.
We propose an OLTQA model that encourages knowledge sharing between head, tail and unseen tasks.
On a large-scale OLTQA dataset, our model consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2023-05-11T04:28:58Z) - Domain Incremental Lifelong Learning in an Open World [45.704746275089555]
We propose textbfDiana: a underlinedynamunderlineic underlinearchitecture-based lifelounderlineng leunderlinearning model.
Four types of hierarchically organized prompts are used in Diana to capture knowledge from different granularities.
arXiv Detail & Related papers (2023-05-11T04:19:08Z) - ProQA: Structural Prompt-based Pre-training for Unified Question
Answering [84.59636806421204]
ProQA is a unified QA paradigm that solves various tasks through a single model.
It concurrently models the knowledge generalization for all QA tasks while keeping the knowledge customization for every specific QA task.
ProQA consistently boosts performance on both full data fine-tuning, few-shot learning, and zero-shot testing scenarios.
arXiv Detail & Related papers (2022-05-09T04:59:26Z) - Continual Object Detection via Prototypical Task Correlation Guided
Gating Mechanism [120.1998866178014]
We present a flexible framework for continual object detection via pRotOtypical taSk corrElaTion guided gaTingAnism (ROSETTA)
Concretely, a unified framework is shared by all tasks while task-aware gates are introduced to automatically select sub-models for specific tasks.
Experiments on COCO-VOC, KITTI-Kitchen, class-incremental detection on VOC and sequential learning of four tasks show that ROSETTA yields state-of-the-art performance.
arXiv Detail & Related papers (2022-05-06T07:31:28Z) - Improved and Efficient Conversational Slot Labeling through Question
Answering [48.670822631047635]
Transformer-based pretrained language models (PLMs) offer unmatched performance across the majority of natural language understanding (NLU) tasks.
We focus on modeling and studying textitslot labeling (SL), a crucial component of NLU for dialog, through the QA optics.
We demonstrate how QA-tuned PLMs can be applied to the SL task, reaching new state-of-the-art performance.
arXiv Detail & Related papers (2022-04-05T11:34:35Z) - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation.
We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.