Chain-of-Skills: A Configurable Model for Open-domain Question Answering
- URL: http://arxiv.org/abs/2305.03130v2
- Date: Fri, 26 May 2023 17:19:58 GMT
- Title: Chain-of-Skills: A Configurable Model for Open-domain Question Answering
- Authors: Kaixin Ma, Hao Cheng, Yu Zhang, Xiaodong Liu, Eric Nyberg, Jianfeng
Gao
- Abstract summary: The retrieval model is an indispensable component for real-world knowledge-intensive tasks.
Recent work focuses on customized methods, limiting the model transferability and scalability.
We propose a modular retriever where individual modules correspond to key skills that can be reused across datasets.
- Score: 79.8644260578301
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The retrieval model is an indispensable component for real-world
knowledge-intensive tasks, e.g., open-domain question answering (ODQA). As
separate retrieval skills are annotated for different datasets, recent work
focuses on customized methods, limiting the model transferability and
scalability. In this work, we propose a modular retriever where individual
modules correspond to key skills that can be reused across datasets. Our
approach supports flexible skill configurations based on the target domain to
boost performance. To mitigate task interference, we design a novel
modularization parameterization inspired by sparse Transformer. We demonstrate
that our model can benefit from self-supervised pretraining on Wikipedia and
fine-tuning using multiple ODQA datasets, both in a multi-task fashion. Our
approach outperforms recent self-supervised retrievers in zero-shot evaluations
and achieves state-of-the-art fine-tuned retrieval performance on NQ, HotpotQA
and OTT-QA.
Related papers
- Retrieval as Attention: End-to-end Learning of Retrieval and Reading
within a Single Transformer [80.50327229467993]
We show that a single model trained end-to-end can achieve both competitive retrieval and QA performance.
We show that end-to-end adaptation significantly boosts its performance on out-of-domain datasets in both supervised and unsupervised settings.
arXiv Detail & Related papers (2022-12-05T04:51:21Z) - You Only Need One Model for Open-domain Question Answering [26.582284346491686]
Recent works for Open-domain Question Answering refer to an external knowledge base using a retriever model.
We propose casting the retriever and the reranker as hard-attention mechanisms applied sequentially within the transformer architecture.
We evaluate our model on Natural Questions and TriviaQA open datasets and our model outperforms the previous state-of-the-art model by 1.0 and 0.7 exact match scores.
arXiv Detail & Related papers (2021-12-14T13:21:11Z) - MetaQA: Combining Expert Agents for Multi-Skill Question Answering [49.35261724460689]
We argue that despite the promising results of multi-dataset models, some domains or QA formats might require specific architectures.
We propose to combine expert agents with a novel, flexible, and training-efficient architecture that considers questions, answer predictions, and answer-prediction confidence scores.
arXiv Detail & Related papers (2021-12-03T14:05:52Z) - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation.
We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z) - Robust Question Answering Through Sub-part Alignment [53.94003466761305]
We model question answering as an alignment problem.
We train our model on SQuAD v1.1 and test it on several adversarial and out-of-domain datasets.
arXiv Detail & Related papers (2020-04-30T09:10:57Z) - ManyModalQA: Modality Disambiguation and QA over Diverse Inputs [73.93607719921945]
We present a new multimodal question answering challenge, ManyModalQA, in which an agent must answer a question by considering three distinct modalities.
We collect our data by scraping Wikipedia and then utilize crowdsourcing to collect question-answer pairs.
arXiv Detail & Related papers (2020-01-22T14:39:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.