Task-Aware Specialization for Efficient and Robust Dense Retrieval for
Open-Domain Question Answering
- URL: http://arxiv.org/abs/2210.05156v2
- Date: Mon, 22 May 2023 20:38:56 GMT
- Title: Task-Aware Specialization for Efficient and Robust Dense Retrieval for
Open-Domain Question Answering
- Authors: Hao Cheng, Hao Fang, Xiaodong Liu, Jianfeng Gao
- Abstract summary: We propose a new architecture, Task-awaredomain for dense Retrieval (TASER)
TASER enables parameter sharing by interleaving shared and specialized blocks in a single encoder.
Our experiments show that TASER can achieve superior accuracy, surpassing BM25, while using about 60% of the parameters as bi-encoder dense retrievers.
- Score: 85.08146789409354
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given its effectiveness on knowledge-intensive natural language processing
tasks, dense retrieval models have become increasingly popular. Specifically,
the de-facto architecture for open-domain question answering uses two
isomorphic encoders that are initialized from the same pretrained model but
separately parameterized for questions and passages. This bi-encoder
architecture is parameter-inefficient in that there is no parameter sharing
between encoders. Further, recent studies show that such dense retrievers
underperform BM25 in various settings. We thus propose a new architecture,
Task-aware Specialization for dense Retrieval (TASER), which enables parameter
sharing by interleaving shared and specialized blocks in a single encoder. Our
experiments on five question answering datasets show that TASER can achieve
superior accuracy, surpassing BM25, while using about 60% of the parameters as
bi-encoder dense retrievers. In out-of-domain evaluations, TASER is also
empirically more robust than bi-encoder dense retrievers. Our code is available
at https://github.com/microsoft/taser.
Related papers
- Mixture of Parrots: Experts improve memorization more than reasoning [72.445819694797]
We show that as we increase the number of experts, the memorization performance consistently increases while the reasoning capabilities saturate.
We find that increasing the number of experts helps solve knowledge-intensive tasks, but fails to yield the same benefits for reasoning tasks.
arXiv Detail & Related papers (2024-10-24T17:54:41Z) - MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are
Better Dense Retrievers [140.0479479231558]
In this work, we aim to unify a variety of pre-training tasks into a multi-task pre-trained model, namely MASTER.
MASTER utilizes a shared-encoder multi-decoder architecture that can construct a representation bottleneck to compress the abundant semantic information across tasks into dense vectors.
arXiv Detail & Related papers (2022-12-15T13:57:07Z) - In Defense of Cross-Encoders for Zero-Shot Retrieval [4.712097135437801]
Bi-encoders and cross-encoders are widely used in many state-of-the-art retrieval pipelines.
We find that the number of parameters and early query-document interactions of cross-encoders play a significant role in the generalization ability of retrieval models.
arXiv Detail & Related papers (2022-12-12T18:50:03Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP
Tasks [40.81306982129298]
Parametric and retrieval-augmented models have complementary strengths in terms of computational efficiency and predictive accuracy.
We propose the Efficient Memory-Augmented Transformer (EMAT)
It encodes external knowledge into a key-value memory and exploits the fast maximum inner product search for memory querying.
arXiv Detail & Related papers (2022-10-30T08:34:49Z) - Large Dual Encoders Are Generalizable Retrievers [26.42937314291077]
We show that scaling up the model size brings significant improvement on a variety of retrieval tasks.
Our dual encoders, textbfGeneralizable textbfT5-based dense textbfRetrievers (GTR), outperform %ColBERTcitekhattab2020colbert and existing sparse and dense retrievers.
arXiv Detail & Related papers (2021-12-15T05:33:27Z) - Encoder Adaptation of Dense Passage Retrieval for Open-Domain Question
Answering [44.853870854321066]
We study how an in-distribution (IND) question/passage encoder would generalize if paired with an OOD passage/question encoder from another domain.
We find that the passage encoder has more influence on the lower bound of generalization while the question encoder seems to affect the upper bound in general.
arXiv Detail & Related papers (2021-10-04T17:51:07Z) - Efficient Retrieval Optimized Multi-task Learning [16.189136169520424]
We propose a novel Retrieval Optimized Multi-task (ROM) framework for jointly training self-supervised tasks, knowledge retrieval, and extractive question answering.
Our ROM approach presents a unified and generalizable framework that enables scaling efficiently to multiple tasks.
Using our framework, we achieve comparable or better performance than recent methods on QA, while drastically reducing the number of parameters.
arXiv Detail & Related papers (2021-04-20T17:16:34Z) - ClarQ: A large-scale and diverse dataset for Clarification Question
Generation [67.1162903046619]
We devise a novel bootstrapping framework that assists in the creation of a diverse, large-scale dataset of clarification questions based on postcomments extracted from stackexchange.
We quantitatively demonstrate the utility of the newly created dataset by applying it to the downstream task of question-answering.
We release this dataset in order to foster research into the field of clarification question generation with the larger goal of enhancing dialog and question answering systems.
arXiv Detail & Related papers (2020-06-10T17:56:50Z) - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation.
We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.