Tackling Multi-Answer Open-Domain Questions via a Recall-then-Verify
Framework
- URL: http://arxiv.org/abs/2110.08544v1
- Date: Sat, 16 Oct 2021 10:48:10 GMT
- Title: Tackling Multi-Answer Open-Domain Questions via a Recall-then-Verify
Framework
- Authors: Zhihong Shao and Minlie Huang
- Abstract summary: Open domain questions are likely to be open-ended and ambiguous, leading to multiple valid answers.
Existing approaches typically adopt the rerank-then-read framework, where a reader reads top-ranking evidence to predict answers.
Our framework achieves new state-of-the-art results on two multi-answer datasets, and predicts significantly more gold answers than a rerank-then-read system with an oracle reranker.
- Score: 38.807388762378444
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open domain questions are likely to be open-ended and ambiguous, leading to
multiple valid answers. Existing approaches typically adopt the
rerank-then-read framework, where a reader reads top-ranking evidence to
predict answers. According to our empirical analyses, this framework is faced
with three problems: to leverage the power of a large reader, the reranker is
forced to select only a few relevant passages that cover diverse answers, which
is non-trivial due to unknown effect on the reader's performance; the small
reading budget also prevents the reader from making use of valuable retrieved
evidence filtered out by the reranker; besides, as the reader generates
predictions all at once based on all selected evidence, it may learn
pathological dependencies among answers, i.e., whether to predict an answer may
also depend on evidence of the other answers. To avoid these problems, we
propose to tackle multi-answer open-domain questions with a recall-then-verify
framework, which separates the reasoning process of each answer so that we can
make better use of retrieved evidence while also leveraging the power of large
models under the same memory constraint. Our framework achieves new
state-of-the-art results on two multi-answer datasets, and predicts
significantly more gold answers than a rerank-then-read system with an oracle
reranker.
Related papers
- Do RAG Systems Cover What Matters? Evaluating and Optimizing Responses with Sub-Question Coverage [74.70255719194819]
We introduce a novel framework based on sub-question coverage, which measures how well a RAG system addresses different facets of a question.
We use this framework to evaluate three commercial generative answer engines: You.com, Perplexity AI, and Bing Chat.
We find that while all answer engines cover core sub-questions more often than background or follow-up ones, they still miss around 50% of core sub-questions.
arXiv Detail & Related papers (2024-10-20T22:59:34Z) - Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering [45.82437926569949]
Multi-Hop Question Answering tasks present a significant challenge for large language models.
We introduce a novel generate-then-ground (GenGround) framework to solve a multi-hop question.
arXiv Detail & Related papers (2024-06-21T06:26:38Z) - RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation [35.981443744108255]
We propose a novel RAG framework, namely RichRAG.
It includes a sub-aspect explorer to identify potential sub-aspects of input questions, a retriever to build a candidate pool of diverse external documents related to these sub-aspects, and a generative list-wise ranker.
Experimental results on two publicly available datasets prove that our framework effectively and efficiently provides comprehensive and satisfying responses to users.
arXiv Detail & Related papers (2024-06-18T12:52:51Z) - Multifaceted Improvements for Conversational Open-Domain Question
Answering [54.913313912927045]
We propose a framework with Multifaceted Improvements for Conversational open-domain Question Answering (MICQA)
Firstly, the proposed KL-divergence based regularization is able to lead to a better question understanding for retrieval and answer reading.
Second, the added post-ranker module can push more relevant passages to the top placements and be selected for reader with a two-aspect constrains.
Third, the well designed curriculum learning strategy effectively narrows the gap between the golden passage settings of training and inference, and encourages the reader to find true answer without the golden passage assistance.
arXiv Detail & Related papers (2022-04-01T07:54:27Z) - Joint Passage Ranking for Diverse Multi-Answer Retrieval [56.43443577137929]
We study multi-answer retrieval, an under-explored problem that requires retrieving passages to cover multiple distinct answers for a question.
This task requires joint modeling of retrieved passages, as models should not repeatedly retrieve passages containing the same answer at the cost of missing a different valid answer.
In this paper, we introduce JPR, a joint passage retrieval model focusing on reranking. To model the joint probability of the retrieved passages, JPR makes use of an autoregressive reranker that selects a sequence of passages, equipped with novel training and decoding algorithms.
arXiv Detail & Related papers (2021-04-17T04:48:36Z) - Answering Ambiguous Questions through Generative Evidence Fusion and
Round-Trip Prediction [46.38201136570501]
We present a model that aggregates and combines evidence from multiple passages to adaptively predict a single answer or a set of question-answer pairs for ambiguous questions.
Our model, named Refuel, achieves a new state-of-the-art performance on the AmbigQA dataset, and shows competitive performance on NQ-Open and TriviaQA.
arXiv Detail & Related papers (2020-11-26T05:48:55Z) - Answering Any-hop Open-domain Questions with Iterative Document
Reranking [62.76025579681472]
We propose a unified QA framework to answer any-hop open-domain questions.
Our method consistently achieves performance comparable to or better than the state-of-the-art on both single-hop and multi-hop open-domain QA datasets.
arXiv Detail & Related papers (2020-09-16T04:31:38Z) - Open-Domain Question Answering with Pre-Constructed Question Spaces [70.13619499853756]
Open-domain question answering aims at solving the task of locating the answers to user-generated questions in massive collections of documents.
There are two families of solutions available: retriever-readers, and knowledge-graph-based approaches.
We propose a novel algorithm with a reader-retriever structure that differs from both families.
arXiv Detail & Related papers (2020-06-02T04:31:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.