RealTime QA: What's the Answer Right Now?
- URL: http://arxiv.org/abs/2207.13332v2
- Date: Wed, 28 Feb 2024 07:45:37 GMT
- Title: RealTime QA: What's the Answer Right Now?
- Authors: Jungo Kasai, Keisuke Sakaguchi, Yoichi Takahashi, Ronan Le Bras, Akari
Asai, Xinyan Yu, Dragomir Radev, Noah A. Smith, Yejin Choi, Kentaro Inui
- Abstract summary: We introduce REALTIME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis.
We build strong baseline models upon large pretrained language models, including GPT-3 and T5.
GPT-3 tends to return outdated answers when retrieved documents do not provide sufficient information to find an answer.
- Score: 137.04039209995932
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce REALTIME QA, a dynamic question answering (QA) platform that
announces questions and evaluates systems on a regular basis (weekly in this
version). REALTIME QA inquires about the current world, and QA systems need to
answer questions about novel events or information. It therefore challenges
static, conventional assumptions in open-domain QA datasets and pursues
instantaneous applications. We build strong baseline models upon large
pretrained language models, including GPT-3 and T5. Our benchmark is an ongoing
effort, and this paper presents real-time evaluation results over the past
year. Our experimental results show that GPT-3 can often properly update its
generation results, based on newly-retrieved documents, highlighting the
importance of up-to-date information retrieval. Nonetheless, we find that GPT-3
tends to return outdated answers when retrieved documents do not provide
sufficient information to find an answer. This suggests an important avenue for
future research: can an open-domain QA system identify such unanswerable cases
and communicate with the user or even the retrieval module to modify the
retrieval results? We hope that REALTIME QA will spur progress in instantaneous
applications of question answering and beyond.
Related papers
- Event Extraction as Question Generation and Answering [72.04433206754489]
Recent work on Event Extraction has reframed the task as Question Answering (QA)
We propose QGA-EE, which enables a Question Generation (QG) model to generate questions that incorporate rich contextual information instead of using fixed templates.
Experiments show that QGA-EE outperforms all prior single-task-based models on the ACE05 English dataset.
arXiv Detail & Related papers (2023-07-10T01:46:15Z) - IfQA: A Dataset for Open-domain Question Answering under Counterfactual
Presuppositions [54.23087908182134]
We introduce the first large-scale counterfactual open-domain question-answering (QA) benchmarks, named IfQA.
The IfQA dataset contains over 3,800 questions that were annotated by crowdworkers on relevant Wikipedia passages.
The unique challenges posed by the IfQA benchmark will push open-domain QA research on both retrieval and counterfactual reasoning fronts.
arXiv Detail & Related papers (2023-05-23T12:43:19Z) - QUADRo: Dataset and Models for QUestion-Answer Database Retrieval [97.84448420852854]
Given a database (DB) of question/answer (q/a) pairs, it is possible to answer a target question by scanning the DB for similar questions.
We build a large scale DB of 6.3M q/a pairs, using public questions, and design a new system based on neural IR and a q/a pair reranker.
We show that our DB-based approach is competitive with Web-based methods, i.e., a QA system built on top the BING search engine.
arXiv Detail & Related papers (2023-03-30T00:42:07Z) - ForecastTKGQuestions: A Benchmark for Temporal Question Answering and
Forecasting over Temporal Knowledge Graphs [28.434829347176233]
Question answering over temporal knowledge graphs (TKGQA) has recently found increasing interest.
TKGQA requires temporal reasoning techniques to extract the relevant information from temporal knowledge bases.
We propose a novel task: forecasting question answering over temporal knowledge graphs.
arXiv Detail & Related papers (2022-08-12T21:02:35Z) - Relation-Guided Pre-Training for Open-Domain Question Answering [67.86958978322188]
We propose a Relation-Guided Pre-Training (RGPT-QA) framework to solve complex open-domain questions.
We show that RGPT-QA achieves 2.2%, 2.4%, and 6.3% absolute improvement in Exact Match accuracy on Natural Questions, TriviaQA, and WebQuestions.
arXiv Detail & Related papers (2021-09-21T17:59:31Z) - Improving Unsupervised Question Answering via Summarization-Informed
Question Generation [47.96911338198302]
Question Generation (QG) is the task of generating a plausible question for a passage, answer> pair.
We make use of freely available news summary data, transforming declarative sentences into appropriate questions using dependency parsing, named entity recognition and semantic role labeling.
The resulting questions are then combined with the original news articles to train an end-to-end neural QG model.
arXiv Detail & Related papers (2021-09-16T13:08:43Z) - SituatedQA: Incorporating Extra-Linguistic Contexts into QA [7.495151447459443]
We introduce SituatedQA, an open-retrieval QA dataset where systems must produce the correct answer to a question given the temporal or geographical context.
We find that a significant proportion of information seeking questions have context-dependent answers.
Our study shows that existing models struggle with producing answers that are frequently updated or from uncommon locations.
arXiv Detail & Related papers (2021-09-13T17:53:21Z) - Hurdles to Progress in Long-form Question Answering [34.805039943215284]
We show that the task formulation raises fundamental challenges regarding evaluation and dataset creation.
We first design a new system that relies on sparse attention and contrastive retriever learning to achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-03-10T20:32:30Z) - Summary-Oriented Question Generation for Informational Queries [23.72999724312676]
We aim to produce self-explanatory questions that focus on main document topics and are answerable with variable length passages as appropriate.
Our model shows SOTA performance of SQ generation on the NQ dataset (20.1 BLEU-4).
We further apply our model on out-of-domain news articles, evaluating with a QA system due to the lack of gold questions and demonstrate that our model produces better SQs for news articles -- with further confirmation via a human evaluation.
arXiv Detail & Related papers (2020-10-19T17:30:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.