Open-WikiTable: Dataset for Open Domain Question Answering with Complex
Reasoning over Table
- URL: http://arxiv.org/abs/2305.07288v1
- Date: Fri, 12 May 2023 07:24:16 GMT
- Title: Open-WikiTable: Dataset for Open Domain Question Answering with Complex
Reasoning over Table
- Authors: Sunjun Kweon, Yeonsu Kwon, Seonhee Cho, Yohan Jo, Edward Choi
- Abstract summary: Open-WikiTable is the first ODQA dataset that requires complex reasoning over tables.
Open-WikiTable is built upon Wiki and WikiTableQuestions.
- Score: 6.436886630398141
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Despite recent interest in open domain question answering (ODQA) over tables,
many studies still rely on datasets that are not truly optimal for the task
with respect to utilizing structural nature of table. These datasets assume
answers reside as a single cell value and do not necessitate exploring over
multiple cells such as aggregation, comparison, and sorting. Thus, we release
Open-WikiTable, the first ODQA dataset that requires complex reasoning over
tables. Open-WikiTable is built upon WikiSQL and WikiTableQuestions to be
applicable in the open-domain setting. As each question is coupled with both
textual answers and SQL queries, Open-WikiTable opens up a wide range of
possibilities for future research, as both reader and parser methods can be
applied. The dataset and code are publicly available.
Related papers
- TANQ: An open domain dataset of table answered questions [15.323690523538572]
TANQ is the first open domain question answering dataset where the answers require building tables from information across multiple sources.
We release the full source attribution for every cell in the resulting table and benchmark state-of-the-art language models in open, oracle, and closed book setups.
Our best-performing baseline, GPT4 reaches an overall F1 score of 29.1, lagging behind human performance by 19.7 points.
arXiv Detail & Related papers (2024-05-13T14:07:20Z) - Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval [52.592071689901196]
We introduce a method that uncovers useful join relations for any query and database during table retrieval.
Our method outperforms the state-of-the-art approaches for table retrieval by up to 9.3% in F1 score and for end-to-end QA by up to 5.4% in accuracy.
arXiv Detail & Related papers (2024-04-15T15:55:01Z) - MURRE: Multi-Hop Table Retrieval with Removal for Open-Domain Text-to-SQL [51.48239006107272]
Multi-hop table retrieval with removal (MURRE) removes previously retrieved information from the question to guide towards unretrieved relevant tables.
Experiments on two open-domain text-to- retriever datasets demonstrate an average improvement of 5.7% over the previous state-of-the-art results.
arXiv Detail & Related papers (2024-02-16T13:14:35Z) - Augment before You Try: Knowledge-Enhanced Table Question Answering via
Table Expansion [57.53174887650989]
Table question answering is a popular task that assesses a model's ability to understand and interact with structured data.
Existing methods either convert both the table and external knowledge into text, which neglects the structured nature of the table.
We propose a simple yet effective method to integrate external information in a given table.
arXiv Detail & Related papers (2024-01-28T03:37:11Z) - MultiTabQA: Generating Tabular Answers for Multi-Table Question
Answering [61.48881995121938]
Real-world queries are complex in nature, often over multiple tables in a relational database or web page.
Our model, MultiTabQA, not only answers questions over multiple tables, but also generalizes to generate tabular answers.
arXiv Detail & Related papers (2023-05-22T08:25:15Z) - Open Domain Question Answering over Virtual Documents: A Unified
Approach for Data and Text [62.489652395307914]
We use the data-to-text method as a means for encoding structured knowledge for knowledge-intensive applications, i.e. open-domain question answering (QA)
Specifically, we propose a verbalizer-retriever-reader framework for open-domain QA over data and text where verbalized tables from Wikipedia and triples from Wikidata are used as augmented knowledge sources.
We show that our Unified Data and Text QA, UDT-QA, can effectively benefit from the expanded knowledge index, leading to large gains over text-only baselines.
arXiv Detail & Related papers (2021-10-16T00:11:21Z) - Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open
Domain Question Answering [78.9863753810787]
A large amount of world's knowledge is stored in structured databases.
query languages can answer questions that require complex reasoning, as well as offering full explainability.
arXiv Detail & Related papers (2021-08-05T22:04:13Z) - AIT-QA: Question Answering Dataset over Complex Tables in the Airline
Industry [30.330772077451048]
We introduce the domain-specific Table QA dataset AIT-QA (Industry Table QA)
The dataset consists of 515 questions authored by human annotators on 116 tables extracted from public U.S. SEC filings.
We also provide annotations pertaining to the nature of questions, marking those that require hierarchical headers, domain-specific terminology, and paraphrased forms.
arXiv Detail & Related papers (2021-06-24T12:14:18Z) - Open Domain Question Answering Using Web Tables [8.25461115955717]
We develop an open-domain QA approach using web tables that works for both factoid and non-factoid queries.
Our solution is used in production in a major commercial web search engine and serves direct answers for tens of millions of real user queries per month.
arXiv Detail & Related papers (2020-01-10T01:25:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.