Topic Transferable Table Question Answering
- URL: http://arxiv.org/abs/2109.07377v1
- Date: Wed, 15 Sep 2021 15:34:39 GMT
- Title: Topic Transferable Table Question Answering
- Authors: Saneem Ahmed Chemmengath, Vishwajeet Kumar, Samarth Bharadwaj, Jaydeep
Sen, Mustafa Canim, Soumen Chakrabarti, Alfio Gliozzo, Karthik
Sankaranarayanan
- Abstract summary: Weakly-supervised table question-answering(TableQA) models have achieved state-of-art performance by using pre-trained BERT transformer to jointly encoding a question and a table to produce structured query for the question.
In practical settings TableQA systems are deployed over table corpora having topic and word distributions quite distinct from BERT's pretraining corpus.
We propose T3QA (Topic Transferable Table Question Answering) as a pragmatic adaptation framework for TableQA.
- Score: 33.54533181098762
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weakly-supervised table question-answering(TableQA) models have achieved
state-of-art performance by using pre-trained BERT transformer to jointly
encoding a question and a table to produce structured query for the question.
However, in practical settings TableQA systems are deployed over table corpora
having topic and word distributions quite distinct from BERT's pretraining
corpus. In this work we simulate the practical topic shift scenario by
designing novel challenge benchmarks WikiSQL-TS and WikiTQ-TS, consisting of
train-dev-test splits in five distinct topic groups, based on the popular
WikiSQL and WikiTableQuestions datasets. We empirically show that, despite
pre-training on large open-domain text, performance of models degrades
significantly when they are evaluated on unseen topics. In response, we propose
T3QA (Topic Transferable Table Question Answering) a pragmatic adaptation
framework for TableQA comprising of: (1) topic-specific vocabulary injection
into BERT, (2) a novel text-to-text transformer generator (such as T5, GPT2)
based natural language question generation pipeline focused on generating topic
specific training data, and (3) a logical form reranker. We show that T3QA
provides a reasonably good baseline for our topic shift benchmarks. We believe
our topic split benchmarks will lead to robust TableQA solutions that are
better suited for practical deployment.
Related papers
- Large Language Models are Complex Table Parsers [26.66460264175336]
We propose to incorporate GPT-3.5 to address the challenges posed by Complex Table QA.
Specifically, we encode each cell's hierarchical structure, position information and content as datasets.
By enhancing the prompt template with an explanatory description of the meaning of each task, we effectively improve the hierarchical awareness structure capability.
arXiv Detail & Related papers (2023-12-13T01:34:42Z) - QTSumm: Query-Focused Summarization over Tabular Data [58.62152746690958]
People primarily consult tables to conduct data analysis or answer specific questions.
We define a new query-focused table summarization task, where text generation models have to perform human-like reasoning.
We introduce a new benchmark named QTSumm for this task, which contains 7,111 human-annotated query-summary pairs over 2,934 tables.
arXiv Detail & Related papers (2023-05-23T17:43:51Z) - MultiTabQA: Generating Tabular Answers for Multi-Table Question
Answering [61.48881995121938]
Real-world queries are complex in nature, often over multiple tables in a relational database or web page.
Our model, MultiTabQA, not only answers questions over multiple tables, but also generalizes to generate tabular answers.
arXiv Detail & Related papers (2023-05-22T08:25:15Z) - Bridge the Gap between Language models and Tabular Understanding [99.88470271644894]
Table pretrain-then-finetune paradigm has been proposed and employed at a rapid pace after the success of pre-training in the natural language domain.
Despite the promising findings, there is an input gap between pre-training and fine-tuning phases.
We propose UTP, an approach that dynamically supports three types of multi-modal inputs: table-text, table, and text.
arXiv Detail & Related papers (2023-02-16T15:16:55Z) - ReasTAP: Injecting Table Reasoning Skills During Pre-training via
Synthetic Reasoning Examples [15.212332890570869]
We develop ReasTAP to show that high-level table reasoning skills can be injected into models during pre-training without a complex table-specific architecture design.
ReasTAP achieves new state-of-the-art performance on all benchmarks and delivers a significant improvement on low-resource setting.
arXiv Detail & Related papers (2022-10-22T07:04:02Z) - Dynamic Prompt Learning via Policy Gradient for Semi-structured
Mathematical Reasoning [150.17907456113537]
We present Tabular Math Word Problems (TabMWP), a new dataset containing 38,431 grade-level problems that require mathematical reasoning.
We evaluate different pre-trained models on TabMWP, including the GPT-3 model in a few-shot setting.
We propose a novel approach, PromptPG, which utilizes policy gradient to learn to select in-context examples from a small amount of training data.
arXiv Detail & Related papers (2022-09-29T08:01:04Z) - OmniTab: Pretraining with Natural and Synthetic Data for Few-shot
Table-based Question Answering [106.73213656603453]
We develop a simple table-based QA model with minimal annotation effort.
We propose an omnivorous pretraining approach that consumes both natural and synthetic data.
arXiv Detail & Related papers (2022-07-08T01:23:45Z) - End-to-End Table Question Answering via Retrieval-Augmented Generation [19.89730342792824]
We introduce T-RAG, an end-to-end Table QA model, where a non-parametric dense vector index is fine-tuned jointly with BART, a parametric sequence-to-sequence model to generate answer tokens.
Given any natural language question, T-RAG utilizes a unified pipeline to automatically search through a table corpus to directly locate the correct answer from the table cells.
arXiv Detail & Related papers (2022-03-30T23:30:16Z) - Multi-Row, Multi-Span Distant Supervision For Table+Text Question [33.809732338627136]
Question answering (QA) over tables and linked text, also called TextTableQA, has witnessed significant research in recent years.
We present MITQA, a transformer-based TextTableQA system that is explicitly designed to cope with distant supervision along both these axes.
arXiv Detail & Related papers (2021-12-14T12:48:19Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z) - TAPAS: Weakly Supervised Table Parsing via Pre-training [16.661382998729067]
We present TAPAS, an approach to question answering over tables without generating logical forms.
We experiment with three different semantic parsing datasets.
We find that TAPAS outperforms or rivals semantic parsing models by improving state-of-the-art accuracy.
arXiv Detail & Related papers (2020-04-05T23:18:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.