TableQuery: Querying tabular data with natural language
- URL: http://arxiv.org/abs/2202.00454v1
- Date: Thu, 27 Jan 2022 17:26:25 GMT
- Title: TableQuery: Querying tabular data with natural language
- Authors: Abhijith Neil Abraham, Fariz Rahman, Damanpreet Kaur
- Abstract summary: In TableQuery, we use deep learning models pre-trained for question answering on free text to convert natural language queries to structured queries.
Deep learning models pre-trained for question answering on free text are readily available on platforms such as HuggingFace Model Hub.
TableQuery does not require re-training; when a newly trained model for question answering with better performance is available, it can replace the existing model in TableQuery.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents TableQuery, a novel tool for querying tabular data using
deep learning models pre-trained to answer questions on free text. Existing
deep learning methods for question answering on tabular data have various
limitations, such as having to feed the entire table as input into a neural
network model, making them unsuitable for most real-world applications. Since
real-world data might contain millions of rows, it may not entirely fit into
the memory. Moreover, data could be stored in live databases, which are updated
in real-time, and it is impractical to serialize an entire database to a neural
network-friendly format each time it is updated. In TableQuery, we use deep
learning models pre-trained for question answering on free text to convert
natural language queries to structured queries, which can be run against a
database or a spreadsheet. This method eliminates the need for fitting the
entire data into memory as well as serializing databases. Furthermore, deep
learning models pre-trained for question answering on free text are readily
available on platforms such as HuggingFace Model Hub (7). TableQuery does not
require re-training; when a newly trained model for question answering with
better performance is available, it can replace the existing model in
TableQuery.
Related papers
- Relational Deep Learning: Graph Representation Learning on Relational
Databases [69.7008152388055]
We introduce an end-to-end representation approach to learn on data laid out across multiple tables.
Message Passing Graph Neural Networks can then automatically learn across the graph to extract representations that leverage all data input.
arXiv Detail & Related papers (2023-12-07T18:51:41Z) - Retrieval-Based Transformer for Table Augmentation [14.460363647772745]
We introduce a novel approach toward automatic data wrangling.
We aim to address table augmentation tasks, including row/column population and data imputation.
Our model consistently and substantially outperforms both supervised statistical methods and the current state-of-the-art transformer-based models.
arXiv Detail & Related papers (2023-06-20T18:51:21Z) - Towards Table-to-Text Generation with Pretrained Language Model: A Table
Structure Understanding and Text Deliberating Approach [60.03002572791552]
We propose a table structure understanding and text deliberating approach, namely TASD.
Specifically, we devise a three-layered multi-head attention network to realize the table-structure-aware text generation model.
Our approach can generate faithful and fluent descriptive texts for different types of tables.
arXiv Detail & Related papers (2023-01-05T14:03:26Z) - OmniTab: Pretraining with Natural and Synthetic Data for Few-shot
Table-based Question Answering [106.73213656603453]
We develop a simple table-based QA model with minimal annotation effort.
We propose an omnivorous pretraining approach that consumes both natural and synthetic data.
arXiv Detail & Related papers (2022-07-08T01:23:45Z) - Table Retrieval May Not Necessitate Table-specific Model Design [83.27735758203089]
We focus on the task of table retrieval, and ask: "is table-specific model design necessary for table retrieval?"
Based on an analysis on a table-based portion of the Natural Questions dataset (NQ-table), we find that structure plays a negligible role in more than 70% of the cases.
We then experiment with three modules to explicitly encode table structures, namely auxiliary row/column embeddings, hard attention masks, and soft relation-based attention biases.
None of these yielded significant improvements, suggesting that table-specific model design may not be necessary for table retrieval.
arXiv Detail & Related papers (2022-05-19T20:35:23Z) - Korean-Specific Dataset for Table Question Answering [3.7056358801102682]
We build Korean-specific datasets for table question answering written in English.
Korean table question answering corpus consists of 70k pairs of questions and answers created by crowd-sourced workers.
We make our datasets publicly available via our GitHub repository.
arXiv Detail & Related papers (2022-01-17T05:47:44Z) - Leveraging Table Content for Zero-shot Text-to-SQL with Meta-Learning [25.69875174742935]
Single-table text-to-one aims to transform a natural language question into a query according to one single table.
We propose a new approach for the zero-shot text-to-one task which does not rely on any additional manual annotations.
We conduct extensive experiments on a public open-domain text-to-one dataset and a domain-specific dataset E.
arXiv Detail & Related papers (2021-09-12T01:01:28Z) - Data Agnostic RoBERTa-based Natural Language to SQL Query Generation [0.0]
The NL2 task aims at finding deep learning approaches to solve the problem converting by natural language questions into valid queries.
We have presented an approach with data privacy at its core.
Although we have not achieved state of the art results, we have eliminated the need for the table right from the training of the model.
arXiv Detail & Related papers (2020-10-11T13:18:46Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z) - TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data [113.29476656550342]
We present TaBERT, a pretrained LM that jointly learns representations for NL sentences and tables.
TaBERT is trained on a large corpus of 26 million tables and their English contexts.
Implementation of the model will be available at http://fburl.com/TaBERT.
arXiv Detail & Related papers (2020-05-17T17:26:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.