API-Assisted Code Generation for Question Answering on Varied Table
Structures
- URL: http://arxiv.org/abs/2310.14687v1
- Date: Mon, 23 Oct 2023 08:26:28 GMT
- Title: API-Assisted Code Generation for Question Answering on Varied Table
Structures
- Authors: Yihan Cao, Shuyi Chen, Ryan Liu, Zhiruo Wang, Daniel Fried
- Abstract summary: A persistent challenge to table question answering (TableQA) by generating executable programs has been adapting to varied table structures.
This paper introduces a unified TableQA framework that provides a unified representation for structured tables as multi-index Pandas data frames.
To answer complex relational questions with extended program functionality and external knowledge, our framework allows customized APIs that Python programs can call.
- Score: 18.65003956496509
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A persistent challenge to table question answering (TableQA) by generating
executable programs has been adapting to varied table structures, typically
requiring domain-specific logical forms. In response, this paper introduces a
unified TableQA framework that: (1) provides a unified representation for
structured tables as multi-index Pandas data frames, (2) uses Python as a
powerful querying language, and (3) uses few-shot prompting to translate NL
questions into Python programs, which are executable on Pandas data frames.
Furthermore, to answer complex relational questions with extended program
functionality and external knowledge, our framework allows customized APIs that
Python programs can call. We experiment with four TableQA datasets that involve
tables of different structures -- relational, multi-table, and hierarchical
matrix shapes -- and achieve prominent improvements over past state-of-the-art
systems. In ablation studies, we (1) show benefits from our multi-index
representation and APIs over baselines that use only an LLM, and (2)
demonstrate that our approach is modular and can incorporate additional APIs.
Related papers
- HiddenTables & PyQTax: A Cooperative Game and Dataset For TableQA to Ensure Scale and Data Privacy Across a Myriad of Taxonomies [9.09415727445941]
We propose a cooperative game dubbed "HiddenTables" as a potential resolution to this challenge.
"HiddenTables" is played between the code-generating "r" and the "Oracle windows" which evaluates the ability of the agents to solve Table QA tasks.
We provide evidential experiments on a diverse set of tables that demonstrate an LLM's collective inability to generalize and perform on complex queries.
arXiv Detail & Related papers (2024-06-16T04:53:29Z) - PyTorch Frame: A Modular Framework for Multi-Modal Tabular Learning [54.912520425218496]
We present PyTorch Frame, a PyTorch-based framework for deep learning over multi-modal tabular data.
We demonstrate the usefulness of PyTorch Frame by implementing diverse models in a modular way.
We integrate PyTorch Frame with PyTorch Geometric, a PyTorch library for Graph Neural Networks (GNNs), to perform end-to-end learning over relational databases.
arXiv Detail & Related papers (2024-03-31T19:15:09Z) - Augment before You Try: Knowledge-Enhanced Table Question Answering via
Table Expansion [57.53174887650989]
Table question answering is a popular task that assesses a model's ability to understand and interact with structured data.
Existing methods either convert both the table and external knowledge into text, which neglects the structured nature of the table.
We propose a simple yet effective method to integrate external information in a given table.
arXiv Detail & Related papers (2024-01-28T03:37:11Z) - TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning [55.33939289989238]
We propose TAP4LLM as a versatile pre-processor suite for leveraging large language models (LLMs) in table-based tasks effectively.
It covers several distinct components: (1) table sampling to decompose large tables into manageable sub-tables based on query semantics, (2) table augmentation to enhance tables with additional knowledge from external sources or models, and (3) table packing & serialization to convert tables into various formats suitable for LLMs' understanding.
arXiv Detail & Related papers (2023-12-14T15:37:04Z) - TableQAKit: A Comprehensive and Practical Toolkit for Table-based
Question Answering [23.412691101965414]
TableQAKit is the first comprehensive toolkit designed specifically for TableQA.
TableQAKit is open-source with an interactive interface that includes visual operations, and comprehensive data for ease of use.
arXiv Detail & Related papers (2023-10-23T16:33:23Z) - MultiTabQA: Generating Tabular Answers for Multi-Table Question
Answering [61.48881995121938]
Real-world queries are complex in nature, often over multiple tables in a relational database or web page.
Our model, MultiTabQA, not only answers questions over multiple tables, but also generalizes to generate tabular answers.
arXiv Detail & Related papers (2023-05-22T08:25:15Z) - Table Retrieval May Not Necessitate Table-specific Model Design [83.27735758203089]
We focus on the task of table retrieval, and ask: "is table-specific model design necessary for table retrieval?"
Based on an analysis on a table-based portion of the Natural Questions dataset (NQ-table), we find that structure plays a negligible role in more than 70% of the cases.
We then experiment with three modules to explicitly encode table structures, namely auxiliary row/column embeddings, hard attention masks, and soft relation-based attention biases.
None of these yielded significant improvements, suggesting that table-specific model design may not be necessary for table retrieval.
arXiv Detail & Related papers (2022-05-19T20:35:23Z) - Retrieving Complex Tables with Multi-Granular Graph Representation
Learning [20.72341939868327]
The task of natural language table retrieval seeks to retrieve semantically relevant tables based on natural language queries.
Existing learning systems treat tables as plain text based on the assumption that tables are structured as dataframes.
We propose Graph-based Table Retrieval (GTR), a generalizable NLTR framework with multi-granular graph representation learning.
arXiv Detail & Related papers (2021-05-04T20:19:03Z) - "What Do You Mean by That?" A Parser-Independent Interactive Approach
for Enhancing Text-to-SQL [49.85635994436742]
We include human in the loop and present a novel-independent interactive approach (PIIA) that interacts with users using multi-choice questions.
PIIA is capable of enhancing the text-to-domain performance with limited interaction turns by using both simulation and human evaluation.
arXiv Detail & Related papers (2020-11-09T02:14:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.