TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition
- URL: http://arxiv.org/abs/2404.10150v1
- Date: Mon, 15 Apr 2024 21:42:20 GMT
- Title: TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition
- Authors: Md Mahadi Hasan Nahid, Davood Rafiei,
- Abstract summary: Table reasoning is a challenging task that requires understanding both natural language questions and structured data.
We propose Tabify, a novel method that leverages text-to-generation to decompose tables into smaller and relevant sub-tables.
Our method performs remarkably well on the WikiTQ benchmark, achieving an accuracy of 64.7%.
- Score: 6.253771639590562
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Table reasoning is a challenging task that requires understanding both natural language questions and structured tabular data. Large language models (LLMs) have shown impressive capabilities in natural language understanding and generation, but they often struggle with large tables due to their limited input length. In this paper, we propose TabSQLify, a novel method that leverages text-to-SQL generation to decompose tables into smaller and relevant sub-tables, containing only essential information for answering questions or verifying statements, before performing the reasoning task. In our comprehensive evaluation on four challenging datasets, our approach demonstrates comparable or superior performance compared to prevailing methods reliant on full tables as input. Moreover, our method can reduce the input context length significantly, making it more scalable and efficient for large-scale table reasoning applications. Our method performs remarkably well on the WikiTQ benchmark, achieving an accuracy of 64.7%. Additionally, on the TabFact benchmark, it achieves a high accuracy of 79.5%. These results surpass other LLM-based baseline models on gpt-3.5-turbo (chatgpt). TabSQLify can reduce the table size significantly alleviating the computational load on LLMs when handling large tables without compromising performance.
Related papers
- Generating Tables from the Parametric Knowledge of Language Models [6.316194671269148]
We explore generating tables from the parametric knowledge of large language models (LLMs)
We examine the table generation abilities of four state-of-the-art LLMs: GPT-3.5, GPT-4, Llama2-13B, and Llama2-70B.
For evaluation, we introduce a novel benchmark, WikiTabGen which contains 100 curated Wikipedia tables.
arXiv Detail & Related papers (2024-06-16T12:55:55Z) - Uncovering Limitations of Large Language Models in Information Seeking from Tables [28.19697259795014]
This paper introduces a more reliable benchmark for Table Information Seeking (TabIS)
To avoid the unreliable evaluation caused by text similarity-based metrics, TabIS adopts a single-choice question format (with two options per question) instead of a text generation format.
arXiv Detail & Related papers (2024-06-06T14:30:59Z) - OpenTab: Advancing Large Language Models as Open-domain Table Reasoners [38.29047314758911]
OpenTab is an open-domain table reasoning framework powered by Large Language Models (LLMs)
OpenTab significantly outperforms baselines in both open- and closed-domain settings, achieving up to 21.5% higher accuracy.
arXiv Detail & Related papers (2024-02-22T08:01:01Z) - Chain-of-Table: Evolving Tables in the Reasoning Chain for Table
Understanding [79.9461269253121]
We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts.
Chain-of-Table achieves new state-of-the-art performance on WikiTQ, FeTaQA, and TabFact benchmarks.
arXiv Detail & Related papers (2024-01-09T07:46:26Z) - TAP4LLM: Table Provider on Sampling, Augmenting, and Packing
Semi-structured Data for Large Language Model Reasoning [58.11442663694328]
We propose TAP4LLM as a versatile pre-processing toolbox to generate table prompts.
In each module, we collect and design several common methods for usage in various scenarios.
arXiv Detail & Related papers (2023-12-14T15:37:04Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - Large Language Models are Versatile Decomposers: Decompose Evidence and
Questions for Table-based Reasoning [45.013230888670435]
We exploit large language models (LLMs) as decomposers for effective table-based reasoning.
We decompose huge evidence (a huge table) into sub-evidence (a small table) to mitigate the interference of useless information.
We propose a "parsing-execution-filling" strategy to alleviate the dilemma of the chain of thought.
arXiv Detail & Related papers (2023-01-31T17:51:45Z) - OmniTab: Pretraining with Natural and Synthetic Data for Few-shot
Table-based Question Answering [106.73213656603453]
We develop a simple table-based QA model with minimal annotation effort.
We propose an omnivorous pretraining approach that consumes both natural and synthetic data.
arXiv Detail & Related papers (2022-07-08T01:23:45Z) - Table Retrieval May Not Necessitate Table-specific Model Design [83.27735758203089]
We focus on the task of table retrieval, and ask: "is table-specific model design necessary for table retrieval?"
Based on an analysis on a table-based portion of the Natural Questions dataset (NQ-table), we find that structure plays a negligible role in more than 70% of the cases.
We then experiment with three modules to explicitly encode table structures, namely auxiliary row/column embeddings, hard attention masks, and soft relation-based attention biases.
None of these yielded significant improvements, suggesting that table-specific model design may not be necessary for table retrieval.
arXiv Detail & Related papers (2022-05-19T20:35:23Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.