TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition
- URL: http://arxiv.org/abs/2404.10150v1
- Date: Mon, 15 Apr 2024 21:42:20 GMT
- Title: TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition
- Authors: Md Mahadi Hasan Nahid, Davood Rafiei,
- Abstract summary: Table reasoning is a challenging task that requires understanding both natural language questions and structured data.
We propose Tabify, a novel method that leverages text-to-generation to decompose tables into smaller and relevant sub-tables.
Our method performs remarkably well on the WikiTQ benchmark, achieving an accuracy of 64.7%.
- Score: 6.253771639590562
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Table reasoning is a challenging task that requires understanding both natural language questions and structured tabular data. Large language models (LLMs) have shown impressive capabilities in natural language understanding and generation, but they often struggle with large tables due to their limited input length. In this paper, we propose TabSQLify, a novel method that leverages text-to-SQL generation to decompose tables into smaller and relevant sub-tables, containing only essential information for answering questions or verifying statements, before performing the reasoning task. In our comprehensive evaluation on four challenging datasets, our approach demonstrates comparable or superior performance compared to prevailing methods reliant on full tables as input. Moreover, our method can reduce the input context length significantly, making it more scalable and efficient for large-scale table reasoning applications. Our method performs remarkably well on the WikiTQ benchmark, achieving an accuracy of 64.7%. Additionally, on the TabFact benchmark, it achieves a high accuracy of 79.5%. These results surpass other LLM-based baseline models on gpt-3.5-turbo (chatgpt). TabSQLify can reduce the table size significantly alleviating the computational load on LLMs when handling large tables without compromising performance.
Related papers
- TabSD: Large Free-Form Table Question Answering with SQL-Based Table Decomposition [29.384514074911955]
Question answering on free-form tables (TableQA) is challenging due to the absence of predefined schemas and the presence of noise in large tables.
We propose TabSD, asql-based decomposition model that enhances Large Language Models' ability to process large free-form tables.
We introduce two TableQA datasets with large free-form tables, SLQA and SEQA, which consist solely of large free-form tables.
arXiv Detail & Related papers (2025-02-19T04:45:05Z) - Interpretable LLM-based Table Question Answering [5.940265173828534]
Plan-of-inputs ( or POS) is an interpretable, effective, and efficient approach to Table QA.
We show that POS is most preferred among explanation methods, helps human users understand model decision boundaries, and facilitates model success and error identification.
arXiv Detail & Related papers (2024-12-16T22:44:31Z) - RSL-SQL: Robust Schema Linking in Text-to-SQL Generation [51.00761167842468]
We propose a novel framework called RSL- that combines bidirectional schema linking, contextual information augmentation, binary selection strategy, and multi-turn self-correction.
benchmarks demonstrate that our approach achieves SOTA execution accuracy among open-source solutions, with 67.2% on BIRD and 87.9% on GPT-4ocorrection.
Our approach outperforms a series of GPT-4 based Text-to-Seek systems when adopting DeepSeek (much cheaper) with same intact prompts.
arXiv Detail & Related papers (2024-10-31T16:22:26Z) - Accurate and Regret-aware Numerical Problem Solver for Tabular Question Answering [29.384514074911955]
We propose a model named TabLaP that uses Large Language Models as a planner rather than an answer generator.
We show that TabLaP is substantially more accurate than the state-of-the-art models, improving the answer accuracy by 5.7% and 5.8% on the two datasets.
arXiv Detail & Related papers (2024-10-10T05:34:00Z) - TableRAG: Million-Token Table Understanding with Language Models [53.039560091592215]
TableRAG is a Retrieval-Augmented Generation (RAG) framework specifically designed for LM-based table understanding.
TableRAG leverages query expansion combined with schema and cell retrieval to pinpoint crucial information before providing it to the LMs.
Our results demonstrate that TableRAG achieves the highest retrieval quality, leading to the new state-of-the-art performance on large-scale table understanding.
arXiv Detail & Related papers (2024-10-07T04:15:02Z) - TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning [61.14586098005874]
Current Large Language Models (LLMs) exhibit limited ability to understand table structures and to apply precise numerical reasoning.
We introduce our Tool-Augmented Reasoning framework for Tables (TART), which integrates LLMs with specialized tools.
TART contains three key components: a table formatter to ensure accurate data representation, a tool maker to develop specific computational tools, and an explanation generator to maintain explainability.
arXiv Detail & Related papers (2024-09-18T06:19:59Z) - OpenTab: Advancing Large Language Models as Open-domain Table Reasoners [38.29047314758911]
OpenTab is an open-domain table reasoning framework powered by Large Language Models (LLMs)
OpenTab significantly outperforms baselines in both open- and closed-domain settings, achieving up to 21.5% higher accuracy.
arXiv Detail & Related papers (2024-02-22T08:01:01Z) - Chain-of-Table: Evolving Tables in the Reasoning Chain for Table
Understanding [79.9461269253121]
We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts.
Chain-of-Table achieves new state-of-the-art performance on WikiTQ, FeTaQA, and TabFact benchmarks.
arXiv Detail & Related papers (2024-01-09T07:46:26Z) - TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning [55.33939289989238]
We propose TAP4LLM as a versatile pre-processor suite for leveraging large language models (LLMs) in table-based tasks effectively.
It covers several distinct components: (1) table sampling to decompose large tables into manageable sub-tables based on query semantics, (2) table augmentation to enhance tables with additional knowledge from external sources or models, and (3) table packing & serialization to convert tables into various formats suitable for LLMs' understanding.
arXiv Detail & Related papers (2023-12-14T15:37:04Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.