Related papers: TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition

TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition

URL: http://arxiv.org/abs/2404.10150v1
Date: Mon, 15 Apr 2024 21:42:20 GMT
Title: TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition
Authors: Md Mahadi Hasan Nahid, Davood Rafiei,
Abstract summary: Table reasoning is a challenging task that requires understanding both natural language questions and structured data. We propose Tabify, a novel method that leverages text-to-generation to decompose tables into smaller and relevant sub-tables. Our method performs remarkably well on the WikiTQ benchmark, achieving an accuracy of 64.7%.
Score: 6.253771639590562
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Table reasoning is a challenging task that requires understanding both natural language questions and structured tabular data. Large language models (LLMs) have shown impressive capabilities in natural language understanding and generation, but they often struggle with large tables due to their limited input length. In this paper, we propose TabSQLify, a novel method that leverages text-to-SQL generation to decompose tables into smaller and relevant sub-tables, containing only essential information for answering questions or verifying statements, before performing the reasoning task. In our comprehensive evaluation on four challenging datasets, our approach demonstrates comparable or superior performance compared to prevailing methods reliant on full tables as input. Moreover, our method can reduce the input context length significantly, making it more scalable and efficient for large-scale table reasoning applications. Our method performs remarkably well on the WikiTQ benchmark, achieving an accuracy of 64.7%. Additionally, on the TabFact benchmark, it achieves a high accuracy of 79.5%. These results surpass other LLM-based baseline models on gpt-3.5-turbo (chatgpt). TabSQLify can reduce the table size significantly alleviating the computational load on LLMs when handling large tables without compromising performance.

Related papers

TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models [57.005158277893194]
TableLoRA is a module designed to improve LLMs' understanding of table structure during PEFT. It incorporates special tokens for serializing tables with special token encoder and uses 2D LoRA to encode low-rank information on cell positions.
arXiv Detail & Related papers (2025-03-06T12:50:14Z)
TabSD: Large Free-Form Table Question Answering with SQL-Based Table Decomposition [29.384514074911955]
Question answering on free-form tables (TableQA) is challenging due to the absence of predefined schemas and the presence of noise in large tables. We propose TabSD, asql-based decomposition model that enhances Large Language Models' ability to process large free-form tables. We introduce two TableQA datasets with large free-form tables, SLQA and SEQA, which consist solely of large free-form tables.
arXiv Detail & Related papers (2025-02-19T04:45:05Z)
RSL-SQL: Robust Schema Linking in Text-to-SQL Generation [51.00761167842468]
We propose a novel framework called RSL- that combines bidirectional schema linking, contextual information augmentation, binary selection strategy, and multi-turn self-correction. benchmarks demonstrate that our approach achieves SOTA execution accuracy among open-source solutions, with 67.2% on BIRD and 87.9% on GPT-4ocorrection. Our approach outperforms a series of GPT-4 based Text-to-Seek systems when adopting DeepSeek (much cheaper) with same intact prompts.
arXiv Detail & Related papers (2024-10-31T16:22:26Z)
Accurate and Regret-aware Numerical Problem Solver for Tabular Question Answering [29.384514074911955]
We propose a model named TabLaP that uses Large Language Models as a planner rather than an answer generator. We show that TabLaP is substantially more accurate than the state-of-the-art models, improving the answer accuracy by 5.7% and 5.8% on the two datasets.
arXiv Detail & Related papers (2024-10-10T05:34:00Z)
TableRAG: Million-Token Table Understanding with Language Models [53.039560091592215]
TableRAG is a Retrieval-Augmented Generation (RAG) framework specifically designed for LM-based table understanding. TableRAG leverages query expansion combined with schema and cell retrieval to pinpoint crucial information before providing it to the LMs. Our results demonstrate that TableRAG achieves the highest retrieval quality, leading to the new state-of-the-art performance on large-scale table understanding.
arXiv Detail & Related papers (2024-10-07T04:15:02Z)
TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning [61.14586098005874]
Current Large Language Models (LLMs) exhibit limited ability to understand table structures and to apply precise numerical reasoning. We introduce our Tool-Augmented Reasoning framework for Tables (TART), which integrates LLMs with specialized tools. TART contains three key components: a table formatter to ensure accurate data representation, a tool maker to develop specific computational tools, and an explanation generator to maintain explainability.
arXiv Detail & Related papers (2024-09-18T06:19:59Z)
Generating Tables from the Parametric Knowledge of Language Models [6.316194671269148]
We explore generating tables from the parametric knowledge of large language models (LLMs) We examine the table generation abilities of four state-of-the-art LLMs: GPT-3.5, GPT-4, Llama2-13B, and Llama2-70B. For evaluation, we introduce a novel benchmark, WikiTabGen which contains 100 curated Wikipedia tables.
arXiv Detail & Related papers (2024-06-16T12:55:55Z)
OpenTab: Advancing Large Language Models as Open-domain Table Reasoners [38.29047314758911]
OpenTab is an open-domain table reasoning framework powered by Large Language Models (LLMs) OpenTab significantly outperforms baselines in both open- and closed-domain settings, achieving up to 21.5% higher accuracy.
arXiv Detail & Related papers (2024-02-22T08:01:01Z)
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding [79.9461269253121]
We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts. Chain-of-Table achieves new state-of-the-art performance on WikiTQ, FeTaQA, and TabFact benchmarks.
arXiv Detail & Related papers (2024-01-09T07:46:26Z)
TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning [55.33939289989238]
We propose TAP4LLM as a versatile pre-processor suite for leveraging large language models (LLMs) in table-based tasks effectively. It covers several distinct components: (1) table sampling to decompose large tables into manageable sub-tables based on query semantics, (2) table augmentation to enhance tables with additional knowledge from external sources or models, and (3) table packing & serialization to convert tables into various formats suitable for LLMs' understanding.
arXiv Detail & Related papers (2023-12-14T15:37:04Z)
Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning [45.013230888670435]
We exploit large language models (LLMs) as decomposers for effective table-based reasoning. We decompose huge evidence (a huge table) into sub-evidence (a small table) to mitigate the interference of useless information. We propose a "parsing-execution-filling" strategy to alleviate the dilemma of the chain of thought.
arXiv Detail & Related papers (2023-01-31T17:51:45Z)
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing. We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar. To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.