Related papers: Rethinking Tabular Data Understanding with Large Language Models

Rethinking Tabular Data Understanding with Large Language Models

URL: http://arxiv.org/abs/2312.16702v1
Date: Wed, 27 Dec 2023 19:58:52 GMT
Title: Rethinking Tabular Data Understanding with Large Language Models
Authors: Tianyang Liu, Fei Wang, Muhao Chen
Abstract summary: This study investigates the robustness of Large Language Models (LLMs) to structural perturbations in tables. We show that structural variance of tables presenting the same content reveals a notable performance decline, particularly in symbolic reasoning tasks. We conclude that the aggregation of textual and symbolic reasoning pathways, bolstered by a mix self-consistency mechanism, resulted in achieving SOTA performance, with an accuracy of 73.6% on WIKITABLEQUESTIONS.
Score: 39.38132513255292
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have shown to be capable of various tasks, yet their capability in interpreting and reasoning over tabular data remains an underexplored area. In this context, this study investigates from three core perspectives: the robustness of LLMs to structural perturbations in tables, the comparative analysis of textual and symbolic reasoning on tables, and the potential of boosting model performance through the aggregation of multiple reasoning pathways. We discover that structural variance of tables presenting the same content reveals a notable performance decline, particularly in symbolic reasoning tasks. This prompts the proposal of a method for table structure normalization. Moreover, textual reasoning slightly edges out symbolic reasoning, and a detailed error analysis reveals that each exhibits different strengths depending on the specific tasks. Notably, the aggregation of textual and symbolic reasoning pathways, bolstered by a mix self-consistency mechanism, resulted in achieving SOTA performance, with an accuracy of 73.6% on WIKITABLEQUESTIONS, representing a substantial advancement over previous existing table processing paradigms of LLMs.

Related papers

Fast-Slow Thinking for Large Vision-Language Model Reasoning [22.084891053164686]
We present textbfFAST, a framework that adapts reasoning depth based on question characteristics. FAST achieves state-of-the-art accuracy with over 10% relative improvement compared to the base model.
arXiv Detail & Related papers (2025-04-25T16:11:23Z)
Your Language Model May Think Too Rigidly: Achieving Reasoning Consistency with Symmetry-Enhanced Training [66.48331530995786]
We propose syMmetry-ENhanceD (MEND) Data Augmentation, a data-centric approach that improves the model's ability to extract useful information from context. Unlike existing methods that emphasize reasoning chain augmentation, our approach improves model robustness at the knowledge extraction stage. Experiments on both logical and arithmetic reasoning tasks show that MEND enhances reasoning performance across diverse query variations.
arXiv Detail & Related papers (2025-02-25T03:03:35Z)
TableMaster: A Recipe to Advance Table Understanding with Language Models [0.0]
TableMaster is a recipe and comprehensive framework that integrates multiple solutions to overcome these obstacles. On the WikiTQ dataset, TableMaster achieves an accuracy of 78.13% using GPT-4o-mini, surpassing existing baselines.
arXiv Detail & Related papers (2025-01-31T18:31:31Z)
Path-of-Thoughts: Extracting and Following Paths for Robust Relational Reasoning with Large Language Models [62.12031550252253]
We present Path-of-Thoughts (PoT), a novel framework designed to tackle relation reasoning. PoT efficiently extracts a task-agnostic graph that identifies crucial entities, relations, and attributes within the problem context. PoT identifies relevant reasoning chains within the graph corresponding to the posed question, facilitating inference of potential answers.
arXiv Detail & Related papers (2024-12-23T20:27:12Z)
GRS-QA -- Graph Reasoning-Structured Question Answering Dataset [50.223851616680754]
We introduce the Graph Reasoning-Structured Question Answering dataset (GRS-QA), which includes both semantic contexts and reasoning structures for QA pairs. Unlike existing M-QA datasets, GRS-QA explicitly captures intricate reasoning pathways by constructing reasoning graphs. Our empirical analysis reveals that LLMs perform differently when handling questions with varying reasoning structures.
arXiv Detail & Related papers (2024-11-01T05:14:03Z)
Knowledge-Aware Reasoning over Multimodal Semi-structured Tables [85.24395216111462]
This study investigates whether current AI models can perform knowledge-aware reasoning on multimodal structured data. We introduce MMTabQA, a new dataset designed for this purpose. Our experiments highlight substantial challenges for current AI models in effectively integrating and interpreting multiple text and image inputs.
arXiv Detail & Related papers (2024-08-25T15:17:43Z)
ALTER: Augmentation for Large-Table-Based Reasoning [5.164923314261229]
ALTER(Augmentation for Large-Table-Based Reasoning) is a framework designed to harness the latent augmentation potential in both free-form natural language (NL) questions. By utilizing only a small subset of relevant data from the table, ALTER achieves outstanding performance on table-based reasoning benchmarks.
arXiv Detail & Related papers (2024-07-03T12:34:45Z)
H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables [56.73919743039263]
This paper introduces a novel algorithm that integrates both symbolic and semantic (textual) approaches in a two-stage process to address limitations. Our experiments demonstrate that H-STAR significantly outperforms state-of-the-art methods across three question-answering (QA) and fact-verification datasets.
arXiv Detail & Related papers (2024-06-29T21:24:19Z)
NormTab: Improving Symbolic Reasoning in LLMs Through Tabular Data Normalization [6.253771639590562]
We introduce NormTab, a framework aimed at enhancing the symbolic reasoning performance of Large Language Models (LLMs) by normalizing web tables. We study table normalization as a stand-alone, one-time preprocessing step using LLMs to support symbolic reasoning on tabular data. Our experimental evaluation, conducted on challenging web table datasets such as WikiTableQuestion and TabFact, demonstrates that leveraging NormTab significantly improves symbolic reasoning performance.
arXiv Detail & Related papers (2024-06-25T22:40:03Z)
On the Robustness of Language Models for Tabular Question Answering [7.486549276995143]
Large Language Models (LLMs) have been shown to tackle table comprehension tasks without specific training. We evaluate the robustness of LLMs on Wikipedia-based $textbfWTQ$ and financial report-based $textbfTAT-QA$ TQA datasets.
arXiv Detail & Related papers (2024-06-18T15:41:15Z)
Investigating Symbolic Capabilities of Large Language Models [16.88906206735967]
This study aims to bridge the gap by rigorously evaluating Large Language Models (LLMs) on a series of symbolic tasks. Our analysis encompasses eight LLMs, including four enterprise-grade and four open-source models, of which three have been pre-trained on mathematical tasks. The findings reveal a significant decline in LLMs' performance on context-free and context-sensitive symbolic tasks as the complexity, represented by the number of symbols, increases.
arXiv Detail & Related papers (2024-05-21T21:24:34Z)
Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering [53.56653281752486]
This study explores Large Language Models' mathematical reasoning on four financial question-answering datasets. We focus on sensitivity to table complexity and performance variations with an increasing number of arithmetic reasoning steps. We introduce a novel prompting technique tailored to semi-structured documents, matching or outperforming other baselines in performance.
arXiv Detail & Related papers (2024-02-17T05:10:18Z)
Did the Cat Drink the Coffee? Challenging Transformers with Generalized Event Knowledge [59.22170796793179]
Transformers Language Models (TLMs) were tested on a benchmark for the textitdynamic estimation of thematic fit Our results show that TLMs can reach performances that are comparable to those achieved by SDM. However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge.
arXiv Detail & Related papers (2021-07-22T20:52:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.