Rethinking Tabular Data Understanding with Large Language Models
- URL: http://arxiv.org/abs/2312.16702v1
- Date: Wed, 27 Dec 2023 19:58:52 GMT
- Title: Rethinking Tabular Data Understanding with Large Language Models
- Authors: Tianyang Liu, Fei Wang, Muhao Chen
- Abstract summary: This study investigates the robustness of Large Language Models (LLMs) to structural perturbations in tables.
We show that structural variance of tables presenting the same content reveals a notable performance decline, particularly in symbolic reasoning tasks.
We conclude that the aggregation of textual and symbolic reasoning pathways, bolstered by a mix self-consistency mechanism, resulted in achieving SOTA performance, with an accuracy of 73.6% on WIKITABLEQUESTIONS.
- Score: 39.38132513255292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have shown to be capable of various tasks, yet
their capability in interpreting and reasoning over tabular data remains an
underexplored area. In this context, this study investigates from three core
perspectives: the robustness of LLMs to structural perturbations in tables, the
comparative analysis of textual and symbolic reasoning on tables, and the
potential of boosting model performance through the aggregation of multiple
reasoning pathways. We discover that structural variance of tables presenting
the same content reveals a notable performance decline, particularly in
symbolic reasoning tasks. This prompts the proposal of a method for table
structure normalization. Moreover, textual reasoning slightly edges out
symbolic reasoning, and a detailed error analysis reveals that each exhibits
different strengths depending on the specific tasks. Notably, the aggregation
of textual and symbolic reasoning pathways, bolstered by a mix self-consistency
mechanism, resulted in achieving SOTA performance, with an accuracy of 73.6% on
WIKITABLEQUESTIONS, representing a substantial advancement over previous
existing table processing paradigms of LLMs.
Related papers
- GRSQA -- Graph Reasoning-Structured Question Answering Dataset [50.223851616680754]
We introduce the Graph Reasoning-Structured Question Answering dataset (GRS-QA), which includes both semantic contexts and reasoning structures for QA pairs.
Unlike existing M-QA datasets, GRS-QA explicitly captures intricate reasoning pathways by constructing reasoning graphs.
Our empirical analysis reveals that LLMs perform differently when handling questions with varying reasoning structures.
arXiv Detail & Related papers (2024-11-01T05:14:03Z) - Knowledge-Aware Reasoning over Multimodal Semi-structured Tables [85.24395216111462]
This study investigates whether current AI models can perform knowledge-aware reasoning on multimodal structured data.
We introduce MMTabQA, a new dataset designed for this purpose.
Our experiments highlight substantial challenges for current AI models in effectively integrating and interpreting multiple text and image inputs.
arXiv Detail & Related papers (2024-08-25T15:17:43Z) - ALTER: Augmentation for Large-Table-Based Reasoning [5.164923314261229]
ALTER(Augmentation for Large-Table-Based Reasoning) is a framework designed to harness the latent augmentation potential in both free-form natural language (NL) questions.
By utilizing only a small subset of relevant data from the table, ALTER achieves outstanding performance on table-based reasoning benchmarks.
arXiv Detail & Related papers (2024-07-03T12:34:45Z) - H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables [56.73919743039263]
This paper introduces a novel algorithm that integrates both symbolic and semantic (textual) approaches in a two-stage process to address limitations.
Our experiments demonstrate that H-STAR significantly outperforms state-of-the-art methods across three question-answering (QA) and fact-verification datasets.
arXiv Detail & Related papers (2024-06-29T21:24:19Z) - NormTab: Improving Symbolic Reasoning in LLMs Through Tabular Data Normalization [6.253771639590562]
We introduce NormTab, a framework aimed at enhancing the symbolic reasoning performance of Large Language Models (LLMs) by normalizing web tables.
We study table normalization as a stand-alone, one-time preprocessing step using LLMs to support symbolic reasoning on tabular data.
Our experimental evaluation, conducted on challenging web table datasets such as WikiTableQuestion and TabFact, demonstrates that leveraging NormTab significantly improves symbolic reasoning performance.
arXiv Detail & Related papers (2024-06-25T22:40:03Z) - On the Robustness of Language Models for Tabular Question Answering [7.486549276995143]
Large Language Models (LLMs) have been shown to tackle table comprehension tasks without specific training.
We evaluate the robustness of LLMs on Wikipedia-based $textbfWTQ$ and financial report-based $textbfTAT-QA$ TQA datasets.
arXiv Detail & Related papers (2024-06-18T15:41:15Z) - Evaluating LLMs' Mathematical Reasoning in Financial Document Question
Answering [53.56653281752486]
This study explores Large Language Models' mathematical reasoning on four financial question-answering datasets.
We focus on sensitivity to table complexity and performance variations with an increasing number of arithmetic reasoning steps.
We introduce a novel prompting technique tailored to semi-structured documents, matching or outperforming other baselines in performance.
arXiv Detail & Related papers (2024-02-17T05:10:18Z) - Did the Cat Drink the Coffee? Challenging Transformers with Generalized
Event Knowledge [59.22170796793179]
Transformers Language Models (TLMs) were tested on a benchmark for the textitdynamic estimation of thematic fit
Our results show that TLMs can reach performances that are comparable to those achieved by SDM.
However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge.
arXiv Detail & Related papers (2021-07-22T20:52:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.