Large Language Models are Versatile Decomposers: Decompose Evidence and
Questions for Table-based Reasoning
- URL: http://arxiv.org/abs/2301.13808v3
- Date: Thu, 27 Apr 2023 11:24:10 GMT
- Title: Large Language Models are Versatile Decomposers: Decompose Evidence and
Questions for Table-based Reasoning
- Authors: Yunhu Ye, Binyuan Hui, Min Yang, Binhua Li, Fei Huang, Yongbin Li
- Abstract summary: We exploit large language models (LLMs) as decomposers for effective table-based reasoning.
We decompose huge evidence (a huge table) into sub-evidence (a small table) to mitigate the interference of useless information.
We propose a "parsing-execution-filling" strategy to alleviate the dilemma of the chain of thought.
- Score: 45.013230888670435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Table-based reasoning has shown remarkable progress in combining deep models
with discrete reasoning, which requires reasoning over both free-form natural
language (NL) questions and structured tabular data. However, previous
table-based reasoning solutions usually suffer from significant performance
degradation on huge evidence (tables). In addition, most existing methods
struggle to reason over complex questions since the required information is
scattered in different places. To alleviate the above challenges, we exploit
large language models (LLMs) as decomposers for effective table-based
reasoning, which (i) decompose huge evidence (a huge table) into sub-evidence
(a small table) to mitigate the interference of useless information for table
reasoning; and (ii) decompose complex questions into simpler sub-questions for
text reasoning. Specifically, we first use the LLMs to break down the evidence
(tables) involved in the current question, retaining the relevant evidence and
excluding the remaining irrelevant evidence from the huge table. In addition,
we propose a "parsing-execution-filling" strategy to alleviate the
hallucination dilemma of the chain of thought by decoupling logic and numerical
computation in each step. Extensive experiments show that our method can
effectively leverage decomposed evidence and questions and outperforms the
strong baselines on TabFact, WikiTableQuestion, and FetaQA datasets. Notably,
our model outperforms human performance for the first time on the TabFact
dataset.
Related papers
- TableRAG: Million-Token Table Understanding with Language Models [53.039560091592215]
TableRAG is a Retrieval-Augmented Generation (RAG) framework specifically designed for LM-based table understanding.
TableRAG leverages query expansion combined with schema and cell retrieval to pinpoint crucial information before providing it to the LMs.
Our results demonstrate that TableRAG achieves the highest retrieval quality, leading to the new state-of-the-art performance on large-scale table understanding.
arXiv Detail & Related papers (2024-10-07T04:15:02Z) - ALTER: Augmentation for Large-Table-Based Reasoning [5.164923314261229]
ALTER(Augmentation for Large-Table-Based Reasoning) is a framework designed to harness the latent augmentation potential in both free-form natural language (NL) questions.
By utilizing only a small subset of relevant data from the table, ALTER achieves outstanding performance on table-based reasoning benchmarks.
arXiv Detail & Related papers (2024-07-03T12:34:45Z) - H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables [56.73919743039263]
This paper introduces a novel algorithm that integrates both symbolic and semantic (textual) approaches in a two-stage process to address limitations.
Our experiments demonstrate that H-STAR significantly outperforms state-of-the-art methods across three question-answering (QA) and fact-verification datasets.
arXiv Detail & Related papers (2024-06-29T21:24:19Z) - Optimizing Language Model's Reasoning Abilities with Weak Supervision [48.60598455782159]
We present textscPuzzleBen, a weakly supervised benchmark that comprises 25,147 complex questions, answers, and human-generated rationales.
A unique aspect of our dataset is the inclusion of 10,000 unannotated questions, enabling us to explore utilizing fewer supersized data to boost LLMs' inference capabilities.
arXiv Detail & Related papers (2024-05-07T07:39:15Z) - TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition [6.253771639590562]
Table reasoning is a challenging task that requires understanding both natural language questions and structured data.
We propose Tabify, a novel method that leverages text-to-generation to decompose tables into smaller and relevant sub-tables.
Our method performs remarkably well on the WikiTQ benchmark, achieving an accuracy of 64.7%.
arXiv Detail & Related papers (2024-04-15T21:42:20Z) - Chain-of-Table: Evolving Tables in the Reasoning Chain for Table
Understanding [79.9461269253121]
We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts.
Chain-of-Table achieves new state-of-the-art performance on WikiTQ, FeTaQA, and TabFact benchmarks.
arXiv Detail & Related papers (2024-01-09T07:46:26Z) - TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning [55.33939289989238]
We propose TAP4LLM as a versatile pre-processor suite for leveraging large language models (LLMs) in table-based tasks effectively.
It covers several distinct components: (1) table sampling to decompose large tables into manageable sub-tables based on query semantics, (2) table augmentation to enhance tables with additional knowledge from external sources or models, and (3) table packing & serialization to convert tables into various formats suitable for LLMs' understanding.
arXiv Detail & Related papers (2023-12-14T15:37:04Z) - HeLM: Highlighted Evidence augmented Language Model for Enhanced Table-to-Text Generation [7.69801337810352]
We conduct parameter-efficient fine-tuning on the LLaMA2 model.
Our approach involves injecting reasoning information into the input by emphasizing table-specific row data.
On both the FetaQA and QTSumm datasets, our approach achieved state-of-the-art results.
arXiv Detail & Related papers (2023-11-15T12:02:52Z) - INFOTABS: Inference on Tables as Semi-structured Data [39.84930221015755]
We introduce a new dataset called INFOTABS, comprising of human-written textual hypotheses based on premises that are tables extracted from Wikipedia info-boxes.
Our analysis shows that the semi-structured, multi-domain and heterogeneous nature of the premises admits complex, multi-faceted reasoning.
Experiments reveal that, while human annotators agree on the relationships between a table-hypothesis pair, several standard modeling strategies are unsuccessful at the task.
arXiv Detail & Related papers (2020-05-13T02:07:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.