Related papers: Utilizing Training Data to Improve LLM Reasoning for Tabular Understanding

Utilizing Training Data to Improve LLM Reasoning for Tabular Understanding

URL: http://arxiv.org/abs/2508.18676v1
Date: Tue, 26 Aug 2025 04:46:54 GMT
Title: Utilizing Training Data to Improve LLM Reasoning for Tabular Understanding
Authors: Chufan Gao, Jintai Chen, Jimeng Sun,
Abstract summary: We propose a novel prompting-based reasoning approach, Learn then Retrieve: LRTab.<n>We first use prompting to obtain CoT responses over the training data.<n>For incorrect CoTs, we prompt the LLM to predict Prompt Conditions to avoid the error, learning insights from the data.<n>Finally, at inference time, we retrieve the most relevant Prompt Conditions for additional context for table understanding.
Score: 28.832331050993464
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Automated tabular understanding and reasoning are essential tasks for data scientists. Recently, Large language models (LLMs) have become increasingly prevalent in tabular reasoning tasks. Previous work focuses on (1) finetuning LLMs using labeled data or (2) Training-free prompting LLM agents using chain-of-thought (CoT). Finetuning offers dataset-specific learning at the cost of generalizability. Training-free prompting is highly generalizable but does not take full advantage of training data. In this paper, we propose a novel prompting-based reasoning approach, Learn then Retrieve: LRTab, which integrates the benefits of both by retrieving relevant information learned from training data. We first use prompting to obtain CoT responses over the training data. For incorrect CoTs, we prompt the LLM to predict Prompt Conditions to avoid the error, learning insights from the data. We validate the effectiveness of Prompt Conditions using validation data. Finally, at inference time, we retrieve the most relevant Prompt Conditions for additional context for table understanding. We provide comprehensive experiments on WikiTQ and Tabfact, showing that LRTab is interpretable, cost-efficient, and can outperform previous baselines in tabular reasoning.

Related papers

Can Prompts Rewind Time for LLMs? Evaluating the Effectiveness of Prompted Knowledge Cutoffs [31.64130018833542]
Large Language Models (LLMs) are widely used for temporal prediction, but their reliance on pretraining data raises contamination concerns.<n>We investigate the capability of prompting to simulate earlier knowledge cutoff in LLMs.<n>Results demonstrate that while prompt-based simulated knowledge cutoffs show effectiveness when directly queried with the information after that date, they struggle to induce forgetting when the forgotten content is not directly asked but causally related to the query.
arXiv Detail & Related papers (2025-09-26T20:37:44Z)
Logical Reasoning with Outcome Reward Models for Test-Time Scaling [10.795521518273214]
We present a set of Outcome Reward Models (ORMs) for deductive reasoning.<n>To train the ORMs, we mainly generate data using Chain-of-Thought (CoT) with single and multiple samples.<n>We also propose a novel tactic to further expand the type of errors covered in the training dataset of the ORM.
arXiv Detail & Related papers (2025-08-27T14:08:43Z)
Aligning Instruction Tuning with Pre-training [61.50161961371844]
We propose Aligning Instruction Tuning with Pre-training (AITP) to align instruction tuning with pre-training distributions.<n>We show consistent performance improvements with AITP on three fully open large language models (LLMs) across eight benchmarks.
arXiv Detail & Related papers (2025-01-16T08:27:40Z)
Tabular Transfer Learning via Prompting LLMs [52.96022335067357]
We propose a novel framework, Prompt to Transfer (P2T), that utilizes unlabeled (or heterogeneous) source data with large language models (LLMs) P2T identifies a column feature in a source dataset that is strongly correlated with a target task feature to create examples relevant to the target task, thus creating pseudo-demonstrations for prompts.
arXiv Detail & Related papers (2024-08-09T11:30:52Z)
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts. We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM. We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z)
Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models [21.10890310571397]
Large Language Models (LLMs) can be applied to a diverse set of tasks, but the critical issues of data contamination and memorization are often glossed over.<n>This work introduces a variety of different techniques to assess whether a language model has seen a dataset during training.<n>We then compare the few-shot learning performance of LLMs on datasets that were seen during training to the performance on datasets released after training.
arXiv Detail & Related papers (2024-04-09T10:58:21Z)
Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science [17.282770819829913]
This research endeavors to apply Large Language Models (LLMs) towards addressing these predictive tasks.<n>Our research aims to mitigate this gap by compiling a comprehensive corpus of tables annotated with instructions and executing large-scale training of Llama-2.
arXiv Detail & Related papers (2024-03-29T14:41:21Z)
TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios [51.66718740300016]
TableLLM is a robust large language model (LLM) with 8 billion parameters.<n>TableLLM is purpose-built for proficiently handling data manipulation tasks.<n>We have released the model checkpoint, source code, benchmarks, and a web application for user interaction.
arXiv Detail & Related papers (2024-03-28T11:21:12Z)
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding [79.9461269253121]
We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts. Chain-of-Table achieves new state-of-the-art performance on WikiTQ, FeTaQA, and TabFact benchmarks.
arXiv Detail & Related papers (2024-01-09T07:46:26Z)
TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning [55.33939289989238]
We propose TAP4LLM as a versatile pre-processor suite for leveraging large language models (LLMs) in table-based tasks effectively. It covers several distinct components: (1) table sampling to decompose large tables into manageable sub-tables based on query semantics, (2) table augmentation to enhance tables with additional knowledge from external sources or models, and (3) table packing & serialization to convert tables into various formats suitable for LLMs' understanding.
arXiv Detail & Related papers (2023-12-14T15:37:04Z)
TABLET: Learning From Instructions For Tabular Data [46.62140500101618]
We introduce TABLET, a benchmark of 20 diverse datasets annotated with instructions that vary in their phrasing, granularity, and technicality. We find in-context instructions increase zero-shot F1 performance for Flan-T5 11b by 44% on average and 13% for ChatGPT on TABLET.
arXiv Detail & Related papers (2023-04-25T23:07:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.