TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models
- URL: http://arxiv.org/abs/2503.04396v1
- Date: Thu, 06 Mar 2025 12:50:14 GMT
- Title: TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models
- Authors: Xinyi He, Yihao Liu, Mengyu Zhou, Yeye He, Haoyu Dong, Shi Han, Zejian Yuan, Dongmei Zhang,
- Abstract summary: TableLoRA is a module designed to improve LLMs' understanding of table structure during PEFT.<n>It incorporates special tokens for serializing tables with special token encoder and uses 2D LoRA to encode low-rank information on cell positions.
- Score: 57.005158277893194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tabular data are crucial in many fields and their understanding by large language models (LLMs) under high parameter efficiency paradigm is important. However, directly applying parameter-efficient fine-tuning (PEFT) techniques to tabular tasks presents significant challenges, particularly in terms of better table serialization and the representation of two-dimensional structured information within a one-dimensional sequence. To address this, we propose TableLoRA, a module designed to improve LLMs' understanding of table structure during PEFT. It incorporates special tokens for serializing tables with special token encoder and uses 2D LoRA to encode low-rank information on cell positions. Experiments on four tabular-related datasets demonstrate that TableLoRA consistently outperforms vanilla LoRA and surpasses various table encoding methods tested in control experiments. These findings reveal that TableLoRA, as a table-specific LoRA, enhances the ability of LLMs to process tabular data effectively, especially in low-parameter settings, demonstrating its potential as a robust solution for handling table-related tasks.
Related papers
- GTR: Graph-Table-RAG for Cross-Table Question Answering [53.11230952572134]
We propose the first Graph-Table-RAG framework, namely GTR, which reorganizes table corpora into a heterogeneous graph.
GTR exhibits superior cross-table question-answering performance while maintaining high deployment efficiency, demonstrating its real-world practical applicability.
arXiv Detail & Related papers (2025-04-02T04:24:41Z) - TableRAG: Million-Token Table Understanding with Language Models [53.039560091592215]
TableRAG is a Retrieval-Augmented Generation (RAG) framework specifically designed for LM-based table understanding.<n>TableRAG leverages query expansion combined with schema and cell retrieval to pinpoint crucial information before providing it to the LMs.<n>Our results demonstrate that TableRAG achieves the highest retrieval quality, leading to the new state-of-the-art performance on large-scale table understanding.
arXiv Detail & Related papers (2024-10-07T04:15:02Z) - 2D-TPE: Two-Dimensional Positional Encoding Enhances Table Understanding for Large Language Models [48.123582712115336]
Existing methods often flatten the two-dimensional (2D) table structure into a sequence of tokens.
2D-TPE effectively mitigates the risk of losing essential spatial information while preserving computational efficiency.
Extensive experiments across five benchmarks demonstrate that 2D-TPE outperforms strong baselines.
arXiv Detail & Related papers (2024-09-29T13:16:37Z) - TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning [61.14586098005874]
Current Large Language Models (LLMs) exhibit limited ability to understand table structures and to apply precise numerical reasoning.
We introduce our Tool-Augmented Reasoning framework for Tables (TART), which integrates LLMs with specialized tools.
TART contains three key components: a table formatter to ensure accurate data representation, a tool maker to develop specific computational tools, and an explanation generator to maintain explainability.
arXiv Detail & Related papers (2024-09-18T06:19:59Z) - NormTab: Improving Symbolic Reasoning in LLMs Through Tabular Data Normalization [6.253771639590562]
We introduce NormTab, a framework aimed at enhancing the symbolic reasoning performance of Large Language Models (LLMs) by normalizing web tables.
We study table normalization as a stand-alone, one-time preprocessing step using LLMs to support symbolic reasoning on tabular data.
Our experimental evaluation, conducted on challenging web table datasets such as WikiTableQuestion and TabFact, demonstrates that leveraging NormTab significantly improves symbolic reasoning performance.
arXiv Detail & Related papers (2024-06-25T22:40:03Z) - TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning [55.33939289989238]
We propose TAP4LLM as a versatile pre-processor suite for leveraging large language models (LLMs) in table-based tasks effectively.
It covers several distinct components: (1) table sampling to decompose large tables into manageable sub-tables based on query semantics, (2) table augmentation to enhance tables with additional knowledge from external sources or models, and (3) table packing & serialization to convert tables into various formats suitable for LLMs' understanding.
arXiv Detail & Related papers (2023-12-14T15:37:04Z) - HeLM: Highlighted Evidence augmented Language Model for Enhanced Table-to-Text Generation [7.69801337810352]
We conduct parameter-efficient fine-tuning on the LLaMA2 model.
Our approach involves injecting reasoning information into the input by emphasizing table-specific row data.
On both the FetaQA and QTSumm datasets, our approach achieved state-of-the-art results.
arXiv Detail & Related papers (2023-11-15T12:02:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.