Related papers: Piece of Table: A Divide-and-Conquer Approach for Selecting Subtables in Table Question Answering

Piece of Table: A Divide-and-Conquer Approach for Selecting Subtables in Table Question Answering

URL: http://arxiv.org/abs/2412.07629v4
Date: Wed, 19 Feb 2025 11:56:57 GMT
Title: Piece of Table: A Divide-and-Conquer Approach for Selecting Subtables in Table Question Answering
Authors: Wonjin Lee, Kyumin Kim, Sungjae Lee, Jihun Lee, Kwang In Kim,
Abstract summary: PieTa is a new framework for subtable-based question answering (QA)<n>It operates through an iterative process of dividing tables into smaller windows, using LMs to select relevant cells within each window, and merging these cells into a subtable.<n>It demonstrates improved performance over previous subtable-based QA approaches.
Score: 20.926770550682964
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Applying language models (LMs) to tables is challenging due to the inherent structural differences between two-dimensional tables and one-dimensional text for which the LMs were originally designed. Furthermore, when applying linearized tables to LMs, the maximum token lengths often imposed in self-attention calculations make it difficult to comprehensively understand the context spread across large tables. To address these challenges, we present PieTa (Piece of Table), a new framework for subtable-based question answering (QA). PieTa operates through an iterative process of dividing tables into smaller windows, using LMs to select relevant cells within each window, and merging these cells into a subtable. This multi-resolution approach captures dependencies across multiple rows and columns while avoiding the limitations caused by long context inputs. Instantiated as a simple iterative subtable union algorithm, PieTa demonstrates improved performance over previous subtable-based QA approaches.

Related papers

TableDART: Dynamic Adaptive Multi-Modal Routing for Table Understanding [52.59372043981724]
TableDART is a training-efficient framework that integrates multimodal views by reusing pretrained single-modality models.<n>In addition, we propose a novel agent to cross-modal knowledge integration by analyzing outputs from text- and image-based models.
arXiv Detail & Related papers (2025-09-18T07:00:13Z)
Improving Table Understanding with LLMs and Entity-Oriented Search [24.3302301035859]
We introduce an entity-oriented search method to improve table understanding with large language models (LLMs)<n>This approach effectively leverages the semantic similarities between questions and table data, as well as the implicit relationships between table cells.<n>It focuses on table entities, ensuring that table cells are semantically tightly bound, thereby enhancing contextual clarity.
arXiv Detail & Related papers (2025-08-23T14:02:45Z)
Improving Table Retrieval with Question Generation from Partial Tables [2.2169618382995764]
We propose QGpT, a simple yet effective method that uses an LLM to generate synthetic questions based on small portions of a table.<n>The generated questions are then jointly embedded with the partial table segments used for generation, enhancing semantic alignment with user queries.
arXiv Detail & Related papers (2025-08-08T09:35:56Z)
Multimodal Tabular Reasoning with Privileged Structured Information [67.40011423365712]
We introduce TabUlar Reasoning with Bridged infOrmation (sc Turbo)<n>sc Turbo benefits from a structure-aware reasoning trace generator based on DeepSeek-R1.<n>sc Turbo achieves state-of-the-art performance ($+7.2%$ vs. previous SOTA) across multiple datasets.
arXiv Detail & Related papers (2025-06-04T15:46:30Z)
Texts or Images? A Fine-grained Analysis on the Effectiveness of Input Representations and Models for Table Question Answering [16.790216473975146]
We conduct the first controlled study on the effectiveness of several combinations of table representations and models from two perspectives.<n>We find that the best combination of table representation and model varies across setups.<n>We propose FRES, a method selecting table representations dynamically, and observe a 10% average performance improvement.
arXiv Detail & Related papers (2025-05-20T09:36:17Z)
General Table Question Answering via Answer-Formula Joint Generation [27.599437384914186]
Advanced table question answering (TableQA) methods prompt large language models (LLMs) to generate answer text. These methods lack the versatility to cope with specific question types or table structures. We propose textttTabAF, a general table answering framework to solve multiple types of tasks over multiple types of tables simultaneously.
arXiv Detail & Related papers (2025-03-16T03:51:06Z)
Towards Question Answering over Large Semi-structured Tables [29.384514074911955]
TaDRe is a model that incorporates both pre- and post-table decomposition refinements to ensure table decomposition quality.<n>TaDRe achieves state-of-the-art performance on large-table TableQA tasks.
arXiv Detail & Related papers (2025-02-19T04:45:05Z)
HiddenTables & PyQTax: A Cooperative Game and Dataset For TableQA to Ensure Scale and Data Privacy Across a Myriad of Taxonomies [9.09415727445941]
We propose a cooperative game dubbed "HiddenTables" as a potential resolution to this challenge. "HiddenTables" is played between the code-generating "r" and the "Oracle windows" which evaluates the ability of the agents to solve Table QA tasks. We provide evidential experiments on a diverse set of tables that demonstrate an LLM's collective inability to generalize and perform on complex queries.
arXiv Detail & Related papers (2024-06-16T04:53:29Z)
Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval [52.592071689901196]
We introduce a method that uncovers useful join relations for any query and database during table retrieval. Our method outperforms the state-of-the-art approaches for table retrieval by up to 9.3% in F1 score and for end-to-end QA by up to 5.4% in accuracy.
arXiv Detail & Related papers (2024-04-15T15:55:01Z)
Augment before You Try: Knowledge-Enhanced Table Question Answering via Table Expansion [57.53174887650989]
Table question answering is a popular task that assesses a model's ability to understand and interact with structured data. Existing methods either convert both the table and external knowledge into text, which neglects the structured nature of the table. We propose a simple yet effective method to integrate external information in a given table.
arXiv Detail & Related papers (2024-01-28T03:37:11Z)
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding [79.9461269253121]
We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts. Chain-of-Table achieves new state-of-the-art performance on WikiTQ, FeTaQA, and TabFact benchmarks.
arXiv Detail & Related papers (2024-01-09T07:46:26Z)
TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning [55.33939289989238]
We propose TAP4LLM as a versatile pre-processor suite for leveraging large language models (LLMs) in table-based tasks effectively. It covers several distinct components: (1) table sampling to decompose large tables into manageable sub-tables based on query semantics, (2) table augmentation to enhance tables with additional knowledge from external sources or models, and (3) table packing & serialization to convert tables into various formats suitable for LLMs' understanding.
arXiv Detail & Related papers (2023-12-14T15:37:04Z)
MultiTabQA: Generating Tabular Answers for Multi-Table Question Answering [61.48881995121938]
Real-world queries are complex in nature, often over multiple tables in a relational database or web page. Our model, MultiTabQA, not only answers questions over multiple tables, but also generalizes to generate tabular answers.
arXiv Detail & Related papers (2023-05-22T08:25:15Z)
SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge) We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution. To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z)
Retrieving Complex Tables with Multi-Granular Graph Representation Learning [20.72341939868327]
The task of natural language table retrieval seeks to retrieve semantically relevant tables based on natural language queries. Existing learning systems treat tables as plain text based on the assumption that tables are structured as dataframes. We propose Graph-based Table Retrieval (GTR), a generalizable NLTR framework with multi-granular graph representation learning.
arXiv Detail & Related papers (2021-05-04T20:19:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.