Related papers: TableFormer: Robust Transformer Modeling for Table-Text Encoding

TableFormer: Robust Transformer Modeling for Table-Text Encoding

URL: http://arxiv.org/abs/2203.00274v1
Date: Tue, 1 Mar 2022 07:23:06 GMT
Title: TableFormer: Robust Transformer Modeling for Table-Text Encoding
Authors: Jingfeng Yang, Aditya Gupta, Shyam Upadhyay, Luheng He, Rahul Goel, Shachi Paul
Abstract summary: Existing models for table understanding require linearization of the table structure, where row or column order is encoded as an unwanted bias. In this work, we propose a robust and structurally aware table-text encoding architecture TableFormer.
Score: 18.00127368618485
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding tables is an important aspect of natural language understanding. Existing models for table understanding require linearization of the table structure, where row or column order is encoded as an unwanted bias. Such spurious biases make the model vulnerable to row and column order perturbations. Additionally, prior work has not thoroughly modeled the table structures or table-text alignments, hindering the table-text understanding ability. In this work, we propose a robust and structurally aware table-text encoding architecture TableFormer, where tabular structural biases are incorporated completely through learnable attention biases. TableFormer is (1) strictly invariant to row and column orders, and, (2) could understand tables better due to its tabular inductive biases. Our evaluations showed that TableFormer outperforms strong baselines in all settings on SQA, WTQ and TabFact table reasoning datasets, and achieves state-of-the-art performance on SQA, especially when facing answer-invariant row and column order perturbations (6% improvement over the best baseline), because previous SOTA models' performance drops by 4% - 6% when facing such perturbations while TableFormer is not affected.

Related papers

TableMaster: A Recipe to Advance Table Understanding with Language Models [0.0]
TableMaster is a recipe and comprehensive framework that integrates multiple solutions to overcome these obstacles. On the WikiTQ dataset, TableMaster achieves an accuracy of 78.13% using GPT-4o-mini, surpassing existing baselines.
arXiv Detail & Related papers (2025-01-31T18:31:31Z)
Is This a Bad Table? A Closer Look at the Evaluation of Table Generation from Text [21.699434525769586]
Existing measures for table quality evaluation fail to capture the overall semantics of the tables. We propose TabEval, a novel table evaluation strategy that captures table semantics. To validate our approach, we curate a dataset comprising of text descriptions for 1,250 diverse Wikipedia tables.
arXiv Detail & Related papers (2024-06-21T02:18:03Z)
KET-QA: A Dataset for Knowledge Enhanced Table Question Answering [63.56707527868466]
We propose to use a knowledge base (KB) as the external knowledge source for TableQA. Every question requires the integration of information from both the table and the sub-graph to be answered. We design a retriever-reasoner structured pipeline model to extract pertinent information from the vast knowledge sub-graph.
arXiv Detail & Related papers (2024-05-13T18:26:32Z)
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding [79.9461269253121]
We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts. Chain-of-Table achieves new state-of-the-art performance on WikiTQ, FeTaQA, and TabFact benchmarks.
arXiv Detail & Related papers (2024-01-09T07:46:26Z)
ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples [15.212332890570869]
We develop ReasTAP to show that high-level table reasoning skills can be injected into models during pre-training without a complex table-specific architecture design. ReasTAP achieves new state-of-the-art performance on all benchmarks and delivers a significant improvement on low-resource setting.
arXiv Detail & Related papers (2022-10-22T07:04:02Z)
OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering [106.73213656603453]
We develop a simple table-based QA model with minimal annotation effort. We propose an omnivorous pretraining approach that consumes both natural and synthetic data.
arXiv Detail & Related papers (2022-07-08T01:23:45Z)
Table Retrieval May Not Necessitate Table-specific Model Design [83.27735758203089]
We focus on the task of table retrieval, and ask: "is table-specific model design necessary for table retrieval?" Based on an analysis on a table-based portion of the Natural Questions dataset (NQ-table), we find that structure plays a negligible role in more than 70% of the cases. We then experiment with three modules to explicitly encode table structures, namely auxiliary row/column embeddings, hard attention masks, and soft relation-based attention biases. None of these yielded significant improvements, suggesting that table-specific model design may not be necessary for table retrieval.
arXiv Detail & Related papers (2022-05-19T20:35:23Z)
Robust (Controlled) Table-to-Text Generation with Structure-Aware Equivariance Learning [24.233552674892906]
Controlled table-to-text generation seeks to generate natural language descriptions for highlighted subparts of a table. We propose an equivariance learning framework, which encodes tables with a structure-aware self-attention mechanism. Our technology is free to be plugged into existing table-to-text generation models, and has improved T5-based models to offer better performance on ToTTo and HiTab.
arXiv Detail & Related papers (2022-05-08T23:37:27Z)
TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition. Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z)
TABBIE: Pretrained Representations of Tabular Data [22.444607481407633]
We devise a simple pretraining objective that learns exclusively from tabular data. Unlike competing approaches, our model (TABBIE) provides embeddings of all table substructures. A qualitative analysis of our model's learned cell, column, and row representations shows that it understands complex table semantics and numerical trends.
arXiv Detail & Related papers (2021-05-06T11:15:16Z)
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing. We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar. To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.