MATE: Multi-view Attention for Table Transformer Efficiency
- URL: http://arxiv.org/abs/2109.04312v1
- Date: Thu, 9 Sep 2021 14:39:30 GMT
- Title: MATE: Multi-view Attention for Table Transformer Efficiency
- Authors: Julian Martin Eisenschlos, Maharshi Gor, Thomas M\"uller, William W.
Cohen
- Abstract summary: More than 20% of relational tables on the web have 20 or more rows.
Current Transformer models are typically limited to 512 tokens.
We propose MATE, a novel Transformer architecture designed to model the structure of web tables.
- Score: 21.547074431324024
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents a sparse-attention Transformer architecture for modeling
documents that contain large tables. Tables are ubiquitous on the web, and are
rich in information. However, more than 20% of relational tables on the web
have 20 or more rows (Cafarella et al., 2008), and these large tables present a
challenge for current Transformer models, which are typically limited to 512
tokens. Here we propose MATE, a novel Transformer architecture designed to
model the structure of web tables. MATE uses sparse attention in a way that
allows heads to efficiently attend to either rows or columns in a table. This
architecture scales linearly with respect to speed and memory, and can handle
documents containing more than 8000 tokens with current accelerators. MATE also
has a more appropriate inductive bias for tabular data, and sets a new
state-of-the-art for three table reasoning datasets. For HybridQA (Chen et al.,
2020b), a dataset that involves large documents containing tables, we improve
the best prior result by 19 points.
Related papers
- TableRAG: Million-Token Table Understanding with Language Models [53.039560091592215]
TableRAG is a Retrieval-Augmented Generation (RAG) framework specifically designed for LM-based table understanding.
TableRAG leverages query expansion combined with schema and cell retrieval to pinpoint crucial information before providing it to the LMs.
Our results demonstrate that TableRAG achieves the highest retrieval quality, leading to the new state-of-the-art performance on large-scale table understanding.
arXiv Detail & Related papers (2024-10-07T04:15:02Z) - Multimodal Table Understanding [26.652797853893233]
How to directly understand tables using intuitive visual information is a crucial and urgent challenge for developing more practical applications.
We propose a new problem, multimodal table understanding, where the model needs to generate correct responses to various table-related requests.
We develop Table-LLaVA, a generalist multimodal large language model (MLLM), which significantly outperforms recent open-source MLLM baselines on 23 benchmarks.
arXiv Detail & Related papers (2024-06-12T11:27:03Z) - Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize
Tables without Using Examples [24.208275772387683]
Auto-Tables can automatically transform non-relational tables into standard relational forms for downstream analytics.
Our evaluation suggests that Auto-Tables can successfully synthesize transformations for over 70% of test cases at interactive speeds.
arXiv Detail & Related papers (2023-07-27T00:55:54Z) - MultiTabQA: Generating Tabular Answers for Multi-Table Question
Answering [61.48881995121938]
Real-world queries are complex in nature, often over multiple tables in a relational database or web page.
Our model, MultiTabQA, not only answers questions over multiple tables, but also generalizes to generate tabular answers.
arXiv Detail & Related papers (2023-05-22T08:25:15Z) - OmniTab: Pretraining with Natural and Synthetic Data for Few-shot
Table-based Question Answering [106.73213656603453]
We develop a simple table-based QA model with minimal annotation effort.
We propose an omnivorous pretraining approach that consumes both natural and synthetic data.
arXiv Detail & Related papers (2022-07-08T01:23:45Z) - Table Retrieval May Not Necessitate Table-specific Model Design [83.27735758203089]
We focus on the task of table retrieval, and ask: "is table-specific model design necessary for table retrieval?"
Based on an analysis on a table-based portion of the Natural Questions dataset (NQ-table), we find that structure plays a negligible role in more than 70% of the cases.
We then experiment with three modules to explicitly encode table structures, namely auxiliary row/column embeddings, hard attention masks, and soft relation-based attention biases.
None of these yielded significant improvements, suggesting that table-specific model design may not be necessary for table retrieval.
arXiv Detail & Related papers (2022-05-19T20:35:23Z) - TableFormer: Table Structure Understanding with Transformers [2.121963121603413]
We present a new table-structure identification model.
New object detection decoder for table-cells.
Second, we replace the LSTM decoders with transformer based decoders.
arXiv Detail & Related papers (2022-03-02T10:46:24Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - Multi-Type-TD-TSR -- Extracting Tables from Document Images using a
Multi-stage Pipeline for Table Detection and Table Structure Recognition:
from OCR to Structured Table Representations [63.98463053292982]
The recognition of tables consists of two main tasks, namely table detection and table structure recognition.
Recent work shows a clear trend towards deep learning approaches coupled with the use of transfer learning for the task of table structure recognition.
We present a multistage pipeline named Multi-Type-TD-TSR, which offers an end-to-end solution for the problem of table recognition.
arXiv Detail & Related papers (2021-05-23T21:17:18Z) - Capturing Row and Column Semantics in Transformer Based Question
Answering over Tables [9.347393642549806]
We show that one can achieve superior performance on table QA task without using any of these specialized pre-training techniques.
Experiments on recent benchmarks prove that the proposed methods can effectively locate cell values on tables (up to 98% Hit@1 accuracy on Wiki lookup questions)
arXiv Detail & Related papers (2021-04-16T18:22:30Z) - Identifying Table Structure in Documents using Conditional Generative
Adversarial Networks [0.0]
In many industries and in academic research, information is primarily transmitted in the form of unstructured documents.
We propose a top-down approach, first using a conditional generative adversarial network to map a table image into a standardised skeleton' table form.
We then deriving latent table structure using xy-cut projection and Genetic Algorithm optimisation.
arXiv Detail & Related papers (2020-01-13T20:42:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.