Related papers: Optimized Table Tokenization for Table Structure Recognition

Optimized Table Tokenization for Table Structure Recognition

URL: http://arxiv.org/abs/2305.03393v1
Date: Fri, 5 May 2023 09:38:47 GMT
Title: Optimized Table Tokenization for Table Structure Recognition
Authors: Maksym Lysak, Ahmed Nassar, Nikolaos Livathinos, Christoph Auer, Peter Staar
Abstract summary: transformer-based models have demonstrated that table-structure can be recognized with impressive accuracy using Image-to-Markup-Sequence approaches. Taking only the image of a table, such models predict a sequence of tokens which represent the structure of the table. We propose a new, optimised table-structure language (OTSL) with a minimized vocabulary and specific rules.
Score: 2.9398911304923447
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Extracting tables from documents is a crucial task in any document conversion pipeline. Recently, transformer-based models have demonstrated that table-structure can be recognized with impressive accuracy using Image-to-Markup-Sequence (Im2Seq) approaches. Taking only the image of a table, such models predict a sequence of tokens (e.g. in HTML, LaTeX) which represent the structure of the table. Since the token representation of the table structure has a significant impact on the accuracy and run-time performance of any Im2Seq model, we investigate in this paper how table-structure representation can be optimised. We propose a new, optimised table-structure language (OTSL) with a minimized vocabulary and specific rules. The benefits of OTSL are that it reduces the number of tokens to 5 (HTML needs 28+) and shortens the sequence length to half of HTML on average. Consequently, model accuracy improves significantly, inference time is halved compared to HTML-based models, and the predicted table structures are always syntactically correct. This in turn eliminates most post-processing needs.

Related papers

Texts or Images? A Fine-grained Analysis on the Effectiveness of Input Representations and Models for Table Question Answering [16.790216473975146]
We conduct the first controlled study on the effectiveness of several combinations of table representations and models from two perspectives.<n>We find that the best combination of table representation and model varies across setups.<n>We propose FRES, a method selecting table representations dynamically, and observe a 10% average performance improvement.
arXiv Detail & Related papers (2025-05-20T09:36:17Z)
SPRINT: Script-agnostic Structure Recognition in Tables [20.394597266150534]
Table Structure Recognition (TSR) is vital for various downstream tasks like information retrieval, table reconstruction, and document understanding. We propose TSR as a language-agnostic cell arrangement prediction and introduce SPRINT, Script-agnostic Structure Recognition in Tables. We experimentally evaluate our performance across benchmark TSR datasets including PubTabNet, FinTabNet, and PubTables-1M.
arXiv Detail & Related papers (2025-03-15T00:43:53Z)
UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition [55.153629718464565]
We introduce UniTabNet, a novel framework for table structure parsing based on the image-to-text model. UniTabNet employs a divide-and-conquer'' strategy, utilizing an image-to-text model to decouple table cells and integrating both physical and logical decoders to reconstruct the complete table structure.
arXiv Detail & Related papers (2024-09-20T01:26:32Z)
SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge) We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution. To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z)
ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples [15.212332890570869]
We develop ReasTAP to show that high-level table reasoning skills can be injected into models during pre-training without a complex table-specific architecture design. ReasTAP achieves new state-of-the-art performance on all benchmarks and delivers a significant improvement on low-resource setting.
arXiv Detail & Related papers (2022-10-22T07:04:02Z)
Table Retrieval May Not Necessitate Table-specific Model Design [83.27735758203089]
We focus on the task of table retrieval, and ask: "is table-specific model design necessary for table retrieval?" Based on an analysis on a table-based portion of the Natural Questions dataset (NQ-table), we find that structure plays a negligible role in more than 70% of the cases. We then experiment with three modules to explicitly encode table structures, namely auxiliary row/column embeddings, hard attention masks, and soft relation-based attention biases. None of these yielded significant improvements, suggesting that table-specific model design may not be necessary for table retrieval.
arXiv Detail & Related papers (2022-05-19T20:35:23Z)
TableFormer: Table Structure Understanding with Transformers [2.121963121603413]
We present a new table-structure identification model. New object detection decoder for table-cells. Second, we replace the LSTM decoders with transformer based decoders.
arXiv Detail & Related papers (2022-03-02T10:46:24Z)
TableFormer: Robust Transformer Modeling for Table-Text Encoding [18.00127368618485]
Existing models for table understanding require linearization of the table structure, where row or column order is encoded as an unwanted bias. In this work, we propose a robust and structurally aware table-text encoding architecture TableFormer.
arXiv Detail & Related papers (2022-03-01T07:23:06Z)
Multi-Type-TD-TSR -- Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: from OCR to Structured Table Representations [63.98463053292982]
The recognition of tables consists of two main tasks, namely table detection and table structure recognition. Recent work shows a clear trend towards deep learning approaches coupled with the use of transfer learning for the task of table structure recognition. We present a multistage pipeline named Multi-Type-TD-TSR, which offers an end-to-end solution for the problem of table recognition.
arXiv Detail & Related papers (2021-05-23T21:17:18Z)
A Graph Representation of Semi-structured Data for Web Question Answering [96.46484690047491]
We propose a novel graph representation of Web tables and lists based on a systematic categorization of the components in semi-structured data as well as their relations. Our method improves F1 score by 3.90 points over the state-of-the-art baselines.
arXiv Detail & Related papers (2020-10-14T04:01:54Z)
Understanding tables with intermediate pre-training [11.96734018295146]
We adapt TAPAS, a table-based BERT model, to recognize entailment. We evaluate table pruning techniques as a pre-processing step to drastically improve the training and prediction efficiency.
arXiv Detail & Related papers (2020-10-01T17:43:27Z)
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing. We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar. To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)
Table Structure Extraction with Bi-directional Gated Recurrent Unit Networks [5.350788087718877]
This paper proposes a robust deep learning based approach to extract rows and columns from a detected table in document images with a high precision. We have benchmarked our system on publicly available UNLV as well as ICDAR 2013 datasets on which it outperformed the state-of-the-art table structure extraction systems by a significant margin.
arXiv Detail & Related papers (2020-01-08T13:17:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.