Related papers: TSRFormer: Table Structure Recognition with Transformers

TSRFormer: Table Structure Recognition with Transformers

URL: http://arxiv.org/abs/2208.04921v1
Date: Tue, 9 Aug 2022 17:36:13 GMT
Title: TSRFormer: Table Structure Recognition with Transformers
Authors: Weihong Lin, Zheng Sun, Chixiang Ma, Mingze Li, Jiawei Wang, Lei Sun, Qiang Huo
Abstract summary: We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognize the structures of complex tables with geometrical distortions from various table images. We propose a new two-stage DETR based separator prediction approach, dubbed textbfSeparator textbfREgression textbfTRansformer (SepRETR) We achieve state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet and WTW.
Score: 15.708108572696064
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognizing the structures of complex tables with geometrical distortions from various table images. Unlike previous methods, we formulate table separation line prediction as a line regression problem instead of an image segmentation problem and propose a new two-stage DETR based separator prediction approach, dubbed \textbf{Sep}arator \textbf{RE}gression \textbf{TR}ansformer (SepRETR), to predict separation lines from table images directly. To make the two-stage DETR framework work efficiently and effectively for the separation line prediction task, we propose two improvements: 1) A prior-enhanced matching strategy to solve the slow convergence issue of DETR; 2) A new cross attention module to sample features from a high-resolution convolutional feature map directly so that high localization accuracy is achieved with low computational cost. After separation line prediction, a simple relation network based cell merging module is used to recover spanning cells. With these new techniques, our TSRFormer achieves state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet and WTW. Furthermore, we have validated the robustness of our approach to tables with complex structures, borderless cells, large blank spaces, empty or spanning cells as well as distorted or even curved shapes on a more challenging real-world in-house dataset.

Related papers

Towards Generalizable Trajectory Prediction Using Dual-Level Representation Learning And Adaptive Prompting [107.4034346788744]
Existing vehicle trajectory prediction models struggle with generalizability, prediction uncertainties, and handling complex interactions. We propose Perceiver with Register queries (PerReg+), a novel trajectory prediction framework that introduces: (1) Dual-Level Representation Learning via Self-Distillation (SD) and Masked Reconstruction (MR), capturing global context and fine-grained details; (2) Enhanced Multimodality using register-based queries and pretraining, eliminating the need for clustering and suppression; and (3) Adaptive Prompt Tuning during fine-tuning, freezing the main architecture and optimizing a small number of prompts for efficient adaptation.
arXiv Detail & Related papers (2025-01-08T20:11:09Z)
SEMv3: A Fast and Robust Approach to Table Separation Line Detection [48.75713662571455]
Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image. "Split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial. We propose SEMv3 (SEM: Split, Embed and Merge), a method that is both fast and robust for detecting table separation lines.
arXiv Detail & Related papers (2024-05-20T08:13:46Z)
LORE++: Logical Location Regression Network for Table Structure Recognition with Pre-training [45.80561537971478]
Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats. We model TSR as a logical location regression problem and propose a new TSR framework called LORE. Our proposed LORE is conceptually simpler, easier to train, and more accurate than other paradigms of TSR.
arXiv Detail & Related papers (2024-01-03T03:14:55Z)
TRACE: Table Reconstruction Aligned to Corner and Edges [7.536220920052911]
We analyze the natural characteristics of a table, where a table is composed of cells and each cell is made up of borders consisting of edges. We propose a novel method to reconstruct the table in a bottom-up manner. A simple design makes the model easier to train and requires less computation than previous two-stage methods.
arXiv Detail & Related papers (2023-05-01T02:26:15Z)
Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer [15.708108572696062]
We present a new table structure recognition approach, called TSRFormer, to robustly recognize the structures of complex tables with geometrical distortions from various table images. With these new techniques, our TSRFormer achieves state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet, WTW and FinTabNet.
arXiv Detail & Related papers (2023-03-21T06:20:49Z)
SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge) We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution. To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z)
LORE: Logical Location Regression Network for Table Structure Recognition [24.45544796305824]
Table structure recognition aims at extracting tables in images into machine-understandable formats. Recent methods solve this problem by predicting the adjacency relations of detected cell boxes. We propose a new TSR framework called LORE, standing for LOgical location REgression network.
arXiv Detail & Related papers (2023-03-07T08:42:46Z)
TRUST: An Accurate and End-to-End Table structure Recognizer Using Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST. Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation. We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z)
Robust Table Detection and Structure Recognition from Heterogeneous Document Images [6.961470641696773]
We introduce RobusTabNet to detect the boundaries of tables and reconstruct the cellular structure of the table from heterogeneous document images. For table detection, we propose to use CornerNet as a new region proposal network to generate higher quality table proposals for Faster R-CNN. Our table structure recognition approach achieves state-of-the-art performance on three public benchmarks, including SciTSR, PubTabNet and cTDaR TrackB.
arXiv Detail & Related papers (2022-03-17T03:35:12Z)
CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning. The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery. The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z)
Semantic Correspondence with Transformers [68.37049687360705]
We propose Cost Aggregation with Transformers (CATs) to find dense correspondences between semantically similar images. We include appearance affinity modelling to disambiguate the initial correlation maps and multi-level aggregation. We conduct experiments to demonstrate the effectiveness of the proposed model over the latest methods and provide extensive ablation studies.
arXiv Detail & Related papers (2021-06-04T14:39:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.