TSRFormer: Table Structure Recognition with Transformers
- URL: http://arxiv.org/abs/2208.04921v1
- Date: Tue, 9 Aug 2022 17:36:13 GMT
- Title: TSRFormer: Table Structure Recognition with Transformers
- Authors: Weihong Lin, Zheng Sun, Chixiang Ma, Mingze Li, Jiawei Wang, Lei Sun,
Qiang Huo
- Abstract summary: We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognize the structures of complex tables with geometrical distortions from various table images.
We propose a new two-stage DETR based separator prediction approach, dubbed textbfSeparator textbfREgression textbfTRansformer (SepRETR)
We achieve state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet and WTW.
- Score: 15.708108572696064
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new table structure recognition (TSR) approach, called
TSRFormer, to robustly recognizing the structures of complex tables with
geometrical distortions from various table images. Unlike previous methods, we
formulate table separation line prediction as a line regression problem instead
of an image segmentation problem and propose a new two-stage DETR based
separator prediction approach, dubbed \textbf{Sep}arator \textbf{RE}gression
\textbf{TR}ansformer (SepRETR), to predict separation lines from table images
directly. To make the two-stage DETR framework work efficiently and effectively
for the separation line prediction task, we propose two improvements: 1) A
prior-enhanced matching strategy to solve the slow convergence issue of DETR;
2) A new cross attention module to sample features from a high-resolution
convolutional feature map directly so that high localization accuracy is
achieved with low computational cost. After separation line prediction, a
simple relation network based cell merging module is used to recover spanning
cells. With these new techniques, our TSRFormer achieves state-of-the-art
performance on several benchmark datasets, including SciTSR, PubTabNet and WTW.
Furthermore, we have validated the robustness of our approach to tables with
complex structures, borderless cells, large blank spaces, empty or spanning
cells as well as distorted or even curved shapes on a more challenging
real-world in-house dataset.
Related papers
- SEMv3: A Fast and Robust Approach to Table Separation Line Detection [48.75713662571455]
Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image.
"Split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial.
We propose SEMv3 (SEM: Split, Embed and Merge), a method that is both fast and robust for detecting table separation lines.
arXiv Detail & Related papers (2024-05-20T08:13:46Z) - LORE++: Logical Location Regression Network for Table Structure
Recognition with Pre-training [45.80561537971478]
Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats.
We model TSR as a logical location regression problem and propose a new TSR framework called LORE.
Our proposed LORE is conceptually simpler, easier to train, and more accurate than other paradigms of TSR.
arXiv Detail & Related papers (2024-01-03T03:14:55Z) - TRACE: Table Reconstruction Aligned to Corner and Edges [7.536220920052911]
We analyze the natural characteristics of a table, where a table is composed of cells and each cell is made up of borders consisting of edges.
We propose a novel method to reconstruct the table in a bottom-up manner.
A simple design makes the model easier to train and requires less computation than previous two-stage methods.
arXiv Detail & Related papers (2023-05-01T02:26:15Z) - Robust Table Structure Recognition with Dynamic Queries Enhanced
Detection Transformer [15.708108572696062]
We present a new table structure recognition approach, called TSRFormer, to robustly recognize the structures of complex tables with geometrical distortions from various table images.
With these new techniques, our TSRFormer achieves state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet, WTW and FinTabNet.
arXiv Detail & Related papers (2023-03-21T06:20:49Z) - SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge)
We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution.
To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z) - LORE: Logical Location Regression Network for Table Structure
Recognition [24.45544796305824]
Table structure recognition aims at extracting tables in images into machine-understandable formats.
Recent methods solve this problem by predicting the adjacency relations of detected cell boxes.
We propose a new TSR framework called LORE, standing for LOgical location REgression network.
arXiv Detail & Related papers (2023-03-07T08:42:46Z) - TRUST: An Accurate and End-to-End Table structure Recognizer Using
Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST.
Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation.
We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z) - Robust Table Detection and Structure Recognition from Heterogeneous
Document Images [6.961470641696773]
We introduce RobusTabNet to detect the boundaries of tables and reconstruct the cellular structure of the table from heterogeneous document images.
For table detection, we propose to use CornerNet as a new region proposal network to generate higher quality table proposals for Faster R-CNN.
Our table structure recognition approach achieves state-of-the-art performance on three public benchmarks, including SciTSR, PubTabNet and cTDaR TrackB.
arXiv Detail & Related papers (2022-03-17T03:35:12Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Semantic Correspondence with Transformers [68.37049687360705]
We propose Cost Aggregation with Transformers (CATs) to find dense correspondences between semantically similar images.
We include appearance affinity modelling to disambiguate the initial correlation maps and multi-level aggregation.
We conduct experiments to demonstrate the effectiveness of the proposed model over the latest methods and provide extensive ablation studies.
arXiv Detail & Related papers (2021-06-04T14:39:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.