Robust Table Detection and Structure Recognition from Heterogeneous
Document Images
- URL: http://arxiv.org/abs/2203.09056v1
- Date: Thu, 17 Mar 2022 03:35:12 GMT
- Title: Robust Table Detection and Structure Recognition from Heterogeneous
Document Images
- Authors: Chixiang Ma, Weihong Lin, Lei Sun, Qiang Huo
- Abstract summary: We introduce RobusTabNet to detect the boundaries of tables and reconstruct the cellular structure of the table from heterogeneous document images.
For table detection, we propose to use CornerNet as a new region proposal network to generate higher quality table proposals for Faster R-CNN.
Our table structure recognition approach achieves state-of-the-art performance on three public benchmarks, including SciTSR, PubTabNet and cTDaR TrackB.
- Score: 6.961470641696773
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a new table detection and structure recognition approach named
RobusTabNet to detect the boundaries of tables and reconstruct the cellular
structure of the table from heterogeneous document images. For table detection,
we propose to use CornerNet as a new region proposal network to generate higher
quality table proposals for Faster R-CNN, which has significantly improved the
localization accuracy of Faster R-CNN for table detection. Consequently, our
table detection approach achieves state-of-the-art performance on three public
table detection benchmarks, namely cTDaR TrackA, PubLayNet and IIIT-AR-13K, by
only using a lightweight ResNet-18 backbone network. Furthermore, we propose a
new split-and-merge based table structure recognition approach, in which a
novel spatial CNN based separation line prediction module is proposed to split
each detected table into a grid of cells, and a Grid CNN based cell merging
module is applied to recover the spanning cells. As the spatial CNN module can
effectively propagate contextual information across the whole table image, our
table structure recognizer can robustly recognize tables with large blank
spaces and geometrically distorted (even curved) tables. Thanks to these two
techniques, our table structure recognition approach achieves state-of-the-art
performance on three public benchmarks, including SciTSR, PubTabNet and cTDaR
TrackB. Moreover, we have further demonstrated the advantages of our approach
in recognizing tables with complex structures, large blank spaces, empty or
spanning cells as well as geometrically distorted or even curved tables on a
more challenging in-house dataset.
Related papers
- UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition [55.153629718464565]
We introduce UniTabNet, a novel framework for table structure parsing based on the image-to-text model.
UniTabNet employs a divide-and-conquer'' strategy, utilizing an image-to-text model to decouple table cells and integrating both physical and logical decoders to reconstruct the complete table structure.
arXiv Detail & Related papers (2024-09-20T01:26:32Z) - ClusterTabNet: Supervised clustering method for table detection and table structure recognition [0.0]
We present a novel deep-learning-based method to cluster words in documents which we apply to detect and recognize tables given the OCR output.
We interpret table structure bottom-up as a graph of relations between pairs of words and use a transformer encoder model to predict its adjacency matrix.
Compared to the current state-of-the-art detection methods such as DETR and Faster R-CNN, our method achieves similar or better accuracy, while requiring a significantly smaller model.
arXiv Detail & Related papers (2024-02-12T09:10:24Z) - SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge)
We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution.
To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z) - EGRC-Net: Embedding-induced Graph Refinement Clustering Network [66.44293190793294]
We propose a novel graph clustering network called Embedding-Induced Graph Refinement Clustering Network (EGRC-Net)
EGRC-Net effectively utilizes the learned embedding to adaptively refine the initial graph and enhance the clustering performance.
Our proposed methods consistently outperform several state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-19T09:08:43Z) - TRUST: An Accurate and End-to-End Table structure Recognizer Using
Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST.
Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation.
We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z) - TSRFormer: Table Structure Recognition with Transformers [15.708108572696064]
We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognize the structures of complex tables with geometrical distortions from various table images.
We propose a new two-stage DETR based separator prediction approach, dubbed textbfSeparator textbfREgression textbfTRansformer (SepRETR)
We achieve state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet and WTW.
arXiv Detail & Related papers (2022-08-09T17:36:13Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - Multi-Type-TD-TSR -- Extracting Tables from Document Images using a
Multi-stage Pipeline for Table Detection and Table Structure Recognition:
from OCR to Structured Table Representations [63.98463053292982]
The recognition of tables consists of two main tasks, namely table detection and table structure recognition.
Recent work shows a clear trend towards deep learning approaches coupled with the use of transfer learning for the task of table structure recognition.
We present a multistage pipeline named Multi-Type-TD-TSR, which offers an end-to-end solution for the problem of table recognition.
arXiv Detail & Related papers (2021-05-23T21:17:18Z) - Global Table Extractor (GTE): A Framework for Joint Table Identification
and Cell Structure Recognition Using Visual Context [11.99452212008243]
We present a vision-guided systematic framework for joint table detection and cell structured recognition.
With GTE-Table, we invent a new penalty based on the natural cell containment constraint of tables to train our table network.
We use this to enhance PubTabNet with cell labels and create FinTabNet, real-world and complex scientific and financial datasets.
arXiv Detail & Related papers (2020-05-01T20:14:49Z) - CascadeTabNet: An approach for end to end table detection and structure
recognition from image-based documents [4.199844472131922]
We present an improved deep learning-based end to end approach for solving both problems of table detection and structure recognition.
We propose CascadeTabNet: a Cascade mask Region-based CNN High-Resolution Network ( Cascade mask R-CNN HRNet) based model.
We attain the highest accuracy results on the ICDAR 2019 table structure recognition dataset.
arXiv Detail & Related papers (2020-04-27T08:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.