Related papers: CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents

CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents

URL: http://arxiv.org/abs/2004.12629v2
Date: Thu, 28 May 2020 08:02:43 GMT
Title: CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents
Authors: Devashish Prasad, Ayan Gadpal, Kshitij Kapadni, Manish Visave and Kavita Sultanpure
Abstract summary: We present an improved deep learning-based end to end approach for solving both problems of table detection and structure recognition. We propose CascadeTabNet: a Cascade mask Region-based CNN High-Resolution Network ( Cascade mask R-CNN HRNet) based model. We attain the highest accuracy results on the ICDAR 2019 table structure recognition dataset.
Score: 4.199844472131922
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: An automatic table recognition method for interpretation of tabular data in document images majorly involves solving two problems of table detection and table structure recognition. The prior work involved solving both problems independently using two separate approaches. More recent works signify the use of deep learning-based solutions while also attempting to design an end to end solution. In this paper, we present an improved deep learning-based end to end approach for solving both problems of table detection and structure recognition using a single Convolution Neural Network (CNN) model. We propose CascadeTabNet: a Cascade mask Region-based CNN High-Resolution Network (Cascade mask R-CNN HRNet) based model that detects the regions of tables and recognizes the structural body cells from the detected tables at the same time. We evaluate our results on ICDAR 2013, ICDAR 2019 and TableBank public datasets. We achieved 3rd rank in ICDAR 2019 post-competition results for table detection while attaining the best accuracy results for the ICDAR 2013 and TableBank dataset. We also attain the highest accuracy results on the ICDAR 2019 table structure recognition dataset. Additionally, we demonstrate effective transfer learning and image augmentation techniques that enable CNNs to achieve very accurate table detection results. Code and dataset has been made available at: https://github.com/DevashishPrasad/CascadeTabNet

Related papers

TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content [39.34067105360439]
We propose an end-to-end pipeline that integrates deep learning models, including DETR, CascadeTabNet, and PP OCR v2, to achieve comprehensive image-based table recognition. Our system achieves simultaneous table detection (TD), table structure recognition (TSR), and table content recognition (TCR) Our proposed approach achieves an IOU of 0.96 and an OCR Accuracy of 78%, showcasing a remarkable improvement of approximately 25% in the OCR Accuracy compared to the previous Table Transformer approach.
arXiv Detail & Related papers (2024-04-16T06:24:53Z)
ClusterTabNet: Supervised clustering method for table detection and table structure recognition [0.0]
We present a novel deep-learning-based method to cluster words in documents which we apply to detect and recognize tables given the OCR output. We interpret table structure bottom-up as a graph of relations between pairs of words and use a transformer encoder model to predict its adjacency matrix. Compared to the current state-of-the-art detection methods such as DETR and Faster R-CNN, our method achieves similar or better accuracy, while requiring a significantly smaller model.
arXiv Detail & Related papers (2024-02-12T09:10:24Z)
Semi-Supervised and Long-Tailed Object Detection with CascadeMatch [91.86787064083012]
We propose a novel pseudo-labeling-based detector called CascadeMatch. Our detector features a cascade network architecture, which has multi-stage detection heads with progressive confidence thresholds. We show that CascadeMatch surpasses existing state-of-the-art semi-supervised approaches in handling long-tailed object detection.
arXiv Detail & Related papers (2023-05-24T07:09:25Z)
TRUST: An Accurate and End-to-End Table structure Recognizer Using Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST. Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation. We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z)
Interpolation-based Correlation Reduction Network for Semi-Supervised Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN) In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries. By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z)
Robust Table Detection and Structure Recognition from Heterogeneous Document Images [6.961470641696773]
We introduce RobusTabNet to detect the boundaries of tables and reconstruct the cellular structure of the table from heterogeneous document images. For table detection, we propose to use CornerNet as a new region proposal network to generate higher quality table proposals for Faster R-CNN. Our table structure recognition approach achieves state-of-the-art performance on three public benchmarks, including SciTSR, PubTabNet and cTDaR TrackB.
arXiv Detail & Related papers (2022-03-17T03:35:12Z)
TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition. Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z)
TNCR: Table Net Detection and Classification Dataset [62.997667081978825]
TNCR dataset can be used for table detection in scanned document images and their classification into 5 different classes. We have implemented state-of-the-art deep learning-based methods for table detection to create several strong baselines. We have made TNCR open source in the hope of encouraging more deep learning approaches to table detection, classification, and structure recognition.
arXiv Detail & Related papers (2021-06-19T10:48:58Z)
Tab.IAIS: Flexible Table Recognition and Semantic Interpretation System [84.39812458417246]
We develop two rule-based algorithms that perform the complete table recognition process and support the most frequent table formats. To incorporate the extraction of semantic information into the table recognition process, we develop a graph-based table interpretation method. Our table recognition approach achieves results competitive with state-of-the-art approaches.
arXiv Detail & Related papers (2021-05-25T12:31:02Z)
Multi-Type-TD-TSR -- Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: from OCR to Structured Table Representations [63.98463053292982]
The recognition of tables consists of two main tasks, namely table detection and table structure recognition. Recent work shows a clear trend towards deep learning approaches coupled with the use of transfer learning for the task of table structure recognition. We present a multistage pipeline named Multi-Type-TD-TSR, which offers an end-to-end solution for the problem of table recognition.
arXiv Detail & Related papers (2021-05-23T21:17:18Z)
CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images [30.48863304419383]
We propose a novel end-to-end trainable deep network, (CDeC-Net) for detecting tables present in the documents. The proposed network consists of a multistage extension of Mask R-CNN with a dual backbone having deformable convolution for detecting tables varying in scale. We empirically evaluate CDeC-Net on all the publicly available benchmark datasets.
arXiv Detail & Related papers (2020-08-25T05:53:59Z)
Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context [11.99452212008243]
We present a vision-guided systematic framework for joint table detection and cell structured recognition. With GTE-Table, we invent a new penalty based on the natural cell containment constraint of tables to train our table network. We use this to enhance PubTabNet with cell labels and create FinTabNet, real-world and complex scientific and financial datasets.
arXiv Detail & Related papers (2020-05-01T20:14:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.