CDeC-Net: Composite Deformable Cascade Network for Table Detection in
Document Images
- URL: http://arxiv.org/abs/2008.10831v1
- Date: Tue, 25 Aug 2020 05:53:59 GMT
- Title: CDeC-Net: Composite Deformable Cascade Network for Table Detection in
Document Images
- Authors: Madhav Agarwal and Ajoy Mondal and C. V. Jawahar
- Abstract summary: We propose a novel end-to-end trainable deep network, (CDeC-Net) for detecting tables present in the documents.
The proposed network consists of a multistage extension of Mask R-CNN with a dual backbone having deformable convolution for detecting tables varying in scale.
We empirically evaluate CDeC-Net on all the publicly available benchmark datasets.
- Score: 30.48863304419383
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Localizing page elements/objects such as tables, figures, equations, etc. is
the primary step in extracting information from document images. We propose a
novel end-to-end trainable deep network, (CDeC-Net) for detecting tables
present in the documents. The proposed network consists of a multistage
extension of Mask R-CNN with a dual backbone having deformable convolution for
detecting tables varying in scale with high detection accuracy at higher IoU
threshold. We empirically evaluate CDeC-Net on all the publicly available
benchmark datasets - ICDAR-2013, ICDAR-2017, ICDAR-2019,UNLV, Marmot,
PubLayNet, and TableBank - with extensive experiments.
Our solution has three important properties: (i) a single trained model
CDeC-Net{\ddag} performs well across all the popular benchmark datasets; (ii)
we report excellent performances across multiple, including higher, thresholds
of IoU; (iii) by following the same protocol of the recent papers for each of
the benchmarks, we consistently demonstrate the superior quantitative
performance. Our code and models will be publicly released for enabling the
reproducibility of the results.
Related papers
- Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document
Summarization [16.830963601598242]
We propose PRIMER, a pre-trained model for multi-document representation with focus on summarization.
Specifically, we adopt the Longformer architecture with proper input transformation and global attention to fit for multi-document inputs.
Our model, PRIMER, outperforms current state-of-the-art models on most of these settings with large margins.
arXiv Detail & Related papers (2021-10-16T07:22:24Z) - End-to-End Information Extraction by Character-Level Embedding and
Multi-Stage Attentional U-Net [0.9137554315375922]
We propose a novel deep learning architecture for end-to-end information extraction on the 2D character-grid embedding of the document.
We show that our model outperforms the baseline U-Net architecture by a large margin while using 40% fewer parameters.
arXiv Detail & Related papers (2021-06-02T05:42:51Z) - Multi-Type-TD-TSR -- Extracting Tables from Document Images using a
Multi-stage Pipeline for Table Detection and Table Structure Recognition:
from OCR to Structured Table Representations [63.98463053292982]
The recognition of tables consists of two main tasks, namely table detection and table structure recognition.
Recent work shows a clear trend towards deep learning approaches coupled with the use of transfer learning for the task of table structure recognition.
We present a multistage pipeline named Multi-Type-TD-TSR, which offers an end-to-end solution for the problem of table recognition.
arXiv Detail & Related papers (2021-05-23T21:17:18Z) - CREPO: An Open Repository to Benchmark Credal Network Algorithms [78.79752265884109]
Credal networks are imprecise probabilistic graphical models based on, so-called credal, sets of probability mass functions.
A Java library called CREMA has been recently released to model, process and query credal networks.
We present CREPO, an open repository of synthetic credal networks, provided together with the exact results of inference tasks on these models.
arXiv Detail & Related papers (2021-05-10T07:31:59Z) - Data Augmentation for Abstractive Query-Focused Multi-Document
Summarization [129.96147867496205]
We present two QMDS training datasets, which we construct using two data augmentation methods.
These two datasets have complementary properties, i.e., QMDSCNN has real summaries but queries are simulated, while QMDSIR has real queries but simulated summaries.
We build end-to-end neural network models on the combined datasets that yield new state-of-the-art transfer results on DUC datasets.
arXiv Detail & Related papers (2021-03-02T16:57:01Z) - SCNet: Training Inference Sample Consistency for Instance Segmentation [15.963615360741356]
This paper proposes an architecture referred to as Sample Consistency Network (SCNet) to ensure that the IoU distribution of the samples at training time is close to that at inference time.
Experiments on the standard dataset reveal the effectiveness of the proposed method over multiple evaluation metrics, including box AP, mask AP, and inference speed.
arXiv Detail & Related papers (2020-12-18T10:26:54Z) - Regularized Densely-connected Pyramid Network for Salient Instance
Segmentation [73.17802158095813]
We propose a new pipeline for end-to-end salient instance segmentation (SIS)
To better use the rich feature hierarchies in deep networks, we propose the regularized dense connections.
A novel multi-level RoIAlign based decoder is introduced to adaptively aggregate multi-level features for better mask predictions.
arXiv Detail & Related papers (2020-08-28T00:13:30Z) - CascadeTabNet: An approach for end to end table detection and structure
recognition from image-based documents [4.199844472131922]
We present an improved deep learning-based end to end approach for solving both problems of table detection and structure recognition.
We propose CascadeTabNet: a Cascade mask Region-based CNN High-Resolution Network ( Cascade mask R-CNN HRNet) based model.
We attain the highest accuracy results on the ICDAR 2019 table structure recognition dataset.
arXiv Detail & Related papers (2020-04-27T08:12:48Z) - Searching Central Difference Convolutional Networks for Face
Anti-Spoofing [68.77468465774267]
Face anti-spoofing (FAS) plays a vital role in face recognition systems.
Most state-of-the-art FAS methods rely on stacked convolutions and expert-designed network.
Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC)
arXiv Detail & Related papers (2020-03-09T12:48:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.