Visual Understanding of Complex Table Structures from Document Images
- URL: http://arxiv.org/abs/2111.07129v1
- Date: Sat, 13 Nov 2021 14:54:33 GMT
- Title: Visual Understanding of Complex Table Structures from Document Images
- Authors: Sachin Raja, Ajoy Mondal, and C V Jawahar
- Abstract summary: We propose a novel object-detection-based deep model that captures the inherent alignments of cells within tables.
We also aim to improve structure recognition by deducing a novel rectilinear graph-based formulation.
Our framework improves the previous state-of-the-art performance by a 2.7% average F1-score on benchmark datasets.
- Score: 32.95187519339354
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Table structure recognition is necessary for a comprehensive understanding of
documents. Tables in unstructured business documents are tough to parse due to
the high diversity of layouts, varying alignments of contents, and the presence
of empty cells. The problem is particularly difficult because of challenges in
identifying individual cells using visual or linguistic contexts or both.
Accurate detection of table cells (including empty cells) simplifies structure
extraction and hence, it becomes the prime focus of our work. We propose a
novel object-detection-based deep model that captures the inherent alignments
of cells within tables and is fine-tuned for fast optimization. Despite
accurate detection of cells, recognizing structures for dense tables may still
be challenging because of difficulties in capturing long-range row/column
dependencies in presence of multi-row/column spanning cells. Therefore, we also
aim to improve structure recognition by deducing a novel rectilinear
graph-based formulation. From a semantics perspective, we highlight the
significance of empty cells in a table. To take these cells into account, we
suggest an enhancement to a popular evaluation criterion. Finally, we introduce
a modestly sized evaluation dataset with an annotation style inspired by human
cognition to encourage new approaches to the problem. Our framework improves
the previous state-of-the-art performance by a 2.7% average F1-score on
benchmark datasets.
Related papers
- Multi-Cell Decoder and Mutual Learning for Table Structure and Character Recognition [1.2328446298523066]
We propose a multi-cell content decoder and bidirectional mutual learning mechanism to improve the end-to-end approach.
The effectiveness is demonstrated on two large datasets, and the experimental results show comparable performance to state-of-the-art models.
arXiv Detail & Related papers (2024-04-20T04:30:38Z) - Doc2SoarGraph: Discrete Reasoning over Visually-Rich Table-Text
Documents via Semantic-Oriented Hierarchical Graphs [79.0426838808629]
We propose TAT-DQA, i.e. to answer the question over a visually-rich table-text document.
Specifically, we propose a novel Doc2SoarGraph framework with enhanced discrete reasoning capability.
We conduct extensive experiments on TAT-DQA dataset, and the results show that our proposed framework outperforms the best baseline model by 17.73% and 16.91% in terms of Exact Match (EM) and F1 score respectively on the test set.
arXiv Detail & Related papers (2023-05-03T07:30:32Z) - SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge)
We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution.
To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z) - TRUST: An Accurate and End-to-End Table structure Recognizer Using
Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST.
Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation.
We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z) - Table Structure Recognition with Conditional Attention [13.976736586808308]
Table Structure Recognition (TSR) problem aims to recognize the structure of a table and transform the unstructured tables into a structured and machine-readable format.
In this study, we hypothesize that a complicated table structure can be represented by a graph whose vertices and edges represent the cells and association between cells, respectively.
Experimental results show that the alignment of a cell bounding box can help improve the Micro-averaged F1 score from 0.915 to 0.963, and the Macro-average F1 score from 0.787 to 0.923.
arXiv Detail & Related papers (2022-03-08T02:44:58Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - TCN: Table Convolutional Network for Web Table Interpretation [52.32515851633981]
We propose a novel table representation learning approach considering both the intra- and inter-table contextual information.
Our method can outperform competitive baselines by +4.8% of F1 for column type prediction and by +4.1% of F1 for column pairwise relation prediction.
arXiv Detail & Related papers (2021-02-17T02:18:10Z) - Table Structure Recognition using Top-Down and Bottom-Up Cues [28.65687982486627]
We present an approach for table structure recognition that combines cell detection and interaction modules.
We empirically validate our method on the publicly available real-world datasets.
arXiv Detail & Related papers (2020-10-09T13:32:53Z) - Global Table Extractor (GTE): A Framework for Joint Table Identification
and Cell Structure Recognition Using Visual Context [11.99452212008243]
We present a vision-guided systematic framework for joint table detection and cell structured recognition.
With GTE-Table, we invent a new penalty based on the natural cell containment constraint of tables to train our table network.
We use this to enhance PubTabNet with cell labels and create FinTabNet, real-world and complex scientific and financial datasets.
arXiv Detail & Related papers (2020-05-01T20:14:49Z) - Identifying Table Structure in Documents using Conditional Generative
Adversarial Networks [0.0]
In many industries and in academic research, information is primarily transmitted in the form of unstructured documents.
We propose a top-down approach, first using a conditional generative adversarial network to map a table image into a standardised skeleton' table form.
We then deriving latent table structure using xy-cut projection and Genetic Algorithm optimisation.
arXiv Detail & Related papers (2020-01-13T20:42:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.