Related papers: Visual Understanding of Complex Table Structures from Document Images

Visual Understanding of Complex Table Structures from Document Images

URL: http://arxiv.org/abs/2111.07129v1
Date: Sat, 13 Nov 2021 14:54:33 GMT
Title: Visual Understanding of Complex Table Structures from Document Images
Authors: Sachin Raja, Ajoy Mondal, and C V Jawahar
Abstract summary: We propose a novel object-detection-based deep model that captures the inherent alignments of cells within tables. We also aim to improve structure recognition by deducing a novel rectilinear graph-based formulation. Our framework improves the previous state-of-the-art performance by a 2.7% average F1-score on benchmark datasets.
Score: 32.95187519339354
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Table structure recognition is necessary for a comprehensive understanding of documents. Tables in unstructured business documents are tough to parse due to the high diversity of layouts, varying alignments of contents, and the presence of empty cells. The problem is particularly difficult because of challenges in identifying individual cells using visual or linguistic contexts or both. Accurate detection of table cells (including empty cells) simplifies structure extraction and hence, it becomes the prime focus of our work. We propose a novel object-detection-based deep model that captures the inherent alignments of cells within tables and is fine-tuned for fast optimization. Despite accurate detection of cells, recognizing structures for dense tables may still be challenging because of difficulties in capturing long-range row/column dependencies in presence of multi-row/column spanning cells. Therefore, we also aim to improve structure recognition by deducing a novel rectilinear graph-based formulation. From a semantics perspective, we highlight the significance of empty cells in a table. To take these cells into account, we suggest an enhancement to a popular evaluation criterion. Finally, we introduce a modestly sized evaluation dataset with an annotation style inspired by human cognition to encourage new approaches to the problem. Our framework improves the previous state-of-the-art performance by a 2.7% average F1-score on benchmark datasets.

Related papers

Multi-Cell Decoder and Mutual Learning for Table Structure and Character Recognition [1.2328446298523066]
We propose a multi-cell content decoder and bidirectional mutual learning mechanism to improve the end-to-end approach. The effectiveness is demonstrated on two large datasets, and the experimental results show comparable performance to state-of-the-art models.
arXiv Detail & Related papers (2024-04-20T04:30:38Z)
Doc2SoarGraph: Discrete Reasoning over Visually-Rich Table-Text Documents via Semantic-Oriented Hierarchical Graphs [79.0426838808629]
We propose TAT-DQA, i.e. to answer the question over a visually-rich table-text document. Specifically, we propose a novel Doc2SoarGraph framework with enhanced discrete reasoning capability. We conduct extensive experiments on TAT-DQA dataset, and the results show that our proposed framework outperforms the best baseline model by 17.73% and 16.91% in terms of Exact Match (EM) and F1 score respectively on the test set.
arXiv Detail & Related papers (2023-05-03T07:30:32Z)
SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge) We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution. To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z)
TRUST: An Accurate and End-to-End Table structure Recognizer Using Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST. Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation. We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z)
Table Structure Recognition with Conditional Attention [13.976736586808308]
Table Structure Recognition (TSR) problem aims to recognize the structure of a table and transform the unstructured tables into a structured and machine-readable format. In this study, we hypothesize that a complicated table structure can be represented by a graph whose vertices and edges represent the cells and association between cells, respectively. Experimental results show that the alignment of a cell bounding box can help improve the Micro-averaged F1 score from 0.915 to 0.963, and the Macro-average F1 score from 0.787 to 0.923.
arXiv Detail & Related papers (2022-03-08T02:44:58Z)
TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition. Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z)
TCN: Table Convolutional Network for Web Table Interpretation [52.32515851633981]
We propose a novel table representation learning approach considering both the intra- and inter-table contextual information. Our method can outperform competitive baselines by +4.8% of F1 for column type prediction and by +4.1% of F1 for column pairwise relation prediction.
arXiv Detail & Related papers (2021-02-17T02:18:10Z)
Table Structure Recognition using Top-Down and Bottom-Up Cues [28.65687982486627]
We present an approach for table structure recognition that combines cell detection and interaction modules. We empirically validate our method on the publicly available real-world datasets.
arXiv Detail & Related papers (2020-10-09T13:32:53Z)
Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context [11.99452212008243]
We present a vision-guided systematic framework for joint table detection and cell structured recognition. With GTE-Table, we invent a new penalty based on the natural cell containment constraint of tables to train our table network. We use this to enhance PubTabNet with cell labels and create FinTabNet, real-world and complex scientific and financial datasets.
arXiv Detail & Related papers (2020-05-01T20:14:49Z)
Identifying Table Structure in Documents using Conditional Generative Adversarial Networks [0.0]
In many industries and in academic research, information is primarily transmitted in the form of unstructured documents. We propose a top-down approach, first using a conditional generative adversarial network to map a table image into a standardised skeleton' table form. We then deriving latent table structure using xy-cut projection and Genetic Algorithm optimisation.
arXiv Detail & Related papers (2020-01-13T20:42:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.