Current Status and Performance Analysis of Table Recognition in Document
Images with Deep Neural Networks
- URL: http://arxiv.org/abs/2104.14272v1
- Date: Thu, 29 Apr 2021 11:43:48 GMT
- Title: Current Status and Performance Analysis of Table Recognition in Document
Images with Deep Neural Networks
- Authors: Khurram Azeem Hashmi, Marcus Liwicki, Didier Stricker, Muhammad Adnan
Afzal, Muhammad Ahtsham Afzal and Muhammad Zeshan Afzal
- Abstract summary: Table detection and structural recognition are pivotal problems in the domain of table understanding.
Recent advances in the computing capabilities of graphical processing units have enabled deep neural networks to outperform traditional machine learning methods.
This review paper provides a thorough analysis of the modern methodologies that utilize deep neural networks.
- Score: 12.161050209491496
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The first phase of table recognition is to detect the tabular area in a
document. Subsequently, the tabular structures are recognized in the second
phase in order to extract information from the respective cells. Table
detection and structural recognition are pivotal problems in the domain of
table understanding. However, table analysis is a perplexing task due to the
colossal amount of diversity and asymmetry in tables. Therefore, it is an
active area of research in document image analysis. Recent advances in the
computing capabilities of graphical processing units have enabled deep neural
networks to outperform traditional state-of-the-art machine learning methods.
Table understanding has substantially benefited from the recent breakthroughs
in deep neural networks. However, there has not been a consolidated description
of the deep learning methods for table detection and table structure
recognition. This review paper provides a thorough analysis of the modern
methodologies that utilize deep neural networks. This work provided a thorough
understanding of the current state-of-the-art and related challenges of table
understanding in document images. Furthermore, the leading datasets and their
intricacies have been elaborated along with the quantitative results. Moreover,
a brief overview is given regarding the promising directions that can serve as
a guide to further improve table analysis in document images.
Related papers
- Knowledge-Aware Reasoning over Multimodal Semi-structured Tables [85.24395216111462]
This study investigates whether current AI models can perform knowledge-aware reasoning on multimodal structured data.
We introduce MMTabQA, a new dataset designed for this purpose.
Our experiments highlight substantial challenges for current AI models in effectively integrating and interpreting multiple text and image inputs.
arXiv Detail & Related papers (2024-08-25T15:17:43Z) - From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models [98.41645229835493]
Data visualization in the form of charts plays a pivotal role in data analysis, offering critical insights and aiding in informed decision-making.
Large foundation models, such as large language models, have revolutionized various natural language processing tasks.
This survey paper serves as a comprehensive resource for researchers and practitioners in the fields of natural language processing, computer vision, and data analysis.
arXiv Detail & Related papers (2024-03-18T17:57:09Z) - Graph Neural Networks and Representation Embedding for Table Extraction
in PDF Documents [1.1859913430860336]
The main contribution of this work is to tackle the problem of table extraction, exploiting Graph Neural Networks.
We experimentally evaluated the proposed approach on a new dataset obtained by merging the information provided in the PubLayNet and PubTables-1M datasets.
arXiv Detail & Related papers (2022-08-23T21:36:01Z) - Temporal Graph Network Embedding with Causal Anonymous Walks
Representations [54.05212871508062]
We propose a novel approach for dynamic network representation learning based on Temporal Graph Network.
For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings.
We show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks.
arXiv Detail & Related papers (2021-08-19T15:39:52Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - TabularNet: A Neural Network Architecture for Understanding Semantic
Structures of Tabular Data [30.479822289380255]
We propose a novel neural network architecture, TabularNet, to simultaneously extract spatial and relational information from tables.
For relational information, we design a new graph construction method based on the WordNet tree and adopt a Graph Convolutional Network (GCN) based encoder.
Our neural network architecture can be a unified neural backbone for different understanding tasks and utilized in a multitask scenario.
arXiv Detail & Related papers (2021-06-06T11:48:09Z) - Multi-Type-TD-TSR -- Extracting Tables from Document Images using a
Multi-stage Pipeline for Table Detection and Table Structure Recognition:
from OCR to Structured Table Representations [63.98463053292982]
The recognition of tables consists of two main tasks, namely table detection and table structure recognition.
Recent work shows a clear trend towards deep learning approaches coupled with the use of transfer learning for the task of table structure recognition.
We present a multistage pipeline named Multi-Type-TD-TSR, which offers an end-to-end solution for the problem of table recognition.
arXiv Detail & Related papers (2021-05-23T21:17:18Z) - Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations.
Several recent studies attribute this performance deterioration to the over-smoothing issue.
We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z) - Table Structure Extraction with Bi-directional Gated Recurrent Unit
Networks [5.350788087718877]
This paper proposes a robust deep learning based approach to extract rows and columns from a detected table in document images with a high precision.
We have benchmarked our system on publicly available UNLV as well as ICDAR 2013 datasets on which it outperformed the state-of-the-art table structure extraction systems by a significant margin.
arXiv Detail & Related papers (2020-01-08T13:17:44Z) - TableNet: Deep Learning model for end-to-end Table detection and Tabular
data extraction from Scanned Document Images [18.016832803961165]
We propose a novel end-to-end deep learning model for both table detection and structure recognition.
TableNet exploits the interdependence between the twin tasks of table detection and table structure recognition.
The proposed model and extraction approach was evaluated on the publicly available ICDAR 2013 and Marmot Table datasets.
arXiv Detail & Related papers (2020-01-06T10:25:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.