Split, embed and merge: An accurate table structure recognizer
- URL: http://arxiv.org/abs/2107.05214v1
- Date: Mon, 12 Jul 2021 06:26:19 GMT
- Title: Split, embed and merge: An accurate table structure recognizer
- Authors: Zhenrong Zhang, Jianshu Zhang and Jun Du
- Abstract summary: We introduce Split, Embed and Merge (SEM) as an accurate table structure recognizer.
SEM can achieve an average F-Measure of $96.9%$ on the SciTSR dataset.
- Score: 42.579215135672094
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The task of table structure recognition is to recognize the internal
structure of a table, which is a key step to make machines understand tables.
However, tabular data in unstructured digital documents, e.g. Portable Document
Format (PDF) and images, are difficult to parse into structured
machine-readable format, due to complexity and diversity in their structure and
style, especially for complex tables. In this paper, we introduce Split, Embed
and Merge (SEM), an accurate table structure recognizer. In the first stage, we
use the FCN to predict the potential regions of the table row (column)
separators, so as to obtain the bounding boxes of the basic grids in the table.
In the second stage, we not only extract the visual features corresponding to
each grid through RoIAlign, but also use the off-the-shelf recognizer and the
BERT to extract the semantic features. The fused features of both are used to
characterize each table grid. We find that by adding additional semantic
features to each grid, the ambiguity problem of the table structure from the
visual perspective can be solved to a certain extent and achieve higher
precision. Finally, we process the merging of these basic grids in a
self-regression manner. The correspondent merging results is learned by the
attention maps in attention mechanism. With the proposed method, we can
recognize the structure of tables well, even for complex tables. SEM can
achieve an average F-Measure of $96.9\%$ on the SciTSR dataset which
outperforms other methods by a large margin. Extensive experiments on other
publicly available table structure recognition datasets show that our model
achieves state-of-the-art.
Related papers
- UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition [55.153629718464565]
We introduce UniTabNet, a novel framework for table structure parsing based on the image-to-text model.
UniTabNet employs a divide-and-conquer'' strategy, utilizing an image-to-text model to decouple table cells and integrating both physical and logical decoders to reconstruct the complete table structure.
arXiv Detail & Related papers (2024-09-20T01:26:32Z) - SEMv3: A Fast and Robust Approach to Table Separation Line Detection [48.75713662571455]
Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image.
"Split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial.
We propose SEMv3 (SEM: Split, Embed and Merge), a method that is both fast and robust for detecting table separation lines.
arXiv Detail & Related papers (2024-05-20T08:13:46Z) - SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge)
We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution.
To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z) - TRUST: An Accurate and End-to-End Table structure Recognizer Using
Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST.
Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation.
We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z) - Table Structure Recognition with Conditional Attention [13.976736586808308]
Table Structure Recognition (TSR) problem aims to recognize the structure of a table and transform the unstructured tables into a structured and machine-readable format.
In this study, we hypothesize that a complicated table structure can be represented by a graph whose vertices and edges represent the cells and association between cells, respectively.
Experimental results show that the alignment of a cell bounding box can help improve the Micro-averaged F1 score from 0.915 to 0.963, and the Macro-average F1 score from 0.787 to 0.923.
arXiv Detail & Related papers (2022-03-08T02:44:58Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - Table Structure Recognition using Top-Down and Bottom-Up Cues [28.65687982486627]
We present an approach for table structure recognition that combines cell detection and interaction modules.
We empirically validate our method on the publicly available real-world datasets.
arXiv Detail & Related papers (2020-10-09T13:32:53Z) - Identifying Table Structure in Documents using Conditional Generative
Adversarial Networks [0.0]
In many industries and in academic research, information is primarily transmitted in the form of unstructured documents.
We propose a top-down approach, first using a conditional generative adversarial network to map a table image into a standardised skeleton' table form.
We then deriving latent table structure using xy-cut projection and Genetic Algorithm optimisation.
arXiv Detail & Related papers (2020-01-13T20:42:40Z) - Table Structure Extraction with Bi-directional Gated Recurrent Unit
Networks [5.350788087718877]
This paper proposes a robust deep learning based approach to extract rows and columns from a detected table in document images with a high precision.
We have benchmarked our system on publicly available UNLV as well as ICDAR 2013 datasets on which it outperformed the state-of-the-art table structure extraction systems by a significant margin.
arXiv Detail & Related papers (2020-01-08T13:17:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.