Split, embed and merge: An accurate table structure recognizer
- URL: http://arxiv.org/abs/2107.05214v1
- Date: Mon, 12 Jul 2021 06:26:19 GMT
- Title: Split, embed and merge: An accurate table structure recognizer
- Authors: Zhenrong Zhang, Jianshu Zhang and Jun Du
- Abstract summary: We introduce Split, Embed and Merge (SEM) as an accurate table structure recognizer.
SEM can achieve an average F-Measure of $96.9%$ on the SciTSR dataset.
- Score: 42.579215135672094
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The task of table structure recognition is to recognize the internal
structure of a table, which is a key step to make machines understand tables.
However, tabular data in unstructured digital documents, e.g. Portable Document
Format (PDF) and images, are difficult to parse into structured
machine-readable format, due to complexity and diversity in their structure and
style, especially for complex tables. In this paper, we introduce Split, Embed
and Merge (SEM), an accurate table structure recognizer. In the first stage, we
use the FCN to predict the potential regions of the table row (column)
separators, so as to obtain the bounding boxes of the basic grids in the table.
In the second stage, we not only extract the visual features corresponding to
each grid through RoIAlign, but also use the off-the-shelf recognizer and the
BERT to extract the semantic features. The fused features of both are used to
characterize each table grid. We find that by adding additional semantic
features to each grid, the ambiguity problem of the table structure from the
visual perspective can be solved to a certain extent and achieve higher
precision. Finally, we process the merging of these basic grids in a
self-regression manner. The correspondent merging results is learned by the
attention maps in attention mechanism. With the proposed method, we can
recognize the structure of tables well, even for complex tables. SEM can
achieve an average F-Measure of $96.9\%$ on the SciTSR dataset which
outperforms other methods by a large margin. Extensive experiments on other
publicly available table structure recognition datasets show that our model
achieves state-of-the-art.
Related papers
- SEMv3: A Fast and Robust Approach to Table Separation Line Detection [48.75713662571455]
Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image.
"Split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial.
We propose SEMv3 (SEM: Split, Embed and Merge), a method that is both fast and robust for detecting table separation lines.
arXiv Detail & Related papers (2024-05-20T08:13:46Z) - SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge)
We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution.
To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z) - TRUST: An Accurate and End-to-End Table structure Recognizer Using
Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST.
Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation.
We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z) - Table Structure Recognition with Conditional Attention [13.976736586808308]
Table Structure Recognition (TSR) problem aims to recognize the structure of a table and transform the unstructured tables into a structured and machine-readable format.
In this study, we hypothesize that a complicated table structure can be represented by a graph whose vertices and edges represent the cells and association between cells, respectively.
Experimental results show that the alignment of a cell bounding box can help improve the Micro-averaged F1 score from 0.915 to 0.963, and the Macro-average F1 score from 0.787 to 0.923.
arXiv Detail & Related papers (2022-03-08T02:44:58Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - Multi-Type-TD-TSR -- Extracting Tables from Document Images using a
Multi-stage Pipeline for Table Detection and Table Structure Recognition:
from OCR to Structured Table Representations [63.98463053292982]
The recognition of tables consists of two main tasks, namely table detection and table structure recognition.
Recent work shows a clear trend towards deep learning approaches coupled with the use of transfer learning for the task of table structure recognition.
We present a multistage pipeline named Multi-Type-TD-TSR, which offers an end-to-end solution for the problem of table recognition.
arXiv Detail & Related papers (2021-05-23T21:17:18Z) - Table Structure Recognition using Top-Down and Bottom-Up Cues [28.65687982486627]
We present an approach for table structure recognition that combines cell detection and interaction modules.
We empirically validate our method on the publicly available real-world datasets.
arXiv Detail & Related papers (2020-10-09T13:32:53Z) - Identifying Table Structure in Documents using Conditional Generative
Adversarial Networks [0.0]
In many industries and in academic research, information is primarily transmitted in the form of unstructured documents.
We propose a top-down approach, first using a conditional generative adversarial network to map a table image into a standardised skeleton' table form.
We then deriving latent table structure using xy-cut projection and Genetic Algorithm optimisation.
arXiv Detail & Related papers (2020-01-13T20:42:40Z) - Table Structure Extraction with Bi-directional Gated Recurrent Unit
Networks [5.350788087718877]
This paper proposes a robust deep learning based approach to extract rows and columns from a detected table in document images with a high precision.
We have benchmarked our system on publicly available UNLV as well as ICDAR 2013 datasets on which it outperformed the state-of-the-art table structure extraction systems by a significant margin.
arXiv Detail & Related papers (2020-01-08T13:17:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.