GridFormer: Towards Accurate Table Structure Recognition via Grid
Prediction
- URL: http://arxiv.org/abs/2309.14962v1
- Date: Tue, 26 Sep 2023 14:29:45 GMT
- Title: GridFormer: Towards Accurate Table Structure Recognition via Grid
Prediction
- Authors: Pengyuan Lyu, Weihong Ma, Hongyi Wang, Yuechen Yu, Chengquan Zhang,
Kun Yao, Yang Xue, Jingdong Wang
- Abstract summary: We propose GridFormer, a novel approach for interpreting unconstrained table structures.
In this paper, we propose a flexible table representation in the form of an MXN grid.
Then, we introduce a DETR-style table structure recognizer to efficiently predict this multi-objective information of the grid in a single shot.
- Score: 35.15882175670814
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: All tables can be represented as grids. Based on this observation, we propose
GridFormer, a novel approach for interpreting unconstrained table structures by
predicting the vertex and edge of a grid. First, we propose a flexible table
representation in the form of an MXN grid. In this representation, the vertexes
and edges of the grid store the localization and adjacency information of the
table. Then, we introduce a DETR-style table structure recognizer to
efficiently predict this multi-objective information of the grid in a single
shot. Specifically, given a set of learned row and column queries, the
recognizer directly outputs the vertexes and edges information of the
corresponding rows and columns. Extensive experiments on five challenging
benchmarks which include wired, wireless, multi-merge-cell, oriented, and
distorted tables demonstrate the competitive performance of our model over
other methods.
Related papers
- UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition [55.153629718464565]
We introduce UniTabNet, a novel framework for table structure parsing based on the image-to-text model.
UniTabNet employs a divide-and-conquer'' strategy, utilizing an image-to-text model to decouple table cells and integrating both physical and logical decoders to reconstruct the complete table structure.
arXiv Detail & Related papers (2024-09-20T01:26:32Z) - TRACE: Table Reconstruction Aligned to Corner and Edges [7.536220920052911]
We analyze the natural characteristics of a table, where a table is composed of cells and each cell is made up of borders consisting of edges.
We propose a novel method to reconstruct the table in a bottom-up manner.
A simple design makes the model easier to train and requires less computation than previous two-stage methods.
arXiv Detail & Related papers (2023-05-01T02:26:15Z) - SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge)
We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution.
To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z) - TRUST: An Accurate and End-to-End Table structure Recognizer Using
Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST.
Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation.
We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z) - Split, embed and merge: An accurate table structure recognizer [42.579215135672094]
We introduce Split, Embed and Merge (SEM) as an accurate table structure recognizer.
SEM can achieve an average F-Measure of $96.9%$ on the SciTSR dataset.
arXiv Detail & Related papers (2021-07-12T06:26:19Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - Retrieving Complex Tables with Multi-Granular Graph Representation
Learning [20.72341939868327]
The task of natural language table retrieval seeks to retrieve semantically relevant tables based on natural language queries.
Existing learning systems treat tables as plain text based on the assumption that tables are structured as dataframes.
We propose Graph-based Table Retrieval (GTR), a generalizable NLTR framework with multi-granular graph representation learning.
arXiv Detail & Related papers (2021-05-04T20:19:03Z) - TCN: Table Convolutional Network for Web Table Interpretation [52.32515851633981]
We propose a novel table representation learning approach considering both the intra- and inter-table contextual information.
Our method can outperform competitive baselines by +4.8% of F1 for column type prediction and by +4.1% of F1 for column pairwise relation prediction.
arXiv Detail & Related papers (2021-02-17T02:18:10Z) - Identifying Table Structure in Documents using Conditional Generative
Adversarial Networks [0.0]
In many industries and in academic research, information is primarily transmitted in the form of unstructured documents.
We propose a top-down approach, first using a conditional generative adversarial network to map a table image into a standardised skeleton' table form.
We then deriving latent table structure using xy-cut projection and Genetic Algorithm optimisation.
arXiv Detail & Related papers (2020-01-13T20:42:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.