A large-scale dataset for end-to-end table recognition in the wild
- URL: http://arxiv.org/abs/2303.14884v1
- Date: Mon, 27 Mar 2023 02:48:51 GMT
- Title: A large-scale dataset for end-to-end table recognition in the wild
- Authors: Fan Yang, Lei Hu, Xinwu Liu, Shuangping Huang, Zhenghui Gu
- Abstract summary: Table recognition (TR) is one of the research hotspots in pattern recognition.
Currently, the end-to-end TR in real scenarios, accomplishing the three sub-tasks simultaneously, is yet an unexplored research area.
We propose a new large-scale dataset named Table Recognition Set (TabRecSet) with diverse table forms sourcing from multiple scenarios in the wild.
- Score: 13.717478398235055
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Table recognition (TR) is one of the research hotspots in pattern
recognition, which aims to extract information from tables in an image. Common
table recognition tasks include table detection (TD), table structure
recognition (TSR) and table content recognition (TCR). TD is to locate tables
in the image, TCR recognizes text content, and TSR recognizes spatial ogical
structure. Currently, the end-to-end TR in real scenarios, accomplishing the
three sub-tasks simultaneously, is yet an unexplored research area. One major
factor that inhibits researchers is the lack of a benchmark dataset. To this
end, we propose a new large-scale dataset named Table Recognition Set
(TabRecSet) with diverse table forms sourcing from multiple scenarios in the
wild, providing complete annotation dedicated to end-to-end TR research. It is
the largest and first bi-lingual dataset for end-to-end TR, with 38.1K tables
in which 20.4K are in English\, and 17.7K are in Chinese. The samples have
diverse forms, such as the border-complete and -incomplete table, regular and
irregular table (rotated, distorted, etc.). The scenarios are multiple in the
wild, varying from scanned to camera-taken images, documents to Excel tables,
educational test papers to financial invoices. The annotations are complete,
consisting of the table body spatial annotation, cell spatial logical
annotation and text content for TD, TSR and TCR, respectively. The spatial
annotation utilizes the polygon instead of the bounding box or quadrilateral
adopted by most datasets. The polygon spatial annotation is more suitable for
irregular tables that are common in wild scenarios. Additionally, we propose a
visualized and interactive annotation tool named TableMe to improve the
efficiency and quality of table annotation.
Related papers
- UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition [55.153629718464565]
We introduce UniTabNet, a novel framework for table structure parsing based on the image-to-text model.
UniTabNet employs a divide-and-conquer'' strategy, utilizing an image-to-text model to decouple table cells and integrating both physical and logical decoders to reconstruct the complete table structure.
arXiv Detail & Related papers (2024-09-20T01:26:32Z) - TRACE: Table Reconstruction Aligned to Corner and Edges [7.536220920052911]
We analyze the natural characteristics of a table, where a table is composed of cells and each cell is made up of borders consisting of edges.
We propose a novel method to reconstruct the table in a bottom-up manner.
A simple design makes the model easier to train and requires less computation than previous two-stage methods.
arXiv Detail & Related papers (2023-05-01T02:26:15Z) - SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge)
We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution.
To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z) - Split, embed and merge: An accurate table structure recognizer [42.579215135672094]
We introduce Split, Embed and Merge (SEM) as an accurate table structure recognizer.
SEM can achieve an average F-Measure of $96.9%$ on the SciTSR dataset.
arXiv Detail & Related papers (2021-07-12T06:26:19Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - Multi-Type-TD-TSR -- Extracting Tables from Document Images using a
Multi-stage Pipeline for Table Detection and Table Structure Recognition:
from OCR to Structured Table Representations [63.98463053292982]
The recognition of tables consists of two main tasks, namely table detection and table structure recognition.
Recent work shows a clear trend towards deep learning approaches coupled with the use of transfer learning for the task of table structure recognition.
We present a multistage pipeline named Multi-Type-TD-TSR, which offers an end-to-end solution for the problem of table recognition.
arXiv Detail & Related papers (2021-05-23T21:17:18Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z) - ToTTo: A Controlled Table-To-Text Generation Dataset [61.83159452483026]
ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples.
We introduce a dataset construction process where annotators directly revise existing candidate sentences from Wikipedia.
While usually fluent, existing methods often hallucinate phrases that are not supported by the table.
arXiv Detail & Related papers (2020-04-29T17:53:45Z) - TableNet: Deep Learning model for end-to-end Table detection and Tabular
data extraction from Scanned Document Images [18.016832803961165]
We propose a novel end-to-end deep learning model for both table detection and structure recognition.
TableNet exploits the interdependence between the twin tasks of table detection and table structure recognition.
The proposed model and extraction approach was evaluated on the publicly available ICDAR 2013 and Marmot Table datasets.
arXiv Detail & Related papers (2020-01-06T10:25:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.