Related papers: TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content

TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content

URL: http://arxiv.org/abs/2404.10305v2
Date: Fri, 19 Apr 2024 06:23:20 GMT
Title: TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content
Authors: Avinash Anand, Raj Jaiswal, Pijush Bhuyan, Mohit Gupta, Siddhesh Bangar, Md. Modassir Imam, Rajiv Ratn Shah, Shin'ichi Satoh,
Abstract summary: We propose an end-to-end pipeline that integrates deep learning models, including DETR, CascadeTabNet, and PP OCR v2, to achieve comprehensive image-based table recognition. Our system achieves simultaneous table detection (TD), table structure recognition (TSR), and table content recognition (TCR) Our proposed approach achieves an IOU of 0.96 and an OCR Accuracy of 78%, showcasing a remarkable improvement of approximately 25% in the OCR Accuracy compared to the previous Table Transformer approach.
Score: 39.34067105360439
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The automatic recognition of tabular data in document images presents a significant challenge due to the diverse range of table styles and complex structures. Tables offer valuable content representation, enhancing the predictive capabilities of various systems such as search engines and Knowledge Graphs. Addressing the two main problems, namely table detection (TD) and table structure recognition (TSR), has traditionally been approached independently. In this research, we propose an end-to-end pipeline that integrates deep learning models, including DETR, CascadeTabNet, and PP OCR v2, to achieve comprehensive image-based table recognition. This integrated approach effectively handles diverse table styles, complex structures, and image distortions, resulting in improved accuracy and efficiency compared to existing methods like Table Transformers. Our system achieves simultaneous table detection (TD), table structure recognition (TSR), and table content recognition (TCR), preserving table structures and accurately extracting tabular data from document images. The integration of multiple models addresses the intricacies of table recognition, making our approach a promising solution for image-based table understanding, data extraction, and information retrieval applications. Our proposed approach achieves an IOU of 0.96 and an OCR Accuracy of 78%, showcasing a remarkable improvement of approximately 25% in the OCR Accuracy compared to the previous Table Transformer approach.

Related papers

MonkeyOCR v1.5 Technical Report: Unlocking Robust Document Parsing for Complex Patterns [80.05126590825121]
MonkeyOCR v1.5 is a unified vision-language framework that enhances both layout understanding and content recognition.<n>To address complex table structures, we propose a visual consistency-based reinforcement learning scheme.<n>Two specialized modules, Image-Decoupled Table Parsing and Type-Guided Table Merging, are introduced to enable reliable parsing of tables.
arXiv Detail & Related papers (2025-11-13T15:12:17Z)
TableDART: Dynamic Adaptive Multi-Modal Routing for Table Understanding [52.59372043981724]
TableDART is a training-efficient framework that integrates multimodal views by reusing pretrained single-modality models.<n>In addition, we propose a novel agent to cross-modal knowledge integration by analyzing outputs from text- and image-based models.
arXiv Detail & Related papers (2025-09-18T07:00:13Z)
TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy [81.76462101465354]
We present a novel large vision-hugging model, TabPedia, equipped with a concept synergy mechanism. This unified framework allows TabPedia to seamlessly integrate VTU tasks, such as table detection, table structure recognition, table querying, and table question answering. To better evaluate the VTU task in real-world scenarios, we establish a new and comprehensive table VQA benchmark, ComTQA.
arXiv Detail & Related papers (2024-06-03T13:54:05Z)
Efficient Information Sharing in ICT Supply Chain Social Network via Table Structure Recognition [12.79419287446918]
Table Structure Recognition (TSR) aims to represent tables with complex structures in a machine-interpretable format. We implement our proposed method based on Faster-RCNN and achieve 94.79% on mean Average Precision (AP)
arXiv Detail & Related papers (2022-11-03T20:03:07Z)
TRUST: An Accurate and End-to-End Table structure Recognizer Using Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST. Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation. We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z)
Robust Table Detection and Structure Recognition from Heterogeneous Document Images [6.961470641696773]
We introduce RobusTabNet to detect the boundaries of tables and reconstruct the cellular structure of the table from heterogeneous document images. For table detection, we propose to use CornerNet as a new region proposal network to generate higher quality table proposals for Faster R-CNN. Our table structure recognition approach achieves state-of-the-art performance on three public benchmarks, including SciTSR, PubTabNet and cTDaR TrackB.
arXiv Detail & Related papers (2022-03-17T03:35:12Z)
TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition. Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z)
Tab.IAIS: Flexible Table Recognition and Semantic Interpretation System [84.39812458417246]
We develop two rule-based algorithms that perform the complete table recognition process and support the most frequent table formats. To incorporate the extraction of semantic information into the table recognition process, we develop a graph-based table interpretation method. Our table recognition approach achieves results competitive with state-of-the-art approaches.
arXiv Detail & Related papers (2021-05-25T12:31:02Z)
Multi-Type-TD-TSR -- Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: from OCR to Structured Table Representations [63.98463053292982]
The recognition of tables consists of two main tasks, namely table detection and table structure recognition. Recent work shows a clear trend towards deep learning approaches coupled with the use of transfer learning for the task of table structure recognition. We present a multistage pipeline named Multi-Type-TD-TSR, which offers an end-to-end solution for the problem of table recognition.
arXiv Detail & Related papers (2021-05-23T21:17:18Z)
Table Structure Recognition using Top-Down and Bottom-Up Cues [28.65687982486627]
We present an approach for table structure recognition that combines cell detection and interaction modules. We empirically validate our method on the publicly available real-world datasets.
arXiv Detail & Related papers (2020-10-09T13:32:53Z)
Table Structure Extraction with Bi-directional Gated Recurrent Unit Networks [5.350788087718877]
This paper proposes a robust deep learning based approach to extract rows and columns from a detected table in document images with a high precision. We have benchmarked our system on publicly available UNLV as well as ICDAR 2013 datasets on which it outperformed the state-of-the-art table structure extraction systems by a significant margin.
arXiv Detail & Related papers (2020-01-08T13:17:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.