SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection
- URL: http://arxiv.org/abs/2406.15485v3
- Date: Mon, 8 Jul 2024 07:41:50 GMT
- Title: SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection
- Authors: Xingjian Hu, Baole Wei, Liangcai Gao, Jun Wang,
- Abstract summary: Text line detection is a key task in historical document analysis.
We propose a general framework for historical document text detection (SegHist)
Integrating the SegHist framework with the commonly used method DB++, we develop DB-SegHist.
- Score: 10.08588082910962
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text line detection is a key task in historical document analysis facing many challenges of arbitrary-shaped text lines, dense texts, and text lines with high aspect ratios, etc. In this paper, we propose a general framework for historical document text detection (SegHist), enabling existing segmentation-based text detection methods to effectively address the challenges, especially text lines with high aspect ratios. Integrating the SegHist framework with the commonly used method DB++, we develop DB-SegHist. This approach achieves SOTA on the CHDAC, MTHv2, and competitive results on HDRC datasets, with a significant improvement of 1.19% on the most challenging CHDAC dataset which features more text lines with high aspect ratios. Moreover, our method attains SOTA on rotated MTHv2 and rotated HDRC, demonstrating its rotational robustness. The code is available at https://github.com/LumionHXJ/SegHist.
Related papers
- Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis [52.01356859448068]
HTS can recognize text in an image and identify its 4-level hierarchical structure: characters, words, lines, and paragraphs.
HTS achieves state-of-the-art results on multiple word-level text spotting benchmark datasets as well as geometric layout analysis tasks.
arXiv Detail & Related papers (2023-10-25T22:23:54Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Towards End-to-End Unified Scene Text Detection and Layout Analysis [60.68100769639923]
We introduce the task of unified scene text detection and layout analysis.
The first hierarchical scene text dataset is introduced to enable this novel research task.
We also propose a novel method that is able to simultaneously detect scene text and form text clusters in a unified way.
arXiv Detail & Related papers (2022-03-28T23:35:45Z) - SwinTextSpotter: Scene Text Spotting via Better Synergy between Text
Detection and Text Recognition [73.61592015908353]
We propose a new end-to-end scene text spotting framework termed SwinTextSpotter.
Using a transformer with dynamic head as the detector, we unify the two tasks with a novel Recognition Conversion mechanism.
The design results in a concise framework that requires neither additional rectification module nor character-level annotation.
arXiv Detail & Related papers (2022-03-19T01:14:42Z) - On Exploring and Improving Robustness of Scene Text Detection Models [20.15225372544634]
We evaluate scene text detection models ICDAR2015-C (IC15-C) and CTW1500-C (CTW-C)
We perform a robustness analysis of six key components: pre-training data, backbone, feature fusion module, multi-scale predictions, representation of text instances and loss function.
We present a simple yet effective data-based method to destroy the smoothness of text regions by merging background and foreground.
arXiv Detail & Related papers (2021-10-12T02:36:48Z) - Line Segmentation from Unconstrained Handwritten Text Images using
Adaptive Approach [10.436029791699777]
Line segmentation from handwritten text images is a challenging task due to diversity and unknown variations.
An adaptive approach is used for the line segmentation from handwritten text images merging the alignment of connected component coordinates and text height.
The proposed scheme is tested on two different type of datasets; document pages having base lines and plain pages.
arXiv Detail & Related papers (2021-04-18T08:52:52Z) - Combining Morphological and Histogram based Text Line Segmentation in
the OCR Context [0.0]
Algorithmic approach proposed by this paper has been designed for this exact purpose.
The method was developed to be applied on a historic data collection that commonly features quality issues.
Because of the promising segmentation results that are joined by low computational cost, the algorithm was incorporated into the OCR pipeline of the National Library of Luxembourg.
arXiv Detail & Related papers (2021-03-16T09:06:25Z) - Rethinking Text Segmentation: A Novel Dataset and A Text-Specific
Refinement Approach [34.63444886780274]
Text segmentation is a prerequisite in real-world text-related tasks.
We introduce Text Refinement Network (TexRNet), a novel text segmentation approach.
TexRNet consistently improves text segmentation performance by nearly 2% compared to other state-of-the-art segmentation methods.
arXiv Detail & Related papers (2020-11-27T22:50:09Z) - DGST : Discriminator Guided Scene Text detector [11.817428636084305]
This paper proposes a detector framework based on the conditional generative adversarial networks to improve the segmentation effect of scene text detection.
Experiments on standard datasets demonstrate that the proposed D GST brings noticeable gain and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-02-28T01:47:36Z) - Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [49.768327669098674]
We propose an end-to-end trainable text spotting approach named Text Perceptron.
It first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information.
Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies.
arXiv Detail & Related papers (2020-02-17T08:07:19Z) - TextScanner: Reading Characters in Order for Robust Scene Text
Recognition [60.04267660533966]
TextScanner is an alternative approach for scene text recognition.
It generates pixel-wise, multi-channel segmentation maps for character class, position and order.
It also adopts RNN for context modeling and performs paralleled prediction for character position and class.
arXiv Detail & Related papers (2019-12-28T07:52:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.