Text Region Multiple Information Perception Network for Scene Text
Detection
- URL: http://arxiv.org/abs/2401.10017v1
- Date: Thu, 18 Jan 2024 14:36:51 GMT
- Title: Text Region Multiple Information Perception Network for Scene Text
Detection
- Authors: Jinzhi Zheng, Libo Zhang, Yanjun Wu, Chen Zhao
- Abstract summary: This paper proposes a plug-and-play module called the Region Multiple Information Perception Module (RMIPM) to enhance the detection performance of segmentation-based algorithms.
Specifically, we design an improved module that can perceive various types of information about scene text regions, such as text foreground classification maps, distance maps, direction maps, etc.
- Score: 19.574306663095243
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Segmentation-based scene text detection algorithms can handle arbitrary shape
scene texts and have strong robustness and adaptability, so it has attracted
wide attention. Existing segmentation-based scene text detection algorithms
usually only segment the pixels in the center region of the text, while
ignoring other information of the text region, such as edge information,
distance information, etc., thus limiting the detection accuracy of the
algorithm for scene text. This paper proposes a plug-and-play module called the
Region Multiple Information Perception Module (RMIPM) to enhance the detection
performance of segmentation-based algorithms. Specifically, we design an
improved module that can perceive various types of information about scene text
regions, such as text foreground classification maps, distance maps, direction
maps, etc. Experiments on MSRA-TD500 and TotalText datasets show that our
method achieves comparable performance with current state-of-the-art
algorithms.
Related papers
- Spotlight Text Detector: Spotlight on Candidate Regions Like a Camera [31.180352896153682]
We propose an effective spotlight text detector (STD) for scene texts.
It consists of a spotlight calibration module (SCM) and a multivariate information extraction module (MIEM)
Our STD is superior to existing state-of-the-art methods on various datasets.
arXiv Detail & Related papers (2024-09-25T11:19:09Z) - Towards Unified Multi-granularity Text Detection with Interactive Attention [56.79437272168507]
"Detect Any Text" is an advanced paradigm that unifies scene text detection, layout analysis, and document page detection into a cohesive, end-to-end model.
A pivotal innovation in DAT is the across-granularity interactive attention module, which significantly enhances the representation learning of text instances.
Tests demonstrate that DAT achieves state-of-the-art performances across a variety of text-related benchmarks.
arXiv Detail & Related papers (2024-05-30T07:25:23Z) - BPDO:Boundary Points Dynamic Optimization for Arbitrary Shape Scene Text
Detection [19.574306663095243]
We propose a novel arbitrary shape scene text detector through boundary points dynamic optimization(BPDO)
Model is designed with a text aware module (TAM) and a boundary point dynamic optimization module (DOM)
Experiments on CTW-1500, Total-Text, and MSRA-TD500 datasets show that the model proposed in this paper achieves a performance better than or comparable to the state-of-the-art algorithm.
arXiv Detail & Related papers (2024-01-18T14:13:46Z) - Adaptive Segmentation Network for Scene Text Detection [0.0]
We propose to automatically learn the discriminate segmentation threshold, which distinguishes text pixels from background pixels for segmentation-based scene text detectors.
Besides, we design a Global-information Enhanced Feature Pyramid Network (GE-FPN) for capturing text instances with macro size and extreme aspect ratios.
Finally, together with the proposed threshold learning strategy and text detection structure, we design an Adaptive Network (ASNet) for scene text detection.
arXiv Detail & Related papers (2023-07-27T17:37:56Z) - Towards End-to-End Unified Scene Text Detection and Layout Analysis [60.68100769639923]
We introduce the task of unified scene text detection and layout analysis.
The first hierarchical scene text dataset is introduced to enable this novel research task.
We also propose a novel method that is able to simultaneously detect scene text and form text clusters in a unified way.
arXiv Detail & Related papers (2022-03-28T23:35:45Z) - RSCA: Real-time Segmentation-based Context-Aware Scene Text Detection [14.125634725954848]
We propose RSCA: a Real-time-based Context-Aware model for arbitrary-shaped scene text detection.
Based on these strategies, RSCA achieves state-of-the-art performance in both speed and accuracy, without complex label assignments or repeated feature aggregations.
arXiv Detail & Related papers (2021-05-26T18:43:17Z) - RayNet: Real-time Scene Arbitrary-shape Text Detection with Multiple
Rays [84.15123599963239]
We propose a novel detection framework for arbitrary-shape text detection, termed as RayNet.
RayNet uses Center Point Set (CPS) and Ray Distance (RD) to fit text, where CPS is used to determine the text general position and the RD is combined with CPS to compute Ray Points (RP) to localize the text accurate shape.
RayNet achieves impressive performance on existing curved text dataset (CTW1500) and quadrangle text dataset (ICDAR2015)
arXiv Detail & Related papers (2021-04-11T03:03:23Z) - Scene Text Retrieval via Joint Text Detection and Similarity Learning [68.24531728554892]
Scene text retrieval aims to localize and search all text instances from an image gallery, which are the same or similar to a given query text.
We address this problem by directly learning a cross-modal similarity between a query text and each text instance from natural images.
In this way, scene text retrieval can be simply performed by ranking the detected text instances with the learned similarity.
arXiv Detail & Related papers (2021-04-04T07:18:38Z) - MOST: A Multi-Oriented Scene Text Detector with Localization Refinement [67.35280008722255]
We propose a new algorithm for scene text detection, which puts forward a set of strategies to significantly improve the quality of text localization.
Specifically, a Text Feature Alignment Module (TFAM) is proposed to dynamically adjust the receptive fields of features.
A Position-Aware Non-Maximum Suppression (PA-NMS) module is devised to exclude unreliable ones.
arXiv Detail & Related papers (2021-04-02T14:34:41Z) - BOTD: Bold Outline Text Detector [85.33700624095181]
We propose a new one-stage text detector, termed as Bold Outline Text Detector (BOTD)
BOTD is able to process the arbitrary-shaped text with low model complexity.
Experimental results on three real-world benchmarks show the state-of-the-art performance of BOTD.
arXiv Detail & Related papers (2020-11-30T11:54:14Z) - DGST : Discriminator Guided Scene Text detector [11.817428636084305]
This paper proposes a detector framework based on the conditional generative adversarial networks to improve the segmentation effect of scene text detection.
Experiments on standard datasets demonstrate that the proposed D GST brings noticeable gain and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-02-28T01:47:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.