Attention-based Feature Decomposition-Reconstruction Network for Scene
Text Detection
- URL: http://arxiv.org/abs/2111.14340v1
- Date: Mon, 29 Nov 2021 06:15:25 GMT
- Title: Attention-based Feature Decomposition-Reconstruction Network for Scene
Text Detection
- Authors: Qi Zhao, Yufei Wang, Shuchang Lyu, Lijiang Chen
- Abstract summary: We propose attention-based feature decomposition-reconstruction network for scene text detection.
We use contextual information and low-level feature to enhance the performance of segmentation-based text detector.
Experiments have been conducted on two public benchmark datasets and results show that our proposed method achieves state-of-the-art performance.
- Score: 20.85468268945721
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, scene text detection has been a challenging task. Texts with
arbitrary shape or large aspect ratio are usually hard to detect. Previous
segmentation-based methods can describe curve text more accurately but suffer
from over segmentation and text adhesion. In this paper, we propose
attention-based feature decomposition-reconstruction network for scene text
detection, which utilizes contextual information and low-level feature to
enhance the performance of segmentation-based text detector. In the phase of
feature fusion, we introduce cross level attention module to enrich contextual
information of text by adding attention mechanism on fused multi-scaled
feature. In the phase of probability map generation, a feature
decomposition-reconstruction module is proposed to alleviate the over
segmentation problem of large aspect ratio text, which decomposes text feature
according to their frequency characteristic and then reconstructs it by adding
low-level feature. Experiments have been conducted on two public benchmark
datasets and results show that our proposed method achieves state-of-the-art
performance.
Related papers
- Leveraging Structure Knowledge and Deep Models for the Detection of Abnormal Handwritten Text [19.05500901000957]
We propose a two-stage detection algorithm that combines structure knowledge and deep models for handwritten text.
A shape regression network trained by a novel semi-supervised contrast training strategy is introduced and the positional relationship between the characters is fully employed.
Experiments on two handwritten text datasets show that the proposed method can greatly improve the detection performance.
arXiv Detail & Related papers (2024-10-15T14:57:10Z) - Towards Unified Multi-granularity Text Detection with Interactive Attention [56.79437272168507]
"Detect Any Text" is an advanced paradigm that unifies scene text detection, layout analysis, and document page detection into a cohesive, end-to-end model.
A pivotal innovation in DAT is the across-granularity interactive attention module, which significantly enhances the representation learning of text instances.
Tests demonstrate that DAT achieves state-of-the-art performances across a variety of text-related benchmarks.
arXiv Detail & Related papers (2024-05-30T07:25:23Z) - Seeing Text in the Dark: Algorithm and Benchmark [28.865779563872977]
In this work, we propose an efficient and effective single-stage approach for localizing text in dark.
We introduce a constrained learning module as an auxiliary mechanism during the training stage of the text detector.
We present a comprehensive low-light dataset for arbitrary-shaped text, encompassing diverse scenes and languages.
arXiv Detail & Related papers (2024-04-13T11:07:10Z) - Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using
Diffusion Models [63.99110667987318]
We present DiffText, a pipeline that seamlessly blends foreground text with the background's intrinsic features.
With fewer text instances, our produced text images consistently surpass other synthetic data in aiding text detectors.
arXiv Detail & Related papers (2023-11-28T06:51:28Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - CORE-Text: Improving Scene Text Detection with Contrastive Relational
Reasoning [65.57338873921168]
Localizing text instances in natural scenes is regarded as a fundamental challenge in computer vision.
In this work, we quantitatively analyze the sub-text problem and present a simple yet effective design, COntrastive RElation (CORE) module.
We integrate the CORE module into a two-stage text detector of Mask R-CNN and devise our text detector CORE-Text.
arXiv Detail & Related papers (2021-12-14T16:22:25Z) - On Exploring and Improving Robustness of Scene Text Detection Models [20.15225372544634]
We evaluate scene text detection models ICDAR2015-C (IC15-C) and CTW1500-C (CTW-C)
We perform a robustness analysis of six key components: pre-training data, backbone, feature fusion module, multi-scale predictions, representation of text instances and loss function.
We present a simple yet effective data-based method to destroy the smoothness of text regions by merging background and foreground.
arXiv Detail & Related papers (2021-10-12T02:36:48Z) - MOST: A Multi-Oriented Scene Text Detector with Localization Refinement [67.35280008722255]
We propose a new algorithm for scene text detection, which puts forward a set of strategies to significantly improve the quality of text localization.
Specifically, a Text Feature Alignment Module (TFAM) is proposed to dynamically adjust the receptive fields of features.
A Position-Aware Non-Maximum Suppression (PA-NMS) module is devised to exclude unreliable ones.
arXiv Detail & Related papers (2021-04-02T14:34:41Z) - DGST : Discriminator Guided Scene Text detector [11.817428636084305]
This paper proposes a detector framework based on the conditional generative adversarial networks to improve the segmentation effect of scene text detection.
Experiments on standard datasets demonstrate that the proposed D GST brings noticeable gain and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-02-28T01:47:36Z) - Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [49.768327669098674]
We propose an end-to-end trainable text spotting approach named Text Perceptron.
It first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information.
Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies.
arXiv Detail & Related papers (2020-02-17T08:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.