A Simple and Strong Baseline: Progressively Region-based Scene Text
Removal Networks
- URL: http://arxiv.org/abs/2106.13029v1
- Date: Thu, 24 Jun 2021 14:06:06 GMT
- Title: A Simple and Strong Baseline: Progressively Region-based Scene Text
Removal Networks
- Authors: Yuxin Wang, Hongtao Xie, Shancheng Fang, Yadong Qu and Yongdong Zhang
- Abstract summary: This paper presents a novel ProgrEssively Region-based scene Text eraser (PERT)
PERT decomposes the STR task to several erasing stages.
PERT introduces a region-based modification strategy to ensure the integrity of text-free areas.
- Score: 72.32357172679319
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing scene text removal methods mainly train an elaborate network with
paired images to realize the function of text localization and background
reconstruction simultaneously, but there exists two problems: 1) lacking the
exhaustive erasure of text region and 2) causing the excessive erasure to
text-free areas. To handle these issues, this paper provides a novel
ProgrEssively Region-based scene Text eraser (PERT), which introduces
region-based modification strategy to progressively erase the pixels in only
text region. Firstly, PERT decomposes the STR task to several erasing stages.
As each stage aims to take a further step toward the text-removed image rather
than directly regress to the final result, the decomposed operation reduces the
learning difficulty in each stage, and an exhaustive erasure result can be
obtained by iterating over lightweight erasing blocks with shared parameters.
Then, PERT introduces a region-based modification strategy to ensure the
integrity of text-free areas by decoupling text localization from erasure
process to guide the removal. Benefiting from the simplicity architecture, PERT
is a simple and strong baseline, and is easy to be followed and developed.
Extensive experiments demonstrate that PERT obtains the state-of-the-art
results on both synthetic and real-world datasets. Code is available
athttps://github.com/wangyuxin87/PERT.
Related papers
- TextDestroyer: A Training- and Annotation-Free Diffusion Method for Destroying Anomal Text from Images [84.08181780666698]
TextDestroyer is the first training- and annotation-free method for scene text destruction.
Our method scrambles text areas in the latent start code using a Gaussian distribution before reconstruction.
The advantages of TextDestroyer include: (1) it eliminates labor-intensive data annotation and resource-intensive training; (2) it achieves more thorough text destruction, preventing recognizable traces; and (3) it demonstrates better generalization capabilities, performing well on both real-world scenes and generated images.
arXiv Detail & Related papers (2024-11-01T04:41:00Z) - Leveraging Text Localization for Scene Text Removal via Text-aware Masked Image Modeling [44.70973195966149]
Existing scene text removal (STR) task suffers from insufficient training data due to the expensive pixel-level labeling.
We introduce a Text-aware Masked Image Modeling algorithm (TMIM), which can pretrain STR models with low-cost text detection labels.
Our method outperforms other pretrain methods and achieves state-of-the-art performance (37.35 PSNR on SCUT-EnsText)
arXiv Detail & Related papers (2024-09-20T11:52:57Z) - PSSTRNet: Progressive Segmentation-guided Scene Text Removal Network [1.7259824817932292]
Scene text removal (STR) is a challenging task due to the complex text fonts, colors, sizes, and background textures in scene images.
We propose a Progressive-guided Scene Text Removal Network(PSSTRNet) to remove the text in the image iteratively.
arXiv Detail & Related papers (2023-06-13T15:20:37Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Exploring Stroke-Level Modifications for Scene Text Editing [86.33216648792964]
Scene text editing (STE) aims to replace text with the desired one while preserving background and styles of the original text.
Previous methods of editing the whole image have to learn different translation rules of background and text regions simultaneously.
We propose a novel network by MOdifying Scene Text image at strokE Level (MOSTEL)
arXiv Detail & Related papers (2022-12-05T02:10:59Z) - Stroke-Based Scene Text Erasing Using Synthetic Data [0.0]
Scene text erasing can replace text regions with reasonable content in natural images.
The lack of a large-scale real-world scene-text removal dataset allows the existing methods to not work in full strength.
We enhance and make full use of the synthetic text and consequently train our model only on the dataset generated by the improved synthetic text engine.
This model can partially erase text instances in a scene image with a bounding box provided or work with an existing scene text detector for automatic scene text erasing.
arXiv Detail & Related papers (2021-04-23T09:29:41Z) - Scene text removal via cascaded text stroke detection and erasing [19.306751704904705]
Recent learning-based approaches show promising performance improvement for scene text removal task.
We propose a novel "end-to-end" framework based on accurate text stroke detection.
arXiv Detail & Related papers (2020-11-19T11:05:13Z) - ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene
Text Detection [147.10751375922035]
We propose the ContourNet, which effectively handles false positives and large scale variance of scene texts.
Our method effectively suppresses these false positives by only outputting predictions with high response value in both directions.
arXiv Detail & Related papers (2020-04-10T08:15:23Z) - Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [49.768327669098674]
We propose an end-to-end trainable text spotting approach named Text Perceptron.
It first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information.
Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies.
arXiv Detail & Related papers (2020-02-17T08:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.