PSSTRNet: Progressive Segmentation-guided Scene Text Removal Network
- URL: http://arxiv.org/abs/2306.07842v1
- Date: Tue, 13 Jun 2023 15:20:37 GMT
- Title: PSSTRNet: Progressive Segmentation-guided Scene Text Removal Network
- Authors: Guangtao Lyu, Anna Zhu
- Abstract summary: Scene text removal (STR) is a challenging task due to the complex text fonts, colors, sizes, and background textures in scene images.
We propose a Progressive-guided Scene Text Removal Network(PSSTRNet) to remove the text in the image iteratively.
- Score: 1.7259824817932292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scene text removal (STR) is a challenging task due to the complex text fonts,
colors, sizes, and background textures in scene images. However, most previous
methods learn both text location and background inpainting implicitly within a
single network, which weakens the text localization mechanism and makes a lossy
background. To tackle these problems, we propose a simple Progressive
Segmentation-guided Scene Text Removal Network(PSSTRNet) to remove the text in
the image iteratively. It contains two decoder branches, a text segmentation
branch, and a text removal branch, with a shared encoder. The text segmentation
branch generates text mask maps as the guidance for the regional removal
branch. In each iteration, the original image, previous text removal result,
and text mask are input to the network to extract the rest part of the text
segments and cleaner text removal result. To get a more accurate text mask map,
an update module is developed to merge the mask map in the current and previous
stages. The final text removal result is obtained by adaptive fusion of results
from all previous stages. A sufficient number of experiments and ablation
studies conducted on the real and synthetic public datasets demonstrate our
proposed method achieves state-of-the-art performance. The source code of our
work is available at:
\href{https://github.com/GuangtaoLyu/PSSTRNet}{https://github.com/GuangtaoLyu/PSSTRNet.}
Related papers
- TextDestroyer: A Training- and Annotation-Free Diffusion Method for Destroying Anomal Text from Images [84.08181780666698]
TextDestroyer is the first training- and annotation-free method for scene text destruction.
Our method scrambles text areas in the latent start code using a Gaussian distribution before reconstruction.
The advantages of TextDestroyer include: (1) it eliminates labor-intensive data annotation and resource-intensive training; (2) it achieves more thorough text destruction, preventing recognizable traces; and (3) it demonstrates better generalization capabilities, performing well on both real-world scenes and generated images.
arXiv Detail & Related papers (2024-11-01T04:41:00Z) - DeepEraser: Deep Iterative Context Mining for Generic Text Eraser [103.39279154750172]
DeepEraser is a recurrent architecture that erases the text in an image via iterative operations.
DeepEraser is notably compact with only 1.4M parameters and trained in an end-to-end manner.
arXiv Detail & Related papers (2024-02-29T12:39:04Z) - Text Augmented Spatial-aware Zero-shot Referring Image Segmentation [60.84423786769453]
We introduce a Text Augmented Spatial-aware (TAS) zero-shot referring image segmentation framework.
TAS incorporates a mask proposal network for instance-level mask extraction, a text-augmented visual-text matching score for mining the image-text correlation, and a spatial for mask post-processing.
The proposed method clearly outperforms state-of-the-art zero-shot referring image segmentation methods.
arXiv Detail & Related papers (2023-10-27T10:52:50Z) - ViTEraser: Harnessing the Power of Vision Transformers for Scene Text
Removal with SegMIM Pretraining [58.241008246380254]
Scene text removal (STR) aims at replacing text strokes in natural scenes with visually coherent backgrounds.
Recent STR approaches rely on iterative refinements or explicit text masks, resulting in high complexity and sensitivity to the accuracy of text localization.
We propose a simple-yet-effective ViT-based text eraser, dubbed ViTEraser.
arXiv Detail & Related papers (2023-06-21T08:47:20Z) - FETNet: Feature Erasing and Transferring Network for Scene Text Removal [14.763369952265796]
Scene text removal (STR) task aims to remove text regions and recover the background smoothly in images for private information protection.
Most existing STR methods adopt encoder-decoder-based CNNs, with direct copies of the features in the skip connections.
We propose a novel Feature Erasing and Transferring (FET) mechanism to reconfigure the encoded features for STR.
arXiv Detail & Related papers (2023-06-16T02:38:30Z) - Exploring Stroke-Level Modifications for Scene Text Editing [86.33216648792964]
Scene text editing (STE) aims to replace text with the desired one while preserving background and styles of the original text.
Previous methods of editing the whole image have to learn different translation rules of background and text regions simultaneously.
We propose a novel network by MOdifying Scene Text image at strokE Level (MOSTEL)
arXiv Detail & Related papers (2022-12-05T02:10:59Z) - A Simple and Strong Baseline: Progressively Region-based Scene Text
Removal Networks [72.32357172679319]
This paper presents a novel ProgrEssively Region-based scene Text eraser (PERT)
PERT decomposes the STR task to several erasing stages.
PERT introduces a region-based modification strategy to ensure the integrity of text-free areas.
arXiv Detail & Related papers (2021-06-24T14:06:06Z) - Scene text removal via cascaded text stroke detection and erasing [19.306751704904705]
Recent learning-based approaches show promising performance improvement for scene text removal task.
We propose a novel "end-to-end" framework based on accurate text stroke detection.
arXiv Detail & Related papers (2020-11-19T11:05:13Z) - SwapText: Image Based Texts Transfer in Scenes [13.475726959175057]
We present SwapText, a framework to transfer texts across scene images.
A novel text swapping network is proposed to replace text labels only in the foreground image.
The generated foreground image and background image are used to generate the word image by the fusion network.
arXiv Detail & Related papers (2020-03-18T11:02:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.