Related papers: DeepEraser: Deep Iterative Context Mining for Generic Text Eraser

DeepEraser: Deep Iterative Context Mining for Generic Text Eraser

URL: http://arxiv.org/abs/2402.19108v1
Date: Thu, 29 Feb 2024 12:39:04 GMT
Title: DeepEraser: Deep Iterative Context Mining for Generic Text Eraser
Authors: Hao Feng, Wendi Wang, Shaokai Liu, Jiajun Deng, Wengang Zhou, Houqiang Li
Abstract summary: DeepEraser is a recurrent architecture that erases the text in an image via iterative operations. DeepEraser is notably compact with only 1.4M parameters and trained in an end-to-end manner.
Score: 103.39279154750172
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we present DeepEraser, an effective deep network for generic text removal. DeepEraser utilizes a recurrent architecture that erases the text in an image via iterative operations. Our idea comes from the process of erasing pencil script, where the text area designated for removal is subject to continuous monitoring and the text is attenuated progressively, ensuring a thorough and clean erasure. Technically, at each iteration, an innovative erasing module is deployed, which not only explicitly aggregates the previous erasing progress but also mines additional semantic context to erase the target text. Through iterative refinements, the text regions are progressively replaced with more appropriate content and finally converge to a relatively accurate status. Furthermore, a custom mask generation strategy is introduced to improve the capability of DeepEraser for adaptive text removal, as opposed to indiscriminately removing all the text in an image. Our DeepEraser is notably compact with only 1.4M parameters and trained in an end-to-end manner. To verify its effectiveness, extensive experiments are conducted on several prevalent benchmarks, including SCUT-Syn, SCUT-EnsText, and Oxford Synthetic text dataset. The quantitative and qualitative results demonstrate the effectiveness of our DeepEraser over the state-of-the-art methods, as well as its strong generalization ability in custom mask text removal. The codes and pre-trained models are available at https://github.com/fh2019ustc/DeepEraser

Related papers

Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search [64.15205542003056]
We introduce Attention-Guided Alignment (AGA) framework featuring two innovative components: Attention-Guided Mask (AGM) Modeling and Text Enrichment Module (TEM)<n>AGA achieves new state-of-the-art results with Rank-1 accuracy reaching 78.36%, 67.31%, and 67.4% on CUHK-PEDES, ICFG-PEDES, and RSTP, respectively.
arXiv Detail & Related papers (2024-12-19T17:51:49Z)
TextDestroyer: A Training- and Annotation-Free Diffusion Method for Destroying Anomal Text from Images [84.08181780666698]
TextDestroyer is the first training- and annotation-free method for scene text destruction. Our method scrambles text areas in the latent start code using a Gaussian distribution before reconstruction. The advantages of TextDestroyer include: (1) it eliminates labor-intensive data annotation and resource-intensive training; (2) it achieves more thorough text destruction, preventing recognizable traces; and (3) it demonstrates better generalization capabilities, performing well on both real-world scenes and generated images.
arXiv Detail & Related papers (2024-11-01T04:41:00Z)
Copy Is All You Need [66.00852205068327]
We formulate text generation as progressively copying text segments from an existing text collection. Our approach achieves better generation quality according to both automatic and human evaluations. Our approach attains additional performance gains by simply scaling up to larger text collections.
arXiv Detail & Related papers (2023-07-13T05:03:26Z)
TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture. TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling. It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z)
Progressive Scene Text Erasing with Self-Supervision [7.118419154170154]
Scene text erasing seeks to erase text contents from scene images. Current state-of-the-art text erasing models are trained on large-scale synthetic data. We employ self-supervision for feature representation on unlabeled real-world scene text images.
arXiv Detail & Related papers (2022-07-23T09:05:13Z)
Text-DIAE: Degradation Invariant Autoencoders for Text Recognition and Document Enhancement [8.428866479825736]
Text-DIAE aims to solve two tasks, text recognition (handwritten or scene-text) and document image enhancement. We define three pretext tasks as learning objectives to be optimized during pre-training without the usage of labelled data. Our method surpasses the state-of-the-art significantly in existing supervised and self-supervised settings.
arXiv Detail & Related papers (2022-03-09T15:44:36Z)
A Simple and Strong Baseline: Progressively Region-based Scene Text Removal Networks [72.32357172679319]
This paper presents a novel ProgrEssively Region-based scene Text eraser (PERT) PERT decomposes the STR task to several erasing stages. PERT introduces a region-based modification strategy to ensure the integrity of text-free areas.
arXiv Detail & Related papers (2021-06-24T14:06:06Z)
Stroke-Based Scene Text Erasing Using Synthetic Data [0.0]
Scene text erasing can replace text regions with reasonable content in natural images. The lack of a large-scale real-world scene-text removal dataset allows the existing methods to not work in full strength. We enhance and make full use of the synthetic text and consequently train our model only on the dataset generated by the improved synthetic text engine. This model can partially erase text instances in a scene image with a bounding box provided or work with an existing scene text detector for automatic scene text erasing.
arXiv Detail & Related papers (2021-04-23T09:29:41Z)
MOST: A Multi-Oriented Scene Text Detector with Localization Refinement [67.35280008722255]
We propose a new algorithm for scene text detection, which puts forward a set of strategies to significantly improve the quality of text localization. Specifically, a Text Feature Alignment Module (TFAM) is proposed to dynamically adjust the receptive fields of features. A Position-Aware Non-Maximum Suppression (PA-NMS) module is devised to exclude unreliable ones.
arXiv Detail & Related papers (2021-04-02T14:34:41Z)
Scene text removal via cascaded text stroke detection and erasing [19.306751704904705]
Recent learning-based approaches show promising performance improvement for scene text removal task. We propose a novel "end-to-end" framework based on accurate text stroke detection.
arXiv Detail & Related papers (2020-11-19T11:05:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.