A Novel Framework For Text Detection From Natural Scene Images With Complex Background
- URL: http://arxiv.org/abs/2409.09635v1
- Date: Sun, 15 Sep 2024 07:12:33 GMT
- Title: A Novel Framework For Text Detection From Natural Scene Images With Complex Background
- Authors: Basavaraj Kaladagi, Jagadeesh Pujari,
- Abstract summary: We propose a novel and efficient method to detect text region from images with complex background using Wavelet Transforms.
The framework uses Wavelet Transformation of the original image in its grayscale form followed by Sub-band filtering.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recognizing texts from camera images is a known hard problem because of the difficulties in text detection from the varied and complicated background. In this paper we propose a novel and efficient method to detect text region from images with complex background using Wavelet Transforms. The framework uses Wavelet Transformation of the original image in its grayscale form followed by Sub-band filtering. Then Region clustering technique is applied using centroids of the regions, further Bounding box is fitted to each region thus identifying the text regions. This method is much sophisticated and efficient than the previous methods as it doesn't stick to a particular font size of the text thus, making it generalized. The sample set used for experimental purpose consists of 50 images with varying backgrounds. Images with edge prominence are considered. Furthermore, our method can be easily customized for applications with different scopes.
Related papers
- Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model [31.819060415422353]
Diff-Text is a training-free scene text generation framework for any language.
Our method outperforms the existing method in both the accuracy of text recognition and the naturalness of foreground-background blending.
arXiv Detail & Related papers (2023-12-19T15:18:40Z) - Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using
Diffusion Models [63.99110667987318]
We present DiffText, a pipeline that seamlessly blends foreground text with the background's intrinsic features.
With fewer text instances, our produced text images consistently surpass other synthetic data in aiding text detectors.
arXiv Detail & Related papers (2023-11-28T06:51:28Z) - Text-Driven Image Editing via Learnable Regions [74.45313434129005]
We introduce a method for region-based image editing driven by textual prompts, without the need for user-provided masks or sketches.
We show that this simple approach enables flexible editing that is compatible with current image generation models.
Experiments demonstrate the competitive performance of our method in manipulating images with high fidelity and realism that correspond to the provided language descriptions.
arXiv Detail & Related papers (2023-11-28T02:27:31Z) - Towards Robust Scene Text Image Super-resolution via Explicit Location
Enhancement [59.66539728681453]
Scene text image super-resolution (STISR) aims to improve image quality while boosting downstream scene text recognition accuracy.
Most existing methods treat the foreground (character regions) and background (non-character regions) equally in the forward process.
We propose a novel method LEMMA that explicitly models character regions to produce high-level text-specific guidance for super-resolution.
arXiv Detail & Related papers (2023-07-19T05:08:47Z) - Expressive Text-to-Image Generation with Rich Text [42.923053338525804]
We propose a rich-text editor supporting formats such as font style, size, color, and footnote.
We extract each word's attributes from rich text to enable local style control, explicit token reweighting, precise color rendering, and detailed region synthesis.
arXiv Detail & Related papers (2023-04-13T17:59:55Z) - Exploring Stroke-Level Modifications for Scene Text Editing [86.33216648792964]
Scene text editing (STE) aims to replace text with the desired one while preserving background and styles of the original text.
Previous methods of editing the whole image have to learn different translation rules of background and text regions simultaneously.
We propose a novel network by MOdifying Scene Text image at strokE Level (MOSTEL)
arXiv Detail & Related papers (2022-12-05T02:10:59Z) - SpaText: Spatio-Textual Representation for Controllable Image Generation [61.89548017729586]
SpaText is a new method for text-to-image generation using open-vocabulary scene control.
In addition to a global text prompt that describes the entire scene, the user provides a segmentation map.
We show its effectiveness on two state-of-the-art diffusion models: pixel-based and latent-conditional-based.
arXiv Detail & Related papers (2022-11-25T18:59:10Z) - Aggregated Text Transformer for Scene Text Detection [5.387121933662753]
We present the Aggregated Text TRansformer(ATTR), which is designed to represent texts in scene images with a multi-scale self-attention mechanism.
The multi-scale image representations are robust and contain rich information on text contents of various sizes.
The proposed method detects scene texts by representing each text instance as an individual binary mask, which is tolerant of curve texts and regions with dense instances.
arXiv Detail & Related papers (2022-11-25T09:47:34Z) - Scene Text Retrieval via Joint Text Detection and Similarity Learning [68.24531728554892]
Scene text retrieval aims to localize and search all text instances from an image gallery, which are the same or similar to a given query text.
We address this problem by directly learning a cross-modal similarity between a query text and each text instance from natural images.
In this way, scene text retrieval can be simply performed by ranking the detected text instances with the learned similarity.
arXiv Detail & Related papers (2021-04-04T07:18:38Z) - SwapText: Image Based Texts Transfer in Scenes [13.475726959175057]
We present SwapText, a framework to transfer texts across scene images.
A novel text swapping network is proposed to replace text labels only in the foreground image.
The generated foreground image and background image are used to generate the word image by the fusion network.
arXiv Detail & Related papers (2020-03-18T11:02:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.