Urdu text in natural scene images: a new dataset and preliminary text
detection
- URL: http://arxiv.org/abs/2109.08060v1
- Date: Thu, 16 Sep 2021 15:41:50 GMT
- Title: Urdu text in natural scene images: a new dataset and preliminary text
detection
- Authors: Hazrat Ali, Khalid Iqbal, Ghulam Mujtaba, Ahmad Fayyaz, Mohammad
Farhad Bulbul, Fazal Wahab Karam and Ali Zahir
- Abstract summary: This work introduces a new dataset for Urdu text in natural scene images.
The dataset comprises of 500 standalone images acquired from real scenes.
MSER method is applied to extract Urdu text regions as candidates in an image.
- Score: 3.070994681743188
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Text detection in natural scene images for content analysis is an interesting
task. The research community has seen some great developments for
English/Mandarin text detection. However, Urdu text extraction in natural scene
images is a task not well addressed. In this work, firstly, a new dataset is
introduced for Urdu text in natural scene images. The dataset comprises of 500
standalone images acquired from real scenes. Secondly, the channel enhanced
Maximally Stable Extremal Region (MSER) method is applied to extract Urdu text
regions as candidates in an image. Two-stage filtering mechanism is applied to
eliminate non-candidate regions. In the first stage, text and noise are
classified based on their geometric properties. In the second stage, a support
vector machine classifier is trained to discard non-text candidate regions.
After this, text candidate regions are linked using centroid-based vertical and
horizontal distances. Text lines are further analyzed by a different classifier
based on HOG features to remove non-text regions. Extensive experimentation is
performed on the locally developed dataset to evaluate the performance. The
experimental results show good performance on test set images. The dataset will
be made available for research use. To the best of our knowledge, the work is
the first of its kind for the Urdu language and would provide a good dataset
for free research use and serve as a baseline performance on the task of Urdu
text extraction.
Related papers
- KhmerST: A Low-Resource Khmer Scene Text Detection and Recognition Benchmark [1.5409800688911346]
We introduce the first Khmer scene-text dataset, featuring 1,544 expert-annotated images.
This diverse dataset includes flat text, raised text, poorly illuminated text, distant polygon and partially obscured text.
arXiv Detail & Related papers (2024-10-23T21:04:24Z) - Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering [50.52792174648067]
This initiative seeks to bridge the gap between textual and visual comprehension.
We propose a new multi-task Urdu scene text dataset comprising over 1000 natural scene images.
We provide fine-grained annotations for text instances, addressing the limitations of previous datasets.
arXiv Detail & Related papers (2024-05-21T06:48:26Z) - The First Swahili Language Scene Text Detection and Recognition Dataset [55.83178123785643]
There is a significant gap in low-resource languages, especially the Swahili Language.
Swahili is widely spoken in East African countries but is still an under-explored language in scene text recognition.
We propose a comprehensive dataset of Swahili scene text images and evaluate the dataset on different scene text detection and recognition models.
arXiv Detail & Related papers (2024-05-19T03:55:02Z) - Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using
Diffusion Models [63.99110667987318]
We present DiffText, a pipeline that seamlessly blends foreground text with the background's intrinsic features.
With fewer text instances, our produced text images consistently surpass other synthetic data in aiding text detectors.
arXiv Detail & Related papers (2023-11-28T06:51:28Z) - Orientation-Independent Chinese Text Recognition in Scene Images [61.34060587461462]
We take the first attempt to extract orientation-independent visual features by disentangling content and orientation information of text images.
Specifically, we introduce a Character Image Reconstruction Network (CIRN) to recover corresponding printed character images with disentangled content and orientation information.
arXiv Detail & Related papers (2023-09-03T05:30:21Z) - SpaText: Spatio-Textual Representation for Controllable Image Generation [61.89548017729586]
SpaText is a new method for text-to-image generation using open-vocabulary scene control.
In addition to a global text prompt that describes the entire scene, the user provides a segmentation map.
We show its effectiveness on two state-of-the-art diffusion models: pixel-based and latent-conditional-based.
arXiv Detail & Related papers (2022-11-25T18:59:10Z) - The Surprisingly Straightforward Scene Text Removal Method With Gated
Attention and Region of Interest Generation: A Comprehensive Prominent Model
Analysis [0.76146285961466]
Scene text removal (STR) is a task of erasing text from natural scene images.
We introduce a simple yet extremely effective Gated Attention (GA) and Region-of-Interest Generation (RoIG) methodology in this paper.
Experimental results on the benchmark dataset show that our method significantly outperforms existing state-of-the-art methods in almost all metrics.
arXiv Detail & Related papers (2022-10-14T03:34:21Z) - Leveraging machine learning for less developed languages: Progress on
Urdu text detection [0.76146285961466]
We present the use of machine learning methods to perform detection of Urdu text from the scene images.
To support research on Urdu text, We aim to make the data freely available for research use.
arXiv Detail & Related papers (2022-09-28T12:00:34Z) - Towards End-to-End Unified Scene Text Detection and Layout Analysis [60.68100769639923]
We introduce the task of unified scene text detection and layout analysis.
The first hierarchical scene text dataset is introduced to enable this novel research task.
We also propose a novel method that is able to simultaneously detect scene text and form text clusters in a unified way.
arXiv Detail & Related papers (2022-03-28T23:35:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.