Leveraging machine learning for less developed languages: Progress on
Urdu text detection
- URL: http://arxiv.org/abs/2209.14022v1
- Date: Wed, 28 Sep 2022 12:00:34 GMT
- Title: Leveraging machine learning for less developed languages: Progress on
Urdu text detection
- Authors: Hazrat Ali
- Abstract summary: We present the use of machine learning methods to perform detection of Urdu text from the scene images.
To support research on Urdu text, We aim to make the data freely available for research use.
- Score: 0.76146285961466
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text detection in natural scene images has applications for autonomous
driving, navigation help for elderly and blind people. However, the research on
Urdu text detection is usually hindered by lack of data resources. We have
developed a dataset of scene images with Urdu text. We present the use of
machine learning methods to perform detection of Urdu text from the scene
images. We extract text regions using channel enhanced Maximally Stable
Extremal Region (MSER) method. First, we classify text and noise based on their
geometric properties. Next, we use a support vector machine for early
discarding of non-text regions. To further remove the non-text regions, we use
histogram of oriented gradients (HoG) features obtained and train a second SVM
classifier. This improves the overall performance on text region detection
within the scene images. To support research on Urdu text, We aim to make the
data freely available for research use. We also aim to highlight the challenges
and the research gap for Urdu text detection.
Related papers
- KhmerST: A Low-Resource Khmer Scene Text Detection and Recognition Benchmark [1.5409800688911346]
We introduce the first Khmer scene-text dataset, featuring 1,544 expert-annotated images.
This diverse dataset includes flat text, raised text, poorly illuminated text, distant polygon and partially obscured text.
arXiv Detail & Related papers (2024-10-23T21:04:24Z) - Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering [50.52792174648067]
This initiative seeks to bridge the gap between textual and visual comprehension.
We propose a new multi-task Urdu scene text dataset comprising over 1000 natural scene images.
We provide fine-grained annotations for text instances, addressing the limitations of previous datasets.
arXiv Detail & Related papers (2024-05-21T06:48:26Z) - The First Swahili Language Scene Text Detection and Recognition Dataset [55.83178123785643]
There is a significant gap in low-resource languages, especially the Swahili Language.
Swahili is widely spoken in East African countries but is still an under-explored language in scene text recognition.
We propose a comprehensive dataset of Swahili scene text images and evaluate the dataset on different scene text detection and recognition models.
arXiv Detail & Related papers (2024-05-19T03:55:02Z) - Efficiently Leveraging Linguistic Priors for Scene Text Spotting [63.22351047545888]
This paper proposes a method that leverages linguistic knowledge from a large text corpus to replace the traditional one-hot encoding used in auto-regressive scene text spotting and recognition models.
We generate text distributions that align well with scene text datasets, removing the need for in-domain fine-tuning.
Experimental results show that our method not only improves recognition accuracy but also enables more accurate localization of words.
arXiv Detail & Related papers (2024-02-27T01:57:09Z) - Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using
Diffusion Models [63.99110667987318]
We present DiffText, a pipeline that seamlessly blends foreground text with the background's intrinsic features.
With fewer text instances, our produced text images consistently surpass other synthetic data in aiding text detectors.
arXiv Detail & Related papers (2023-11-28T06:51:28Z) - Orientation-Independent Chinese Text Recognition in Scene Images [61.34060587461462]
We take the first attempt to extract orientation-independent visual features by disentangling content and orientation information of text images.
Specifically, we introduce a Character Image Reconstruction Network (CIRN) to recover corresponding printed character images with disentangled content and orientation information.
arXiv Detail & Related papers (2023-09-03T05:30:21Z) - SpaText: Spatio-Textual Representation for Controllable Image Generation [61.89548017729586]
SpaText is a new method for text-to-image generation using open-vocabulary scene control.
In addition to a global text prompt that describes the entire scene, the user provides a segmentation map.
We show its effectiveness on two state-of-the-art diffusion models: pixel-based and latent-conditional-based.
arXiv Detail & Related papers (2022-11-25T18:59:10Z) - Text Detection & Recognition in the Wild for Robot Localization [1.52292571922932]
We propose an end-to-end scene text spotting model that simultaneously outputs the text string and bounding boxes.
Our central contribution is introducing utilizing an end-to-end scene text spotting framework to adequately capture the irregular and occluded text regions.
arXiv Detail & Related papers (2022-05-17T18:16:34Z) - Urdu text in natural scene images: a new dataset and preliminary text
detection [3.070994681743188]
This work introduces a new dataset for Urdu text in natural scene images.
The dataset comprises of 500 standalone images acquired from real scenes.
MSER method is applied to extract Urdu text regions as candidates in an image.
arXiv Detail & Related papers (2021-09-16T15:41:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.