Comprehensive Benchmark Datasets for Amharic Scene Text Detection and
Recognition
- URL: http://arxiv.org/abs/2203.12165v1
- Date: Wed, 23 Mar 2022 03:19:35 GMT
- Title: Comprehensive Benchmark Datasets for Amharic Scene Text Detection and
Recognition
- Authors: Wondimu Dikubab, Dingkang Liang, Minghui Liao, Xiang Bai
- Abstract summary: Ethiopic/Amharic script is one of the oldest African writing systems, which serves at least 23 languages in East Africa.
The Amharic writing system, Abugida, has 282 syllables, 15 punctuation marks, and 20 numerals.
We presented the first comprehensive public datasets named HUST-ART, HUST-AST, ABE, and Tana for Amharic script detection and recognition in the natural scene.
- Score: 56.048783994698425
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Ethiopic/Amharic script is one of the oldest African writing systems, which
serves at least 23 languages (e.g., Amharic, Tigrinya) in East Africa for more
than 120 million people. The Amharic writing system, Abugida, has 282
syllables, 15 punctuation marks, and 20 numerals. The Amharic syllabic matrix
is derived from 34 base graphemes/consonants by adding up to 12 appropriate
diacritics or vocalic markers to the characters. The syllables with a common
consonant or vocalic markers are likely to be visually similar and challenge
text recognition tasks. In this work, we presented the first comprehensive
public datasets named HUST-ART, HUST-AST, ABE, and Tana for Amharic script
detection and recognition in the natural scene. We have also conducted
extensive experiments to evaluate the performance of the state of art methods
in detecting and recognizing Amharic scene text on our datasets. The evaluation
results demonstrate the robustness of our datasets for benchmarking and its
potential of promoting the development of robust Amharic script detection and
recognition algorithms. Consequently, the outcome will benefit people in East
Africa, including diplomats from several countries and international
communities.
Related papers
- KhmerST: A Low-Resource Khmer Scene Text Detection and Recognition Benchmark [1.5409800688911346]
We introduce the first Khmer scene-text dataset, featuring 1,544 expert-annotated images.
This diverse dataset includes flat text, raised text, poorly illuminated text, distant polygon and partially obscured text.
arXiv Detail & Related papers (2024-10-23T21:04:24Z) - Bukva: Russian Sign Language Alphabet [75.42794328290088]
This paper investigates the recognition of the Russian fingerspelling alphabet, also known as the Russian Sign Language (RSL) dactyl.
Dactyl is a component of sign languages where distinct hand movements represent individual letters of a written language.
We provide Bukva, the first full-fledged open-source video dataset for RSL dactyl recognition.
arXiv Detail & Related papers (2024-10-11T09:59:48Z) - The First Swahili Language Scene Text Detection and Recognition Dataset [55.83178123785643]
There is a significant gap in low-resource languages, especially the Swahili Language.
Swahili is widely spoken in East African countries but is still an under-explored language in scene text recognition.
We propose a comprehensive dataset of Swahili scene text images and evaluate the dataset on different scene text detection and recognition models.
arXiv Detail & Related papers (2024-05-19T03:55:02Z) - Semantically Corrected Amharic Automatic Speech Recognition [27.569469583183423]
We build a set of ASR tools for Amharic, a language spoken by more than 50 million people in eastern Africa.
We release corrected transcriptions of existing Amharic ASR test datasets, enabling the community to accurately evaluate progress.
We introduce a post-processing approach using a transformer encoder-decoder architecture to organize raw ASR outputs into a grammatically complete and semantically meaningful Amharic sentence.
arXiv Detail & Related papers (2024-04-20T12:08:00Z) - AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages [45.88640066767242]
Africa is home to over 2,000 languages from more than six language families and has the highest linguistic diversity among all continents.
Yet, there is little NLP research conducted on African languages. Crucial to enabling such research is the availability of high-quality annotated datasets.
In this paper, we introduce AfriSenti, a sentiment analysis benchmark that contains a total of >110,000 tweets in 14 African languages.
arXiv Detail & Related papers (2023-02-17T15:40:12Z) - ASR2K: Speech Recognition for Around 2000 Languages without Audio [100.41158814934802]
We present a speech recognition pipeline that does not require any audio for the target language.
Our pipeline consists of three components: acoustic, pronunciation, and language models.
We build speech recognition for 1909 languages by combining it with Crubadan: a large endangered languages n-gram database.
arXiv Detail & Related papers (2022-09-06T22:48:29Z) - Towards Boosting the Accuracy of Non-Latin Scene Text Recognition [27.609596088151644]
Scene-text recognition is remarkably better in Latin languages than the non-Latin languages.
This paper examines the possible reasons for low accuracy by comparing English datasets with non-Latin languages.
arXiv Detail & Related papers (2022-01-10T06:36:43Z) - Phoneme Recognition through Fine Tuning of Phonetic Representations: a
Case Study on Luhya Language Varieties [77.2347265289855]
We focus on phoneme recognition using Allosaurus, a method for multilingual recognition based on phonetic annotation.
To evaluate in a challenging real-world scenario, we curate phone recognition datasets for Bukusu and Saamia, two varieties of the Luhya language cluster of western Kenya and eastern Uganda.
We find that fine-tuning of Allosaurus, even with just 100 utterances, leads to significant improvements in phone error rates.
arXiv Detail & Related papers (2021-04-04T15:07:55Z) - Arabic Dialect Identification in the Wild [10.010733302895938]
We present QADI, an automatically collected dataset of tweets belonging to a wide range of country-level Arabic dialects.
The resultant dataset contains 540k tweets from 2,525 users who are evenly distributed across 18 Arab countries.
arXiv Detail & Related papers (2020-05-13T19:46:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.