Quran Recitation Recognition using End-to-End Deep Learning
- URL: http://arxiv.org/abs/2305.07034v1
- Date: Wed, 10 May 2023 18:40:01 GMT
- Title: Quran Recitation Recognition using End-to-End Deep Learning
- Authors: Ahmad Al Harere, Khloud Al Jallad
- Abstract summary: The Quran is the holy scripture of Islam, and its recitation is an important aspect of the religion.
Recognizing the recitation of the Holy Quran automatically is a challenging task due to its unique rules.
We propose a novel end-to-end deep learning model for recognizing the recitation of the Holy Quran.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Quran is the holy scripture of Islam, and its recitation is an important
aspect of the religion. Recognizing the recitation of the Holy Quran
automatically is a challenging task due to its unique rules that are not
applied in normal speaking speeches. A lot of research has been done in this
domain, but previous works have detected recitation errors as a classification
task or used traditional automatic speech recognition (ASR). In this paper, we
proposed a novel end-to-end deep learning model for recognizing the recitation
of the Holy Quran. The proposed model is a CNN-Bidirectional GRU encoder that
uses CTC as an objective function, and a character-based decoder which is a
beam search decoder. Moreover, all previous works were done on small private
datasets consisting of short verses and a few chapters of the Holy Quran. As a
result of using private datasets, no comparisons were done. To overcome this
issue, we used a public dataset that has recently been published (Ar-DAD) and
contains about 37 chapters that were recited by 30 reciters, with different
recitation speeds and different types of pronunciation rules. The proposed
model performance was evaluated using the most common evaluation metrics in
speech recognition, word error rate (WER), and character error rate (CER). The
results were 8.34% WER and 2.42% CER. We hope this research will be a baseline
for comparisons with future research on this public new dataset (Ar-DAD).
Related papers
- Quranic Audio Dataset: Crowdsourced and Labeled Recitation from Non-Arabic Speakers [1.2124551005857038]
This paper addresses the challenge of learning to recite the Quran for non-Arabic speakers.
We use the volunteer-based crowdsourcing genre and implement a crowdsourcing API to gather audio assets.
We have collected around 7000 Quranic recitations from a pool of 1287 participants across more than 11 non-Arabic countries.
arXiv Detail & Related papers (2024-05-04T14:29:05Z) - Continuously Learning New Words in Automatic Speech Recognition [56.972851337263755]
We propose an self-supervised continual learning approach to recognize new words.
We use a memory-enhanced Automatic Speech Recognition model from previous work.
We show that with this approach, we obtain increasing performance on the new words when they occur more frequently.
arXiv Detail & Related papers (2024-01-09T10:39:17Z) - Quranic Conversations: Developing a Semantic Search tool for the Quran
using Arabic NLP Techniques [0.7673339435080445]
The Holy Book of Quran is believed to be the literal word of God (Allah) as revealed to the Prophet Muhammad (PBUH) over a period of approximately 23 years.
It is challenging for Muslims to get all relevant ayahs (verses) pertaining to a matter or inquiry of interest.
We developed a Quran semantic search tool which finds the verses pertaining to the user inquiry or prompt.
arXiv Detail & Related papers (2023-11-09T03:14:54Z) - HyPoradise: An Open Baseline for Generative Speech Recognition with
Large Language Models [81.56455625624041]
We introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction.
The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses.
LLMs with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list.
arXiv Detail & Related papers (2023-09-27T14:44:10Z) - SpellMapper: A non-autoregressive neural spellchecker for ASR
customization with candidate retrieval based on n-gram mappings [76.87664008338317]
Contextual spelling correction models are an alternative to shallow fusion to improve automatic speech recognition.
We propose a novel algorithm for candidate retrieval based on misspelled n-gram mappings.
Experiments on Spoken Wikipedia show 21.4% word error rate improvement compared to a baseline ASR system.
arXiv Detail & Related papers (2023-06-04T10:00:12Z) - Mispronunciation Detection of Basic Quranic Recitation Rules using Deep
Learning [0.0]
In Islam, readers must apply a set of pronunciation rules called Tajweed rules to recite the Quran.
The number of Tajweed teachers is not enough nowadays for daily recitation practice for every Muslim.
We propose a solution that consists of Mel-Frequency Cepstral Coefficient (MFCC) features with Long Short-Term Memory (LSTM) neural networks which use the time series.
arXiv Detail & Related papers (2023-05-10T19:31:25Z) - A Gold Standard Dataset for the Reviewer Assignment Problem [117.59690218507565]
"Similarity score" is a numerical estimate of the expertise of a reviewer in reviewing a paper.
Our dataset consists of 477 self-reported expertise scores provided by 58 researchers.
For the task of ordering two papers in terms of their relevance for a reviewer, the error rates range from 12%-30% in easy cases to 36%-43% in hard cases.
arXiv Detail & Related papers (2023-03-23T16:15:03Z) - Improving Contextual Recognition of Rare Words with an Alternate
Spelling Prediction Model [0.0]
We release contextual biasing lists to accompany the Earnings21 dataset.
We show results for shallow fusion contextual biasing applied to two different decoding algorithms.
We propose an alternate spelling prediction model that improves recall of rare words by 34.7% relative.
arXiv Detail & Related papers (2022-09-02T19:30:16Z) - DTW at Qur'an QA 2022: Utilising Transfer Learning with Transformers for
Question Answering in a Low-resource Domain [10.172732008860539]
The research in machine reading comprehension has been understudied in several domains, including religious texts.
The goal of the Qur'an QA 2022 shared task is to fill this gap by producing state-of-the-art question answering and reading comprehension research on Qur'an.
arXiv Detail & Related papers (2022-05-12T11:17:23Z) - Speaker Embedding-aware Neural Diarization for Flexible Number of
Speakers with Textual Information [55.75018546938499]
We propose the speaker embedding-aware neural diarization (SEND) method, which predicts the power set encoded labels.
Our method achieves lower diarization error rate than the target-speaker voice activity detection.
arXiv Detail & Related papers (2021-11-28T12:51:04Z) - On Addressing Practical Challenges for RNN-Transduce [72.72132048437751]
We adapt a well-trained RNN-T model to a new domain without collecting the audio data.
We obtain word-level confidence scores by utilizing several types of features calculated during decoding.
The proposed time stamping method can get less than 50ms word timing difference on average.
arXiv Detail & Related papers (2021-04-27T23:31:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.