An ensemble-based framework for mispronunciation detection of Arabic
phonemes
- URL: http://arxiv.org/abs/2301.01378v1
- Date: Tue, 3 Jan 2023 22:17:08 GMT
- Title: An ensemble-based framework for mispronunciation detection of Arabic
phonemes
- Authors: Sukru Selim Calik, Ayhan Kucukmanisa, Zeynep Hilal Kilimci
- Abstract summary: This work introduces an ensemble model that defines the mispronunciation of Arabic phonemes.
Experiment results demonstrate that the utilization of voting as an ensemble algorithm with Mel spectrogram feature extraction technique exhibits remarkable classification result with 95.9% of accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Determination of mispronunciations and ensuring feedback to users are
maintained by computer-assisted language learning (CALL) systems. In this work,
we introduce an ensemble model that defines the mispronunciation of Arabic
phonemes and assists learning of Arabic, effectively. To the best of our
knowledge, this is the very first attempt to determine the mispronunciations of
Arabic phonemes employing ensemble learning techniques and conventional machine
learning models, comprehensively. In order to observe the effect of feature
extraction techniques, mel-frequency cepstrum coefficients (MFCC), and Mel
spectrogram are blended with each learning algorithm. To show the success of
proposed model, 29 letters in the Arabic phonemes, 8 of which are hafiz, are
voiced by a total of 11 different person. The amount of data set has been
enhanced employing the methods of adding noise, time shifting, time stretching,
pitch shifting. Extensive experiment results demonstrate that the utilization
of voting classifier as an ensemble algorithm with Mel spectrogram feature
extraction technique exhibits remarkable classification result with 95.9% of
accuracy.
Related papers
- Strategies for Arabic Readability Modeling [9.976720880041688]
Automatic readability assessment is relevant to building NLP applications for education, content analysis, and accessibility.
We present a set of experimental results on Arabic readability assessment using a diverse range of approaches.
arXiv Detail & Related papers (2024-07-03T11:54:11Z) - SLICER: Learning universal audio representations using low-resource
self-supervised pre-training [53.06337011259031]
We present a new Self-Supervised Learning approach to pre-train encoders on unlabeled audio data.
Our primary aim is to learn audio representations that can generalize across a large variety of speech and non-speech tasks.
arXiv Detail & Related papers (2022-11-02T23:45:33Z) - Pronunciation Generation for Foreign Language Words in Intra-Sentential
Code-Switching Speech Recognition [14.024346215923972]
Code-Switching refers to the phenomenon of switching languages within a sentence or discourse.
In this paper, we make use of limited code-switching data as driving materials and explore a shortcut to quickly develop intra-sentential code-switching recognition skill.
arXiv Detail & Related papers (2022-10-26T13:19:35Z) - Speaker Embedding-aware Neural Diarization for Flexible Number of
Speakers with Textual Information [55.75018546938499]
We propose the speaker embedding-aware neural diarization (SEND) method, which predicts the power set encoded labels.
Our method achieves lower diarization error rate than the target-speaker voice activity detection.
arXiv Detail & Related papers (2021-11-28T12:51:04Z) - Efficient Measuring of Readability to Improve Documents Accessibility
for Arabic Language Learners [0.0]
The approach is based on machine learning classification methods to discriminate between different levels of difficulty in reading and understanding a text.
Several models were trained on a large corpus mined from online Arabic websites and manually annotated.
Best results were achieved using TF-IDF Vectors trained by a combination of word-based unigrams and bigrams with an overall accuracy of 87.14% over four classes of complexity.
arXiv Detail & Related papers (2021-09-09T10:05:38Z) - Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and
language Models for Intent Classification [81.80311855996584]
We propose a novel intent classification framework that employs acoustic features extracted from a pretrained speech recognition system and linguistic features learned from a pretrained language model.
We achieve 90.86% and 99.07% accuracy on ATIS and Fluent speech corpus, respectively.
arXiv Detail & Related papers (2021-02-15T07:20:06Z) - UniSpeech: Unified Speech Representation Learning with Labeled and
Unlabeled Data [54.733889961024445]
We propose a unified pre-training approach called UniSpeech to learn speech representations with both unlabeled and labeled data.
We evaluate the effectiveness of UniSpeech for cross-lingual representation learning on public CommonVoice corpus.
arXiv Detail & Related papers (2021-01-19T12:53:43Z) - Multitask Training with Text Data for End-to-End Speech Recognition [45.35605825009208]
We propose a multitask training method for attention-based end-to-end speech recognition models.
We regularize the decoder in a listen, attend, and spell model by multitask training it on both audio-text and text-only data.
arXiv Detail & Related papers (2020-10-27T14:29:28Z) - Arabic Offensive Language Detection Using Machine Learning and Ensemble
Machine Learning Approaches [0.0]
The study shows significant impact for applying ensemble machine learning approach over the single learner machine learning approach.
Among the trained ensemble machine learning classifiers, bagging performs the best in offensive language detection with F1 score of 88%.
arXiv Detail & Related papers (2020-05-16T06:40:36Z) - Towards Zero-shot Learning for Automatic Phonemic Transcription [82.9910512414173]
A more challenging problem is to build phonemic transcribers for languages with zero training data.
Our model is able to recognize unseen phonemes in the target language without any training data.
It achieves 7.7% better phoneme error rate on average over a standard multilingual model.
arXiv Detail & Related papers (2020-02-26T20:38:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.