Spell Correction for Azerbaijani Language using Deep Neural Networks
- URL: http://arxiv.org/abs/2102.03218v1
- Date: Fri, 5 Feb 2021 15:02:35 GMT
- Title: Spell Correction for Azerbaijani Language using Deep Neural Networks
- Authors: Ahmad Ahmadzade and Saber Malekzadeh
- Abstract summary: This paper sequence to sequence model with an attention mechanism is used to develop spelling correction for Azerbaijani.
Total 12000 wrong and correct sentence pairs used for training, and the model is tested on 1000 real-world misspelled words.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spell correction is used to detect and correct orthographic mistakes in
texts. Most of the time, traditional dictionary lookup with string similarity
methods is suitable for the languages that have a less complex structure such
as the English language. However, the Azerbaijani language has a more complex
structure and due to its morphological structure, the derivation of words is
plenty that several words are derived from adding suffices, affixes to the
words. Therefore, in this paper sequence to sequence model with an attention
mechanism is used to develop spelling correction for Azerbaijani. Total 12000
wrong and correct sentence pairs used for training, and the model is tested on
1000 real-world misspelled words and F1-score results are 75% for distance 0,
90% for distance 1, and 96% for distance 2.
Related papers
- Automatic Real-word Error Correction in Persian Text [0.0]
This paper introduces a cutting-edge approach for precise and efficient real-word error correction in Persian text.
We employ semantic analysis, feature selection, and advanced classifiers to enhance error detection and correction efficacy.
Our method achieves an impressive F-measure of 96.6% in the detection phase and an accuracy of 99.1% in the correction phase.
arXiv Detail & Related papers (2024-07-20T07:50:52Z) - PERCORE: A Deep Learning-Based Framework for Persian Spelling Correction with Phonetic Analysis [0.0]
This research introduces a state-of-the-art Persian spelling correction system that seamlessly integrates deep learning techniques with phonetic analysis.
Our methodology effectively combines deep contextual analysis with phonetic insights, adeptly correcting both non-word and real-word spelling errors.
A thorough evaluation on a wide-ranging dataset confirms our system's superior performance compared to existing methods.
arXiv Detail & Related papers (2024-07-20T07:41:04Z) - AraSpell: A Deep Learning Approach for Arabic Spelling Correction [0.0]
"AraSpell" is a framework for Arabic spelling correction using different seq2seq model architectures.
It was trained on more than 6.9 Million Arabic sentences.
arXiv Detail & Related papers (2024-05-11T10:36:28Z) - SpellMapper: A non-autoregressive neural spellchecker for ASR
customization with candidate retrieval based on n-gram mappings [76.87664008338317]
Contextual spelling correction models are an alternative to shallow fusion to improve automatic speech recognition.
We propose a novel algorithm for candidate retrieval based on misspelled n-gram mappings.
Experiments on Spoken Wikipedia show 21.4% word error rate improvement compared to a baseline ASR system.
arXiv Detail & Related papers (2023-06-04T10:00:12Z) - Persian Typographical Error Type Detection Using Deep Neural Networks on Algorithmically-Generated Misspellings [2.2503811834154104]
Typographical Error Type Detection in Persian is a relatively understudied area.
This paper presents a compelling approach for detecting typographical errors in Persian texts.
The outcomes of our final method proved to be highly competitive, achieving an accuracy of 97.62%, precision of 98.83%, recall of 98.61%, and surpassing others in terms of speed.
arXiv Detail & Related papers (2023-05-19T15:05:39Z) - A transformer-based spelling error correction framework for Bangla and resource scarce Indic languages [2.5874041837241304]
Spelling error correction is the task of identifying and rectifying misspelled words in texts.
Earlier efforts on spelling error correction in Bangla and resource-scarce Indic languages focused on rule-based, statistical, and machine learning-based methods.
We propose a novel detector-purificator-corrector, DPC based on denoising transformers by addressing previous issues.
arXiv Detail & Related papers (2022-11-07T17:59:05Z) - Improving Pre-trained Language Models with Syntactic Dependency
Prediction Task for Chinese Semantic Error Recognition [52.55136323341319]
Existing Chinese text error detection mainly focuses on spelling and simple grammatical errors.
Chinese semantic errors are understudied and more complex that humans cannot easily recognize.
arXiv Detail & Related papers (2022-04-15T13:55:32Z) - NeuSpell: A Neural Spelling Correction Toolkit [88.79419580807519]
NeuSpell is an open-source toolkit for spelling correction in English.
It comprises ten different models, and benchmarks them on misspellings from multiple sources.
We train neural models using spelling errors in context, synthetically constructed by reverse engineering isolated misspellings.
arXiv Detail & Related papers (2020-10-21T15:53:29Z) - On the Robustness of Language Encoders against Grammatical Errors [66.05648604987479]
We collect real grammatical errors from non-native speakers and conduct adversarial attacks to simulate these errors on clean text data.
Results confirm that the performance of all tested models is affected but the degree of impact varies.
arXiv Detail & Related papers (2020-05-12T11:01:44Z) - Phonotactic Complexity and its Trade-offs [73.10961848460613]
This simple measure allows us to compare the entropy across languages.
We demonstrate a very strong negative correlation of -0.74 between bits per phoneme and the average length of words.
arXiv Detail & Related papers (2020-05-07T21:36:59Z) - Towards Zero-shot Learning for Automatic Phonemic Transcription [82.9910512414173]
A more challenging problem is to build phonemic transcribers for languages with zero training data.
Our model is able to recognize unseen phonemes in the target language without any training data.
It achieves 7.7% better phoneme error rate on average over a standard multilingual model.
arXiv Detail & Related papers (2020-02-26T20:38:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.