Handwritten Arabic Character Recognition for Children Writ-ing Using
Convolutional Neural Network and Stroke Identification
- URL: http://arxiv.org/abs/2211.02119v1
- Date: Thu, 3 Nov 2022 19:48:11 GMT
- Title: Handwritten Arabic Character Recognition for Children Writ-ing Using
Convolutional Neural Network and Stroke Identification
- Authors: Mais Alheraki, Rawan Al-Matham and Hend Al-Khalifa
- Abstract summary: We propose a convolutional neural network (CNN) model that recognizes children handwriting with an accuracy of 91% on the Hijja dataset.
We propose a new approach using multi models instead of single model based on the number of strokes in a character.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic Arabic handwritten recognition is one of the recently studied
problems in the field of Machine Learning. Unlike Latin languages, Arabic is a
Semitic language that forms a harder challenge, especially with variability of
patterns caused by factors such as writer age. Most of the studies focused on
adults, with only one recent study on children. Moreover, much of the recent
Machine Learning methods focused on using Convolutional Neural Networks, a
powerful class of neural networks that can extract complex features from
images. In this paper we propose a convolutional neural network (CNN) model
that recognizes children handwriting with an accuracy of 91% on the Hijja
dataset, a recent dataset built by collecting images of the Arabic characters
written by children, and 97% on Arabic Handwritten Character Dataset. The
results showed a good improvement over the proposed model from the Hijja
dataset authors, yet it reveals a bigger challenge to solve for children Arabic
handwritten character recognition. Moreover, we proposed a new approach using
multi models instead of single model based on the number of strokes in a
character, and merged Hijja with AHCD which reached an averaged prediction
accuracy of 96%.
Related papers
- Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents.
Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z) - ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic [51.922112625469836]
We present datasetname, the first multi-task language understanding benchmark for the Arabic language.
Our data comprises 40 tasks and 14,575 multiple-choice questions in Modern Standard Arabic (MSA) and is carefully constructed by collaborating with native speakers in the region.
Our evaluations of 35 models reveal substantial room for improvement, particularly among the best open-source models.
arXiv Detail & Related papers (2024-02-20T09:07:41Z) - NusaWrites: Constructing High-Quality Corpora for Underrepresented and
Extremely Low-Resource Languages [54.808217147579036]
We conduct a case study on Indonesian local languages.
We compare the effectiveness of online scraping, human translation, and paragraph writing by native speakers in constructing datasets.
Our findings demonstrate that datasets generated through paragraph writing by native speakers exhibit superior quality in terms of lexical diversity and cultural content.
arXiv Detail & Related papers (2023-09-19T14:42:33Z) - Huruf: An Application for Arabic Handwritten Character Recognition Using
Deep Learning [0.0]
We propose a lightweight Convolutional Neural Network-based architecture for recognizing Arabic characters and digits.
The proposed pipeline consists of a total of 18 layers containing four layers each for convolution, pooling, batch normalization, dropout, and finally one Global average layer.
The proposed model respectively achieved an accuracy of 96.93% and 99.35% which is comparable to the state-of-the-art and makes it a suitable solution for real-life end-level applications.
arXiv Detail & Related papers (2022-12-16T17:39:32Z) - Graphemic Normalization of the Perso-Arabic Script [47.429213930688086]
This paper documents the challenges that Perso-Arabic presents beyond the best-documented languages.
We focus on the situation in natural language processing (NLP), which is affected by multiple, often neglected, issues.
We evaluate the effects of script normalization on eight languages from diverse language families in the Perso-Arabic script diaspora on machine translation and statistical language modeling tasks.
arXiv Detail & Related papers (2022-10-21T21:59:44Z) - Kurdish Handwritten Character Recognition using Deep Learning Techniques [26.23274417985375]
This paper attempts to design and develop a model that can recognize handwritten characters for Kurdish alphabets using deep learning techniques.
A comprehensive dataset was created for handwritten Kurdish characters, which contains more than 40 thousand images.
The tested results reported a 96% accuracy rate, and training accuracy reported a 97% accuracy rate.
arXiv Detail & Related papers (2022-10-18T16:48:28Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - VidLanKD: Improving Language Understanding via Video-Distilled Knowledge
Transfer [76.3906723777229]
We present VidLanKD, a video-language knowledge distillation method for improving language understanding.
We train a multi-modal teacher model on a video-text dataset, and then transfer its knowledge to a student language model with a text dataset.
In our experiments, VidLanKD achieves consistent improvements over text-only language models and vokenization models.
arXiv Detail & Related papers (2021-07-06T15:41:32Z) - Classification of Handwritten Names of Cities and Handwritten Text
Recognition using Various Deep Learning Models [0.0]
We have tried to describe various approaches and achievements of recent years in the development of handwritten recognition models.
The first model uses deep convolutional neural networks (CNNs) for feature extraction and a fully connected multilayer perceptron neural network (MLP) for word classification.
The second model, called SimpleHTR, uses CNN and recurrent neural network (RNN) layers to extract information from images.
arXiv Detail & Related papers (2021-02-09T13:34:16Z) - Arabic Handwritten Character Recognition based on Convolution Neural
Networks and Support Vector Machine [0.0]
We present an algorithm for recognizing Arabic letters and characters based on using deep convolution neural networks (DCNN) and support vector machine (SVM)
This paper addresses the problem of recognizing the Arabic handwritten characters by determining the similarity between the input templates and the pre-stored templates.
The experimental results of this work indicate the ability of the proposed algorithm to recognize, identify, and verify the input handwritten Arabic characters.
arXiv Detail & Related papers (2020-09-28T16:18:52Z) - Neural Computing for Online Arabic Handwriting Character Recognition
using Hard Stroke Features Mining [0.0]
An enhanced method of detecting the desired critical points from vertical and horizontal direction-length of handwriting stroke features of online Arabic script recognition is proposed.
A minimum feature set is extracted from these tokens for classification of characters using a multilayer perceptron with a back-propagation learning algorithm and modified sigmoid function-based activation function.
The proposed method achieves an average accuracy of 98.6% comparable in state of art character recognition techniques.
arXiv Detail & Related papers (2020-05-02T23:17:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.