Persian Handwritten Digit, Character and Word Recognition Using Deep
Learning
- URL: http://arxiv.org/abs/2010.12880v2
- Date: Sat, 14 Nov 2020 06:20:53 GMT
- Title: Persian Handwritten Digit, Character and Word Recognition Using Deep
Learning
- Authors: Mehdi Bonyani, Simindokht Jahangard, Morteza Daneshmand
- Abstract summary: In this paper, deep neural networks are utilized through various DensNet architectures, as well as the Xception.
We come up with an optical character recognition accounting for the particularities of the Persian language and the corresponding handwritings.
On the HODA database, we achieve recognition rates of 99.72% and 89.99% for digits and characters, being 99.72%, 98.32% and 98.82% for digits, characters and words.
- Score: 0.5188841610098436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Digit, letter and word recognition for a particular script has various
applications in todays commercial contexts. Nevertheless, only a limited number
of relevant studies have dealt with Persian scripts. In this paper, deep neural
networks are utilized through various DensNet architectures, as well as the
Xception, are adopted, modified and further boosted through data augmentation
and test time augmentation, in order to come up with an optical character
recognition accounting for the particularities of the Persian language and the
corresponding handwritings. Taking advantage of dividing the databases to
training, validation and test sets, as well as k-fold cross validation, the
comparison of the proposed method with various state-of-the-art alternatives is
performed on the basis of the HODA and Sadri databases, which offer the most
comprehensive collection of samples in terms of the various handwriting styles
possessed by different human beings, as well as different forms each letter may
take, which depend on its position within a word. On the HODA database, we
achieve recognition rates of 99.72% and 89.99% for digits and characters, being
99.72%, 98.32% and 98.82% for digits, characters and words from the Sadri
database, respectively.
Related papers
- Bukva: Russian Sign Language Alphabet [75.42794328290088]
This paper investigates the recognition of the Russian fingerspelling alphabet, also known as the Russian Sign Language (RSL) dactyl.
Dactyl is a component of sign languages where distinct hand movements represent individual letters of a written language.
We provide Bukva, the first full-fledged open-source video dataset for RSL dactyl recognition.
arXiv Detail & Related papers (2024-10-11T09:59:48Z) - MDIW-13: a New Multi-Lingual and Multi-Script Database and Benchmark for Script Identification [19.021909090693505]
This paper provides a new database for benchmarking script identification algorithms.
The dataset consists of 1,135 documents scanned from local newspaper and handwritten letters as well as notes from different native writers.
Easy-to-go benchmarks are proposed with handcrafted and deep learning methods.
arXiv Detail & Related papers (2024-05-29T09:29:09Z) - Sampling and Ranking for Digital Ink Generation on a tight computational
budget [69.15275423815461]
We study ways to maximize the quality of the output of a trained digital ink generative model.
We use and compare the effect of multiple sampling and ranking techniques, in the first ablation study of its kind in the digital ink domain.
arXiv Detail & Related papers (2023-06-02T09:55:15Z) - Huruf: An Application for Arabic Handwritten Character Recognition Using
Deep Learning [0.0]
We propose a lightweight Convolutional Neural Network-based architecture for recognizing Arabic characters and digits.
The proposed pipeline consists of a total of 18 layers containing four layers each for convolution, pooling, batch normalization, dropout, and finally one Global average layer.
The proposed model respectively achieved an accuracy of 96.93% and 99.35% which is comparable to the state-of-the-art and makes it a suitable solution for real-life end-level applications.
arXiv Detail & Related papers (2022-12-16T17:39:32Z) - Kurdish Handwritten Character Recognition using Deep Learning Techniques [26.23274417985375]
This paper attempts to design and develop a model that can recognize handwritten characters for Kurdish alphabets using deep learning techniques.
A comprehensive dataset was created for handwritten Kurdish characters, which contains more than 40 thousand images.
The tested results reported a 96% accuracy rate, and training accuracy reported a 97% accuracy rate.
arXiv Detail & Related papers (2022-10-18T16:48:28Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - Writer Recognition Using Off-line Handwritten Single Block Characters [59.17685450892182]
We use personal identity numbers consisting of the six digits of the date of birth, DoB.
We evaluate two recognition approaches, one based on handcrafted features that compute directional measurements, and another based on deep features from a ResNet50 model.
Results show the presence of identity-related information in a piece of handwritten information as small as six digits with the DoB.
arXiv Detail & Related papers (2022-01-25T23:04:10Z) - Letter-level Online Writer Identification [86.13203975836556]
We focus on a novel problem, letter-level online writer-id, which requires only a few trajectories of written letters as identification cues.
A main challenge is that a person often writes a letter in different styles from time to time.
We refer to this problem as the variance of online writing styles (Var-O-Styles)
arXiv Detail & Related papers (2021-12-06T07:21:53Z) - KOHTD: Kazakh Offline Handwritten Text Dataset [0.0]
We propose an extensive Kazakh offline Handwritten Text dataset (KOHTD)
KOHTD has 3000 handwritten exam papers and more than 140335 segmented images and there are approximately 922010 symbols.
We used a variety of popular text recognition methods for word and line recognition in our studies, including CTC-based and attention-based methods.
arXiv Detail & Related papers (2021-09-22T16:19:38Z) - Neural Computing for Online Arabic Handwriting Character Recognition
using Hard Stroke Features Mining [0.0]
An enhanced method of detecting the desired critical points from vertical and horizontal direction-length of handwriting stroke features of online Arabic script recognition is proposed.
A minimum feature set is extracted from these tokens for classification of characters using a multilayer perceptron with a back-propagation learning algorithm and modified sigmoid function-based activation function.
The proposed method achieves an average accuracy of 98.6% comparable in state of art character recognition techniques.
arXiv Detail & Related papers (2020-05-02T23:17:08Z) - Differentiable Reasoning over a Virtual Knowledge Base [156.94984221342716]
We consider the task of answering complex multi-hop questions using a corpus as a virtual knowledge base (KB)
In particular, we describe a neural module, DrKIT, that traverses textual data like a KB, softly following paths of relations between mentions of entities in the corpus.
DrKIT is very efficient, processing 10-100x more queries per second than existing multi-hop systems.
arXiv Detail & Related papers (2020-02-25T03:13:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.