Related papers: Does color modalities affect handwriting recognition? An empirical study on Persian handwritings using convolutional neural networks

Does color modalities affect handwriting recognition? An empirical study on Persian handwritings using convolutional neural networks

URL: http://arxiv.org/abs/2307.12150v1
Date: Sat, 22 Jul 2023 19:47:52 GMT
Title: Does color modalities affect handwriting recognition? An empirical study on Persian handwritings using convolutional neural networks
Authors: Abbas Zohrevand, Zahra Imani, Javad Sadri, Ching Y.Suen
Abstract summary: We investigate to see whether color modalities of handwritten digits and words affect their recognition accuracy or speed. We selected 13,330 isolated digits and 62,500 words from a novel Persian handwritten database. CNN on the BW digit and word images has a higher performance compared to the other two color modalities.
Score: 7.965705015476877
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Most of the methods on handwritten recognition in the literature are focused and evaluated on Black and White (BW) image databases. In this paper we try to answer a fundamental question in document recognition. Using Convolutional Neural Networks (CNNs), as eye simulator, we investigate to see whether color modalities of handwritten digits and words affect their recognition accuracy or speed? To the best of our knowledge, so far this question has not been answered due to the lack of handwritten databases that have all three color modalities of handwritings. To answer this question, we selected 13,330 isolated digits and 62,500 words from a novel Persian handwritten database, which have three different color modalities and are unique in term of size and variety. Our selected datasets are divided into training, validation, and testing sets. Afterwards, similar conventional CNN models are trained with the training samples. While the experimental results on the testing set show that CNN on the BW digit and word images has a higher performance compared to the other two color modalities, in general there are no significant differences for network accuracy in different color modalities. Also, comparisons of training times in three color modalities show that recognition of handwritten digits and words in BW images using CNN is much more efficient.

Related papers

Application of convolutional neural networks in image super-resolution [99.25287909319401]
convolutional neural networks (CNNs) have become mainstream methods for image super-resolution.<n>There are big differences of different deep learning methods with different types.<n>This paper first introduces principles of CNNs in image super-resolution, then introduces CNNs based bicubic, nearest neighbor, bilinear, transposed convolution, sub-pixel layer, meta-up-sampling for image super-resolution.<n>Finally, this paper gives potential research points and drawbacks and summarizes the whole paper, which can facilitate developments of CNNs in image super-resolution.
arXiv Detail & Related papers (2025-06-03T08:28:08Z)
Attention based End to end network for Offline Writer Identification on Word level data [3.5829161769306244]
We propose a writer identification system based on an attention-driven Convolutional Neural Network (CNN) The system is trained utilizing image segments, known as fragments, extracted from word images, employing a pyramid-based strategy. The efficacy of the proposed algorithm is evaluated on three benchmark databases.
arXiv Detail & Related papers (2024-04-11T09:41:14Z)
A CNN Based Framework for Unistroke Numeral Recognition in Air-Writing [17.426389959819538]
This paper proposes a generic video camera-aided convolutional neural network (CNN) based air-writing framework. Gestures are performed using a marker of fixed color in front of a generic video camera, followed by color-based segmentation to identify the marker and track the trajectory of the marker tip. The proposed framework has achieved 97.7%, 95.4% and 93.7% recognition rates in person independent evaluations on English, Bengali and Devanagari numerals, respectively.
arXiv Detail & Related papers (2023-03-14T15:44:45Z)
Handwritten Word Recognition using Deep Learning Approach: A Novel Way of Generating Handwritten Words [14.47529728678643]
This paper proposes a novel way of generating diverse handwritten word images using handwritten characters. The whole approach shows the process of generating two types of large and diverse handwritten word datasets. For the experiments, we have targeted the Bangla language, which lacks the handwritten word dataset.
arXiv Detail & Related papers (2023-03-13T22:58:34Z)
Siamese based Neural Network for Offline Writer Identification on word level data [7.747239584541488]
We propose a novel scheme to identify the author of a document based on the input word image. Our method is text independent and does not impose any constraint on the size of the input image under examination.
arXiv Detail & Related papers (2022-11-17T10:01:46Z)
Iris super-resolution using CNNs: is photo-realism important to iris recognition? [67.42500312968455]
Single image super-resolution techniques are emerging, especially with the use of convolutional neural networks (CNNs) In this work, the authors explore single image super-resolution using CNNs for iris recognition. They validate their approach on a database of 1.872 near infrared iris images and on a mobile phone image database.
arXiv Detail & Related papers (2022-10-24T11:19:18Z)
Kurdish Handwritten Character Recognition using Deep Learning Techniques [26.23274417985375]
This paper attempts to design and develop a model that can recognize handwritten characters for Kurdish alphabets using deep learning techniques. A comprehensive dataset was created for handwritten Kurdish characters, which contains more than 40 thousand images. The tested results reported a 96% accuracy rate, and training accuracy reported a 97% accuracy rate.
arXiv Detail & Related papers (2022-10-18T16:48:28Z)
Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition [101.60244147302197]
We introduce contrastive learning and masked image modeling to learn discrimination and generation of text images. Our method outperforms previous self-supervised text recognition methods by 10.2%-20.2% on irregular scene text recognition datasets. Our proposed text recognizer exceeds previous state-of-the-art text recognition methods by averagely 5.3% on 11 benchmarks, with similar model size.
arXiv Detail & Related papers (2022-07-01T03:50:26Z)
Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate. Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance. Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z)
Assessing The Importance Of Colours For CNNs In Object Recognition [70.70151719764021]
Convolutional neural networks (CNNs) have been shown to exhibit conflicting properties. We demonstrate that CNNs often rely heavily on colour information while making a prediction. We evaluate a model trained with congruent images on congruent, greyscale, and incongruent images.
arXiv Detail & Related papers (2020-12-12T22:55:06Z)
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text [93.08109196909763]
We propose a novel VQA approach, Multi-Modal Graph Neural Network (MM-GNN) It first represents an image as a graph consisting of three sub-graphs, depicting visual, semantic, and numeric modalities respectively. It then introduces three aggregators which guide the message passing from one graph to another to utilize the contexts in various modalities.
arXiv Detail & Related papers (2020-03-31T05:56:59Z)
Learning to Structure an Image with Few Colors [59.34619548026885]
We propose a color quantization network, ColorCNN, which learns to structure the images from the classification loss in an end-to-end manner. With only a 1-bit color space (i.e., two colors), the proposed network achieves 82.1% top-1 accuracy on the CIFAR10 dataset. For applications, when encoded with PNG, the proposed color quantization shows superiority over other image compression methods in the extremely low bit-rate regime.
arXiv Detail & Related papers (2020-03-17T17:56:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.