Performance Evaluation of Advanced Deep Learning Architectures for
Offline Handwritten Character Recognition
- URL: http://arxiv.org/abs/2003.06794v1
- Date: Sun, 15 Mar 2020 11:17:16 GMT
- Title: Performance Evaluation of Advanced Deep Learning Architectures for
Offline Handwritten Character Recognition
- Authors: Moazam Soomro, Muhammad Ali Farooq, Rana Hammad Raza
- Abstract summary: The system utilizes advanced multilayer deep neural network by collecting features from raw pixel values.
Two state of the art deep learning architectures were used which includes Caffe AlexNet and GoogleNet models in NVIDIA DIGITS.
The accuracy level achieved with AlexNet was 77.77% and 88.89% with Google Net.
- Score: 0.6445605125467573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a hand-written character recognition comparison and
performance evaluation for robust and precise classification of different
hand-written characters. The system utilizes advanced multilayer deep neural
network by collecting features from raw pixel values. The hidden layers stack
deep hierarchies of non-linear features since learning complex features from
conventional neural networks is very challenging. Two state of the art deep
learning architectures were used which includes Caffe AlexNet and GoogleNet
models in NVIDIA DIGITS.The frameworks were trained and tested on two different
datasets for incorporating diversity and complexity. One of them is the
publicly available dataset i.e. Chars74K comprising of 7705 characters and has
upper and lowercase English alphabets, along with numerical digits. While the
other dataset created locally consists of 4320 characters. The local dataset
consists of 62 classes and was created by 40 subjects. It also consists upper
and lowercase English alphabets, along with numerical digits. The overall
dataset is divided in the ratio of 80% for training and 20% for testing phase.
The time required for training phase is approximately 90 minutes. For
validation part, the results obtained were compared with the groundtruth. The
accuracy level achieved with AlexNet was 77.77% and 88.89% with Google Net. The
higher accuracy level of GoogleNet is due to its unique combination of
inception modules, each including pooling, convolutions at various scales and
concatenation procedures.
Related papers
- Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets.
DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z) - ULIP: Learning a Unified Representation of Language, Images, and Point
Clouds for 3D Understanding [110.07170245531464]
Current 3D models are limited by datasets with a small number of annotated data and a pre-defined set of categories.
Recent advances have shown that similar problems can be significantly alleviated by employing knowledge from other modalities, such as language.
We learn a unified representation of images, texts, and 3D point clouds by pre-training with object triplets from the three modalities.
arXiv Detail & Related papers (2022-12-10T01:34:47Z) - Kurdish Handwritten Character Recognition using Deep Learning Techniques [26.23274417985375]
This paper attempts to design and develop a model that can recognize handwritten characters for Kurdish alphabets using deep learning techniques.
A comprehensive dataset was created for handwritten Kurdish characters, which contains more than 40 thousand images.
The tested results reported a 96% accuracy rate, and training accuracy reported a 97% accuracy rate.
arXiv Detail & Related papers (2022-10-18T16:48:28Z) - Learning Rate Curriculum [75.98230528486401]
We propose a novel curriculum learning approach termed Learning Rate Curriculum (LeRaC)
LeRaC uses a different learning rate for each layer of a neural network to create a data-agnostic curriculum during the initial training epochs.
We compare our approach with Curriculum by Smoothing (CBS), a state-of-the-art data-agnostic curriculum learning approach.
arXiv Detail & Related papers (2022-05-18T18:57:36Z) - Investigating Neural Architectures by Synthetic Dataset Design [14.317837518705302]
Recent years have seen the emergence of many new neural network structures (architectures and layers)
We sketch a methodology to measure the effect of each structure on a network's ability, by designing ad hoc synthetic datasets.
We illustrate our methodology by building three datasets to evaluate each of the three following network properties.
arXiv Detail & Related papers (2022-04-23T10:50:52Z) - Deep ensembles in bioimage segmentation [74.01883650587321]
In this work, we propose an ensemble of convolutional neural networks (CNNs)
In ensemble methods, many different models are trained and then used for classification, the ensemble aggregates the outputs of the single classifiers.
The proposed ensemble is implemented by combining different backbone networks using the DeepLabV3+ and HarDNet environment.
arXiv Detail & Related papers (2021-12-24T05:54:21Z) - Dive into Layers: Neural Network Capacity Bounding using Algebraic
Geometry [55.57953219617467]
We show that the learnability of a neural network is directly related to its size.
We use Betti numbers to measure the topological geometric complexity of input data and the neural network.
We perform the experiments on a real-world dataset MNIST and the results verify our analysis and conclusion.
arXiv Detail & Related papers (2021-09-03T11:45:51Z) - Facial Age Estimation using Convolutional Neural Networks [0.0]
This paper is a part of a student project in Machine Learning at the Norwegian University of Science and Technology.
A deep convolutional neural network with five convolutional layers and three fully-connected layers is presented to estimate the ages of individuals based on images.
arXiv Detail & Related papers (2021-05-14T10:09:47Z) - Satellite Image Classification with Deep Learning [0.0]
We describe a deep learning system for classifying objects and facilities from the IARPA Functional Map of the World (fMoW) dataset into 63 different classes.
The system consists of an ensemble of convolutional neural networks and additional neural networks that integrate satellite metadata with image features.
At the time of writing the system is in 2nd place in the fMoW TopCoder competition.
arXiv Detail & Related papers (2020-10-13T15:56:58Z) - Alpha-Net: Architecture, Models, and Applications [0.0]
We present a novel network architecture for custom training and weight evaluations.
We implement Alpha-Net with 4 different layer configurations to express the architecture behavior.
The Alpha-Net v3 gives improved accuracy of approx. 3% over the last state-of-the-art network ResNet 50 on ImageNet benchmark.
arXiv Detail & Related papers (2020-06-27T05:05:01Z) - Pyramidal Convolution: Rethinking Convolutional Neural Networks for
Visual Recognition [98.10703825716142]
This work introduces pyramidal convolution (PyConv), which is capable of processing the input at multiple filter scales.
We present different architectures based on PyConv for four main tasks on visual recognition: image classification, video action classification/recognition, object detection and semantic image segmentation/parsing.
arXiv Detail & Related papers (2020-06-20T10:19:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.