2D Self-Organized ONN Model For Handwritten Text Recognition
- URL: http://arxiv.org/abs/2207.08139v1
- Date: Sun, 17 Jul 2022 11:18:20 GMT
- Title: 2D Self-Organized ONN Model For Handwritten Text Recognition
- Authors: Hanadi Hassen Mohammed, Junaid Malik, Somaya Al-Madeed, and Serkan
Kiranyaz
- Abstract summary: This study proposes the 2D Self-organized ONNs (Self-ONNs) in the core of a novel network model.
Deformable convolutions, which have recently been demonstrated to tackle variations in the writing styles better, are utilized in this study.
Results show that the proposed model with the operational layers of Self-ONNs significantly improves Character Error Rate (CER) and Word Error Rate (WER)
- Score: 4.66970207245168
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Convolutional Neural Networks (CNNs) have recently reached
state-of-the-art Handwritten Text Recognition (HTR) performance. However,
recent research has shown that typical CNNs' learning performance is limited
since they are homogeneous networks with a simple (linear) neuron model. With
their heterogeneous network structure incorporating non-linear neurons,
Operational Neural Networks (ONNs) have recently been proposed to address this
drawback. Self-ONNs are self-organized variations of ONNs with the generative
neuron model that can generate any non-linear function using the Taylor
approximation. In this study, in order to improve the state-of-the-art
performance level in HTR, the 2D Self-organized ONNs (Self-ONNs) in the core of
a novel network model are proposed. Moreover, deformable convolutions, which
have recently been demonstrated to tackle variations in the writing styles
better, are utilized in this study. The results over the IAM English dataset
and HADARA80P Arabic dataset show that the proposed model with the operational
layers of Self-ONNs significantly improves Character Error Rate (CER) and Word
Error Rate (WER). Compared with its counterpart CNNs, Self-ONNs reduce CER and
WER by 1.2% and 3.4 % in the HADARA80P and 0.199% and 1.244% in the IAM
dataset. The results over the benchmark IAM demonstrate that the proposed model
with the operational layers of Self-ONNs outperforms recent deep CNN models by
a significant margin while the use of Self-ONNs with deformable convolutions
demonstrates exceptional results.
Related papers
- DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects [48.65846477275723]
This study proposes novel dual-current neural networks (DCNN) to improve the accuracy of fine-grained image classification.
The main novel design features for constructing a weakly supervised learning backbone model DCNN include (a) extracting heterogeneous data, (b) keeping the feature map resolution unchanged, (c) expanding the receptive field, and (d) fusing global representations and local features.
arXiv Detail & Related papers (2024-05-07T07:51:28Z) - Harnessing Neuron Stability to Improve DNN Verification [42.65507402735545]
We present VeriStable, a novel extension of recently proposed DPLL-based constraint DNN verification approach.
We evaluate the effectiveness of VeriStable across a range of challenging benchmarks including fully-connected feed networks (FNNs), convolutional neural networks (CNNs) and residual networks (ResNets)
Preliminary results show that VeriStable is competitive and outperforms state-of-the-art verification tools, including $alpha$-$beta$-CROWN and MN-BaB, the first and second performers of the VNN-COMP, respectively.
arXiv Detail & Related papers (2024-01-19T23:48:04Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Adaptive-SpikeNet: Event-based Optical Flow Estimation using Spiking
Neural Networks with Learnable Neuronal Dynamics [6.309365332210523]
Spiking Neural Networks (SNNs) with their neuro-inspired event-driven processing can efficiently handle asynchronous data.
We propose an adaptive fully-spiking framework with learnable neuronal dynamics to alleviate the spike vanishing problem.
Our experiments on datasets show an average reduction of 13% in average endpoint error (AEE) compared to state-of-the-art ANNs.
arXiv Detail & Related papers (2022-09-21T21:17:56Z) - Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex.
In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Supervised Training of Siamese Spiking Neural Networks with Earth's
Mover Distance [4.047840018793636]
This study adapts the highly-versatile siamese neural network model to the event data domain.
We introduce a supervised training framework for optimizing Earth's Mover Distance between spike trains with spiking neural networks (SNN)
arXiv Detail & Related papers (2022-02-20T00:27:57Z) - BackEISNN: A Deep Spiking Neural Network with Adaptive Self-Feedback and
Balanced Excitatory-Inhibitory Neurons [8.956708722109415]
Spiking neural networks (SNNs) transmit information through discrete spikes, which performs well in processing spatial-temporal information.
We propose a deep spiking neural network with adaptive self-feedback and balanced excitatory and inhibitory neurons (BackEISNN)
For the MNIST, FashionMNIST, and N-MNIST datasets, our model has achieved state-of-the-art performance.
arXiv Detail & Related papers (2021-05-27T08:38:31Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - Auditory Attention Decoding from EEG using Convolutional Recurrent
Neural Network [20.37214453938965]
The auditory attention decoding (AAD) approach was proposed to determine the identity of the attended talker in a multi-talker scenario.
Recent models based on deep neural networks (DNN) have been proposed to solve this problem.
In this paper, we proposed novel convolutional recurrent neural network (CRNN) based regression model and classification model.
arXiv Detail & Related papers (2021-03-03T05:09:40Z) - Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks [61.76338096980383]
A range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper- parameters of state-of-the-art factored time delay neural networks (TDNNs)
These include the DARTS method integrating architecture selection with lattice-free MMI (LF-MMI) TDNN training.
Experiments conducted on a 300-hour Switchboard corpus suggest the auto-configured systems consistently outperform the baseline LF-MMI TDNN systems.
arXiv Detail & Related papers (2020-07-17T08:32:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.