Related papers: 2D Self-Organized ONN Model For Handwritten Text Recognition

2D Self-Organized ONN Model For Handwritten Text Recognition

URL: http://arxiv.org/abs/2207.08139v1
Date: Sun, 17 Jul 2022 11:18:20 GMT
Title: 2D Self-Organized ONN Model For Handwritten Text Recognition
Authors: Hanadi Hassen Mohammed, Junaid Malik, Somaya Al-Madeed, and Serkan Kiranyaz
Abstract summary: This study proposes the 2D Self-organized ONNs (Self-ONNs) in the core of a novel network model. Deformable convolutions, which have recently been demonstrated to tackle variations in the writing styles better, are utilized in this study. Results show that the proposed model with the operational layers of Self-ONNs significantly improves Character Error Rate (CER) and Word Error Rate (WER)
Score: 4.66970207245168
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep Convolutional Neural Networks (CNNs) have recently reached state-of-the-art Handwritten Text Recognition (HTR) performance. However, recent research has shown that typical CNNs' learning performance is limited since they are homogeneous networks with a simple (linear) neuron model. With their heterogeneous network structure incorporating non-linear neurons, Operational Neural Networks (ONNs) have recently been proposed to address this drawback. Self-ONNs are self-organized variations of ONNs with the generative neuron model that can generate any non-linear function using the Taylor approximation. In this study, in order to improve the state-of-the-art performance level in HTR, the 2D Self-organized ONNs (Self-ONNs) in the core of a novel network model are proposed. Moreover, deformable convolutions, which have recently been demonstrated to tackle variations in the writing styles better, are utilized in this study. The results over the IAM English dataset and HADARA80P Arabic dataset show that the proposed model with the operational layers of Self-ONNs significantly improves Character Error Rate (CER) and Word Error Rate (WER). Compared with its counterpart CNNs, Self-ONNs reduce CER and WER by 1.2% and 3.4 % in the HADARA80P and 0.199% and 1.244% in the IAM dataset. The results over the benchmark IAM demonstrate that the proposed model with the operational layers of Self-ONNs outperforms recent deep CNN models by a significant margin while the use of Self-ONNs with deformable convolutions demonstrates exceptional results.

Related papers

Threshold Modulation for Online Test-Time Adaptation of Spiking Neural Networks [13.112288560806359]
spiking neural networks (SNNs) deployed on neuromorphic chips provide efficient solutions on edge devices in different scenarios.<n>Online test-time adaptation (OTTA) offers a promising solution by enabling models to adjust to new data distributions without requiring source data or labeled target samples.<n>Existing OTTA methods are largely designed for traditional artificial neural networks and are not well-suited for SNNs.<n>We propose a low-power, neuromorphic chip-friendly online test-time adaptation framework, aiming to enhance model generalization under distribution shifts.
arXiv Detail & Related papers (2025-05-08T16:09:40Z)
Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects [48.65846477275723]
This study proposes novel dual-current neural networks (DCNN) to improve the accuracy of fine-grained image classification. The main novel design features for constructing a weakly supervised learning backbone model DCNN include (a) extracting heterogeneous data, (b) keeping the feature map resolution unchanged, (c) expanding the receptive field, and (d) fusing global representations and local features.
arXiv Detail & Related papers (2024-05-07T07:51:28Z)
Harnessing Neuron Stability to Improve DNN Verification [42.65507402735545]
We present VeriStable, a novel extension of recently proposed DPLL-based constraint DNN verification approach. We evaluate the effectiveness of VeriStable across a range of challenging benchmarks including fully-connected feed networks (FNNs), convolutional neural networks (CNNs) and residual networks (ResNets) Preliminary results show that VeriStable is competitive and outperforms state-of-the-art verification tools, including $alpha$-$beta$-CROWN and MN-BaB, the first and second performers of the VNN-COMP, respectively.
arXiv Detail & Related papers (2024-01-19T23:48:04Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Adaptive-SpikeNet: Event-based Optical Flow Estimation using Spiking Neural Networks with Learnable Neuronal Dynamics [6.309365332210523]
Spiking Neural Networks (SNNs) with their neuro-inspired event-driven processing can efficiently handle asynchronous data. We propose an adaptive fully-spiking framework with learnable neuronal dynamics to alleviate the spike vanishing problem. Our experiments on datasets show an average reduction of 13% in average endpoint error (AEE) compared to state-of-the-art ANNs.
arXiv Detail & Related papers (2022-09-21T21:17:56Z)
Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex. In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z)
Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications. In this paper we propose an uncertainty quantification approach by modelling the distribution of features. We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem. We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z)
Supervised Training of Siamese Spiking Neural Networks with Earth's Mover Distance [4.047840018793636]
This study adapts the highly-versatile siamese neural network model to the event data domain. We introduce a supervised training framework for optimizing Earth's Mover Distance between spike trains with spiking neural networks (SNN)
arXiv Detail & Related papers (2022-02-20T00:27:57Z)
BackEISNN: A Deep Spiking Neural Network with Adaptive Self-Feedback and Balanced Excitatory-Inhibitory Neurons [8.956708722109415]
Spiking neural networks (SNNs) transmit information through discrete spikes, which performs well in processing spatial-temporal information. We propose a deep spiking neural network with adaptive self-feedback and balanced excitatory and inhibitory neurons (BackEISNN) For the MNIST, FashionMNIST, and N-MNIST datasets, our model has achieved state-of-the-art performance.
arXiv Detail & Related papers (2021-05-27T08:38:31Z)
ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware. The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z)
Auditory Attention Decoding from EEG using Convolutional Recurrent Neural Network [20.37214453938965]
The auditory attention decoding (AAD) approach was proposed to determine the identity of the attended talker in a multi-talker scenario. Recent models based on deep neural networks (DNN) have been proposed to solve this problem. In this paper, we proposed novel convolutional recurrent neural network (CRNN) based regression model and classification model.
arXiv Detail & Related papers (2021-03-03T05:09:40Z)
Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks [61.76338096980383]
A range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper- parameters of state-of-the-art factored time delay neural networks (TDNNs) These include the DARTS method integrating architecture selection with lattice-free MMI (LF-MMI) TDNN training. Experiments conducted on a 300-hour Switchboard corpus suggest the auto-configured systems consistently outperform the baseline LF-MMI TDNN systems.
arXiv Detail & Related papers (2020-07-17T08:32:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.