Related papers: A Lexicon and Depth-wise Separable Convolution Based Handwritten Text Recognition System

A Lexicon and Depth-wise Separable Convolution Based Handwritten Text Recognition System

URL: http://arxiv.org/abs/2207.04651v1
Date: Mon, 11 Jul 2022 06:24:26 GMT
Title: A Lexicon and Depth-wise Separable Convolution Based Handwritten Text Recognition System
Authors: Lalita Kumari, Sukhdeep Singh, VVS Rathore and Anuj Sharma
Abstract summary: We have used depthwise convolution in place of standard convolutions to reduce the total number of parameters to be trained. We have also included a lexicon based word beam search decoder at testing step. We have obtained 3.84% character error rate and 9.40% word error rate on IAM dataset; 4.88% character error rate and 14.56% word error rate in George Washington dataset.
Score: 3.9097549127191473
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cursive handwritten text recognition is a challenging research problem in the domain of pattern recognition. The current state-of-the-art approaches include models based on convolutional recurrent neural networks and multi-dimensional long short-term memory recurrent neural networks techniques. These methods are highly computationally extensive as well model is complex at design level. In recent studies, combination of convolutional neural network and gated convolutional neural networks based models demonstrated less number of parameters in comparison to convolutional recurrent neural networks based models. In the direction to reduced the total number of parameters to be trained, in this work, we have used depthwise convolution in place of standard convolutions with a combination of gated-convolutional neural network and bidirectional gated recurrent unit to reduce the total number of parameters to be trained. Additionally, we have also included a lexicon based word beam search decoder at testing step. It also helps in improving the the overall accuracy of the model. We have obtained 3.84% character error rate and 9.40% word error rate on IAM dataset; 4.88% character error rate and 14.56% word error rate in George Washington dataset, respectively.

Related papers

Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
The sampling complexity of learning invertible residual neural networks [9.614718680817269]
It has been shown that determining a feedforward ReLU neural network to within high uniform accuracy from point samples suffers from the curse of dimensionality. We consider the question of whether the sampling complexity can be improved by restricting the specific neural network architecture. Our main result shows that the residual neural network architecture and invertibility do not help overcome the complexity barriers encountered with simpler feedforward architectures.
arXiv Detail & Related papers (2024-11-08T10:00:40Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Tensor Decomposition for Model Reduction in Neural Networks: A Review [13.96938227911258]
Modern neural networks have revolutionized the fields of computer vision (CV) and Natural Language Processing (NLP) They are widely used for solving complex CV tasks and NLP tasks such as image classification, image generation, and machine translation. This paper reviews six tensor decomposition methods and illustrates their ability to compress model parameters.
arXiv Detail & Related papers (2023-04-26T13:12:00Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x. We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Generalization Error Bounds for Iterative Recovery Algorithms Unfolded as Neural Networks [6.173968909465726]
We introduce a general class of neural networks suitable for sparse reconstruction from few linear measurements. By allowing a wide range of degrees of weight-sharing between the layers, we enable a unified analysis for very different neural network types.
arXiv Detail & Related papers (2021-12-08T16:17:33Z)
SignalNet: A Low Resolution Sinusoid Decomposition and Estimation Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples. We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions. In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z)
LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop. A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z)
Measurement error models: from nonparametric methods to deep neural networks [3.1798318618973362]
We propose an efficient neural network design for estimating measurement error models. We use a fully connected feed-forward neural network to approximate the regression function $f(x)$. We conduct an extensive numerical study to compare the neural network approach with classical nonparametric methods.
arXiv Detail & Related papers (2020-07-15T06:05:37Z)
Dynamic Bayesian Neural Networks [2.28438857884398]
We define an evolving in time neural network called a Hidden Markov neural network. Weights of a feed-forward neural network are modelled with the hidden states of a Hidden Markov model. A filtering algorithm is used to learn a variational approximation to the evolving in time posterior over the weights.
arXiv Detail & Related papers (2020-04-15T09:18:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.