Related papers: Have convolutions already made recurrence obsolete for unconstrained handwritten text recognition ?

Have convolutions already made recurrence obsolete for unconstrained handwritten text recognition ?

URL: http://arxiv.org/abs/2012.04954v1
Date: Wed, 9 Dec 2020 10:15:24 GMT
Title: Have convolutions already made recurrence obsolete for unconstrained handwritten text recognition ?
Authors: Denis Coquenet, Yann Soullard, Cl\'ement Chatelain, Thierry Paquet
Abstract summary: Unconstrained handwritten text recognition remains an important challenge for deep neural networks. recurrent networks and Long Short-Term Memory networks have achieved state-of-the-art performance in this field. We propose an experimental study regarding different architectures on an offline handwriting recognition task using the RIMES dataset.
Score: 3.0969191504482247
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unconstrained handwritten text recognition remains an important challenge for deep neural networks. These last years, recurrent networks and more specifically Long Short-Term Memory networks have achieved state-of-the-art performance in this field. Nevertheless, they are made of a large number of trainable parameters and training recurrent neural networks does not support parallelism. This has a direct influence on the training time of such architectures, with also a direct consequence on the time required to explore various architectures. Recently, recurrence-free architectures such as Fully Convolutional Networks with gated mechanisms have been proposed as one possible alternative achieving competitive results. In this paper, we explore convolutional architectures and compare them to a CNN+BLSTM baseline. We propose an experimental study regarding different architectures on an offline handwriting recognition task using the RIMES dataset, and a modified version of it that consists of augmenting the images with notebook backgrounds that are printed grids.

Related papers

Extraction Propagation [4.368185344922342]
We present an alternative architecture composed of many small neural networks that interact with one another. Instead of propagating gradients back through the architecture we propagate vector-valued messages computed via forward passes.
arXiv Detail & Related papers (2024-02-24T19:06:41Z)
Centered Self-Attention Layers [89.21791761168032]
The self-attention mechanism in transformers and the message-passing mechanism in graph neural networks are repeatedly applied. We show that this application inevitably leads to oversmoothing, i.e., to similar representations at the deeper layers. We present a correction term to the aggregating operator of these mechanisms.
arXiv Detail & Related papers (2023-06-02T15:19:08Z)
Deep Learning Architecture for Automatic Essay Scoring [0.0]
We propose a novel architecture based on recurrent networks (RNN) and convolution neural network (CNN) In the proposed architecture, the multichannel convolutional layer learns and captures the contextual features of the word n-gram from the word embedding vectors. Our proposed system achieves significantly higher grading accuracy than other deep learning-based AES systems.
arXiv Detail & Related papers (2022-06-16T14:56:24Z)
Neural Architecture Search for Dense Prediction Tasks in Computer Vision [74.9839082859151]
Deep learning has led to a rising demand for neural network architecture engineering. neural architecture search (NAS) aims at automatically designing neural network architectures in a data-driven manner rather than manually. NAS has become applicable to a much wider range of problems in computer vision.
arXiv Detail & Related papers (2022-02-15T08:06:50Z)
SIRe-Networks: Skip Connections over Interlaced Multi-Task Learning and Residual Connections for Structure Preserving Object Classification [28.02302915971059]
In this paper, we introduce an interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing gradient in relation to the object classification task. The presented methodology directly improves a convolutional neural network (CNN) by enforcing the input image structure preservation through auto-encoders. To validate the presented methodology, a simple CNN and various implementations of famous networks are extended via the SIRe strategy and extensively tested on the CIFAR100 dataset.
arXiv Detail & Related papers (2021-10-06T13:54:49Z)
Applications of Recurrent Neural Network for Biometric Authentication & Anomaly Detection [0.0]
Recurrent Neural Networks are powerful machine learning frameworks that allow for data to be saved and referenced in a temporal sequence. This paper seeks to explore current research being conducted on RNNs in four very important areas, being biometric authentication, expression recognition, anomaly detection, and applications to aircraft.
arXiv Detail & Related papers (2021-09-13T04:37:18Z)
Multi-Exit Vision Transformer for Dynamic Inference [88.17413955380262]
We propose seven different architectures for early exit branches that can be used for dynamic inference in Vision Transformer backbones. We show that each one of our proposed architectures could prove useful in the trade-off between accuracy and speed.
arXiv Detail & Related papers (2021-06-29T09:01:13Z)
Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network [2.277447144331876]
Unconstrained handwritten text recognition is a major step in most document analysis tasks. One alternative solution to using LSTM cells is to compensate the long time memory loss with an heavy use of convolutional layers. We present a Gated Fully Convolutional Network architecture that is a recurrence-free alternative to the well-known CNN+LSTM architectures.
arXiv Detail & Related papers (2020-12-09T10:30:13Z)
Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition. Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z)
EASTER: Efficient and Scalable Text Recognizer [0.0]
We present an Efficient And Scalable TExt Recognizer (EASTER) to perform optical character recognition on both machine printed and handwritten text. Our model utilise 1-D convolutional layers without any recurrence which enables parallel training with considerably less volume of data. We also showcase improvements over the current best results on offline handwritten text recognition task.
arXiv Detail & Related papers (2020-08-18T10:26:03Z)
Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. We show how to extend the architecture of a simple RNN by separating its hidden state into different modules. We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)
Recognizing Long Grammatical Sequences Using Recurrent Networks Augmented With An External Differentiable Stack [73.48927855855219]
Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction. RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and time series forecasting problems. One way to address these shortcomings is to couple an RNN with an external, differentiable memory structure, such as a stack. In this paper, we improve the memory-augmented RNN with important architectural and state updating mechanisms.
arXiv Detail & Related papers (2020-04-04T14:19:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.